Re: filtering HTTPS/CONNECT (summary and continuation of discussion) from Marcus Kool on 2012-03-16 (squid-dev)

From: Marcus Kool <marcus.kool_at_urlfilterdb.com>
Date: Fri, 16 Mar 2012 21:39:45 -0300

Alex Rousskov wrote:
> On 03/16/2012 03:05 PM, Marcus Kool wrote:
>> There were 4 threads about 'filtering HTTPS' and I will try to
>> summarise here.
>>
>> Current situation with Squid 3.1.19:
>> What happens inside a CONNECT is practically not filterable because
>> 1) sslBump is not used, or
>> 2) sslBump is used and SSL+HTTP can be filtered, but it breaks the
>> other data streams for Skype et al. Using the unsafe options
>> 'sslproxy_cert_error allow all' and 'sslproxy_flags DONT_VERIFY_PEER'
>> to circumvent the latter problem are far from desirable.
>>
>> The wiki features pages say that Alex Rousskov is working on
>> BumpSslServerFirst
>> and MimicSslServerCert but unfortunately Alex has not (yet) participated
>> in the discussion.
>
> Sorry, I was on a business trip when the discussion started and could
> not respond until now (I tried!).

ok, no need to apologise.

>
>> To filter HTTP is trivial. To filter HTTPS there are two options:
>> 1) to filter without sslBump and then the filter only receives
>> "CONNECT <endpoint>:443" on which it has to make a decision to block
>> or not. This cripples the filter since it does not has access to the
>> content and in many cases can not detect which application sends
>> what (type of) data.
>> An additional drawback is that connection can be blocked but an
>> understandable error message cannot be presented to the end user.
>
> I believe this is already supported.

Yes. Technically works but the issue of not being able to give
the end user a different error than "cannot connect to server"
is annoying to users.

>
>> 2) use sslBump. The filter will receive "CONNECT <endpoint>:443" as well as
>> "https://endpoint/path" (and content for RESPMOD) for SSL+HTTP based
>> connections so this is optimal for filtering SSL+HTTP connections.
>> The discussion was much around what to do with data streams that are not
>> SSL+HTTP. This can be any protocol encapsulated by SSL or simply any
>> protocol.
>>
>> To be able to filter all data, Squid needs a modification to present raw
>> data
>> about the non-SSL+HTTP data streams to a filter (URL redirector or ICAP).
>
> or eCAP.

I read about eCAP but when I decided to make a new URL filter
(I already wrote ufdbGuard a URL redirector), I decided for ICAP
since it is more widespread and eCAP not yet matured.

My new ICAP server (no better name yet than ufdbicapd) is multithreaded,
loads a 200 MB URL database in memory and not that straightforward
to put inside Squid with a loadable module.
I do not want to judge eCAP since I know little about it, also
because there is not that much documentation.
I think I will look at it again to see if a hybrid solution is
feasible.

>> To keep the discussion focussed on one type of filter I will assume that
>> an ICAP server is used as the filter.
>>
>> The ICAP protocol has a considerable overhead (CPU processing) and
>> extending
>> the ICAP protocol for data stream filtering is not the first choice.
>> Amos and Henrik were "optimistic" about implementing a new pipe filter.
>>
>> The data streams for a bidirectional pipe have a different behavior than
>> HTTP and SSL+HTTP. Both client and server can send data at any time. And
>> for some, the server initiates the protocol and for others, the client
>> initiates. OpenVPN is a chameleon and can pretend to be an SSL+HTTP server
>> but is also a VPN server.
>>
>> In all cases that Squid sends a request to a filter, it would be
>> a *big* plus if it informs the filter what it already knows about the
>> CONNECT endpoint. E.g. If it has SSL/TLS or not.
>>
>> Since sslBump is being rewritten for 3.3 it is a good opportunity
>> to make Squid suitable for filtering *all* data streams.
>
> Sure, although please keep in mind that the bump-server-first and
> certificate mimicking code is pretty much complete. We are going through
> beta testing and code polishing cycles now. I hope I would not have to
> rewrite a lot of stuff that already works!

well, always good to hear that a project is almost done.

>
>> The new sslBump flow could be something like this:
>>
>> A) open socket to server. If error, close socket to client.
>
> If there is an error, bump-ssl-server-first returns an error to the
> client, after establishing a secure connection with it. Closing the
> connection can sometimes be a good option as well, of course.

Yeah, this depends on the error. When Squid cannot make a connection
to the server, it could simple close the socket to the client.
Just an idea. But doing a full handshake with a client and given
a user-friendly error message is very nice.

>> B) do the logic for ICAP REQMOD CONNECT endpoint:443
>
> Bump-ssl-server-first does not change the order of ICAP processing and
> server connection establishment. And it would be wrong to change it,
> IMO. In other words, your (B) should come before (A) because (B) may
> change where we are connecting or even prohibit the CONNECT request
> (among other things):
>
> 1. Receive CONNECT.
> 2. Authenticate/etc.
> 3. Adapt/redirect/etc.
> 4. Bump.

You are right. I totally forgot about the REQMOD post-cache vectoring point
and what I suggested is that. OK, let's stick with what we have.

>
>> C) start SSL handshake to server and take care of all certificate issues.
>
> Bump-ssl-server-first does that.
>
>
>> If the SSL handshake fails with a PROTOCOL error, the socket must be
>> closed,
>> a new socket must be opened, and Squid will assume that the endpoint
>> uses an other protocol than SSL. Squid goes into tunnel mode and all
>> filtering will be done by the new pipe filter.
>> Squid may get a new option to define its behaviour in case the SSL
>> handshake
>> fails. The options could be called sslBumpForNoneSSL with values
>> prohibitNoneSSL (terminate connection), passNoneSSL (always allow),
>> filterNoneSSL (default value - let new pipe filter decide).
>
>
> s/None/Not/ or s/None/Non/
>
> I suspect we should not allow any default here because the right
> decision is impossible to guess correctly as it depends on why SslBump
> was enabled in the first place. We could
>
> - serve a secure error (current bump-server-first code);
> - tunnel things through (not yet supported),
> - terminate the client connection (not yet supported);
> - ask a 3rd party filter (supported via external ACLs?).

The current bump-server-first code breaks various applications
that use CONNECT. SSH tunnels, VPNs and Skype are broken.

How will bump-ssl-server-first behave?
Will it after an unsuccesful SSL handshake (protocol error)
close the socket and open a new, clean socket to be used in tunnel mode ?

>
>> D) Squid now knows that the connection has a SSL/TLS wrapper but does
>> not know yet if inside the wrapper HTTP is used.
>> Squid monitors what the client *and* the server send on the pipe. If the
>> client sends first and sends a valid HTTP command, Squid assumes that
>> the connection has SSL+HTTP.
>> If there is no SSL+HTTP Squid goes into tunnel mode
>
> Yes, provided Squid was configured to do that.

I am not sure what you mean. I think that Squid has no other option
than to go into tunnel mode when it sees that the protocol is
SSL+anything but not SSL+HHTP.

What type of configurable options are you thinking of?

>
>> and all filtering will be done with the new pipe filter.
>
> Yes, provided Squid was configured to do that.

indeed

>> E) do the "normal processing" and ICAP REQMOD/RESPMOD for
>> https://endpoint/path
>>
>> The total work of Squid+filter can be reduced if B) is done after C) since
>> Squid can inform the filter about the SSL handshake and the filter does
>> not have to do its own probe.
>
> If the filter needs handshake information, it can get it when it is
> available, but please do not change the order of ICAP/eCAP CONNECT
> adaptation -- adaptation must happen before we do anything with the
> CONNECT request (unless you add support for a post-cache REQMOD
> vectoring point).

agreed.

>> There was a suggestion for a connection cache which allows it to skip
>> checks and make assumptions about a new CONNECT to an endpoint that was
>> CONNECTed before.
>
> Sure, but that is just a secondary optimization.
>
>
>> The new pipe filter requires a new protocol yet to be defined.
>
> And/or a new API like eCAP.

Is is possible with the eCAP hooks to build functionality similar to
what we called the pipe filter?

>> Squid initially tells the filter what it already knows about the endpoint.
>> I.e. uses SSL or not, time to CONNECT, endpoint address, cached
>> information. The Squid pipe sends copies of all data to the filter and the filter can
>> reply with one of the following: OK (proceed with this data), REPLACE-CONTENT
>> (content and a flag to optionally also terminate the connection), TERMINATE (just
>> close sockets), OK-FOR-ALL (proceed and do not consult me any more for this
>> connection). Squid also informs the filter when the connection is terminated by the
>> client or the server.
>
> The above is very similar to ICAP except every I/O (instead of every
> HTTP message) becomes a transaction. I do not think you can specify all
> ICAP-like interactions in a short paragraph. Even ICAP folks had to add
> a few extensions to make the data pipeline more efficient, and here the
> efficiency would be more critical because we would be dealing with
> bytes, not messages.
>
> It would be nice not to repeat ICAP errors for this new protocol/API. I
> suggest that you look at relatively recent ICAP extension to incorporate
> it in your design:
> http://www.icap-forum.org/documents/specification/draft-icap-ext-partial-content-07.txt

Of course, the one paragraph functional spec was only to give a rough idea
of what to expect.

>
>> How do we go on from here?
>
> I will respond to that separately.
>
>
> Thank you,
>
> Alex.

Thanks

Marcus
Received on Sat Mar 17 2012 - 00:39:50 MDT

This archive was generated by hypermail 2.2.0 : Sat Mar 17 2012 - 12:00:10 MDT