Re: issue with ICAP message for redirecting HTTPS/CONNECT

From: Marcus Kool <marcus.kool_at_urlfilterdb.com>
Date: Sun, 08 Jun 2014 19:18:14 -0300

On 06/08/2014 04:20 PM, Alex Rousskov wrote:
> On 06/08/2014 10:02 AM, Nathan Hoad wrote:
>
>> There's a bug in your ICAP server with how it's handling the
>> Encapsulated header that it sends back to Squid.
> ...
>> The Encapsulated header says that the HTTP object that has been sent
>> back contains HTTP response headers, and no body. This leads Squid to
>> believe it should be parsing a HTTP response
>
>
> Hello Marcus,
>
> In addition to the Encapsulated header wrongly promising an HTTP
> response, the ICAP response also contains an encapsulated HTTP body
> chunk (of zero size) when the Encapsulated header promised no body at
> all. That ICAP server bug is present in both GET and CONNECT adaptation
> transactions (but the correct behavior would be different in each of
> those two cases).

Thanks for pointing that out.

> If you are writing a yet another ICAP server, please note that free and
> commercial ICAP servers are available. Are you sure you want to go
> through the pains of writing a yet another broken one? And that you
> actually need ICAP?

For this project I indeed need ICAP.
I was not satisfied with the free ICAP servers and will
make the ICAP server public domain so a commercial one is not an option.

> Finally, please note that rewriting and even satisfying CONNECT requests
> is difficult because the browser has certain expectations about the
> origin server and the browser's security model prevent many CONNECT
> request and response manipulations.

yes, I am aware of all troubles with certificates and how browsers deal with them.
ICAP was designed for HTTP, not HTTPS, but ICAP is all we got for content filtering.

I am aware that ecap exists, but because ecap sits inside the Squid process
but has no support for multithreading, which is a must-have for this project,
ecap is not suitable for technical reasons.

Thanks
Marcus

> Cheers,
>
> Alex.
>
>
>
>> On 9 June 2014 00:22, Marcus Kool <marcus.kool_at_urlfilterdb.com> wrote:
>>> I ran into an issue with the ICAP interface.
>>> The issue is that a GET/HTTP-based URL can be successfully rewritten but a
>>> CONNECT/HTTPS-based URL cannot. I used debug_options ALL,9 to find out what
>>> is going wrong
>>> but I fail to understand Squid.
>>>
>>> GET/HTTP to http://googleads.g.doubleclick.net works:
>>>
>>> Squid writes:
>>> REQMOD icap://127.0.0.1:1344/reqmod_icapd_squid34 ICAP/1.0<0d>
>>> Host: 127.0.0.1:1344<0d>
>>> Date: Sun, 08 Jun 2014 13:54:09 GMT<0d>
>>> Encapsulated: req-hdr=0, null-body=135<0d>
>>> Preview: 0<0d>
>>> Allow: 204<0d>
>>> X-Client-IP: 127.0.0.1<0d>
>>> <0d>
>>> GET http://googleads.g.doubleclick.net/ HTTP/1.0<0d>
>>> User-Agent: Wget/1.12 (linux-gnu)<0d>
>>> Accept: */*<0d>
>>> Host: googleads.g.doubleclick.net<0d>
>>> <0d>
>>>
>>> ICAP daemon responds:
>>> ICAP/1.0 200 OK<0d>
>>> Server: ufdbICAPd/1.0<0d>
>>> Date: Sun, 08 Jun 2014 13:54:09 GMT<0d>
>>> ISTag: "5394572c-4567"<0d>
>>> Connection: keep-alive<0d>
>>> Encapsulated: res-hdr=0, null-body=233<0d>
>>> X-Next-Services: <0d>
>>> <0d>
>>> HTTP/1.0 200 OK<0d>
>>> Date: Sun, 08 Jun 2014 13:54:09 GMT<0d>
>>> Server: ufdbICAPd/1.0<0d>
>>> Last-Modified: Sun, 08 Jun 2014 13:54:09 GMT<0d>
>>> ETag: "498a-00000001-5394572c-4567"<0d>
>>> Cache-Control: max-age=10<0d>
>>> Content-Length: 0<0d>
>>> Content-Type: text/html<0d>
>>> <0d>
>>> 0<0d>
>>> <0d>
>>>
>>>
>>> CONNECT/HTTPS does not work:
>>>
>>> Squid writes:
>>> REQMOD icap://127.0.0.1:1344/reqmod_icapd_squid34 ICAP/1.0<0d>
>>> Host: 127.0.0.1:1344<0d>
>>> Date: Sun, 08 Jun 2014 12:29:32 GMT<0d>
>>> Encapsulated: req-hdr=0, null-body=87<0d>
>>> Preview: 0<0d>
>>> Allow: 204<0d>
>>> X-Client-IP: 127.0.0.1<0d>
>>> <0d>
>>> CONNECT googleads.g.doubleclick.net:443 HTTP/1.0<0d>
>>> User-Agent: Wget/1.12 (linux-gnu)<0d>
>>> <0d>
>>>
>>> ICAP daemon responds:
>>> ICAP/1.0 200 OK<0d>
>>> Server: ufdbICAPd/1.0<0d>
>>> Date: Sun, 08 Jun 2014 12:29:32 GMT<0d>
>>> ISTag: "5394572c-4567"<0d>
>>> Connection: keep-alive<0d>
>>> Encapsulated: res-hdr=0, null-body=193<0d>
>>> X-Next-Services: <0d>
>>> <0d>
>>> CONNECT blockedhttps.urlfilterdb.com:443 HTTP/1.0<0d> --
>>> NOTE: also fails: CONNECT https://blockedhttps.urlfilterdb.com HTTP/1.0<0d>
>>> Host: blockedhttps.urlfilterdb.com<0d>
>>> User-Agent: Wget/1.12 (linux-gnu)<0d>
>>> X-blocked-URL: googleads.g.doubleclick.net<0d>
>>> X-blocked-category: ads<0d>
>>> <0d>
>>> 0<0d>
>>> <0d>
>>>
>>> and Squid in the end responds to wget:
>>> HTTP/1.1 500 Internal Server Error
>>> Server: squid/3.4.5
>>> Mime-Version: 1.0
>>> Date: Sun, 08 Jun 2014 13:59:27 GMT
>>> Content-Type: text/html
>>> Content-Length: 2804
>>> X-Squid-Error: ERR_ICAP_FAILURE 0
>>> Vary: Accept-Language
>>> Content-Language: en
>>> X-Cache: MISS from XXX
>>> X-Cache-Lookup: NONE from XXX:3128
>>> Via: 1.1 XXX (squid/3.4.5)
>>> Connection: close
>>>
>>> A fragment of cache.log is below.
>>> I think that the line
>>> HttpReply.cc(460) sanityCheckStartLine: HttpReply::sanityCheckStartLine:
>>> missing protocol prefix (HTTP/) in 'CONNECT blockedhttps.urlfilterdb.com:443
>>> HTTP/1.0<0d>
>>> indicates where the problem is.
>
>
>
Received on Sun Jun 08 2014 - 22:18:19 MDT

This archive was generated by hypermail 2.2.0 : Mon Jun 09 2014 - 12:00:11 MDT