Re: [squid-users] Strange Problem regarding Accept-Encoding and compression / Regex anyone?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 29 Apr 2009 20:41:30 +1200

Stefan Hartmann wrote:
>
> Amos Jeffries wrote:
>>> Hello,
>>>
>>> i am running squid as reverse proxy in front of a web server farm. We
>>> are trying to implement Content-Compression, and it gets broken from
>>> "time to time".
>>>
>>> The www-servers are windows IIS 5, and the compression is done using a
>>> ISAPI Filter (no, not the original broken M$ filter from the server).
>>>
>>> We are using Version 2.7.STABLE6 in our setup. The www-servers are all
>>> sending a "Vary: Accept-Enconding" header, and the setup is working
>>> perfectly in my test scenarios. We have no "broken_vary_encoding"
>>> configured, and no ETag in the responses (we are only using Expire:
>>> Headers).
>>>
>>> We installed the ISAPI Filters last week without putting the "Vary:
>>> Accept-Encoding" header on the www-servers in place, and blocked the
>>> "Accept-Encoding:" and the "Vary:" headers at squid, waiting for
>>> maintainance window to activate it. The site worked without any problem.
>>>
>>> During the last maintainance window, we activated the "Accept-Encoding:"
>>> and "Vary:" Headers (no longer blocking it in squid), and set up the
>>> WWW-Servers to send "Vary: Accept-Encoding" headers, and it works -
>>> sometimes with some browsers.
>>>
>>> The failure we see are content-pages which are ending after some kB of
>>> correct data. ie the homepage is about 150 kB uncompressed, compressed
>>> around 30 kB (this is why we want compression), and the Serverfarm
>>> delivered Content-Pages consisting of the first 18 to 25 kB
>>> (uncompressed, different sizes possible) of the complete page, never
>>> coming to an end. This never happened in our test setup.
>>>
>>> The pages were (as intended) cached by squid, so we had the situation
>>> that for example Internet Explorer was working, but Firefox got the
>>> short page. And vice versa, sometimes Firefox worked, but IE failed. And
>>> sometimes all browsers worked.
>>>
>>> From tonight logs:
>>> 11:30 pm to 01:00 am: IE pages broken
>>> 01:00 am to 09:30 am: all working
>>> 09:30 am to 10:15 am: Firefox pages broken
>>> 10:15 am to 11:00 am: IE and Firefox pages broken
>>>
>>> Ok, perhaps the ISAPI filter is faulty in some conditions we did not
>>> test, with some browsers or bots or... so we uninstalled the ISAPI
>>> filter from all WWW-Servers, but left the "Vary: Accept-Encoding" header
>>> in place.
>>>
>>> Result: The error did not stop! I had to block the "Accept-Encoding:"
>>> and "Vary:" in squid to get the site working properly.
>>>
>>> Next step was to remove the Vary: Header from the WWW-Servers and not
>>> blocking the "Accept-Encoding:" and "Vary:" headers in squid: the site
>>> is working properly.
>>>
>>> So... are there any issues regarding squid and WWW-Servers sending
>>> "Vary: Accept-Encoding" (without actually doing Content-Compression)?
>>>
>>> When the error occurs, our logs are showing connections with "short"
>>> pages (ie 18 kB vs. 150 kB normaly), which are obviously aborted after
>>> 900 seconds:
>>>
>>> Mon Apr 27 15:38:29 2009 900301 111.111.111.111 TCP_MISS/200 18806 GET
>>> http://real.server.de/ - DEFAULT_PARENT/real.server.de text/html
>>> [
>>> Accept-Encoding: gzip, deflate
>>> User-Agent: Nutscrape/1.0 (CP/M; 8-bit)
>>> Host: real.server.de
>>> Cookie: WT_SET=id=213.253.......
>>> Cache-Control: max-age=259200
>>> ]
>>>
>>> [
>>> HTTP/1.0 200 OK
>>> Date: Mon, 27 Apr 2009 13:23:29 GMT
>>> X-Powered-By: ASP.NET
>>> X-AspNet-Version: 2.0.50727
>>> Realserver-info: BuildTime: 27.04.2009 15:23:29; TimeSpan:
>>> 00:00:02.6719434; CacheTime: 120; Server: WWW31
>>> Publisher: Real-Server
>>> Expires: Mon, 27 Apr 2009 13:25:29 GMT
>>> Content-Type: text/html; charset=iso-8859-1
>>> Content-Length: 168830
>>> X-Cache: HIT from accel3
>>> Connection: close
>>> ]
>>>
>>> Please help!
>>>
>> Since the browser seems to be eratic, I assume that one particular client
>> request is causing some bad data to enter squid cache and being served for
>> all following clients for a period.
>> Look to the requests at the beginning of the time when things break. If
>> you can find the exact conditions or client it will be much easier to
>> track through the logs on later occurances.
>>
>> It sounds a little bit like:
>> http://squidproxy.wordpress.com/2008/04/29/chunked-decoding/
>>
>> except for a few factors that don't fit:
>> IIS 5 is not known for this issue,
>> 2.7 has a decoding hack to fix it
>> and Vary: seemed to show relevance.
>>
>> I'd try raising the debug_options levels for request processing a bit and
>> see what becomes visible.
>
> Amos,
>
> thanks for the reply. debugging is somewhat tricky, since the serverfarm
> has to handle lots of traffic (around 200 Mio content pages per month)
> and debugging the real servers would generate a (too) huge amount of
> data. And in my test scenario i don`t get the error...
>
> I will try to filter the "bad" requests. The idea is to stop the
> Accept-Encoding headers if the are "crazy", ie (all seen live)
>
> Accept-Encoding: FFFF, FFFFFFF
> Accept-Encoding: mzip, meflate
> Accept-Encoding: identity, deflate, gzip
> Accept-Encoding: gzip;q=1.0, deflate;q=0.8, chunked;q=0.6,
> identity;q=0.4, *;q=0
> Accept-Encoding: gzip, deflate, x-gzip, identity; q=0.9
> Accept-Encoding: gzip,deflate,bzip2
> Accept-Encoding: nnnnndeflate
> Accept-Encoding: x-gzip, gzip
> Accept-Encoding: gzip,identity
> Accept-Encoding: gzip, deflate, compress;q=0.9
> Accept-Encoding: gzip,deflate,X.509
>

lol. Thanks.

> and only let pass these two:
>
> Accept-Encoding: gzip,deflate
> Accept-Encoding: gzip, deflate
>
> first one is Firefox, the other is IE. This will match in about 80-90%
> of all requests, which would be ok.
>
> so i tried
>
> acl zipit req_header Accept-Encoding ^gzip,deflate$
> acl zipit req_header Accept-Encoding ^gzip, deflate$
> [...]
> header_access Accept-Encoding allow zipit
>
> but something seems to be wrong with the regex above, squid will let
> pass not only "gzip,deflate" as i would expect but also
> "gzip,deflate,xx" and "gzip,xx". "bla" will be blocked.
>
> seems like squid will let pass the header if it starts with gzip,
> disregarding the rest. am i wrong with my regex?

Squid splits on whitespace. The space in your second pattern makes that
into three patterns.

I'd use:
  acl zipit req_header Accept-Encoding ^gzip,.deflate$

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
   Current Beta Squid 3.1.0.7
Received on Wed Apr 29 2009 - 08:41:43 MDT

This archive was generated by hypermail 2.2.0 : Wed Apr 29 2009 - 12:00:03 MDT