Re: [squid-users] Zero sized reply and other recent access problems

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sun, 6 Mar 2005 21:07:54 +0100 (CET)

On Sat, 5 Mar 2005, H Matik wrote:

> Recently all of us are having problems with squid not serving certain
> pages/objects anymore.

Examples please.

> We do know that squid most probably does detect correct or incorrect html
> codes and tells it via it's error messages.
>
> But I am not so sure if this should be a squid task.

It isn't, and Squid does none of the kind. Squid could not care less about
what is HTML. To Squid a HTML page is just a sequence of characters of no
meaning to Squid.

As Reuben said Squid only cares about the validity of the HTTP protocol,
and the things it cares about is for good reasons (mostly security). It is
known that there is several quite broken web sites out there which will
not work via 2.5.STABLE9, and due to the nature of the bugs in these sites
it is unlikely they will work with any later Squid releases until the site
administrator fixes their critical server bugs.

> Squid IMO should cache and serve what it gets from the server.

And this is what Squid does. The server must however speak the HTTP
protocol in a somewhat meaningful dialect for Squid to understand what the
server says and not reject it as a hacker attempt or other malicious
intent.

> The code check should be done by the browser - means incorrect code is a
> browser problem or a web server problem so it should be adviced by the
> browser not by anything in the middle.

And this is exacly how it is.

> We here do use transparent squid on lots of sites and soon someone complains
> about this kind of problem we rewrite our fwd rules so that it does not goes
> through squid anymore.

You complain all this about what a proxy should or should not do, and
still you intentinally and focibly violate the fundamentals of TCP/IP by
hijacking your users requests? Transparent interception violates Internet
Standard #3 "Requirements for Internet hosts" and also the general spirit
of the design of TCP/IP.

> IMO I think it might be better for squid not checking code.

There is sertain things Squid must check in the HTTP protocol used for
transferring the HTML code. But Squid absolutely does NOT care about the
HTML or other contents of the requested site.

> Custumers say: "Without your cache I can access the site, with your cache not.
> I do not want to know about and if you do not resolve this problem for me I
> do not use you service anymore but another where it works."

Unfortunately the world is not so unambigious.

It may be worth mentioning that many of the sites failing with Squid
2.5.STABLE9 is likely to start failing with newer browsers as well for the
same reasons Squid pukes on these sites.

> So even if "I" loose first my customer second they do not use squid anymore. I
> believe it could be considered to think about this.

I belive the 2.5.STABLE9 release has a very good balance in this.

Sure, there may still be a few buggy web servers out there where Squid
could safely work around the server bugs, but each of these has to be
analyzed very carefully individually.

In addition the only way of getting this done is to spend some time on
identifying why Squid rejects the responses from a certain site, and then
open a discussion here on squid-users on how Squid maybe could work around
that broken web server.

If you can/will not investigate why problems arises but still expects
everything to work then you should have a support contract, either for
Squid from one of the Suqid support providers or for a commercial
proxy/cache if you prefer.

Just complaining without any information won't get you anywhere, except
perhaps blacklisted in some of the subscribers here.

> I like to add that we here are using squid since 97/98 and what I wrote here
> is not in any kind a meant as offending critic to the developers but a point
> to think about. So what you think about this?

And beleive me, we think very careful about these things.

If we did not then Squid-2.5.STABLE8 would have been released with the
HTTP parser in it's very strictest setting, i.e. the equivalence of
2.5.STABLE9 configured with "relaxed_header_parser off" and in addition
yelling a screenful of complaints per request in cache.log on each
malfunctioning web server seen.

Regards
Henrik
Received on Sun Mar 06 2005 - 13:07:57 MST

This archive was generated by hypermail pre-2.1.9 : Fri Apr 01 2005 - 12:00:01 MST