Re: [squid-users] Can Squid hide all 404s from clients?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 22 Aug 2008 14:55:50 +1200 (NZST)

> Hello, Leonardo.
>
> Thanks for your prompt reply!
>
> On 8/21/08 3:02 PM, "Leonardo Rodrigues Magalhães"
> <leolistas_at_solutti.com.br> wrote:
>> are you sure you wanna do this kind of configuration ???
> Yes. I am aware that my request is unusual, as this is to be a
> special-purpose installation of squid.
>
>> have you
>> ever imagined that you can be feeding your users with really
>> invalid/very outdated pages ???
> Of course -- it is implicit in my request. The "users" of this particular
> squid instance will not be humans with desktop browsers, but other
> software
> systems that need a layer of "availability fault-tolerance" between
> themselves and the URL resources they depend on.
>
>> If the site admin took the files off off
>> the webserver, maybe there's some reason for that ....
> The servers this Squid server will pull URLs from are many and diverse,
> and
> maintained by admins with widely different competency levels and uptime
> expectations.
>
>> i dont know if this can be done,
> Bummer. After all that explanation of why I shouldn't, I thought for sure
> you were going to say "But if you REALLY want to...". ;)
>
> So, now that I have explained my need, the question reamins unanswered:
> Is it possible to configure Squid so that it always serves the "latest
> available" version of any given URL, even if the URL is no longer
> available
> at the original source?
>

Thats how HTTP works, yes. 'latest available', only what you don't seem to
understand is that a 404 page IS the latest available copy when an object
has been administratively withdrawn.

Your fault-tolerance requirement, brings to mind the stale-if-error
capability recently sponsored into Squid 2.7. That will let squid serve a
stale copy of something if the origin source fails. I'm not too savy on
the details, but its worth a look. It should get you past a lot of the
possible temporary errors.

To achieve 100% retrieval of objects is not possible. Any attempt at it
ends up serving stale objects that should never have been stored.

The closest you can get is to:
 a) use the stale-if-error where you can be sure its safe to do so
 b) get as many of the source apps/servers to provide correct caching
control headers.
 c) admit that there will be occasional 404s coming out (but rarely in a
well tuned system)

Given the variance in levels of admin knowledge, you can confidantly say
the Squid layer handles a certain level of faults and recovery, and
provides a higher level than it accepts. Then you can start beating the
admins both at supply and drain, with clue-stick when their apps fail to
provide sufficient confidence on the Squid input or fail to handle a low
level of faults. As the problem is entirely theirs and you can prove it by
comparison with the working app feeds.

The using software MUST be able to gracefully cope with occasional network
idiocy. Anything less is a major design flaw.

Amos
Received on Fri Aug 22 2008 - 02:55:53 MDT

This archive was generated by hypermail 2.2.0 : Fri Aug 22 2008 - 12:00:03 MDT