Re: Redundancy from WWW server manager on 1997-11-06 (squid-users)

From: WWW server manager <webadm@dont-contact.us>
Date: Thu, 6 Nov 1997 10:50:24 +0000 (GMT)

Henny Bekker wrote:
>
> [snip; about how to avoid transient disruptions to service due to
> reconfiguring, rotating logs, etc., also crashes]
>
> True.. But in practics the Netscape and Microsoft Internet explorer are
> the most use browsers.. I agree it's no solution for other browsers such
> as lynx (which I'm using quite often). However the main proxy does not
> crache that often (I hope).

Crashes are a fact of life, and hopefully rare. More of a problem is that

* when asked to load a new configuration, Squid rejects connections while
   it writes a new cache/log file containing details of the current cached
   objects. On our server, that takes a 2-5 minutes. That's in addition to
   the configurable timeout during which it will reject new connections
   but attempt to finish old ones; with the sample configuration, that adds
   a 30-second delay.

* when asked to rotate the logs, in addition to rotating the "real" logs
   (the ones recording what it's been doing for later analysis) it also
   writes out the current object details to cache/log. That's good since
   the file would otherwise grow without bound, but it's bad because
   while Squid seems to continue accepting connections and to maintain
   existing ones, those connections are "on hold" until it's finished
   writing out cache/log, which again may take 2-5 minutes on our server.

When users start getting annoyed if a remote server cannot be connected or
fails to return the desired document in maybe 5-10 seconds, multi-minute
periods with connections rejected or frozen are *not* helpful and are, as
indicated in the original query, liable to result in users bypassing the
cache and never bothering to use it again.

The same issue arises for the all too common situation where "it takes
forever to load via the cache but is immediate if I bypass the cache",
mostly due to parent caches failing to respond quickly, for whatever reason
though I'm not convinced that's the whole story. It's not relevant to the
question under discussion, though, except in relation to user response to
how the cache behaves (but if there's a subtle but common cause that can
readily be fixed, details please!).

> You can also use it as a load balancing meschanisme.. The only problem with
> that is that you want to have specific Web traffic on the same Web caching
> server to improve the HIT-rate..

Load-balancing how? Unless things have changed relatively recently, few if
any browsers will try more than the first DNS A record found under the cache
server's name. And if one server of two (say) is not responding then relying
on DNS round-robin shuffling of A-records would at best arrange for users to
get a working cache server 50% of the time - which for a page with many
embedded icons or other images means that even if the textual part of the
page happens to be loaded via the working server, 50% of the images will be
"broken image" icons or equivalent, representing failed retrievals via the
unresponsive cache.

It would be extremely helpful if Squid could be enhanced so as to avoid the
extended delay while writing out cache/log, either with the sockets still
live (as with log rotation) or already closed (hence no new connections, as
with reconfiguration). I suspect this might be tricky, though...?

Would it be viable if in those situations Squid were to treat all requests
handled while the file was being written as proxy-only, using cache files
which would be immediately erased after the document had been transmitted to
the browser? The only specific issue that occurs to me is that it would be
essential to avoid reusing a cache file in such a way that after reloading
cache/log a cache file might exist but not contain the file identified by
data from cache/log ...

An alternative approach might be to keep track of documents processed while
cache/log was being written, and freeze processing of requests only very
briefly while those details are appended to cache/log after the pre-existing
object details have been written and before closing the file.

This would also help in the case Squid restarts, system shutdowns, etc.,
where at present you also get a configurable delay with the sockets closed,
then a long delay while cache/log is written. Either approach would mean
that the "dead time" would be much reduced.

This doesn't directly address the period with sockets closed after a HUP or
shutdown request, but it's implicit in the above suggestions that the
sockets would *not* be closed as early as they are now, else making it
possible to continue processing requests would be useful only in the log
rotation case.

A configurable (down to zero) delay with new connections rejected, while
attempting to finish old requests, *after* writing cache/log (with
in-progress requests proxied not cached) would make more sense if it were
possibly to continue handling requests while writing cache/log.

Is any of this feasible?

One of my few regrets in deciding to use Squid rather than Netscape's proxy
server is that with the latter, reconfiguration is "instantaneous" and
likewise shutdown, at the expense of dropping current requests on the floor;
in terms of maintaining the confidence of your user community and not having
them give up on the cache as "useless", dropping a handful of in-progress
requests with no perceptible period rejecting new connections is a big plus.
2-5 minutes of no response or rejected connections equates to a lot more
disruption than dropping a handful of current connections!

John Line

-- 
University of Cambridge WWW manager account (usually John Line)
Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk

Received on Thu Nov 06 1997 - 02:52:52 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:27 MST