What is the best way to bypass parent caches for some domains?

From: WWW server manager <webadm@dont-contact.us>
Date: Tue, 28 Oct 1997 23:51:35 +0000 (GMT)

What is the best (most efficient, least risk of mistakes) way to specify a
non-trivial (though as yet not really large) number of specific servers
and/or domains for which parent caches should never be used (i.e. Squid
should always connect direct)?

The situation is that we have site access by source domain to a growing
number of subscription-based electronic journals. The general rule is that
our cache passes any requests for targets outside JANET - the UK academic
network - to the JANET national cache, and those systems are outside our
domain. Hence if the e-journal requests are routed via the national cache,
the journal servers will reject them as unauthorised. We therefore need to
ensure that access to those servers is always direct from our cache.

What I've done so far is to use local_domain. While the comments in the
sample squid.conf say "This tag specifies a list of domains local to your
organization.", which is clearly false, it also says

"For URLs which are in one of the local domains, the object is always
fetched directly from the source and never from a neighbor or parent."

which is pretty much what we want. It would actually be OK for requests to
go to siblings within our domain, but it's probably not going to get enough
sibling cache hits to make much difference, and it keeps things simple.

The alternatives seem to be:

 * use cache_host_domain with a *long* list of exclusions for each special-
   case target server or domain - clumsy and error-prone.
 * use cache_host_acl and define an access control list naming the
   relevant hosts/domains. Better, but still needs to be specified for
   each parent, hence scope for accidents e.g. if the details (names!) of the
   parent cache systems change and the basic parent definitions get edited
   but related definitions such as special-case exclusions get overlooked
   or misedited. "Be careful" is a reasonable response, but doesn't help
   when something goes wrong. :-)

My feeling is that local_domain is actually the cleanest solution, but
is the matching done on the assumption that the list will be short
(typically one domain), so it would become inefficient with a long list
(we're up to 21 entries already)?

What solutions do other people use in this sort of situation? Is there
actually a significant performance improvement for using an acl rather than
local_domain, or is acl matching no more efficient?

On a related point - is there a way to split overlong Squid configuration
lines over multiple lines in squid.conf? Some configuration directives are
clearly allowed to be repeated and act cumulatively (as shown by examples
that people have quoted), but I've not seen any indication whether that's
always allowed. Even if it is allowed in all cases where it would be useful,
I don't find it very clear compared to a single definition with the line
wrapped and continuations (probably) indented.

                                John Line

University of Cambridge WWW manager account (usually John Line)
Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk
Received on Tue Oct 28 1997 - 16:02:01 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:21 MST