Re: Ranting on ipcache bogosity

From: Oskar Pearson <oskar@dont-contact.us>
Date: Wed, 21 Aug 1996 11:43:21 +0200 (GMT)

Chris Fedde wrote:
>
> The Problem:
>
> With positive_dns_ttl set to zero squid exhibits a behavior which
> is confusing for multi-server sites with inconsistent failure modes.
> The behavior with positive_dns_ttl set to non zero is much worse so I
> will not even go into that (see the ps below).
>
> My Exploration:

> 1) Forget about caching DNS results in the server. Named
> does a fine job of it and the async technique used by the
> dnsservers work well to access a local (or remote) IP
> address cache. Further, the current ip cache implementation
> is broken.
>
> 2) Cache _failed_ ip addresses for each HTTP attempt. Set
> a TTL for these cache entries at something like the TTL
> for the DNS record itself.
>
> 3) Check the failed ip cache before making a connection attempt.
>
> 4) Try ALL THE ADDRESSES returned by DNS before returning a failure
> to the client.
What about the following too:

When multiple dns entries are returned, cache them all identically,
my reasoning is as follows - when the new version of Netscape came out,
they had a random load balancing system to keep their servers sane.

BUT - they had about 20 servers, so, while one of our users downloaded
netscape from ftp1, 19 other people were downloading it from the
other servers... This problem can only get worse :(

Fixing this is going to be hard to implement because:
Netscape does it's load balancing in a strange way: they set a very
low TTL on their records, and then rotate their records on their
name server. (ie - if you find out the address for ftp.netscape.com now,
it may give you two records, but if you do it five minutes later, you
will get a different address...

You could (I suppose) get it to cache all DNS requests for something like
24 hours :) (Evil grin ;) and only re-lookup if you issue a reload command.

(Yes I know that there are all sorts of bad points about not following
RFC's reguarding DNS lookups etc, but please don't flame me!)

On the other hand, something like www.microsoft.com works on a much
easier system to cache:
When you ask for www.microsoft.com, it returns 16 different IP addresses...

It seems that squid would store the files with an algorithm based on
the IP address?

Assuming that this is true, it would be best to modify squid code as little
as possible. What I thinnk would work is the following (feel free to stomp it)
When squid queries the cache with an address, the cache looks it up.
If there are multiple addresses, it randomly chooses one (MUST be random, other
wise all the caches in the world could end up querying a single server for
netscape). It passes this address to squid, along with a flag that says
"There are multiple addresses!".

If squid can't connect, it can re-query the dnsserver, with a flag that
says "Dump that one for this address!". The dnsserver would then trash
that IP for a definable period of time (preferably the TTL reported from the
DNS, or a fraction thereof) and returns the "next best" to the cache.
(We must make sure that there is some kind of flag that would return a
"Out of possible IP's" flag to squid, which would then fail.

Is there currently any passing of flags between squid and dnsserver?

I am not sure how you would propogate the information between the
multiple dnsserver programs either...

Saving the flags to the cache server's cache system would also be a possible
problem - you might not be able to use your old cached docs with the
new server <grimace>

I am not sure as to how the cache keeps track of the contents of it's
cache... Is it in some kind of linear log?

> Conclusion:
>
> I suspect that these problems are a historical part of the way
> that squid's ip chache system has worked since the early harvest
> days. The brokenness was not exposed back when most sites were

It may be broken, it may also be historical, but the underlying structure
could become a workaround for the problem.

> and will in-fact be a simpler implementation.
Now that is probably true ;)

I am afraid that I have not had time to have a look at the squid
source code... I have been too busy with work and hacking the linux
networking source. Once I do so, I may see that my idea is total
junk... Please don't take me too seriously if you don't agree!

Oskar Pearson
Received on Wed Aug 21 1996 - 02:45:34 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:32:50 MST