Ranting on ipcache bogosity

From: Chris Fedde <cfedde@dont-contact.us>
Date: Wed, 21 Aug 1996 01:57:14 -0600

The Problem:

    With positive_dns_ttl set to zero squid exhibits a behavior which
    is confusing for multi-server sites with inconsistent failure modes.
    The behavior with positive_dns_ttl set to non zero is much worse so I
    will not even go into that (see the ps below).

My Exploration:

    In looking for an example of this problem I stumbled on this site.

    Consider the following four addresses
        www.geocities.com. 299 A 43200 PTR www2.geocities.com.

        www.geocities.com. 299 A 43200 PTR www3.geocities.com.
        www.geocities.com. 299 A 43200 PTR www4.geocities.com.

        www.geocities.com. 299 A 43200 PTR www.geocities.com.

    History of manual telnet to the IP and port 80 at Tue Aug 20
    23:51:21 MDT 1996. Response confirmed with "GET /"

        Host www2 refuses connection
        host www3 hangs on connect (syn packet times out on retries)
        host www4 connects and responds
        host www connects and responds

    Connecting from a browser through the squid proxy the behavior is erratic
    but predictable. Since positive_dns_ttl = 0 there is no IP caching in the
    server. So normal DNS round robin rules are followed.
        On first attempt DNS returns www2 and the site fails mediately.
        On second attempt DNS returns www3 then the site times out after 94
        seconds. (black hole)

        On third attempt DNS returns www4. The base HTML is fetched and squid
        starts GETing the img references. These each follow a time line like
        the above. First GET succeeds (www), next fails with a broken image (www2)
        third GET hangs, fourth succeeds, fifth succeeds and so on.

My Recommendation:

    While the current behavior is to be expected given the ipcache
    implementation. I suspect that a better behavior would be easy
    to implement. I'd like to see squid keep information about the its
    successful connection attempts. The little discussion below details
    my thinking on these lines.

        1) Forget about caching DNS results in the server. Named
        does a fine job of it and the async technique used by the
        dnsservers work well to access a local (or remote) IP
        address cache. Further, the current ip cache implementation
        is broken.

        2) Cache _failed_ ip addresses for each HTTP attempt. Set
        a TTL for these cache entries at something like the TTL
        for the DNS record itself.

        3) Check the failed ip cache before making a connection attempt.

        4) Try ALL THE ADDRESSES returned by DNS before returning a failure
        to the client.


    I suspect that these problems are a historical part of the way
    that squid's ip chache system has worked since the early harvest
    days. The brokenness was not exposed back when most sites were
    single hosted. That may well be the case for the vast volume
    of sites. Now, with very large volume sites using multi-hosted
    site servers the problem that I discuss above has turned out
    to be a show stopper, It is my opinion that something like
    the above recommendation will provide a better overall product
    and will in-fact be a simpler implementation.

Best Regards

PS.  Setting positive_dns_ttl to non zero causes squid to "fixate"
on a single address for any given site.  To see this for yourself
run tcpdump or another packet analyzer and trace accesses to the
site listed above.  Note that squid never uses any addresses other
than the first returned by an initial DNS query.  You may draw your
own conclusions from this.
Received on Wed Aug 21 1996 - 00:59:20 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:32:49 MST