RE: Squid DNS caching behaviour

From: Andrew Carroll <acarroll@dont-contact.us>
Date: Fri, 9 Jul 2004 11:18:11 +0800

Henrik,

The experiment I have implemented involves adding the original
destination address to the FwdState datastructure, and populating it in
fwdStart() at the same point the client address is stored when
LINUX_TPROXY is defined.

The assumption is then made that commConnectStart() will always receive
a FwdStart instance as it's data parameter (commConnectStart is not
modified). In commConnectDnsHandle(), the FwdStart instance is obtained
from cs->data. A linear search of the ipcache_addrs list (ia parameter)
for the original destination address is then performed and ia->in_addr
is set to that address if it's found. If not found, it resorts to
default behaviour of using the address at index ia->cur.

This works well except for large sites with DNS load balancing, as
stated previously.

If I were to modify storeKeyPublic to MD5 hash on the original
destination address as well as url, method, how many dependencies would
this affect? From what I've gathered, it looks like I'll need to add
the original destination address to the request_t structure as well.
Are there many functions that directly depend on storeKeyPublic and
storeKeyPublicByRequest?

Cheers,

Andrew.

-----Original Message-----
From: Henrik Nordstrom [mailto:hno@squid-cache.org]
Sent: Thursday, 8 July 2004 9:45 PM
To: Andrew Carroll
Cc: Henrik Nordstrom
Subject: RE: Squid DNS caching behaviour

Please keep discussion on the mailing list.

On Thu, 8 Jul 2004, Andrew Carroll wrote:

> I use the word "hint" here for the original destination server IP
> because I have been using it in a search of the list of IPs in an
> ipcache_entry for the URL supplied by the client: that IP being used
for
> the outgoing connection if found in the list.

Unfortunately many load balanced DNS servers of the major web sites
returns different IPs depending on time and numerous other factors so
chances are high the original IP is not even found in the list of IPs
returned in the DNS lookup.

> If we can assume that we still use DNS lookups, the ipcache_entry's
list
> of IPs doesn't get flushed but grows (if URL IPs are rotating due to
DNS
> load balancing) when the TTL of the last update expires, then over
time
> the IP cache will have all the IPs to which the URL is mapped, i.e.
the
> TTL problem reduces over time and still avoids cache pollution (more
or
> less breaking attempts to try it).

Could work. You still need to extend the same area of Squid as pointed
out
before.. commConnectStart needs to be given the original intended IP,
and
able to select this among the available answers.

What do you propose doing in the case the DNS lookup performed by
Squid does not return the same IP? Use whatever Squid got, or the
original
IP requested by the client?

> Is this latter approach feasible in your opinion? If so, can you
point
> me to the appropriate functions for this solutions implementation?

It might, but I would prefer first having the option to just use the IP
requested by the client as this is still needed for full functionality,
then extend this to be cache friendly by performing DNS lookups and if
the
IP is found to match cache on the site name.

Regards
Henrik
Received on Thu Jul 08 2004 - 21:17:12 MDT

This archive was generated by hypermail pre-2.1.9 : Sat Jul 31 2004 - 12:00:03 MDT