Hosts with multiple IP addresses...

From: Mike Pelletier <mikep@dont-contact.us>
Date: Thu, 15 May 1997 16:43:49 -0400 (EDT)

As usual, Microsoft is a highly effective headache source. The
www.microsoft.com server has about 14 IP addresses, and when a particular
address is overloaded, it starts refusing connections, which unfortunately
is permitted by the RFC. The behaviour that I'd expect of the proxy given
this situation would be to try the next IP address in the list, and return
an error to the user only once all the IP addresses prove to be unusable.

I started to delve into this problem a little bit, and I'm beginning to
see why nobody's wanted to tackle it. The IP cache throws another wrinkle
into the mix, and the Squid code is quite elegant and intricate. I was
hoping that someone who's a little more familiar with the Squid code
layout could determine whether the approach I'm considering is the best
(as in least-work, best-performance, least-intrusive) one to fixing this
issue.

I'm looking at "commConnectHandle()," and it looks like what happens if
the connection succeeds is that the "cur" IP address indicator in the
cache entry structure is incremented to send the next address in the list
to the next person who requests it from the ipcache, but if it fails, that
address is considered "Bad" and removed from the cache list.

However, in the www.microsoft.com situation, a "bad" address could be
"good" ten seconds later, after the load lessens and the server starts
accepting connections again, and the COMM_ERROR state it sets could be
avoided simply by trying the next address in the list.

Now, what I'm wondering is whether setting up a loop to try all the
addresses in the list should be done here in commConnectHandle()? Are
there any ramifications of doing this that I'm not aware of? What if the
list of addresses has been depleted by a series of "BadAddress" calls?

In most cases, a bad address really should be expunged from the cache, but
perhaps in the case of multi-IP hosts, the purge should not be immediate,
but after two or three tries. It would appear that another member of the
ipcache_entry struct would need to be added to keep track of this. Yuck.
 
Or, perhaps when dealing with multiple IP hosts, ipcacheRemoveBadAddress()
shouldn't be called until they all addresses fail, and then the entire
cache entry should be ipcache_release()d. This would sort of defeat the
purpose of the ipcacheRemoveBadAddress(), though.

Maybe only if the error is "connection refused" for that address, it
should be kept in the list, but if it's some other error, it should be
removed...

So, what do you think?

Thanks for your input in this. If someone out there is already working on
this, please let me know and save me some work.

        -Mike Pelletier.
Received on Thu May 15 1997 - 13:46:04 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:35:12 MST