[squid-users] Cache Peers and Load Balancing

From: Dean Weimer <dweimer_at_ORSCHELN.com>
Date: Mon, 29 Sep 2008 15:41:33 -0500

I am looking at implementing a new proxy configuration, using multiple peers and load balancing, I have been looking through the past archives but I haven't found the answers to some questions I have.

Here is what I am trying to accomplish:
Have 3 parent proxy servers, each connected to a DSL line, I will call them PARENT1 (1.1.1.1), PARENT2 (2.2.2.2) and PARENT3 (3.3.3.3).
One child proxy that the users will connect to, configured with null storage, I will call it CHILD (4.4.4.4)
I want to Keep it simple, and not include other load balancers, probably not best performance, but easiest to deploy and maintain. This of course leaves the cache_peer options as my method for load balancing.

Obviously the simplest method would be:
cache_peer 1.1.1.1 parent 3128 3130 round-robin
cache_peer 2.2.2.2 parent 3128 3130 round-robin
cache_peer 3.3.3.3 parent 3128 3130 round-robin
cache_peer_access 1.1.1.1 allow all
cache_peer_access 2.2.2.2 allow all
cache_peer_access 3.3.3.3 allow all
The problem with this is, that websites requiring the source IP as part of the session state, in this case it would be required to add the sourcehash option:
cache_peer 1.1.1.1 parent 3128 3130 round-robin sourcehash
cache_peer 2.2.2.2 parent 3128 3130 round-robin sourcehash
cache_peer 3.3.3.3 parent 3128 3130 round-robin sourcehash
cache_peer_access 1.1.1.1 allow all
cache_peer_access 2.2.2.2 allow all
cache_peer_access 3.3.3.3 allow all
The question here, is what problems the source hash brings to the table. For starters I know this is persistent, and doesn't change once established unless a condition changes. But what happens when PARENT1 goes down, I would expect the hashes associated with it would be load balanced using round-robin between PARENT2 and PARENT3. What happens when PARENT1 comes back into service? Will the original hashes associated with it resume using it? Or will they stay for the foreseeable future on the new parent hash?
Now this brings another possible option to the table, the websites needed a persistent source IP are limited, currently only 8 that I know need this in use by the users. So there is the possibility of using the source hash only with sites you know need it. This of course adds maintenance overhead. But can be handled by adding a secondary address to each parent, and using an access list.
acl HASHNEEDED dstdomain "/usr/local/squid/etc/sourcehash.list"
cache_peer 1.1.1.1 parent 3128 3130 round-robin
cache_peer 11.11.11.11 parent 3128 3130 round-robin sourcehash
cache_peer 2.2.2.2 parent 3128 3130 round-robin
cache_peer 22.22.22.22 parent 3128 3130 round-robin sourcehash
cache_peer 3.3.3.3 parent 3128 3130 round-robin
cache_peer 33.33.33.33 parent 3128 3130 round-robin sourcehash
cache_peer_access 1.1.1.1 allow !HASHNEEDED
cache_peer_access 11.11.11.11 allow HASHNEEDED
cache_peer_access 2.2.2.2 allow !HASHNEEDED
cache_peer_access 22.22.22.22 allow HASHNEEDED
cache_peer_access 3.3.3.3 allow !HASHNEEDED
cache_peer_access 33.33.33.33 allow HASHNEEDED
This should give me the best of both previous options, at the cost of the increased maintenance, and user calls, when they come across a new site that doesn't work without the source hash.

Now the other question is whether or not I should configure the 3 parent servers as siblings?
Would doing so break the source hash?

Please let me know if any of you have other suggestions that are completely different, keeping in mind that I would like to stick entirely within squid and not utilize other technologies. Feel free to tell me I am completely wrong in how squid works with the above configurations, I am pretty much a complete newbie when it comes to the cache_peer options. Since there seems to be a lack of information on this on the web page and wiki (Please forgive me if it's already there and I just didn't search for the right term to find it), I will gladly do my best to put everything I learn in this project on the squid wiki to help people in the future.

Thanks,
     Dean Weimer
     Network Administrator
     Orscheln Management Co
Received on Mon Sep 29 2008 - 20:41:49 MDT

This archive was generated by hypermail 2.2.0 : Tue Sep 30 2008 - 12:00:04 MDT