[squid-users] cache hierarchy question

From: Rob Williams <rob.williams_at_gmail.com>
Date: Sun, 10 Aug 2008 20:25:46 -0700

I've been experimenting more with ICP and carp. I've noticed that if I
have two squid machines configured like so:

squid1.test.rob:

icp_access allow all
icp_port 3130
http_port 80 accel vhost vport
cache_peer squid2.test.rob parent 80 3130 proxy-only
cache_peer freebsd1.test.rob parent 80 0 originserver

squid2.test.rob:

icp_access allow all
icp_port 3130
http_port 80 accel vhost vport
cache_peer freebsd1.test.rob parent 80 0 originserver

a request to squid1 via a load balancer (lynx --head --dump
http://freebsd1:81/index3.html) when squid2 has index3.html in cache
and squid1 does not results in a

X-Cache: HIT from squid2.test.rob
X-Cache-Lookup: HIT from squid2.test.rob:80
X-Cache: MISS from squid1.test.rob
X-Cache-Lookup: MISS from squid1.test.rob:80

That is what I would expect from a properly working array using ICP.
My question is if I want to distribute the load across an array of
reverse proxies and I want to use a cache peer protocol (ICP, CARP,
HTCP, Cache Digest, whatever), do I need all requests to my array /
mesh to come to a 'master' squid acting as a router/load balancer? Or
do I put a load balancer in front of the array and distribute http
requests randomly to the squids in the array?

The problem I'm having is if I put a load balancer in front of my
squid array then each squid must be aware of all other peers in order
for the squid array to act as a large cache. This results in
forwarding loops and other problems when using ICP and carp that I've
been unable to get around so far. But, if I don't put a load balancer
in front of the array how do I efficiently distribute the load to the
squids? If all requests come into one 'master' squid then wouldn't
that squid simply cache everything from the origin server itself?

It seems for a cache array to work properly when all squids in the
array represent content from a single origin server that a load
balancer must be used to distribute the requests (and therefore the
cached objects) across the machines in the array. Is my thinking
correct? And if so, how do I configure the cache_peers lines on the
squids? I would assume carp is the best protocol to use in this
instance as it assumes a linear array instead of a hierarchy, but at
this point I'd be happy to see ANY cache protocol example that would
distribute the cached objects in a squid cluster.

-Rob
Received on Mon Aug 11 2008 - 03:25:49 MDT

This archive was generated by hypermail 2.2.0 : Fri Aug 15 2008 - 12:00:03 MDT