[squid-users] CARP cache_peer hashing based on order found in config?

From: john allspaw <jallspaw_at_yahoo.com>
Date: Fri, 19 Dec 2008 11:24:04 -0800 (PST)

We've got squid accelerator setups, and trying to get away from using layer7 balancers for URL hashing, so we thought we'd use CARP.
The basics are: a pool of 18 servers, balanced round-robin. We have CARP squid instances on port 80, a caching squid on 81.

Our carp cache_peer lines look like:
cache_peer 69.147.123.121 parent 81 7 carp no-query name=photocache201 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.122 parent 81 7 carp no-query name=photocache202 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.123 parent 81 7 carp no-query name=photocache203 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.124 parent 81 7 carp no-query name=photocache204 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.125 parent 81 7 carp no-query name=photocache205 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.126 parent 81 7 carp no-query name=photocache206 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.32 parent 81 7 carp no-query name=photocache207 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.33 parent 81 7 carp no-query name=photocache208 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.34 parent 81 7 carp no-query name=photocache209 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.35 parent 81 7 carp no-query name=photocache210 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.36 parent 81 7 carp no-query name=photocache211 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.37 parent 81 7 carp no-query name=photocache212 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.38 parent 81 7 carp no-query name=photocache213 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.39 parent 81 7 carp no-query name=photocache214 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.40 parent 81 7 carp no-query name=photocache215 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.41 parent 81 7 carp no-query name=photocache216 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.42 parent 81 7 carp no-query name=photocache217 monitorurl=/index.htm monitorinterval=60
cache_peer 69.147.123.43 parent 81 7 carp no-query name=photocache218 monitorurl=/index.htm monitorinterval=60

The carp instances get an equal balance from the load balancer, but we don't see an equal balance across the caching
squid instances. Instead, we see a distribution that looks exactly like the sequence of hosts listed as cache_peers. For
example, photocache201 gets the most requests, and it decreases down the line, and photocache218 gets the *least* requests.

Is this expected? How can we get a real balance?

The cache manager confirms what we're seeing:

$squidclient -p 80 cache_object://127.0.0.1/carp
HTTP/1.0 200 OK
Server: squid/2.7.STABLE5
Date: Fri, 19 Dec 2008 18:52:21 GMT
Content-Type: text/plain
Expires: Fri, 19 Dec 2008 18:52:21 GMT
X-Cache: MISS from photocache201.flickr
X-Cache-Lookup: MISS from photocache201.flickr
Via: 1.0 photocache201.flickr (squid/2.7.STABLE5)
Connection: close

                Hostname Hash Multiplier Factor Actual
             apache_peer 0 0.000000 0.000000 0.007863
           photocache201 b7d71c0d 1.000000 0.055556 0.162550
           photocache202 e4836670 1.000000 0.055556 0.133271
           photocache203 114fb0d4 1.000000 0.055556 0.072396
           photocache204 3e1bfb37 1.000000 0.055556 0.076387
           photocache205 6ac8459a 1.000000 0.055556 0.064102
           photocache206 97948ffd 1.000000 0.055556 0.045340
           photocache207 c440da60 1.000000 0.055556 0.058199
           photocache208 f10d24c3 1.000000 0.055556 0.039018
           photocache209 1dd96f27 1.000000 0.055556 0.050036
           photocache210 b7d0820d 1.000000 0.055556 0.043762
           photocache211 e49ccc70 1.000000 0.055556 0.038504
           photocache212 114916d4 1.000000 0.055556 0.032422
           photocache213 3e156137 1.000000 0.055556 0.031728
           photocache214 6ac1ab9a 1.000000 0.055556 0.026673
           photocache215 978df5fd 1.000000 0.055556 0.028562
           photocache216 c45a4060 1.000000 0.055556 0.030378
           photocache217 f1068ac3 1.000000 0.055556 0.024259
           photocache218 1dd2d527 1.000000 0.055556 0.034549

(the apache_peer is a local peer on each box for origin server healthchecks, it can be ignored)

To confirm this, we even reversed the cache_peer lines; it results in photocache218 getting the most, photocache201 getting
the least. :)

What gives?
--john

      
Received on Fri Dec 19 2008 - 19:24:11 MST

This archive was generated by hypermail 2.2.0 : Sat Dec 20 2008 - 12:00:02 MST