Re: [squid-users] CARP Failover behavior - multiple parents chosen for URL

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 12 May 2009 15:33:11 +1200 (NZST)

> Moving this to squid-dev due to increasingly propellerhead-like
> content... :)
>
> Looking over the code and some debugging output, it's pretty clear
> what's happening here.
>
> The carpSelectParent() function does the appropriate hashing of each
> URL+parent hash and the requisite ranking of the results. To determine
> whether or not the highest-hash-value parent is the parent that
> should, in fact, be returned, it uses peerHTTPOkay() as its test.
>
> The problem here is that peerHTTPOkay only returns 0 if the peer in
> question has been marked DEAD; carpSelectParent has no way of knowing
> if the peer is down unless squid has "officially" marked it DEAD.
>
> So, if the highest-ranked peer is a peer that is refusing connections
> but isn't marked DEAD yet, then peer_select tries to use it, and when
> it fails, falls back to ANY_PARENT - this actually shows up in the
> access.log, which I didn't realize when I initially sent this in. Once
> we've tried to hit the parent 10 times, we officially mark it DEAD,
> and then carpSelectParent() does the Right Thing.
>
> So, we have a couple option here as far as how to resolve this:
>
> 1. Adjust PEER_TCP_MAGIC_COUNT from 10 to 1, so that a parent is
> marked DEAD after only one failure. This may be overly sensitive
> however. Alternatively, carpSelectParent() can check peer->tcp_up and
> disqualify the peer if it's not equal to PEER_TCP_MAGIC_COUNT; this
> will have a similar effect without going through the overhead of
> actually marking the peer DEAD and then "reviving" it.

Patches went in recently to make that setting a squid.conf option.

Squid-3:
  http://www.squid-cache.org/Versions/v3/HEAD/changesets/b9678.patch

Squid-2:
 http://www.squid-cache.org/Versions/v2/HEAD/changesets/12208.patch
 http://www.squid-cache.org/Versions/v2/HEAD/changesets/12209.patch

>
> 2. Somehow have carpSelectParent() return the entire sorted list of
> peers, so that if the to choice is found to be down, then
> peer_select() already knows where to go next...
>
> 3. Add some special-case code (I'm guessing this would be either in
> forward.c or peer_select.c) so that if a connection to a peer selected
> by carpSelectParent() fails, then increment a counter (which would be
> unique to that request) and call carpSelectParent() again. This
> counter can be used in carpPeerSelect to ignore the X highest-ranked
> entries. Once this peer gets officially declared DEAD, this becomes
> moot.
>
> Personally, I'm partial to #3, but other approaches are welcome :)
>

I'm partial to #2. But not for any particular reason.
Patches for either #2 or #3 are welcome.

Amos
Received on Tue May 12 2009 - 03:33:25 MDT

This archive was generated by hypermail 2.2.0 : Tue May 12 2009 - 12:00:02 MDT