Re: Round-Robin becomes unbalanced when a peer dies and comes back

From: Mark Nottingham <mnot_at_yahoo-inc.com>
Date: Fri, 6 Jun 2008 11:53:53 +1000

Oh, I chopped off the last part; if people agree with that plan, I'll
produce a patch.

On 06/06/2008, at 11:46 AM, Mark Nottingham wrote:

> <http://www.squid-cache.org/bugs/show_bug.cgi?id=2376>
>
> When a peer goes down and then comes back, its round-robin counters
> aren't
> reset, causing it to get a disproportionate amount of traffic until
> it "catches
> up" with the rest of the peers in the round-robin pool.
>
> If it was down for load-related issues, this has the effect of
> making it more
> likely that it will go down again, because it's temporarily handling
> the load
> of the entire pool.
>
> Normally, this isn't a concern, because the number of requests that
> it can get
> out-of-step is relatively small (bounded to how many requests it can
> be given
> before it is considered down -- is this 10 in all cases, or are
> there corner
> cases?), but in an accelerator case where the origin has a process-
> based
> request-handling model, or back-end processes are CPU-intensive, it
> is.
>
> It looks like the way to fix this is to call peerClearRR from
> neighborAlive in
> neighbors.c. However, that just clears one peer - it's necessary to
> clear *all*
> peers simultaneously.
>
> Therefore, I sugest:
>
> 1) calling peerClearRR from neighborAlive
>
> 2) changing the semantics of peerClearRR to clear all neighbours at
> once, and
> change how it's called appropriately.
>
>
> --
> Mark Nottingham mnot_at_yahoo-inc.com
>
>

--
Mark Nottingham       mnot_at_yahoo-inc.com
Received on Fri Jun 06 2008 - 01:55:04 MDT

This archive was generated by hypermail 2.2.0 : Fri Jun 06 2008 - 12:00:04 MDT