Re: [PATCH] icap_oldest_service_failure option

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 20 Feb 2010 14:19:53 +1300

Alex Rousskov wrote:
> Added icap_oldest_service_failure option to forget old ICAP errors.
>
> A busy or remote ICAP server may produce a steady but shallow stream of
> errors. Any ICAP server may become nearly unusable in a short period of
> time, producing a burst of errors. To avoid disabling a generally usable
> service, it is important to distinguish these two cases. Just counting
> the number of errors and suspending the service after
> icap_service_failure_limit is reached often either suspends the service
> in both cases or never suspends it at all, depending on the option
> value.
>
> One way to distinguish a large burst of errors from a steady but shallow
> error stream is to forget about old errors. The added
> icap_oldest_service_failure option instructs Squid to ignore errors that
> are "too old" to be counted as a part of a burst.
>
> Another way to look at this feature is to say that the combination of
> the old icap_service_failure_limit and the new
> icap_oldest_service_failure limits the ICAP error _rate_. For example,
> # suspend service usage after 10 failures in 5 seconds:
> icap_service_failure_limit 10
> icap_oldest_service_failure 5 seconds
>
> Squid does not remember every transaction error that occurred within the
> allowed "oldest error" time period. That would be result in a precise
> but too expensive implementation, especially during error bursts on a
> busy server. Instead, Squid divides the period in ten slots, counts the
> number of errors that occurred in each slot, and forget the oldest
> slot(s) as needed. Thus, the algorithm has about 90% precision as far as
> timing of the failures is concerned. That 90% precision ought to be good
> enough for any deployment.
>
> The patch is for Squid v3.1+ but we will port to trunk if approved.
>

+1. I definitely like the idea.

Would it be possible to deflate the options a little bit though?

If what is being achieved really is meant to be a rate I would expect
the icap_service_failure_limit could better be extended instead of a new
option added. To make it easier for users to understand whats going to
happen.

Something like:
   'icap_service_failure_limit' number [ ('/'|'per') period ]

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE8 or 3.0.STABLE24
   Current Beta Squid 3.1.0.16
Received on Sat Feb 20 2010 - 01:20:03 MST

This archive was generated by hypermail 2.2.0 : Sat Feb 20 2010 - 12:00:08 MST