RE: [squid-users] RE: Essential ICAP service eown error not working reliably

From: Justin Lawler <jlawler_at_amdocs.com>
Date: Tue, 18 Oct 2011 11:12:08 +0000

HI Amos, thanks for that.

Yea - we're in the middle of running against a JVM with tuned GC settings, which we hope will resolve the issue.

One problem is we need to be 100% the issue is being caused by long GC pauses, as the patch has to go into a busy production system. Currently we're not, as we're not always getting ICAP errors for every long GC pause - maybe only 20% of the time we're getting ICAP errors only.

Thanks,
Justin

-----Original Message-----
From: Amos Jeffries [mailto:squid3_at_treenet.co.nz]
Sent: Tuesday, October 18, 2011 7:03 PM
To: squid-users_at_squid-cache.org
Subject: Re: [squid-users] RE: Essential ICAP service eown error not working reliably

On 18/10/11 18:02, Justin Lawler wrote:
> Hi,
>
> Just a follow up to this. Anyone know how/when squid will trigger ICAP
> service as down?
>

When it stops responding.

> From ICAP logs, we can see squid is sending in an 'OPTIONS' request
> every second. Is this request a health-check on the ICAP service? Or
> is there any other function to it?
>

Yes, and yes. A service responding to OPTIONS is obviously running.

See the ICAP specification for what else its used for:
http://www.rfc-editor.org/rfc/rfc3507.txt section 4.10

> We're still seeing very long pauses in our ICAP server that should
> really trigger an ICAP error on squid, but it isn't always.
>
> Thanks, Justin

Can you run it against a better GC? I've heard that there were competing GC algorithms in Java these last few years with various behaviour benefits.

>
> -----Original Message-----
> From: Justin Lawler
>
> Hi,
>
> We have an application that integrates with squid over ICAP - a java
> based application. We're finding that the java application has very
> long garbage collection pauses at times (20+ seconds), where the
> application becomes completely unresponsive.
>
> We have squid configured to use this application as an essential
> service, with a timeout for 20 seconds. If the application goes into
> a GC pause, squid can throw an 'essential ICAP service is down'
> error.
>
> The problem is most of the time it doesn't. It only happens maybe 20%
> of the time - even though some of the pauses are 25 seconds+.
>
> Squid is setup to do an 'OPTIONS' request on the java application
> every second, so I don't understand why it doesn't detect the java
> application becoming unresponsive.
>

It's very likely these requests are being made and being serviced, just
very much later.

http://www.squid-cache.org/Doc/config/icap_connect_timeout/
   Note the default is: 30-60 seconds inherited from [peer_]connect_timeout.

Also http://www.squid-cache.org/Doc/config/icap_service_failure_limit/

So 10 failures in a row are required to detect an outage. Each failure
takes 30+ seconds to be noticed.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.16
   Beta testers wanted for 3.2.0.13
This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp
Received on Tue Oct 18 2011 - 11:12:21 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 18 2011 - 12:00:04 MDT