RE: [squid-users] RE: Essential ICAP service eown error not working reliably from Justin Lawler on 2011-10-19 (squid-users)

From: Justin Lawler <jlawler_at_amdocs.com>
Date: Thu, 20 Oct 2011 03:44:26 +0000

Hi Amos,

We're seeing these OPTIONS health-check requests coming in every second in the ICAP server. Is this correct behavior?

Is this customizable in the squid.conf file? Or does squid calculate this setting itself?

We're seeing these requests come in every second in production, but in our test environment, they're coming in every 40-60 seconds - and we're a little confused as to why.

Thanks and regards,
Justin

-----Original Message-----
From: Justin Lawler
Sent: Tuesday, October 18, 2011 7:12 PM
To: squid-users_at_squid-cache.org
Subject: RE: [squid-users] RE: Essential ICAP service eown error not working reliably

HI Amos, thanks for that.

Yea - we're in the middle of running against a JVM with tuned GC settings, which we hope will resolve the issue.

One problem is we need to be 100% the issue is being caused by long GC pauses, as the patch has to go into a busy production system. Currently we're not, as we're not always getting ICAP errors for every long GC pause - maybe only 20% of the time we're getting ICAP errors only.

Thanks,
Justin

-----Original Message-----
From: Amos Jeffries [mailto:squid3_at_treenet.co.nz]
Sent: Tuesday, October 18, 2011 7:03 PM
To: squid-users_at_squid-cache.org
Subject: Re: [squid-users] RE: Essential ICAP service eown error not working reliably

On 18/10/11 18:02, Justin Lawler wrote:
> Hi,
>
> Just a follow up to this. Anyone know how/when squid will trigger ICAP
> service as down?
>

When it stops responding.

> From ICAP logs, we can see squid is sending in an 'OPTIONS' request
> every second. Is this request a health-check on the ICAP service? Or
> is there any other function to it?
>

Yes, and yes. A service responding to OPTIONS is obviously running.

See the ICAP specification for what else its used for:
http://www.rfc-editor.org/rfc/rfc3507.txt section 4.10

> We're still seeing very long pauses in our ICAP server that should
> really trigger an ICAP error on squid, but it isn't always.
>
> Thanks, Justin

Can you run it against a better GC? I've heard that there were competing GC algorithms in Java these last few years with various behaviour benefits.

>
> -----Original Message-----
> From: Justin Lawler
>
> Hi,
>
> We have an application that integrates with squid over ICAP - a java
> based application. We're finding that the java application has very
> long garbage collection pauses at times (20+ seconds), where the
> application becomes completely unresponsive.
>
> We have squid configured to use this application as an essential
> service, with a timeout for 20 seconds. If the application goes into a
> GC pause, squid can throw an 'essential ICAP service is down'
> error.
>
> The problem is most of the time it doesn't. It only happens maybe 20%
> of the time - even though some of the pauses are 25 seconds+.
>
> Squid is setup to do an 'OPTIONS' request on the java application
> every second, so I don't understand why it doesn't detect the java
> application becoming unresponsive.
>

It's very likely these requests are being made and being serviced, just very much later.

http://www.squid-cache.org/Doc/config/icap_connect_timeout/
Note the default is: 30-60 seconds inherited from [peer_]connect_timeout.

Also http://www.squid-cache.org/Doc/config/icap_service_failure_limit/

So 10 failures in a row are required to detect an outage. Each failure takes 30+ seconds to be noticed.

Amos

--
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.16
   Beta testers wanted for 3.2.0.13
This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at http://www.amdocs.com/email_disclaimer.asp

Received on Thu Oct 20 2011 - 03:44:45 MDT

This archive was generated by hypermail 2.2.0 : Thu Oct 20 2011 - 12:00:03 MDT