[squid-users] Strange performance effects on squid during off peak hours

From: Martin Sperl <Martin.Sperl_at_amdocs.com>
Date: Wed, 15 Sep 2010 14:01:10 +0100

Hi everyone,

we are seeing a strange response-time effect over 24 hours when delivering content via Squid+icap service (3.0.STABLE9 - I know old, but getting something changed in a production environment can be VERY hard...). Icap server we use is rewriting some URLs and also rewriting some of the content response.

Essentially we see that during peak hours the Average response time is better than during off-peak hours.
Here a report for one day for all CSS files that are delivered with CacheStatus TCP_MEM_HIT (as taken from the extended access-logs of squid) for a single server (all servers show similar effects):

Here the quick overview:
+------+------+-------+
| hour | hits | ART |
+------+------+-------+
| 0 | 4232 | 0.016 |
| 1 | 4553 | 0.015 |
| 2 | 4238 | 0.015 |
| 3 | 4026 | 0.018 |
| 4 | 1270 | 0.024 |
| 5 | 390 | 0.042 |
| 6 | 61 | 0.054 |
| 7 | 591 | 0.034 |
| 8 | 445 | 0.038 |
| 9 | 505 | 0.035 |
| 10 | 716 | 0.034 |
| 11 | 1307 | 0.030 |
| 12 | 2552 | 0.023 |
| 13 | 3197 | 0.021 |
| 14 | 3567 | 0.020 |
| 15 | 4095 | 0.019 |
| 16 | 4037 | 0.019 |
| 17 | 4670 | 0.017 |
| 18 | 5349 | 0.016 |
| 19 | 5638 | 0.017 |
| 20 | 6262 | 0.014 |
| 21 | 5634 | 0.014 |
| 22 | 4809 | 0.016 |
| 23 | 5393 | 0.016 |
+------+------+-------+

Obviously there are statistical effects involved, but when looking at the % histograms for the hours we see that:
Hour 0.000 to 0.030 in 0.001 steps and >0.030
00 0.2 6.0 3.2 3.3 3.3 2.7 4.0 3.2 3.4 3.0 4.4 4.7 4.2 3.9 3.6 3.9 3.1 3.1 3.0 2.6 3.7 2.8 2.9 2.1 1.7 2.0 1.7 1.5 1.5 1.6 1.5 8.2
01 0.2 6.1 3.7 3.3 3.4 3.2 3.3 3.3 3.9 3.8 4.3 5.1 4.0 4.4 3.4 3.7 3.3 2.8 3.1 2.7 3.6 2.6 2.3 2.1 2.1 1.8 1.5 1.7 1.4 1.8 1.4 6.6
02 0.1 6.5 3.4 3.5 3.2 3.3 3.4 3.1 3.0 3.3 4.5 5.0 4.3 3.8 3.4 3.6 3.7 3.1 3.3 2.6 2.9 3.1 2.8 2.0 2.3 1.9 2.0 1.4 1.1 1.6 1.7 7.3
03 0.3 6.1 3.2 2.9 3.2 3.2 3.6 3.5 3.2 3.4 4.5 5.1 3.8 4.0 4.2 4.0 3.4 3.2 3.1 3.2 2.8 3.1 2.6 1.5 1.7 1.9 1.7 1.6 1.1 1.5 1.5 8.0
04 0.4 3.6 1.9 1.5 1.8 1.3 2.0 2.0 1.9 2.0 3.8 3.5 2.8 3.1 1.7 3.1 2.4 2.8 2.4 2.2 3.0 3.6 2.6 1.7 2.5 1.7 1.3 2.1 1.7 3.4 3.9 26.3
05 0.0 0.5 0.8 0.0 0.0 0.3 0.3 0.0 0.0 0.8 1.0 0.8 0.5 0.3 1.0 0.5 1.5 0.0 1.0 0.3 1.8 1.8 0.3 0.8 1.0 0.8 0.5 1.5 0.8 4.6 5.1 71.5
06 0.0 0.0 1.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.6 1.6 3.3 91.8
07 0.5 2.4 0.3 0.2 0.7 0.3 0.5 0.8 0.2 0.7 1.2 2.2 2.4 2.0 0.7 1.0 1.5 0.3 0.8 2.2 2.0 3.2 2.5 1.9 2.0 0.8 2.2 1.0 1.4 2.4 5.6 54.0
08 0.4 1.6 1.1 0.0 1.1 0.9 1.1 0.4 0.2 0.4 0.9 0.9 0.9 0.7 1.3 1.1 0.7 1.6 1.1 1.6 1.1 3.6 1.8 2.0 1.8 1.8 1.8 1.3 1.8 1.3 5.2 58.2
09 0.0 1.6 0.8 1.0 0.8 0.6 0.4 0.8 0.4 0.2 1.4 1.0 0.6 0.6 0.8 1.8 1.4 1.0 1.2 1.0 2.8 3.0 1.0 2.2 2.8 1.0 1.6 1.4 1.6 3.2 5.5 56.8
10 0.3 1.3 0.6 0.7 0.4 0.0 0.3 0.1 0.6 0.8 0.3 1.1 1.3 1.1 1.1 1.4 1.7 0.8 1.1 1.4 3.8 2.8 1.1 1.5 2.2 1.0 2.2 1.1 1.4 4.5 8.4 53.6
11 0.4 3.0 1.0 0.8 0.9 0.5 0.7 0.4 0.8 0.3 1.8 2.1 2.0 1.4 1.2 1.3 1.8 1.3 1.3 1.0 3.3 3.3 2.5 2.7 2.2 2.4 1.8 2.3 1.8 4.3 4.8 44.6
12 0.2 3.4 1.8 1.4 1.3 1.7 1.5 1.5 1.3 1.2 3.0 3.6 2.8 2.9 2.7 2.4 2.7 2.2 2.3 2.3 3.7 3.7 2.9 2.5 2.5 2.6 2.2 2.4 2.2 2.8 4.4 25.7
13 0.2 4.6 1.5 1.8 2.2 1.6 1.6 2.0 2.2 1.8 3.7 4.2 2.7 2.9 3.1 3.1 2.8 2.9 2.7 3.1 3.6 3.8 2.9 2.6 2.3 2.6 2.3 1.9 1.7 2.7 3.7 19.3
14 0.4 4.3 1.9 1.9 2.4 1.7 2.6 1.8 2.2 1.8 3.8 4.3 3.2 3.2 3.1 2.5 3.4 2.7 3.0 2.9 3.8 4.0 2.7 2.9 2.4 2.6 2.2 2.5 2.0 2.5 3.0 16.7
15 0.4 5.3 2.0 2.3 2.1 2.2 2.5 2.5 2.5 2.7 3.9 4.7 3.4 3.5 3.3 3.0 3.2 3.0 3.0 2.8 4.4 3.8 2.7 2.1 2.6 1.9 2.2 1.5 2.1 1.9 2.1 14.3
16 0.3 5.3 3.1 2.7 2.4 2.5 3.0 2.6 2.5 2.9 3.4 4.9 4.0 3.5 3.3 3.5 3.5 2.9 2.7 2.8 3.4 3.1 2.6 2.7 2.3 2.7 2.1 1.9 1.5 1.9 1.8 12.5
17 0.2 5.4 2.7 2.6 2.8 3.0 2.5 2.5 3.0 3.2 4.5 4.8 3.9 3.6 4.2 3.3 3.3 3.3 3.2 3.5 3.5 3.3 2.5 2.3 1.9 2.0 2.0 1.6 1.5 1.9 2.0 9.8
18 0.4 5.5 3.0 3.2 2.8 3.1 3.0 3.5 2.9 2.9 4.3 5.5 3.8 3.1 3.9 3.1 3.2 3.3 2.7 3.4 3.3 3.1 2.7 2.3 2.2 1.9 1.6 1.5 1.7 1.6 1.8 9.4
19 0.3 6.2 3.1 3.0 3.4 3.4 3.1 2.9 3.5 2.9 4.2 5.0 4.5 3.5 3.4 3.6 2.9 3.2 2.9 3.0 3.4 3.1 2.7 2.1 1.9 1.6 1.6 1.7 1.3 1.7 2.0 8.6
20 0.4 6.8 3.7 3.4 3.6 3.7 3.3 3.4 3.4 3.4 5.1 5.3 4.1 3.5 4.0 3.6 3.6 2.5 3.1 2.8 3.0 2.8 2.5 2.1 2.1 1.7 1.4 1.2 1.1 1.3 1.5 6.5
21 0.4 7.1 3.6 4.1 3.7 3.3 3.6 3.8 3.6 3.3 4.4 4.6 5.1 4.5 4.1 3.1 3.2 3.2 2.7 2.9 3.3 2.9 2.3 1.9 1.8 1.5 1.5 1.2 1.2 1.3 1.2 5.6
22 0.3 6.8 3.6 3.2 2.7 3.1 3.2 3.0 3.4 2.9 4.2 4.4 3.8 3.2 3.5 3.9 3.0 3.1 2.9 2.5 3.7 3.2 2.7 2.4 2.1 1.5 1.9 1.4 1.4 1.6 2.0 9.4
23 0.4 6.0 3.8 3.2 3.4 3.0 3.0 2.8 3.0 3.1 4.5 4.5 4.1 3.9 3.6 3.7 3.4 2.8 2.8 3.1 4.0 2.9 2.7 1.9 2.1 2.1 1.6 1.6 1.3 1.6 1.5 8.5

You can see that for off-peak hours (6am UTC 91% of all request with TCP_MEM_HIT for css files are >0.030 seconds).
As for "peak" hours most requests are responded at 0.011s and 0.001s (@18:00 with 5.5% of all requests).

I know, that the numbers reported by squid also include some "effects" of the network itself.
But we also see similar effects on active monitoring of html+image downloads within our Span of control (this is one of our KPIs, which we are exceeding during graveyard-shift hours...).

We have tried a lot of things:
* virtualized versus real HW (0.002s improvement during peak hours)
* removing diskcache (uses the default settings compiled into squid when no diskcache is defined - at least the version of squid that we have)
* moving diskcache to ramdisk and increasing it (this has a negative effect!!!) - I wanted to change to aufs, but the binary we have does not support it..
* tuning some linux kernel parameters for increasing TCP buffers

Has someone experienced similar behavior and has got any recommendations what else we can do/test (besides upgrading to squid 3.1, which is a major effort from the testing perspective and which may not resolve the issue either)?

Thanks,
        Martin

P.s: Here the squid -v output:
Squid Cache: Version 3.0.STABLE9
configure options: '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--prefix=/opt/squid' '--enable-icap-client' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' 'CXXFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' 'FFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'
This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp
Received on Wed Sep 15 2010 - 13:01:29 MDT

This archive was generated by hypermail 2.2.0 : Wed Sep 15 2010 - 12:00:03 MDT