RE: [squid-users] Strange performance effects on squid during off peak hours

From: Martin Sperl <Martin.Sperl_at_amdocs.com>
Date: Mon, 20 Sep 2010 08:45:01 +0100

Hi!

I have run a test with ab running 10000 hits with a single thread loading an image with a spacing of 5 minutes of inactivity and there is still a "variation" even measurable with apache bench:
Date time mean RT measured by AB
2010-09-16 02:55:12 37.033
2010-09-16 03:01:22 37.633
2010-09-16 03:07:38 37.245
2010-09-16 03:13:51 38.867
2010-09-16 03:20:20 41.326
2010-09-16 03:29:09 38.427
2010-09-16 03:40:33 40.313
2010-09-16 03:52:16 41.049
2010-09-16 04:02:12 42.100
2010-09-16 04:19:13 40.650
2010-09-16 04:36:00 37.490
2010-09-16 04:52:15 36.126
2010-09-16 05:08:16 37.390
2010-09-16 05:24:30 34.031
2010-09-16 05:40:10 30.392
2010-09-16 05:55:14 26.779
2010-09-16 06:09:42 24.118
2010-09-16 06:23:43 24.283
2010-09-16 06:37:46 24.423
2010-09-16 06:51:50 23.334
2010-09-16 07:05:44 23.633
2010-09-16 07:19:40 22.333
2010-09-16 07:33:24 21.460
2010-09-16 07:46:58 20.632
2010-09-16 08:00:25 21.047
2010-09-16 08:13:55 20.049
2010-09-16 08:27:16 18.903
2010-09-16 08:40:25 19.244
2010-09-16 08:53:37 21.181
2010-09-16 09:07:09 21.196
2010-09-16 09:20:41 19.102
2010-09-16 09:33:52 19.755
2010-09-16 09:47:10 18.674
2010-09-16 10:00:16 18.832
2010-09-16 10:13:25 17.063
2010-09-16 10:26:16 18.207
2010-09-16 10:39:18 18.328
2010-09-16 10:52:21 17.980
2010-09-16 11:05:23 17.868
2010-09-16 11:18:21 17.417
2010-09-16 11:31:16 16.421
2010-09-16 11:44:00 17.059
2010-09-16 11:56:51 17.350
2010-09-16 12:09:44 16.641
2010-09-16 12:22:31 18.211
2010-09-16 12:35:33 16.686
2010-09-16 12:48:20 17.278
2010-09-16 13:01:13 17.172
2010-09-16 13:14:05 16.528
2010-09-16 13:26:50 16.124
2010-09-16 13:39:31 16.353
2010-09-16 13:52:15 18.287
2010-09-16 14:05:18 16.728
2010-09-16 14:18:05 17.055
2010-09-16 14:30:56 17.452
2010-09-16 14:43:50 16.491
2010-09-16 14:56:35 15.851
2010-09-16 15:09:14 16.407
2010-09-16 15:21:58 15.822
2010-09-16 15:34:36 17.049
2010-09-16 15:47:27 16.052
2010-09-16 16:00:07 16.307
2010-09-16 16:12:50 16.408
2010-09-16 16:25:34 17.201
2010-09-16 16:38:26 16.686
2010-09-16 16:51:13 16.076
2010-09-16 17:03:54 17.277
2010-09-16 17:16:47 16.468
2010-09-16 17:29:32 14.842
2010-09-16 17:42:00 15.721
2010-09-16 17:54:37 15.734
2010-09-16 18:07:15 16.160
2010-09-16 18:19:56 16.131
2010-09-16 18:32:38 15.951
2010-09-16 18:45:17 14.994
2010-09-16 18:57:47 15.365
2010-09-16 19:10:21 16.774
2010-09-16 19:23:09 17.303
2010-09-16 19:36:02 16.790
2010-09-16 19:48:50 16.421
2010-09-16 20:01:34 16.380
2010-09-16 20:14:18 15.523
2010-09-16 20:26:53 16.499
2010-09-16 20:39:38 16.596
2010-09-16 20:52:24 16.116
2010-09-16 21:05:05 16.445
2010-09-16 21:17:50 15.919
2010-09-16 21:30:29 16.928
2010-09-16 21:43:18 15.841
2010-09-16 21:55:57 16.378
2010-09-16 22:08:41 17.232
2010-09-16 22:21:33 15.755
2010-09-16 22:34:11 17.264
2010-09-16 22:47:03 18.250
2010-09-16 23:00:06 18.700
2010-09-16 23:13:13 19.165
2010-09-16 23:26:25 23.088
2010-09-16 23:40:15 23.505
2010-09-16 23:54:11 22.105
2010-09-17 00:07:52 23.635
2010-09-17 00:21:48 29.841
2010-09-17 00:36:46 29.847
2010-09-17 00:51:45 31.886
2010-09-17 01:07:04 31.010
2010-09-17 01:22:14 33.142
2010-09-17 01:37:46 35.977
2010-09-17 01:53:45 38.067
2010-09-17 02:10:06 38.245
2010-09-17 02:26:29 39.521
2010-09-17 02:43:04 39.803
2010-09-17 02:59:42 34.372
2010-09-17 03:15:26 33.135

The test got done on a server sitting next to the one tested in the same network segment.
Command executed: ab -n 10000 -c 1 -X <serverip>:3128 http://<hostname>/<image URL>

As you can see there is still a variation even though ab is producing lots of hits/s: something like 30hits/s

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking <hostname> [through <serverip>:3128] (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests

Server Software: Apache-Coyote/1.1
Server Hostname: <hostname>
Server Port: 80

Document Path: /<image URL>
Document Length: 927 bytes

Concurrency Level: 1
Time taken for tests: 331.351 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 13320000 bytes
HTML transferred: 9270000 bytes
Requests per second: 30.18 [#/sec] (mean)
Time per request: 33.135 [ms] (mean)
Time per request: 33.135 [ms] (mean, across all concurrent requests)
Transfer rate: 39.26 [Kbytes/sec] received

Connection Times (ms)
              min mean[+/-sd] median max
Connect: 0 0 1.8 0 130
Processing: 0 33 17.1 31 422
Waiting: 0 33 17.1 31 422
Total: 0 33 17.3 32 423

Percentage of the requests served within a certain time (ms)
  50% 32
  66% 40
  75% 46
  80% 50
  90% 58
  95% 60
  98% 61
  99% 62
 100% 423 (longest request)

There is also no visible change on all the requests measured.

I also did an analysis of the ICAP service RT (as measured via an analysis of tcpdumps taken) and I came to the conclusion that the Request Modification time was 0.0006s and the Response modification took 0.0017s. But the image is NOT passing thru the ResponseModification (as it is delivered from Memory, and my experience shows, that with TCP_MEM_HIT the Response Modification is not "triggered") - the overall ratio between request and response modification count measured is 3:1.

Any further Ideas how to improve the RT?
In the meantime I will push for an upgrade to squid 3.1

Thanks,
        Martin

> -----Original Message-----
> From: Martin Sperl
> Sent: Mittwoch, 15. September 2010 21:53
> To: Amos Jeffries; squid-users_at_squid-cache.org
> Subject: RE: [squid-users] Strange performance effects on squid during off
> peak hours
>
> Hi Amos!
>
> Thanks for your feedback.
>
> > Squid is still largely IO event driven. If the network IO is less than
> > say 3-4 req/sec Squid can have a queue of things waiting to happen which
> > get delayed a long time (hundreds of ms) waiting to be kicked off.
> > Your overview seems to show that behaviour clearly.
> >
> > There have been some small improvements and fixes to several of the
> > lagging things but I think its still there in even the latest Squid.
>
> Here the Hit/s statistics on this specific server for the time:
> +------+-------+-------+
> | h | allHPS| cssART|
> +------+-------+-------+
> | 0 | 48.34 | 0.016 |
> | 1 | 49.80 | 0.015 |
> | 2 | 49.01 | 0.015 |
> | 3 | 47.08 | 0.018 |
> | 4 | 17.34 | 0.024 |
> | 5 | 4.00 | 0.042 |
> | 6 | 0.52 | 0.054 |
> | 7 | 9.02 | 0.034 |
> | 8 | 7.18 | 0.038 |
> | 9 | 8.25 | 0.035 |
> | 10 | 9.45 | 0.034 |
> | 11 | 14.71 | 0.030 |
> | 12 | 23.94 | 0.023 |
> | 13 | 31.04 | 0.021 |
> | 14 | 35.02 | 0.020 |
> | 15 | 38.87 | 0.019 |
> | 16 | 40.92 | 0.019 |
> | 17 | 43.39 | 0.017 |
> | 18 | 45.62 | 0.016 |
> | 19 | 47.58 | 0.017 |
> | 20 | 51.91 | 0.014 |
> | 21 | 53.65 | 0.014 |
> | 22 | 40.87 | 0.016 |
> | 23 | 47.40 | 0.016 |
> +------+-------+-------+
>
> So to summarize it: we need to keep the number of hits above 30 hits/s for
> squid, so that we get an acceptable Response time.
>
> I believe it will need some convincing of management to get this assumption
> tested in production ;)
>
> One other Question: is squid 3.1 "better" in this respect than 3.0?
>
> Thanks,
> Martin
>
> This message and the information contained herein is proprietary and
> confidential and subject to the Amdocs policy statement,
> you may review at http://www.amdocs.com/email_disclaimer.asp

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp
Received on Mon Sep 20 2010 - 07:45:23 MDT

This archive was generated by hypermail 2.2.0 : Mon Sep 20 2010 - 12:00:03 MDT