Re: [squid-users] What does storeClientCopyEvent mean? from Adrian Chadd on 2008-12-11 (squid-users)

From: Adrian Chadd <adrian_at_squid-cache.org>
Date: Thu, 11 Dec 2008 21:34:10 -0500

Which version of Squid are you using again? I patched the latest
Squid-2.HEAD with some aufs related fixes that reduce the amount of
callback checking which is done.

Uhm, check src/fs/aufs/store_asyncufs.h :

/* Which operations to run async */
#define ASYNC_OPEN 1
#define ASYNC_CLOSE 0
#define ASYNC_CREATE 1
#define ASYNC_WRITE 0
#define ASYNC_READ 1

Thats by default on Squid-2.HEAD. I've just changed them all to be
async under cacheboy-1.6 and this performs great under freebsd-7 +
AUFS with my testing.

Adrian

2008/12/11 Bin Liu <binliu.lqbn_at_gmail.com>:
> Thanks for your reply, Adrian. I'm very appreciated for your help.
>
>> I'd suggest using your OS profiling to figure out where the CPU is
>> being spent. This may be a symptom, not the cause.
>
> Here is the top output snapshot:
>
> last pid: 76181; load averages: 1.15, 1.12, 1.08 up 6+05:35:14 22:25:07
> 184 processes: 5 running, 179 sleeping
> CPU states: 24.2% user, 0.0% nice, 3.8% system, 0.0% interrupt, 72.0% idle
> Mem: 4349M Active, 2592M Inact, 599M Wired, 313M Cache, 214M Buf, 11M Free
> Swap: 4096M Total, 4096M Free
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
> 38935 nobody 27 44 0 4385M 4267M ucond 1 302:19 100.00% squid
> 46838 root 1 44 0 24144K 2344K select 0 3:09 0.00% snmpd
> 573 root 1 44 0 4684K 608K select 0 0:34 0.00% syslogd
> 678 root 1 44 0 24780K 4360K select 1 0:12 0.00% perl5.8.8
> 931 root 1 44 0 10576K 1480K select 0 0:11 0.00% sendmail
> 871 root 1 44 0 20960K 508K select 3 0:08 0.00% sshd
> 941 root 1 8 0 5736K 424K nanslp 2 0:03 0.00% cron
> 14177 root 1 44 0 40620K 2648K select 0 0:02 0.00% httpd
>
>
> # iostat 1 5
> tty da0 da1 da2 cpu
> tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
> 4 75 28.86 5 0.13 41.08 22 0.88 40.01 42 1.66 6 0 4 0 89
> 0 230 22.86 7 0.16 21.33 6 0.12 44.78 23 1.00 23 0 4 0 73
> 0 77 16.00 1 0.02 51.56 27 1.35 40.38 48 1.88 22 0 6 0 72
> 0 77 16.00 8 0.12 18.29 7 0.12 26.64 22 0.57 24 0 3 0 72
> 0 77 16.00 2 0.03 32.00 2 0.06 41.43 35 1.41 24 0 4 0 71
>
> # vmstat 1 5
> procs memory page disks faults cpu
> r b w avm fre flt re pi po fr sr da0 da1 in sy
> cs us sy id
> 1 2 0 13674764 386244 455 9 0 0 792 4672 0 0 4147 6847
> 6420 6 4 89
> 1 1 0 13674764 383112 1365 4 0 0 147 0 2 4 5678 9065
> 16860 18 6 76
> 1 1 0 13674764 383992 894 3 0 0 916 0 5 6 5089 7950
> 16239 22 5 73
> 1 1 0 13674764 378624 1399 11 0 0 52 0 11 1 5533 10447
> 18994 23 5 72
> 1 1 0 13674768 373360 1427 6 0 0 30 0 9 3 5919 10913
> 19686 25 5 70
>
>
> ASYNC IO Counters:
> Operation # Requests
> open 2396837
> close 1085
> cancel 2396677
> write 3187
> read 16721807
> stat 0
> unlink 299208
> check_callback 800440690
> queue 14
>
>
> I've noticed that the counter 'queue' is relatively high, which
> normally should always be zero. But the disks seems pretty idle. I've
> tested that by copying some large files to cache_dir, very fast. So
> there must be something blocking squid. I've got 2 boxes with the same
> hardware/software configuration running load balancing, when one of
> them was blocking, another one ran pretty well.
>
> I'm using FreeBSD 7.0 + AUFS, and I've also noticed what you have
> written several days ago
> (http://www.squid-cache.org/mail-archive/squid-users/200811/0647.html),
> which mentions that some operations may block under FreeBSD. So could
> that cause this problem?
>
> Thanks again.
>
> Regards,
> Liu
>
>
> On Tue, Dec 9, 2008 at 23:28, Adrian Chadd <adrian_at_squid-cache.org> wrote:
>> Its a hack which is done to defer a storage manager transaction from
>> beginning whilst another one is in progress for that same connection.
>>
>> I'd suggest using your OS profiling to figure out where the CPU is
>> being spent. This may be a symptom, not the cause.
>>
>>
>> adrian
>>
>> 2008/12/7 Bin Liu <binliu.lqbn_at_gmail.com>:
>>> Hi there,
>>>
>>> Squid is pegging CPU to 100% with "storeClientCopyEvent" and hit
>>> service time soar up to server seconds here. The following is what I
>>> see in cachemgr:events:
>>>
>>> OperationNext ExecutionWeightCallback Valid?
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent-0.019010 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> storeClientCopyEvent0.000000 seconds0yes
>>> MaintainSwapSpace0.980990 seconds1N/A
>>> idnsCheckQueue1.000000 seconds1N/A
>>> ipcache_purgelru5.457004 seconds1N/A
>>> wccp2HereIam5.464900 seconds1N/A
>>> fqdncache_purgelru5.754399 seconds1N/A
>>> storeDirClean10.767635 seconds1N/A
>>> statAvgTick59.831274 seconds1N/A
>>> peerClearRR110.539127 seconds0N/A
>>> peerClearRR279.341239 seconds0N/A
>>> User Cache Maintenance1610.136367 seconds1N/A
>>> storeDigestRebuildStart1730.225879 seconds1N/A
>>> storeDigestRewriteStart1732.267852 seconds1N/A
>>> peerRefreshDNS1957.777934 seconds1N/A
>>> peerDigestCheck2712.910515 seconds1yes
>>>
>>> So what does "storeClientCopyEvent" mean? Is it disk IO cause this problem?
>>>
>>> Regards,
>>> Liu
>>>
>>>
>>
>
>
>
> --
> Bin LIU
> GreenCache Team,
> Gforce Inc.
>
>
Received on Fri Dec 12 2008 - 02:34:21 MST

This archive was generated by hypermail 2.2.0 : Fri Dec 12 2008 - 12:00:06 MST