Re: [squid-users] squid cpu problem from Amos Jeffries on 2014-05-08 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 08 May 2014 21:22:21 +1200

On 8/05/2014 12:33 a.m., a.afach wrote:
> Hi amos
> as i see the problem is still occurring with other errors in GDB the CPU
> still goes to 100%
>

The "problem" is that very big objects do exist and occasionally need to
be moved from memory to disk.

> this it the GDB :
>
>
> Loaded symbols for /lib64/libnss_db.so.2
> 0x000000000050a1c8 in linklistPush (L=0x9fb429e8, p=0x53e0be0) at
> list.cc:47
> 47 list.cc: No such file or directory.
> in list.cc
> (gdb) backtrace
> #0 0x000000000050a1c8 in linklistPush (L=0x9fb429e8, p=0x53e0be0) at
> list.cc:47
> #1 0x0000000000594841 in UFSStoreState::write (this=0xb7775918,
> buf=0x723b7c70
> "I\223\324\004\245\201\315\306\354P\276\372e\373\r\235\250\311\033\275P\333\344\323\211\354\275\200\362>A",
> size=4096, aOffset=-1, free_func=
> 0x50f220 <memNodeWriteComplete(void*)>) at ufs/store_io_ufs.cc:247
> #2 0x00000000005436e0 in doPages (anEntry=<optimized out>) at
> store_swapout.cc:160
<snip>

>
> i tried to change config with no success
> the problem occurs in peak times or when no load in random times.
> how can i know if the problem is a hardware problem or squid ????

Neither and both.

It is a "non-problem" in that storing a large object to disk in small
incremental bits is going to take a lot of CPU cycles. The nature of the
task itself causes large CPU usage.

The Squid code doing this store is not great. It walks the linked-list
of memory blocks (N^2)/2 times during the store operation.
Also, the version you are using does not distinguish between objects
stored for future use and objects being discarded immediately. They all
go to disk on their way through Squid. So there is no way to avoid it by
configuring storage of smaller objects.

The hardware is not able to cope with that operation being done on the
size of objects you are proxying.

Amos

>
> thanks
>
>
> On 2014-04-05 03:37, Amos Jeffries wrote:
>> This looks like the CPU cycles are being consumed by walking one or more
>> very long lists of memory pieces and writing them to disk one by one.
>> Note the UFSStoreState::write parameter size=4096 in the backtrace for
>> how bit those memory pages are.
>>
>> Which could happen if you cached a very big object in cache_mem and then
>> a random time later it needed swapping out to disk to free up memory.
>>
>> It could also happen if Squid needed to suddenly swap out a large number
>> of smaller items to make memory space available for a large one which is
>> about to arrive.
>>
>> So, have you configured Squid to allow very large objects (many MB or
>> GB) in memory storage?
>>
>>
>> Note these causes would not show up in the testing you mentioned unless
>> you had a very wide range of test object sizes being pumped randomly
>> through the proxy. A tool like web polygraph is best to test that
>> traffic behaviour accurately.
>>
>> Amos
>>
>>
>> On 5/04/2014 1:59 a.m., a.afach wrote:
>>> Dear all
>>> i still have the CPU spikes even when i used
>>> disable-strict-error-checking without using Cflags
>>>
>>> this is the gdb backtrace while the CPU spikes
>>>
>>> 0x000000000051b348 in linklistPush (L=0x11853e188, p=0xce6d4300) at
>>> list.cc:47
>>> 47 while (*L)
>>> (gdb) backtrace
>>> #0 0x000000000051b348 in linklistPush (L=0x11853e188, p=0xce6d4300) at
>>> list.cc:47
>>> #1 0x00000000005a70a1 in UFSStoreState::write (this=0xb3970e28,
>>> buf=0x11fe69ca0
>>> "!v\253r[/\307\232G\b\375`\237:\213\256^\335\373{\241%\232\363\021\071>`\342\033\177a\202G\320{\323%\236K\342\243*\332\316\351\231=\360\370\313Ro=\317\262\243\315\027\351,\221\230\353Z\023\024q\"QSC\036\214:M\242{@\351m\020\337Cw_\214\216\304\226\265\a\375\031\211\243V\222T\320\016\227\312-\211Sz\326^\346\230\251\327\222\n\373I\032\341\303==U\214\277\264\244\205\b1\346S=\230\215\204\245\254>\312\223\066\336\230PpP\227\271\370\266;\362\226\242\036\225\235w\330\325\061\316{o_\364\021\062\351\376\062|\313\006`\357m\206FQ0\021\030C\224\004]\336\315\371\033h1\361\363\350d\366\066"...,
>>>
>>> size=4096, aOffset=-1, free_func=0x5203b0 <memNodeWriteComplete(void*)>)
>>> at ufs/store_io_ufs.cc:247
>>> #2 0x0000000000554ca0 in doPages (anEntry=<optimized out>) at
>>> store_swapout.cc:160
>>> #3 StoreEntry::swapOut (this=0x372ca10) at store_swapout.cc:279
>>> #4 0x000000000054c986 in StoreEntry::invokeHandlers (this=0x372ca10) at
>>> store_client.cc:714
>>> #5 0x00000000004dc1a7 in FwdState::complete (this=0xbb502b48) at
>>> forward.cc:341
>>> #6 0x00000000005579a5 in ServerStateData::completeForwarding
>>> (this=0xf8030588) at Server.cc:239
>>> #7 0x00000000005571bd in ServerStateData::serverComplete2
>>> (this=0xf8030588) at Server.cc:207
>>> #8 0x00000000004ff3dc in HttpStateData::processReplyBody
>>> (this=0xf8030588) at http.cc:1382
>>> #9 0x00000000004fd367 in HttpStateData::readReply (this=0xf8030588,
>>> io=...) at http.cc:1161
>>> #10 0x0000000000503156 in JobDialer<HttpStateData>::dial
>>> (this=0xde75ca50, call=...) at base/AsyncJobCalls.h:175
>>> #11 0x0000000000569ee4 in AsyncCall::make (this=0xde75ca20) at
>>> AsyncCall.cc:34
>>> #12 0x000000000056cb76 in AsyncCallQueue::fireNext (this=<optimized
>>> out>) at AsyncCallQueue.cc:53
>>> #13 0x000000000056ccf0 in AsyncCallQueue::fire (this=0x2586400) at
>>> AsyncCallQueue.cc:39
>>> #14 0x00000000004d385c in EventLoop::runOnce (this=0x7fffcb3518d0) at
>>> EventLoop.cc:130
>>> #15 0x00000000004d3938 in EventLoop::run (this=0x7fffcb3518d0) at
>>> EventLoop.cc:94
>>> #16 0x000000000051d35b in SquidMain (argc=<optimized out>,
>>> argv=<optimized out>) at main.cc:1418
>>> #17 0x000000000051dd83 in SquidMainSafe (argv=<optimized out>,
>>> argc=<optimized out>) at main.cc:1176
>>> #18 main (argc=<optimized out>, argv=<optimized out>) at main.cc:1168
>>>
>>>
>>> any idea about what's causing the cpu spike
>>>
>>>
>>> On 2014-03-31 16:34, Amos Jeffries wrote:
>>>> On 2014-04-01 02:10, a.afach wrote:
>>>>> Dear Eliezer
>>>>> these are the configure options ...
>>>>> configure options: '--prefix=/usr/local/squid-3.1.19'
>>>>> '--sysconfdir=/etc' '--sysconfdir=/etc/squid' '--localstatedir=/var'
>>>>> '--enable-auth=basic,digest,ntlm' '--enable-removal-policies=lru,heap'
>>>>> '--enable-digest-auth-helpers=password'
>>>>> '--enable-basic-auth-helpers=PAM,getpwnam,NCSA,MSNT'
>>>>> '--enable-external-acl-helpers=ip_user,session,unix_group'
>>>>> '--enable-ntlm-auth-helpers=fakeauth'
>>>>> '--enable-ident-lookups--enable-useragent-log'
>>>>> '--enable-cache-digests' '--enable-delay-pools' '--enable-referer-log'
>>>>> '--enable-arp-acl' '--with-pthreads' '--with-large-files'
>>>>> '--enable-htcp' '--enable-carp' '--enable-follow-x-forwarded-for'
>>>>> '--enable-snmp' '--enable-ssl' '--enable-storeio=ufs,diskd,aufs'
>>>>> '--enable-async-io' '--enable-linux-netfilter' '--enable-epoll'
>>>>> '--with-squid=/usr/squid-3.1.19' '--disable-ipv6' '--with-aio'
>>>>> '--with-aio-threads=128' 'build_alias=x86_64-pc-linux-gnu'
>>>>> 'host_alias=x86_64-pc-linux-gnu' 'CC=x86_64-pc-linux-gnu-gcc'
>>>>> 'CFLAGS=-O2 -pipe -m64 -mtune=generic' 'LDFLAGS=-Wl,-O1
>>>>> -Wl,--as-needed' 'CXXFLAGS=' '--cache-file=/dev/null' '--srcdir=.'
>>>>>
>>>>
>>>> Some more reasons to upgrade:
>>>> * --disable-strict-error-checking avoids issues on Gentoo with -Werror
>>>> * CFLAGS affects the C compiler, not the C++ compiler. C compiler is
>>>> only used by Squid-3 to build some libraries.
>>>> * current verified stable Gentoo Squid version is 3.3.8.
>>>> * updating aything on Gentoo involves rebuilding a surprising number
>>>> of components from scratch. So when you get a difference like this it
>>>> really could be anywhere. Including buried in the compiler itself -
>>>> your flags are possibly changing optimization levels and CPU-specific
>>>> assembly instructions used by it.
>>>>
>>>> Amos
Received on Thu May 08 2014 - 09:22:43 MDT

This archive was generated by hypermail 2.2.0 : Mon May 12 2014 - 12:00:05 MDT