Re: Re: [squid-users] Antwort: Re: [squid-users] Memory and CPU usage squid-3.1.4

From: <Martin.Pichlmaier_at_continental-corporation.com>
Date: Thu, 1 Jul 2010 10:45:39 +0200

Hello list,

I just wanted to post the results with valgrind.
Unfortunately the memcheck thread needs so much CPU that I could not
put a high load on the squid as maximum only about 5-10 req/s.

# ./squid -v
Squid Cache: Version 3.1.3
configure options: '--prefix=/appl' '--localstate=/var'
'--with-filedescriptors=16384' '--enable-storeio=aufs'
'--enable-auth=ntlm,basic' '--enable-external-acl-helpers=wbinfo_group'
'--enable-icap-client' --enable-ltdl-convenience

also recompiled and tried with:
# squid -v
Squid Cache: Version 3.1.3
configure options: '--prefix=/appl' '--localstate=/var'
'--with-filedescriptors=16384' '--enable-storeio=aufs'
'--enable-auth=ntlm,basic' '--enable-external-acl-helpers=wbinfo_group'
'--enable-icap-client' '--with-valgrind-debug' 'CFLAGS=-g -O2'
--enable-ltdl-convenience

I ran valgrind repeatedly with: "valgrind --leak-check=yes -v squid -N &"
and found:

==24141== 3,311,957 bytes in 3,784 blocks are definitely lost in loss
record 26 of 27
==24141== at 0x4A05809: malloc (vg_replace_malloc.c:149)
==24141== by 0x5ABAA7: xmalloc (util.c:508)
==24141== by 0x5AA35A: rfc1035MessageUnpack (rfc1035.c:433)
==24141== by 0x4B15A7: idnsGrokReply(char const*, unsigned long)
(dns_interna l.cc:939)
==24141== by 0x4B22F0: idnsRead(int, void*) (dns_internal.cc:1178)
==24141== by 0x4AC154: comm_select (comm_epoll.cc:308)
==24141== by 0x5455AC: CommSelectEngine::checkEvents(int)
(comm.cc:2682)
==24141== by 0x4B712D: EventLoop::checkEngine(AsyncEngine*, bool)
(EventLoop.cc:51)
==24141== by 0x4B7282: EventLoop::runOnce() (EventLoop.cc:125)
==24141== by 0x4B7377: EventLoop::run() (EventLoop.cc:95)
==24141== by 0x4FB36C: SquidMain(int, char**) (main.cc:1379)
==24141== by 0x4FB975: main (main.cc:1141)

I looked a bit in the source code but didn't really find what could cause
this.

Sometimes DNS did not seem to loose mem but I found this instead:

==29780== 987,870 (987,046 direct, 824 indirect) bytes in 1,321 blocks are
definitely lost in loss record 27 of 28
==29780== at 0x4A05809: malloc (vg_replace_malloc.c:149)
==29780== by 0x5ABAA7: xmalloc (util.c:508)
==29780== by 0x5ABBAB: xstrdup (util.c:756)
==29780== by 0x4B3E15: errorTryLoadText(char const*, char const*, bool)
(errorpage.cc:313)
==29780== by 0x4B494F: ErrorState::BuildContent() (errorpage.cc:1007)
==29780== by 0x4B551D: ErrorState::BuildHttpReply() (errorpage.cc:881)
==29780== by 0x4B58E5: errorAppendEntry (errorpage.cc:432)
==29780== by 0x51D656: store_client::callback(long, bool)
(store_client.cc:164)
==29780== by 0x51DA2F: store_client::scheduleMemRead()
(store_client.cc:448)
==29780== by 0x51E567: storeClientCopy2(StoreEntry*, store_client*)
(store_client.cc:331)
==29780== by 0x51E8D3: store_client::copy(StoreEntry*, StoreIOBuffer,
void (*)(void*, StoreIOBuffer), void*) (store_client.cc:264)
==29780== by 0x4A0D0E: clientReplyContext::doGetMoreData()
(client_side_reply.cc:1675)

When running valgrind with 3.0.STABLE 23 I did not find similar lost
blocks,
only some KB lost when initializing but 3.1 looses some KB as well at that
point.

I monitored a squid3.0.STABLE25 and squid 3.1.3/3.1.4 over a longer period
and
found out that both need more memory over time but 3.0 eventually does not
grow.
3.1 continues to grow until CPU rises to nearly 100%; then the memory
consumption
seem to stop.

Has someone an idea where the problem could be?

Martin

Marcus Kool <marcus.kool_at_urlfilterdb.com> wrote on 17.06.2010 16:15:09:

> Martin,
>
> Valgrind is a memory leak detection tool.
> You need some developer skills to run it.
>
> If you have a test environment with low load you may want
> to give it a try.
> - download the squid sources
> - run configure with CFLAGS="-g -O2"
> - run squid with valgrind
> - wait
> - kill squid with a TERM signal and look and the valgrind log file
>
> Valgrind uses a lot of memory for its own administration and has
> a lot of CPU overhead, so reduce cache_mem to a small value like 32MB.
>
> Most likely you will see many memory leaks because
> Squid does not free everything when it exits. This is normal.
> You need to look at repeated memory leaks; the leaks that
> occur often and file a bug report.
>
> Please do not post the whole valgrind output to this list.
>
> Marcus
>
>
>
> Martin.Pichlmaier_at_continental-corporation.com wrote:
> > Hello,
> >
> > I just wanted to report back the last tests:
> >
> > After the memory cache is filled to 100% the squid (3.1.4 or 3.1.3)
> > still needs more memory over time when under load, about 1-2 GB a day.
> > memory_pool off did not change anything, the process size still rises.
> > The high CPU usage seem to start when rising over a certain size limit
> > but I am not sure about that.
> > Example memory consuption of squid-3.1.4:
> > from 8.0 GB (4pm) to 8.4 GB (7pm) to 8.5 GB (4am next day) to 9,4 GB
> > (2pm).
> > At night there is low load on the squid, maybe 20-50 req/s.
> > 3.1.3 behaves the same, so it does not seem to be related to the
> > "ctx enter/exit level" topic discussed in the last mails.
> >
> > I am now reverting the proxies back to 3.0.STABLE25 but will keep one
> > proxy
> > on 3.1.4 for testing.
> > Probably something in my setup causes squid to consume too much
memory.
> >
> > Amos, do you have another idea what might causes this and where to
look,
> > for example which debug depth? I can do some tests and have the
> > possibility
> > to slowly put this proxy under load and take it out of productive
> > environment afterwards...
> >
> > Regards,
> > Martin
> >
> >
> >
> > Amos Jeffries <squid3_at_treenet.co.nz> wrote on 15.06.2010 13:31:40:
> >
> > <snip>
> >>> I am now checking with mem pools off on one of the proxies and
report
> >>> later whether it changes anything.
> > <snip>
> >>> 2010/06/15 11:37:06| ctx: enter level 2059: '<lost>'
> >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP
header
> > {
> >>> 2010/06/15 11:37:06| ctx: exit level 2060
> >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP
header
> > {
> >>> 2010/06/15 11:37:06| ctx: exit level 2060
> >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP
header
> > {
> >>> 2010/06/15 11:37:06| ctx: exit level 2060
> >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP
header
> > {
> >>> 2010/06/15 11:37:06| ctx: exit level 2060
> >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP
header
> > {
> >>> 2010/06/15 11:40:56| ctx: exit level 2060
> >>> 2010/06/15 11:40:56| ctx: enter level 2060: '<lost>'
> >>> 2010/06/15 11:40:56| ctx: enter level 2061: '<lost>'
> >>>
> >> Ouch. We've been wondering about these ctx loops. It is not something
to
> >
> >> be terribly worried about, but can cause some "weird stuff" (yes that
is
> >
> >> probably the best explanation).
> >>
> >> Thanks to your reminder, I've just had another look and found one
more
> >> in header processing. Hopefully that was it.
> >>
> >> Amos
> >> --
> >> Please be using
> >> Current Stable Squid 2.7.STABLE9 or 3.1.4
> >
> >
> >
Received on Thu Jul 01 2010 - 08:46:06 MDT

This archive was generated by hypermail 2.2.0 : Tue Jul 06 2010 - 12:00:02 MDT