Re: [squid-users] Antwort: Re: [squid-users] Memory and CPU usage squid-3.1.4

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 15 Jun 2010 23:31:40 +1200

Martin.Pichlmaier_at_continental-corporation.com wrote:
> Amos Jeffries <squid3_at_treenet.co.nz> wrote on 15.06.2010 10:48:33:
>
>> Martin.Pichlmaier_at_continental-corporation.com wrote:
>>> Hello list,
>>>
>>> I have a question regarding memory and CPU usage change from 3.0 to
> 3.1.
>>> I have 4 forwards proxies with ICAP (c-icap and clamav), NTLMv2
>>> authentication, all four proxies each have about 200-400 req/sec
>>> on RedHat AS5 64bit servers with each 16GB mem for about 15k to 30k
> users.
>>> cache_mem is set to 3.5 GB, no disk cache, some ACLs are used.
>>> With 3.0.STABLE23 the typical memory usage was about 7 GB and CPU
> usage of
>>> about 30-40% for the main squid process, without the helpers or other
>>> processes.
>>> With 3.1.4 the proxies use up to 2GB more memory and CPU usage rose at
>
>>> times to 60% or 80%.
>>> After a restart of one of the proxies it behaved "normally" since
> then.
>>> Status: (the CPU usage, Client HTTP out and requests per second are
> all
>>> from last 5 minute-info)
>>>
>>> Proxy1: 36% CPU 7.2 GB mem 3.3 MB/s client out 341
> req/s
>>> Version 3.1.4
>>> Proxy2: 77% CPU 8.6 GB mem 2.6 MB/s client out 384
> req/s
>>> Version 3.1.4 <-----
>>> Proxy3: 53% CPU 8.8 GB mem 6.8 MB/s client out 402
> req/s
>>> Version 3.1.4
>>> Proxy4: 32% CPU 6.8 GB mem 2.7 MB/s client out 405
> req/s
>>> Version 3.0.STABLE23
>>>
>>> Configuration of the proxies is the same, with only the changes for
> 3.1,
>>> mainly icap config.
>>>
>>> Is there some kind of memory leak that additionally causes massive CPU
>
>>> usage, could it be load related or is this normal behaviour?
>>>
>>> Martin
>> With an explicit cache_mem there should be no difference between the
> two.
>> Maybe; ICAP needs to buffer traffic twice as much as normal. One buffer
>> queue to ICAP server, one to client.
>>
>> Maybe; with HTTP/1.1 being advertised now, if you have ignore_expect_100
>
>> turned on you can see the number of waiting clients rise. These are
>> active but 'hung' connections which waste more resources until the
>> client times out and continues.
>>
>> Maybe; There are some known leaks in 3.0/3.1 auth. But that does not
>> account for the extra CPU unless it contributes to making the box swap
>> memory pages.
>>
>> In general 3.1.4 has a lot of memory fixes than 3.0. Which are supposed
>> to cause less resource waste. Earlier 3.1.x default memory features were
>
>> broken on 64-bit. Check if turning memory pools on/off has any good
> effect.
>> Amos
>> --
>> Please be using
>> Current Stable Squid 2.7.STABLE9 or 3.1.4
>
>
> Hi Amos,
>
> thanks for your reply and suggestions!
>
>
> ICAP needs about two times of memory, that is true but was already the
> same with 3.0.
> ignore_expect_100 is not configured, therefore the default (turned off)
> is used and about the same amount of ESTABLISHED and TIME_WAIT is seen
> on 3.0 and 3.1, and only few other connection states, therefore I
> "think" there is no problem of hanging connections.
> Auth - hmm I have no idea to trace mem leaks here but at least the box
> itself does not swap.
> I am now checking with mem pools off on one of the proxies and report
> later whether it changes anything.
>
> But I have found two things to mention:
> First squid is slowly needing more memory over time. In 3 hours the
> "Process Data Segment Size via sbrk():" rose about 600MB. I will check
> whether it will continue rising so fast and whether it changes with
> memory_pool off.
>
> The other thing is that with 3.1.4 I now have many
> "ctx: enter level 1234: '<lost>'" messages in cache.log.
> ctx messages were also with 3.0 but normally I had
> "ctx enter level 0: 'some text'" immediately followed by
> "ctx exit level 0".
> With all three 3.1.4 proxies I see many "ctx enter level <rising number>"
> without or only very few "ctx exit level" messages. This is new with
> 3.1.4 -- 3.1.3 here behaved like 3.0.
> Here is a snip from the cache.log:
>
> 2010/06/15 11:36:09| ctx: enter level 2007: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2008: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2009: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2010: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2011: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2012: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2013: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2014: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2015: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2016: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2017: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2018: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2019: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2020: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2021: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2022: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2023: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2024: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2025: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2026: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2027: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2028: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2029: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2030: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2031: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2032: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2033: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2034: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2035: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2036: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2037: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2038: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2039: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2040: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2041: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2042: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2043: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2044: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2045: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2046: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2047: '<lost>'
> 2010/06/15 11:36:09| ctx: enter level 2048: '<lost>'
> 2010/06/15 11:36:09| # ctx: suspiciously deep (2048) nesting:
> 2010/06/15 11:37:06| ctx: exit level 2048
> 2010/06/15 11:37:06| ctx: enter level 2048: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2049: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2050: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2051: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2052: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2053: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2054: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2055: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2056: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2057: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2058: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2059: '<lost>'
> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header {
> 2010/06/15 11:37:06| ctx: exit level 2060
> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header {
> 2010/06/15 11:37:06| ctx: exit level 2060
> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header {
> 2010/06/15 11:37:06| ctx: exit level 2060
> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header {
> 2010/06/15 11:37:06| ctx: exit level 2060
> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>'
> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header {
> 2010/06/15 11:40:56| ctx: exit level 2060
> 2010/06/15 11:40:56| ctx: enter level 2060: '<lost>'
> 2010/06/15 11:40:56| ctx: enter level 2061: '<lost>'
>

Ouch. We've been wondering about these ctx loops. It is not something to
be terribly worried about, but can cause some "weird stuff" (yes that is
probably the best explanation).

Thanks to your reminder, I've just had another look and found one more
in header processing. Hopefully that was it.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.4
Received on Tue Jun 15 2010 - 13:26:10 MDT

This archive was generated by hypermail 2.2.0 : Thu Jun 17 2010 - 12:00:03 MDT