Re: [squid-users] Squid 3.3 is very aggressive with memory

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Sun, 12 Jan 2014 21:26:06 -0700

On 12/22/2013 12:39 AM, Nathan Hoad wrote:
> On Wed, Dec 18, 2013 at 4:54 PM, Alex Rousskov
> <rousskov_at_measurement-factory.com> wrote:
>> On 12/16/2013 10:24 PM, Nathan Hoad wrote:
>>
>>> While running under this configuration, I've confirmed that memory
>>> usage does go up when active, and stays at that level when inactive,
>>> allowing some time for timeouts and whatnot. I'm currently switching
>>> between the two instances every fifteen minutes.
>>>
>>> Here is a link to the memory graph for the entire running time of the
>>> second process, at 1 minute intervals:
>>> http://getoffmalawn.com/static/mem-graph.png. The graph shows memory
>>> use steadily increasing during activity, but remaining reasonably
>>> stable during inactivity.
>>
>> I agree that this looks like a memory leak, but (in general) it could
>> also be some kind of memory pooling or cache entry accumulation.
>>
>>
>>> Where shall we go from here?
>>
>>
>> I recommend the following next steps:
>>
>> 1. Set "memory_pools off".
>>
>> 2. Disable all caching with "cache deny all".
>>
>> Do you see as similar memory growth pattern after the above two steps?
>
> I do see a similar pattern, although slowed - this makes sense though,
> given the directives that I've added, so it would appear that it's not
> related to pooling or caching. Memory usage still reaches a point
> where I have to kill everything to prevent the system being OOM'd. I'm
> happy to go in the other direction and raise the size of the memory
> pools, if that could be something useful.
>
>>
>> * If yes: Time for valgrind or ALL,9 debugging. I can help you make that
>> choice if needed. You can actually do those things now, without doing
>> steps 1 and 2 first, but valgrind and log analysis take time so if we
>> can avoid it by eliminating false positives and/or simplifying the
>> setup, we should do that first...
>
> I have got an ALL,9 log

Hi Nathan,

    Sorry, it takes a while for me to resume working on a project that
was suspended for a while. I finally looked at your log. I wrote a
script that finds all "alive" objects in the log that follow a few
common construction/destruction logging patterns. That script did find
many "alive" objects in your log, but since the log is partial (Squid
did not quit at the end) that alone is normal.

Nevertheless, I did look through the list of long-lived alive object in
hope to find good suspects with a common pattern. The vast majority of
them are associated with the external ACL cache. For example, object
0xb2c2a8a0 has lived for 13 minutes and was created here:

> 2013/12/20 09:43:03.675 kid1| external_acl.cc(1276) external_acl_cache_add: external_acl_cache_add: Adding '10.12.10.216 http://www.... 80 -' = ALLOWED
> 2013/12/20 09:43:03.675 kid1| cbdata.cc(324) cbdataInternalAlloc: cbdataAlloc: 0xb2c2a8a0

I assume these long-lived cache entries are legitimate and not leaking,
but I cannot really tell based on a partial log.

Since you already suspect SSL inspection, the leaking objects may be
specific to SSL state and not closely tracked by Squid debugging(**),
making valgrind your best bet (if you can make it work).

> Running valgrind produces repeated, spurious errors - claims that the
> debugs() macro has a mismatched free() / delete / delete [] on each
> call, which naturally gets a little noisy. If this is a known issue
> and not too much of a problem, I'm happy to continue doing it, though.
> What parameters should I run it with? I am not too experienced with
> Valgrind.

I have not seen valgrind complain about debugs(). Could be
platform-specific though.

You should use "ALL,1" and ./configure --with-valgrind-debug when
running with valgrind. And a short test should be enough (if we assume
nearly every transaction leaks a little). Is that what you were doing?

Please note that Squid should exit nicely, without pending transactions
getting killed, for valgrind report to be easier to digest. If you
cannot deprive Squid from traffic before stopping the Squid process, a
"dirty" valgrind report may still be useful.

Here are the parameters you may find useful:

  valgrind -v \
    --trace-children=yes \
    --num-callers=30 \
    --log-file=valgrind-%p.log \
    --leak-check=full \
    --suppressions=valgrind.supp

My suppression list (attached) is incomplete and outdated, but should
not hurt.

HTH,

Alex.
(**) I assume the debugging logs you uploaded did have bumped
transactions, but I have not checked.

Received on Mon Jan 13 2014 - 04:26:13 MST

This archive was generated by hypermail 2.2.0 : Mon Jan 13 2014 - 12:00:05 MST