Re: [squid-users] squid: Memory utilization higher than expected since moving from 3.3 to 3.4 and Vary: working from Alex Rousskov on 2014-07-16 (squid-users)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 16 Jul 2014 18:49:22 -0600

On 07/14/2014 05:36 AM, Martin Sperl wrote:

> * Pools that increase a lot (starting at below 20% of the currend KB 2 days ago) - which are (sorted from Biggest to smallest KB footprint):
> ** mem_node
> ** 4K Buffer
> ** Short Strings
> ** HttpHeaderEntry
> ** 2K Buffer
> ** 16K Buffer
> ** 8K Buffer
> ** Http Reply
> ** Mem Object
> ** Medium Strings
> ** cbdata BodyPipe (39)
> ** HttpHdrCc
> ** cbdata MemBuff(13)
> ** 32K Buffer
> ** Long Strings

> So there must be something that links all of those in the last group together.

MemObject structures contain or tie most (possibly all) of the above
objects. MemObjects are used for current transactions and non-shared
memory cache storage. The ones used for non-shared memory cache storage
are called "hot objects". However, some current transactions might
affect "hot objects" counters as well, I guess. These stats is messy and
imprecise.

Please note that every MemObject must have a StoreEntry but StoreEntries
may lack MemObject. When working with large caches, most of the
StoreEntries without MemObject would correspond to on-disk objects that
are _not_ also cached in memory.

The above is more complex for SMP-aware caches which, I think, you are
not using.

> So here the values of "StoreEntries" for the last few days:
> 20140709-020001: 1472007 StoreEntries
> 20140710-020001: 1475545 StoreEntries
> 20140711-020001: 1478025 StoreEntries
> 20140712-020001: 1480771 StoreEntries
> 20140713-020001: 1481721 StoreEntries
> 20140714-020001: 1482608 StoreEntries
> These stayed almost constant...

OK, the total number of unique cache entry keys (among memory and disk
caches) is not growing much.

> But looking at " StoreEntries with MemObjects" the picture is totally different.
> 20140709-020001: 128542 StoreEntries with MemObjects
> 20140710-020001: 275923 StoreEntries with MemObjects
> 20140711-020001: 387990 StoreEntries with MemObjects
> 20140712-020001: 489994 StoreEntries with MemObjects
> 20140713-020001: 571872 StoreEntries with MemObjects
> 20140714-020001: 651560 StoreEntries with MemObjects

OK, your memory cache is filling, possibly from swapped in disk entries
(so that the total number of keys does not grow much)??

FWIW, the "StoreEntries with" part of the label is misleading. These are
just MemObjects. However, that distinction is only important if
MemObjects are leaking separately from StoreEntries.

> So if you look at the finer details and traffic pattern we again see that traffic pattern for:
> * storeEntries with MemObjects
> * Hot Object Cache Items

Which are both about MemObjects.

> And these show similar behavior to the pools mentioned above.

Yes, the "StoreEntries with MemObjects" counter is just the MemObject
pool counter.

> If I sum up the "inmem_hi:" values I get: 2918369522, so 2.9GB.
>
> So it seems as if there must be some major overhead for those inmem objects...

How do you calculate the overhead? 2.9GB is useful payload, not
overhead. Are you comparing 2.9GB with your total Squid memory footprint
of about 9GB?

> So the question is: why do we underestimate memory_object sizes by a
> factor of aproximately 2?

Sorry, you lost me here. What do you mean by "memory_object sizes",
where do we estimate them, and x2 compared to what?

Please note that the above comments and questions are _not_ meant to
indicate that there is no leak or that your analysis is flawed! I am
just trying to understand if you have found a leak or still need to keep
looking [elsewhere].

Are you willing to run Squid with a tiny memory cache (e.g., 100MB) for
a while? This would remove the natural memory cache growth as a variable...

Thank you,

Alex.

> -----Original Message-----
> From: Martin Sperl
> Sent: Freitag, 11. Juli 2014 09:06
> To: Amos Jeffries; squid-users_at_squid-cache.org
> Subject: RE: [squid-users] squid: Memory utilization higher than expected since moving from 3.3 to 3.4 and Vary: working
>
> The basic connection stats are in the mgr:info:
> File descriptor usage for squid:
> Maximum number of file descriptors: 65536
> Largest file desc currently in use: 1351
> Number of file desc currently in use: 249
> Files queued for open: 0
> Available number of file descriptors: 65287
> Reserved number of file descriptors: 100
> Store Disk files open: 0
>
> Also: our loadbalancer will disconnect idle connections after some time and I believe the config has similar settings...
>
> Will send you the hourly details since the restart in a personal email due to size limits of the mailinglist.
>
> Here the current size of the process:
> squid 15022 9.5 29.6 4951452 4838272 ? Sl Jul08 317:01 (squid-1) -f /opt/cw/squid/squid.conf
>
> Martin
>
> -----Original Message-----
> From: Amos Jeffries [mailto:squid3_at_treenet.co.nz]
> Sent: Freitag, 11. Juli 2014 05:24
> To: Martin Sperl; squid-users_at_squid-cache.org
> Subject: Re: [squid-users] squid: Memory utilization higher than expected since moving from 3.3 to 3.4 and Vary: working
>
> On 8/07/2014 10:20 p.m., Martin Sperl wrote:
>> The problem is that it is a "slow" leak - it takes some time (month) to find it...
>> Also it only happens on real live traffic with high volume plus high utilization of "Vary:"
>> Moving our prod environment to head would be quite a political issue inside our organization.
>> Arguing to go to the latest stable version 3.4.6 would be possible, but I doubt it would change a thing
>>
>> In the meantime we have not restarted the squids yet, so we still got a bit of information available if needed.
>> But we cannot keep it up in this state much longer.
>>
>> I created a core-dump, but analyzing that is hard.
>>
>> Here the top strings from that 10GB core-file - taken via: strings corefile| sort | uniq -c | sort -rn | head -20).
>> This may give you some idea:
>> 2071897 =0.7
>> 1353960 Keep-Alive
>> 1343528 image/gif
>> 877129 HTTP/1.1 200 OK
>> 855949 GMT
>> 852122 Content-Type
>> 851706 HTTP/
>> 851371 Date
>> 850485 Server
>> 848027 IEND
>> 821956 Content-Length
>> 776359 Content-Type: image/gif
>> 768935 Cache-Control
>> 760741 ETag
>> 743341 live
>> 720255 Connection
>> 677920 Connection: Keep-Alive
>> 676108 Last-Modified
>> 662765 Expires
>> 585139 X-Powered-By: Servlet/2.4 JSP/2.0
>>
>> Another thing I thought we could do is:
>> * restart squids
>> * run mgr:mem every day and compare the daily changes for all the values (maybe others?)
>>
>> Any other ideas how to "find" the issue?
>>
>
> Possibly a list of the mgr:filedescriptors open will show if there are
> any hung connections/transactions, or long-polling connections holding
> onto state.
>
>
> Do you have the mgr:mem reports over the last few days? I can start
> analysing to see if anything else pops out at me.
>
> Amos
>
>
> This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
> you may review at http://www.amdocs.com/email_disclaimer.asp
>
Received on Thu Jul 17 2014 - 00:49:51 MDT

This archive was generated by hypermail 2.2.0 : Thu Jul 17 2014 - 12:00:04 MDT