RE: [squid-users] squid: Memory utilization higher than expected since moving from 3.3 to 3.4 and Vary: working

From: Martin Sperl <Martin.Sperl_at_amdocs.com>
Date: Fri, 18 Jul 2014 07:14:10 +0000

Hi Alex!

Well as for data: I have posted the raw data as an excel sheet including graphs to the ticket: http://bugs.squid-cache.org/show_bug.cgi?id=4084
Especially the tab: "ChartKBto100%" show the wavy pattern of the memory pools (graphed as % of last measured value)
And then you have the same, but graphing the delta to the previous hour: "ChartKBto100%Differencial"

And finally there is "ChartInternal Data Structures" which graphs the behavior of the 4 different basic structures.
 
Those wavy pattern I have mentioned are visible in each of those graphs and follow our traffic curve.

Well - as for the theories:
If it is "underestimating the real memory-footprint of a mem_object", then:
a reduction of the memory cache size to 2GB should show a leveling of
If it is instead wasted memory per request, then decreasing the memory settings would not have an impact.

Here some more stats on Requests distribution with regards to cache hit/miss and status codes for a 20hour period:
Count CacheStatus-StatusCode
11846509 total requests
6425676 TCP_MEM_HIT-200
1462630 TCP_MISS-200
1460093 TAG_NONE-302
1378134 TCP_IMS_HIT-304
390021 TCP_MISS-302
244197 TCP_REFRESH_MODIFIED-200
101942 TCP_NEGATIVE_HIT-404
88726 TAG_NONE-400
69045 TCP_REFRESH_UNMODIFIED-200
33006 TCP_MISS_ABORTED-0
30973 TAG_NONE_ABORTED-0
26063 TCP_MISS-304
23619 TCP_MISS-301
20107 TCP_DENIED_REPLY-302
17303 TCP_MISS-404
17267 TCP_MISS_ABORTED-200
16727 TCP_DENIED-302
8215 TCP_MEM_HIT-404
7921 TCP_HIT-200
7478 TCP_MEM_HIT_ABORTED-200
4909 TCP_MISS-303
3914 TCP_MISS-206
2746 TCP_DENIED-403
1448 TCP_MEM_HIT-206
777 TAG_NONE-405
670 TCP_MISS_TIMEDOUT-200
448 TCP_MISS_ABORTED-206
390 TCP_MISS-500
310 TCP_HIT_ABORTED-0
240 TCP_DENIED_REPLY-307
232 TCP_MEM_HIT_TIMEDOUT-200
169 TCP_MISS-502
146 TCP_MISS-503
139 TCP_DENIED_ABORTED-302
132 TCP_REFRESH_MODIFIED-302
69 TCP_NEGATIVE_HIT-500
62 TCP_REFRESH_UNMODIFIED-206
60 TCP_NEGATIVE_HIT-502
54 TCP_REFRESH_MODIFIED-206
50 TCP_REFRESH_UNMODIFIED_ABORTED-200
44 TCP_MISS-403
44 TAG_NONE_ABORTED-400
42 TCP_REFRESH_FAIL_ERR-500
36 TCP_REFRESH_MODIFIED_ABORTED-200
35 TCP_DENIED_ABORTED-403
33 TCP_NEGATIVE_HIT-403
30 TCP_DENIED-307
21 TCP_CLIENT_REFRESH_MISS-200
19 TCP_MISS_ABORTED-302
16 TCP_MISS-400
8 TCP_MISS-504
8 TCP_IMS_HIT_ABORTED-304
8 TCP_HIT_ABORTED-200
5 TCP_MISS-416
5 TCP_MEM_HIT_ABORTED-206
4 TCP_REFRESH_MODIFIED-404
4 TCP_REFRESH_FAIL_OLD-200
4 TCP_NEGATIVE_HIT_ABORTED-404
4 TCP_NEGATIVE_HIT-503
4 TCP_MISS_TIMEDOUT-206
4 TCP_MISS-204
4 TCP_MEM_HIT-500
4 TAG_NONE_TIMEDOUT-400
3 TCP_NEGATIVE_HIT-400
2 TCP_REFRESH_UNMODIFIED_TIMEDOUT-200
2 TCP_HIT_TIMEDOUT-200
1 TCP_SWAPFAIL_MISS-200
1 TCP_REFRESH_MODIFIED_TIMEDOUT-200
1 TCP_MISS_ABORTED-404
1 TCP_DENIED_REPLY_ABORTED-302

I will try to do some more analysis including latest data and share the result as well - this time also correlating TPS...

Martin

> -----Original Message-----
> From: Alex Rousskov [mailto:rousskov_at_measurement-factory.com]
> Sent: Donnerstag, 17. Juli 2014 23:03
> To: squid-users_at_squid-cache.org
> Cc: Martin Sperl
> Subject: Re: [squid-users] squid: Memory utilization higher than expected
> since moving from 3.3 to 3.4 and Vary: working
>
> On 07/17/2014 02:49 AM, Martin Sperl wrote:
> > This is why I have been mentioning all the pools that show similar (wavy)
> > memory increase pattern. There must be one of those that is the root of
> > all the others.
>
> Unfortunately, I do not share your certainty. As I see it, at least two
> theories more-or-less fit your data:
>
> A) MemObjects are leaking.
>
> B) Your memory cache is growing, explaining some or all of the pools
> growth and muddying the waters along the way. Something unrelated to
> MemObjects is leaking. That something may or may not be pooled.
>
> My bet is on (B).
>
> If you do not mind reminding me, does accounted-for memory pools growth
> correspond to the actual/total Squid memory growth? Or is the "pooled"
> vs "total/real" gap widening with time?
>
>
> > the strange thing is that if you look at the
> > distribution of vm_objects, then you see that they have expired long ago
> > (16267 days ago to be exact, so with EX:-1 - 42511 exactly).
> > If these have been expired, then why would they get loaded into memory?
>
> because you have free cache space and Squid can still serve that cached
> data (after revalidation)? In general, there is no benefit from purging
> something from a non-full cache, especially if that something is reusable.
>
> I have not checked the code and responses producing those "16267 days"
> stats so I treat that number with a grain of salt. :-)
>
>
> > Well: SUM(inmem_hi) is memory for payload (possibly without headers)
> against
> > which we compare the cache_mem against.
> >
> > But if the squid process consumes 9GB, then there must be more than a
> factor
> > of 2 of overhead so that we get to those 9GB.
>
> Yes if (A) above is correct. If (B) above is correct, that "overhead" is
> a leak of something other than memory cache entries.
>
>
> > Report-file(with Date+Time) psmemory MemObjects kb/MemObject
> > report-20140708-234225.txt 1795932 123979 6.03
> ...
> > report-20140717-000001.txt 10662148 845404 11.37
>
> > Which shows that the average size/memObject is increasing constantly!
>
> Yes, and that is one of the reasons I bet on (B). (B) is a lot less
> surprising than a constantly growing MemObject overhead that (A) has to
> explain :-). Running with a small cache size (i.e., with a full cache)
> would confirm that.
>
>
>
> > Another point here more on the reporting/debugging side:
> > Total space in arena: -2034044 KB
>
> Yes, this is a known problem mostly outside of Squid control:
> http://www.squid-cache.org/mail-archive/squid-users/201402/0296.html
>
> Squid should probably stop using that API [when it can detect that it is
> broken].
>
>
> And yes, I know that my response does not really help you. I just wanted
> to make sure you consider both (A) and (B) theories in your investigation.
>
>
> Alex.
> P.S. Another suggestion that you probably cannot use in your environment
> is to run trunk or a trunk-based branch. Perhaps the leak you are after
> has been fixed. And if not, somebody may be more motivated to help you
> find it in trunk.


This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp
Received on Fri Jul 18 2014 - 07:14:22 MDT

This archive was generated by hypermail 2.2.0 : Fri Jul 18 2014 - 12:00:04 MDT