Re: [squid-users] "Quadruple" memory usage with squid

From: Marcus Kool <marcus.kool_at_urlfilterdb.com>
Date: Wed, 25 Nov 2009 14:18:54 -0200

Linda Messerschmidt wrote:
> On Wed, Nov 25, 2009 at 7:43 AM, Marcus Kool
> <marcus.kool_at_urlfilterdb.com> wrote:
>> The result of the test with vm.pmap.pg_ps_enabled set to 1
>> is ... different than what I expected.
>> The values of vm.pmap.pde.p_failures and vm.pmap.pde.demotions
>> indicate that the page daemon has problems creating and
>> maintaining superpages. Maybe the load is too high
>> to create superpages or the algorithm of the page daemon
>> needs to be tuned.
>
> Well the load on that machine is 0.09. :-)

I was trying to express 'memory load' or total system memory usage.
The FreeBSD list may have an explanation why there are
superpage demotions before we expect them (when their are no forks
and no big demands for memory).

>> But one last try from me: The machine has 24 GB and Squid has
>> 19 GB. I guess that on the first fork the OS demotes many
>> superpages because it needs to map the child process to
>> virtual memory and superpages cannot be swapped and therefore
>> will be demoted. The second fork demotes more superpages...
>> To make the first fork fast, Squid must be
>> less than 10 GB because Squid and its child fit within
>> physical memory.
>
> The demotions occur prior to the fork; I was able to watch both
> counters increment during the day yesterday.

strange indeed. I think that the heuristic to demote superpages
is too aggressive.

> But, what you are saying is where we are at now: assign no more than
> 25% of system memory to cache_mem and make sure 50% is wasted at all
> times except log rotation.
>
> I did make one change last night which appears to have made a big
> difference. I noticed in the memory report I posted:
>
> Idle pool limit: 5.00 MB
>
> We did not have such a value anywhere in our config file; according to
> the documentation the default is unlimited. So I don't know where
> that value came from; maybe a documentation tweak is appropriate. But
> in light of it, I added:
>
> memory_pools_limit 10 GB
>
> And restarted squid. The memory wastage is the same (still 50%
> missing with no explanation) but the pmap values are quite different:
>
> vm.pmap.pmap_collect_active: 0
> vm.pmap.pmap_collect_inactive: 0
> vm.pmap.pv_entry_spare: 10202
> vm.pmap.pv_entry_allocs: 203328446
> vm.pmap.pv_entry_frees: 203268592
> vm.pmap.pc_chunk_tryfail: 0
> vm.pmap.pc_chunk_frees: 1231037
> vm.pmap.pc_chunk_allocs: 1231454
> vm.pmap.pc_chunk_count: 417
> vm.pmap.pv_entry_count: 59854 <---------
> vm.pmap.pde.promotions: 16718 <----------
> vm.pmap.pde.p_failures: 247130
> vm.pmap.pde.mappings: 0
> vm.pmap.pde.demotions: 12403 <-----------
> vm.pmap.shpgperproc: 200
> vm.pmap.pv_entry_max: 7330186
> vm.pmap.pg_ps_enabled: 1
>
> The memory usage of squid at this time is only 7544M, but the whole
> system VSZ K/pv_entry is way up to 160 (from 4.55 before) and the
> number of superpages is much higher at (16718 - 12403) 4315. Also, I
> cannot watch "demotions" increase in real time as before; it has been
> sitting at 12403 all morning and that is barely higher than the 11968
> reading last night well before the change.
>
> So we will leave it running until rotation tonight to see if this has
> any effect. One can hope it makes squid fork 40x faster. :)
>
>> There are alternative solutions to the problem:
>> 1. redesign the URL rewriter into a multithreaded application that
>> accepts multiple requests from Squid simultaneously (use
>> url_rewrite_concurrency in squid.conf)
>> This way there will be only one child process and only one fork on
>> 'squid -k rotate'
>
> As with multithreaded in general, that would entail a lot of work to
> turn a very simple, easy to verify reliable program into a very
> complex, hard to verify, error-prone one. It's something we've talked
> about doing, but there is not much enthusiasm for doing it to
> something so important.
>
>> 2. redesign the URL rewriter where the URL rewriter rereads
>> its configuration/database on receipt of a signal and
>> send this signal every 24 hours without doing the 'squid -k rotate'
>> This way squid does not fork.
>> (maybe you want a 'squid -k rotate' once per week though).
>
> We don't want/need the URL rewriter to stop *ever*; it has no need for
> that. The -k rotate is solely to keep on top of the GB of logs squid
> generates every day.
>
>> 4. use less URL rewriters. You might get an occasional
>> 'not enough rewriters' warning from Squid in which case the
>> redirector might be bypassed (use url_rewrite_bypass).
>> Depending on what exactly the URL rewriter does, this
>> might be acceptable.
>> This way Squid does less forks.

option 5. (multi-CPU systems only).
use 2 instances of Squid:
1. with null cache, small cache (e.g. 100 MB cache_mem),
    16 URL rewriters and a Squid parent
2. a Squid parent with null cache and HUGE cache_mem

Both Squid processes will rotate/restart fast.

> We cannot bypass the rewriters, and we have already cut them from 48 to 16.
>
> In the ordinary case, one rewriter can already handle it: they take on
> average 150us to respond so one rewriter can handle about 6600
> requests per second and the average workload is only 3000 - 10000
> requests per minute. But there are sometimes non-average cases so we
> would not want to serialize incoming requests, and we may get slammed
> or abused with very high request rates during which we need the
> extras.
>
> Provided the superpages now have a positive effect, the only thing
> left to do will be to get to the bottom of the memory usage situation.
>
> Does everybody, especially on other platforms (e.g. Linux, Solaris)
> get this same behavior where the process VSZ is double what squid can
> account for?

I never used Squid with a null cache but the extra memory besides the
memory for objects seems high.

Marcus

> Thanks!
>
>
Received on Wed Nov 25 2009 - 16:19:04 MST

This archive was generated by hypermail 2.2.0 : Wed Nov 25 2009 - 12:00:06 MST