Re: [squid-users] squid 3.1.10 page allocation failure. order:1, mode:0x20 from inittab on 2013-08-20 (squid-users)

From: inittab <inispam_at_gmail.com>
Date: Tue, 20 Aug 2013 09:31:31 -0400

Hello,

Thanks for the suggestions.
I've moved up to squid version 3.3.5, changed the raid5 into a raid0,
tweaked the value of cache_dir to 100000, moved the acl manager lines,
removed hierarchy_stoplist, and enabled memory_pools.

I have also added RPS monitoring to our cacti instance so i can get a
better idea when I give this a shot again.

When running multiple processes of squid to deal with multiple (slow)
cores, how many processes do you recommend running? 1 per core or
less? I currently have 10 processes setup on 12 cores but do not know
if this is the correct way to go about it.

Thanks,

Richard

On Mon, Aug 19, 2013 at 3:27 AM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
> On 17/08/2013 6:45 a.m., inittab wrote:
>>
>> Hello,
>>
>> I wanted to get some suggestions on my current setup and ask if i'm
>> expecting too much out of my hardware for the traffic load.
>
>
> Sorry for the slow reply.
>
> NOTE: If you determine that it is a memory leak, please upgrade to the
> current Squid-3.3 or later versions. There are a few dozen leaks in 3.1 and
> 3.2 series of various sizes which have been fixed. Not everybody is hitting
> them due to specific behaviour causing each one, but you may be.
>
>
>
>> it appears i am running into out of memory problems and hitting swap,
>> squid processes then end up dying out.
>> [root_at_squid01 squid]# dmesg | grep "page allocation"
>> swapper: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> kswapd0: page allocation failure. order:1, mode:0x20
>> squid: page allocation failure. order:1, mode:0x20
>>
>>
>>
>> I currently have 2 dell 2950's running squid 3.1.10, we generally see
>> ~200Mbps total.
>
>
> How many HTTP requests/second is the most relevant traffic speed metric for
> Squid.
>
> FYI: 200Mbps of traffic coudl be coming from 1 single HTTPS / CONNECT
> request per day, or from a million IMS requests. The effect on and by Squid
> CPU and memory is drastically different for each of those cases and varies
> greatly for all permutations in between.
>
> Each request requires soem KB amount of buffer memory - 1 request/day vs a
> million requests/day and you can see where the relevance starts to appear
> for your particular problem.
>
>
>> box stats are:
>> 2x Six-Core AMD Opteron(tm) Processor 2427 @2.2Ghz
>> 32gb ram
>> 1x Intel E1G44HTBLK Server Adapter I340-T4 all 4 ports bonded with 802.3ad
>> /var/spool/squid 512G raid5
>
>
> Ah. RAID. Well there is some more disk I/O overheads you could possibly
> avoid:
> http://wiki.squid-cache.org/SquidFaq/RAID
>
> Keep in mind tha the cache data is effectively a local _backup_ of data
> elsewhere. It is non-critical. The only benefit you gain from RAID is
> advance warning about disk failures and some time to correct them without
> Squid crashing.
>
>
>> The boxes are both running 10 squid processes on different ports in
>> transparent mode
>> I am using iptables rules to redirect traffic to the different squid ports
>> ex:
>> 22M 1351M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3120
>> 20M 1216M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3121
>> 18M 1094M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3122
>> 16M 985M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3123
>> 15M 886M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3124
>> 13M 798M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3125
>> 12M 718M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3126
>> 11M 647M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3127
>> 9631K 582M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3128
>> 8668K 524M REDIRECT tcp -- * * 10.96.0.0/15
>> 0.0.0.0/0 statistic mode random probability 0.100000 tcp
>> dpt:80 redir ports 3129
>>
>> sysctl.conf:
>> net.ipv4.ip_forward = 0
>> net.ipv4.conf.default.rp_filter = 1
>> net.ipv4.conf.default.accept_source_route = 0
>> kernel.sysrq = 0
>> kernel.core_uses_pid = 1
>> net.ipv4.tcp_syncookies = 1
>> net.bridge.bridge-nf-call-ip6tables = 0
>> net.bridge.bridge-nf-call-iptables = 0
>> net.bridge.bridge-nf-call-arptables = 0
>> kernel.msgmnb = 65536
>> kernel.msgmax = 65536
>> kernel.shmmax = 68719476736
>> kernel.shmall = 4294967296
>> net.netfilter.nf_conntrack_max = 196608
>>
>>
>> example squid config file: squid-p3120.conf
>> acl adminnet src 10.3.25.0/24
>> acl proxyvlan src 10.5.22.0/24
>> acl SSL_ports port 443
>> acl Safe_ports port 80 # http
>> acl Safe_ports port 21 # ftp
>> acl Safe_ports port 443 # https
>> acl Safe_ports port 70 # gopher
>> acl Safe_ports port 210 # wais
>> acl Safe_ports port 1025-65535 # unregistered ports
>> acl Safe_ports port 280 # http-mgmt
>> acl Safe_ports port 488 # gss-http
>> acl Safe_ports port 591 # filemaker
>> acl Safe_ports port 777 # multiling http
>> acl CONNECT method CONNECT
>> http_access allow manager localhost
>> http_access allow manager adminnet
>> http_access allow manager proxyvlan
>> http_access deny manager
>
>
> For high speed Squid-3.2 or later I am recommending that people at least
> place the manager ACL tests down ...
>
>
>> http_access deny !Safe_ports
>> http_access deny CONNECT !SSL_ports
>> http_access deny to_localhost
>
> ... here. So that the faster port rejections can protect better against some
> DDoS issues. ("manager" has become a regex test.).
>
>
>> http_access allow localhost
>> http_access allow customers
>
>
> NP: given the above ACLs are all allow, you could in this proxy even move
> the manager allow lines down to here. They will be far rarer than your
> normal client traffic I think.
>
>
>> http_access deny all
>> hierarchy_stoplist cgi-bin ?
>
> You can simplify the config by removing hierarchy_stoplist.
>
>
>> coredump_dir /var/spool/squid/p3120
>> refresh_pattern ^ftp: 1440 20% 10080
>> refresh_pattern ^gopher: 1440 0% 1440
>> refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
>> refresh_pattern . 0 20% 4320
>> hosts_file /etc/hosts
>> dns_nameservers 10.5.7.13 10.5.7.23
>> cache_replacement_policy heap LFUDA
>> cache_swap_low 90
>> cache_swap_high 95
>> maximum_object_size_in_memory 96 KB
>> maximum_object_size 100 MB
>> cache_dir aufs /var/spool/squid/p3120 204800 16 256
>> cache_mem 100 MB
>> logfile_rotate 10
>> memory_pools off
>
>
> It does vary between installations, but memory_pools can offer reduction on
> a lot of memory allocator overheads when it is enabled.
>
>
>> quick_abort_min 0 KB
>> quick_abort_max 0 KB
>> log_icp_queries off
>> client_db off
>> buffered_logs on
>> half_closed_clients off
>> url_rewrite_children 20
>> pid_filename /var/run/squid-p3120.pid
>> unique_hostname squid01-p3120.eng.XXXXXX
>> visible_hostname squid.eng.XXXXXXX
>> icp_port 3100
>> tcp_outgoing_address 10.5.22.101
>> emulate_httpd_log on
>>
>>
>>
>> Anyone have any suggestions on whether or not i'm doing something
>> terribly wrong her or missing some kind of performance tuning?
>
>
> Your memory requirements in MB of RAM per proxy are:
> 100 (cache_mem) + 15*0.1 (cache_mem index) + 15*205 (cache_dir index) +
> 0.25 * R (active request buffers)
>
> I note that this is already 3.1GB just for the index values. So 10 proxies
> will be only leaving ~1GB of RAM for the operating system use, other
> processes, and Squids active request buffering.
>
> I suggest dropping the cache_dir size to 100000 and measure the RAM usage on
> the box to see how much you can increase it back up.
>
> Amos
Received on Tue Aug 20 2013 - 13:31:38 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 21 2013 - 12:00:43 MDT