Re: [squid-users] high load issues

From: Luis Daniel Lucio Quiroz <luis.daniel.lucio_at_gmail.com>
Date: Wed, 10 Feb 2010 12:15:05 -0600

Le Mercredi 10 Février 2010 11:41:29, Justin Lintz a écrit :
> We're seeing the symptoms across 4 servers on different hardware.
> What would be the reason for adjusting the cache_swap_high to 96?
> Thanks
>
> - Justin Lintz
>
>
>
> On Wed, Feb 10, 2010 at 11:45 AM, Luis Daniel Lucio Quiroz
>
> <luis.daniel.lucio_at_gmail.com> wrote:
> > Le Mercredi 10 Février 2010 10:36:40, Justin Lintz a écrit :
> >> Squid ver: squid-2.6.STABLE21-3
> >> The server is a xen virtual with 6GB of ram available to it.
> >>
> >> relevant lines in Squid.conf:
> >>
> >> ierarchy_stoplist cgi-bin ?
> >> acl apache rep_header Server ^Apache
> >> broken_vary_encoding allow apache
> >> cache_mem 4096 MB
> >> maximum_object_size 8192 KB
> >> maximum_object_size_in_memory 4096 KB
> >> cache_swap_low 95
> >> cache_swap_high 96
> >> cache_dir aufs /www/apps/squid/var/cache 4096 16 256
> >> logformat combined %>a %ui %un [%tl] "%rm %ru HTTP/%rv" %Hs %<st
> >> "%{Referer}>h" "%{User-Agent}>h" %Ss:%Sh %tr
> >> access_log /www/logs/squid/access.log combined
> >> cache_log /www/logs/squid/cache.log
> >> cache_store_log /www/logs/squid/store.log
> >> debug_options ALL,1 33,2
> >> refresh_pattern ^ftp: 1440 20% 10080
> >> refresh_pattern ^gopher: 1440 0% 1440
> >> refresh_pattern . 0 20% 4320
> >> negative_ttl 0
> >> collapsed_forwarding on
> >> refresh_stale_hit 5 seconds
> >> half_closed_clients off
> >> acl all src 0.0.0.0/0.0.0.0
> >> acl manager proto cache_object
> >> acl localhost src 127.0.0.1/255.255.255.255
> >> acl to_localhost dst 127.0.0.0/8
> >> acl SSL_ports port 443
> >> acl Safe_ports port 80 # http
> >> acl Safe_ports port 21 # ftp
> >> acl Safe_ports port 443 # https
> >> acl Safe_ports port 70 # gopher
> >> acl Safe_ports port 210 # wais
> >> acl Safe_ports port 1025-65535 # unregistered ports
> >> acl Safe_ports port 280 # http-mgmt
> >> acl Safe_ports port 488 # gss-http
> >> acl Safe_ports port 591 # filemaker
> >> acl Safe_ports port 777 # multiling http
> >> acl CONNECT method CONNECT
> >> acl PURGE method PURGE
> >> http_access allow manager localhost
> >> http_access deny manager
> >> http_access deny PURGE
> >> http_access allow localhost
> >> http_access allow all
> >> http_reply_access allow all
> >> icp_access allow all
> >> httpd_suppress_version_string on
> >> cachemgr_passwd none config
> >> error_directory /www/apps/squid/errors
> >> coredump_dir /var/spool/squid
> >> minimum_expiry_time 15 seconds
> >> max_filedesc 8192
> >>
> >> Symptoms:
> >> - High load avg on box ranging from 6-10 during traffic hours
> >> - CPU iowait time during times will be between 20-50%
> >> - SO_FAIL status codes seen in store.log
> >> - MaintainSwapSpace is continually running under a second. This
> >> appears to be normal though looking at our dev and stage squid setups
> >> which have no load.
> >> - From squidaio_counts, seeing the Queue spike upwards to 200 or
> >> more. I saw a mention in the O'Reilly book this number if greater
> >> than 5x # of IO threads, then squid is overworked.
> >> - Cache_dir storage size is constantly at the cache_swap_low value
> >> (94%). Does this mean squid is continually garbage collecting and
> >> possibly causing the high IO? Originally we had the number at 90, but
> >> after reading some threads, adjusted the number to 94 for the low and
> >> 95 for the high hoping to reduce IO with smaller amount of data being
> >> garbage collected. This change didn't have any impact
> >> - Saw a couple of warnings in cache.log saying
> >> "squidaio_queue_request: WARNING - Disk I/O overloading"
> >> - High number of create.select_fail events in store_io screen in the
> >> cache manager. Seeing this number at 12% of the total IO calls.
> >>
> >> From reading around the list of people with similar issues, I see one
> >> suggestion we will implement next will be configuring a second
> >> cache_dir to increase the number of threads available for IO.
> >>
> >> I wanted to know if you had any other suggestions for tweaks that
> >> could be made that would hopefully alleviate the load on the box.
> >>
> >> A couple of other tweaks we have currently implemented are putting the
> >> noatime option on the partition where the cache is stored and using
> >> tcmalloc inplace of gnu malloc.
> >>
> >> I saw a recommendation of changing the store_dir_select_algorithm to
> >> round-robin but from reading this
> >> http://www.squid-cache.org/mail-archive/squid-users/200011/0794.html
> >> it sounded like the change would increase the response times.
> >>
> >>
> >>
> >>
> >> - Justin Lintz
> >
> > Change your
> > cache_swap_high 96
> >
> > to something higher, 98 could be.
> > look for hardware errors

dont top list,

we have seveal heavy load squids, and we realized that sometimes inet surf is
slow, we've discovered that it is because IO (as you see in your top command
more than 1% of IO waiting), so we purge our cache to dont let it rise
cache_swap_high percentage very often
Received on Wed Feb 10 2010 - 18:15:13 MST

This archive was generated by hypermail 2.2.0 : Wed Feb 10 2010 - 12:00:05 MST