Re: [squid-users] Frequent cache rebuilding

From: Chris Robertson <crobertson_at_gci.net>
Date: Thu, 22 Jan 2009 10:02:00 -0900

Andreev Nikita wrote:
>>>> Why does squid eat 100% of processor if the problem is in FS?
>>>>
>
>
>> How is your cache_dir defined? aufs (in general) is a better choice
>> than ufs, diskd might still have some stability issues under load, and
>> coss is a good supplement as a small object cache. Conceivably if Squid
>> is set up with a ufs cache_dir mounted as NFS, it's spending a lot of
>> time in a wait state, blocked while the I/O completes.
>>
>
> For 6 days uptime:
> # vmstat
> procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy id wa st
> 2 0 92 104052 235704 2309956 0 0 3 43 24 33 10 16 73 1 0
>
> As you can see system has spent only 1% of CPU time in I/O wait.
> (cpu-wa column).
>

This also shows the CPU has been 73% idle. If I'm not mistaken, you
stated that you had a two CPU (core) system, which would still leave one
of the cores at 50% idle. Running "vmstat 2" while you are experiencing
load will give more insight.

> My cache dir directive looks like:
> cache_dir ufs /var/spool/squid 16384 64 1024
>

Make sure your Squid can support it (check the output of "squid -v" for
aufs) and change this line to...

cache_dir aufs /var/spool/squid 16384 64 1024

...to enable asynchronous cache accesses.

> # vmstat -d
> vmstat -d
> disk- ------------reads------------ ------------writes----------- -----IO------
> total merged sectors ms total merged sectors ms cur sec
> ram0 0 0 0 0 0 0 0 0 0 0
> ram1 0 0 0 0 0 0 0 0 0 0
> ram2 0 0 0 0 0 0 0 0 0 0
> ram3 0 0 0 0 0 0 0 0 0 0
> ram4 0 0 0 0 0 0 0 0 0 0
> ram5 0 0 0 0 0 0 0 0 0 0
> ram6 0 0 0 0 0 0 0 0 0 0
> ram7 0 0 0 0 0 0 0 0 0 0
> ram8 0 0 0 0 0 0 0 0 0 0
> ram9 0 0 0 0 0 0 0 0 0 0
> ram10 0 0 0 0 0 0 0 0 0 0
> ram11 0 0 0 0 0 0 0 0 0 0
> ram12 0 0 0 0 0 0 0 0 0 0
> ram13 0 0 0 0 0 0 0 0 0 0
> ram14 0 0 0 0 0 0 0 0 0 0
> ram15 0 0 0 0 0 0 0 0 0 0
> sda 50114 8198 1197972 239114 771044 986524 13061742 1616345 0 1239
> sdb 125 1430 2383 100 3 20 184 43 0 0
> sdc 547181 13909 6116481 6209599 2893943 6771249 77505040 42580590 0 8027
> dm-0 6659 0 143594 45401 528574 0 4228592 1248409 0 269
> dm-1 13604 0 408122 82828 883993 0 7071944 3118925 0 677
> dm-2 150 0 1132 387 2 0 10 2 0 0
> dm-3 36240 0 639146 173982 178529 0 1428232 540632 0 229
> dm-4 164 0 1136 610 35 0 76 155 0 0
> dm-5 216 0 1240 817 166439 0 332884 262910 0 185
> hda 0 0 0 0 0 0 0 0 0 0
> fd0 0 0 0 0 0 0 0 0 0 0
> md0 0 0 0 0 0 0 0 0 0 0
>

Right. Unless it's mounted as part of a logical volume, NFS doesn't
show up here.

> If it's not an I/O wait problem then what can cause squid to use 100%
> of CPU core?

For a 4mbit circuit on recent hardware, using lots (thousands) of regex
ACLs would do it. But...

> I tried to clear cache but after an hour or so squid
> began to use as much CPU as usual (~100%).
>

...indicates to me that it's cache related. So I think it's either the
cache_dir type you are using (ufs) or the way the cache_dir is mounted
(NFS).

> I'm not sure but maybe it started after we enlarged our outer link
> from 2Mbps to 4Mbps.
>
> I will try to move squid cache to local disk but squid works in VMware
> Virtual Infrastructure. So if I move any of virtual machine partitions
> from shared to local storage I wouldn't have an ability to move squid
> VM from one HA cluster node to the other ('cause local partitions on
> cluster nodes are different from each other).
>

Then if changing the cache_dir type doesn't help, look into using AoE or
iSCSI.

> Regards,
> Nikita.

Chris
Received on Thu Jan 22 2009 - 19:00:02 MST

This archive was generated by hypermail 2.2.0 : Mon Jan 26 2009 - 12:00:02 MST