Re: [squid-users] squid3.0.25 hoggeing the CPU, serving little

From: Ralf Hildebrandt <Ralf.Hildebrandt_at_charite.de>
Date: Wed, 11 Aug 2010 15:00:09 +0200

* Amos Jeffries <squid3_at_treenet.co.nz>:
> Ralf Hildebrandt wrote:
> >3.0.STABLE25 is showing the following behaviour during normal operation:
>
> Hi Ralf,
> Thank you for all this, but I'm wondering why you are putting so
> much work into 3.0?

3.1 sucks even more? See my other bug reports! That stuff is crashing
all over the place. Need to get some stability here :)

> I ask because the major performance gains are aimed at 3.2. It could
> do with this type of analysis as part of the polish up.

I COULD run 3.2 if you like.

> Okay, add to that a sudden extreme loss of known clients. And a
> sudden 'instant' drop in memory usage before the growth.
>
> This looks to me like the usual culprit:
> A Squid crash followed by dirty rebuild of a large caches' index.

Could be

> The behaviour in such a situation is complete non-response on the
> ports for a short period (extreme service times for existing clients,
> they simply get no further traffic and time out).

Yes, but that should only be a problem for the clients.

> Followed by a period of heavy reads as the entire cache_dir get
> scanned file-by-file for meta data to build the index. Some of which
> will fail as the un-closed files from previous instance are found.
> Accompanied by heavy writes as the swap.state journal gets rebuilt
> from each of those meta-data reads.
>
> Under heavy client load this extra disk IO can lead to delays
> processing other actions and slower new client service times.

OK, but for such a long time?

> Potentially a huge backlog of buffered in-transit data waiting to be
> stored in the cache. Which can't be written to until the index is
> loaded properly.
>
> This latter can be alleviated by a sufficiently large in-memory
> cache, though older versions did not permit that space to be used
> until after the rebuild either.
>
> >proxy 10426 88.5 17.3 380636 357040 ? R 10:42 110:54 /usr/sbin/squid3 -NsYC
> >
> >% strace -c -p 10426
>
> Over how long a time was the strace taken? just that 1.6 seconds or
> something longer?

That trace is from about 15 seconds.

-- 
Ralf Hildebrandt
  Geschäftsbereich IT | Abteilung Netzwerk
  Charité - Universitätsmedizin Berlin
  Campus Benjamin Franklin
  Hindenburgdamm 30 | D-12203 Berlin
  Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
  ralf.hildebrandt@charite.de | http://www.charite.de
	    
Received on Wed Aug 11 2010 - 13:00:20 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 11 2010 - 12:00:02 MDT