Re: [squid-users] Squid sporadic performance - goes OFF every 5-10 minutes

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 21 May 2009 14:48:10 +1200 (NZST)

>
> Hello everyone,
>
> I'm trying to implement Squid project at our non-protfit organization
> (Squid
> + Squid Guard) for web content filtering. I use Squid 3.0.
>
> Because main focus is content filtering I disabled Squid's file caching
> feature. I use Pentium 2 PC with 1 GB of RAM. Because caching is OFF I
> think
> that is pretty capable machine for this task.
>
> The worst problem that I faced is very sporadic performance: users would
> have internet access, then after 5 minutes internet would be not
> accessible,
> then after another 10 minutes it would come back again... Users were
> really
> annoyed so I had to quit all my experiments... :)
>
> From the browser error page it was clear that squid was not working
> properly...
>
> First I thought that the described problem was caused bu SquidGuard, but
> after disabling it, problem didn't go away... So only Squid is left...
>
> For me it looks like problems with DNS resolution, but I'm pretty new with
> Squid and with Linux, so I'm just guessing here.
>
> Attached are config files for Squid and SquidGuard and medley of squid
> logs
> with my remarks.
>
> If you will have time I would be very thakful for your help. Also it may
> be
> something that other readers of your article may face.
>
> Thank you!
>
> Here is short exerpt from logs:
>
> access.log
>
> Here client could not connect:
>
> 1234909886.006 59821 10.127.35.1 TCP_MISS/503 0 CONNECT
> urs.microsoft.com:443 - DIRECT/- -
<snip>
> 1234910314.927 121758 10.127.35.19 TCP_MISS/503 2212 GET
> http://dss2.siteadvisor.com/DSS/Ping? - DIRECT/dss2.siteadvisor.com
> text/html

Note the "503" result status. This means squid was unable to connect to
the requested server.

>
> After 5 mins client could connect:
>

<snip>
> 1234910598.412 11689 10.127.35.1 TCP_MISS/000 0 GET http://google.com/ -
> DIRECT/google.com -
> 1234910720.238 121823 10.127.35.1 TCP_MISS/503 2749 GET
> http://www.google.com/ - DIRECT/www.google.com text/html
> 1234910918.519 223 10.127.35.19 TCP_MISS/301 589 GET
> http://www.bbc.co.uk/hindi - DIRECT/212.58.251.197 text/html
> 1234910919.071 537 10.127.35.19 TCP_MISS/200 74824 GET
> http://www.bbc.co.uk/hindi/ - NONE/- text/html

5 min is suspiciously alike the default negative_ttl on old proxies. This
would cause squid to cache the 5xx or 4xx error pages and send them to
clients for a minimum of 5 minutes.

Check that setting is "0 seconds" in your config. Older squid need it
explicitly set to that or they use the bad default.

>
>
> store.log
>

This is a log of what gets stored. It is not useful and slows things down.
Particularly useless if, as you said, you are not storing/caching stuff
long-term.

Set it to "none" in the config.

>
> cache.log
>
> Here client could not connect:
>
<snip intro info from another request>

<next request comes in>
> 2009/02/17 16:38:35.501| fd_open FD 20 HTTP Request
> 2009/02/17 16:38:35.501| fdc_t::acceptOne accepted: FD 15 handler:
> 0x80785d0
> newfd: 20
> 2009/02/17 16:38:35.501| comm_accept: FD 15 handler: 0x80785d0
> 2009/02/17 16:38:35.501| httpAccept: FD 20: accepted
> 2009/02/17 16:38:35.501| comm_add_close_handler: FD 20, handler=1,
> data=0x8598728
> 2009/02/17 16:38:35.501| commSetTimeout: FD 20 timeout 300
> 2009/02/17 16:38:35.501| aclCheckFast: list: 0x8324d40
> 2009/02/17 16:38:35.501| ACLChecklist::preCheck: 0xbfda3854 checking
> 'ident_lookup_access deny all'
> 2009/02/17 16:38:35.501| ACLList::matches: checking all
> 2009/02/17 16:38:35.501| ACL::checklistMatches: checking 'all'
> 2009/02/17 16:38:35.501| aclMatchIp: '10.127.35.19' found
> 2009/02/17 16:38:35.502| ACL::ChecklistMatches: result for 'all' is 1
> 2009/02/17 16:38:35.502| ACLList::matches: result is true
> 2009/02/17 16:38:35.502| aclmatchAclList: 0xbfda3854 returning true (AND
> list satisfied)
> 2009/02/17 16:38:35.502| ACLChecklist::markFinished: 0xbfda3854 checklist
> processing finished
> 2009/02/17 16:38:35.502| clientReadSomeData: FD 20: reading request...
> 2009/02/17 16:38:35.502| comm_read, queueing read for FD 20
> 2009/02/17 16:38:35.502| ACLChecklist::~ACLChecklist: destroyed 0xbfda3854

.. squid waits for the request headers to be read.

<next request comes in>
> 2009/02/17 16:38:35.502| fd_open FD 21 HTTP Request
> 2009/02/17 16:38:35.502| fdc_t::acceptOne accepted: FD 15 handler:
> 0x80785d0
> newfd: 21
> 2009/02/17 16:38:35.502| comm_accept: FD 15 handler: 0x80785d0
> 2009/02/17 16:38:35.502| httpAccept: FD 21: accepted
> 2009/02/17 16:38:35.502| comm_add_close_handler: FD 21, handler=1,
> data=0x859850c
> 2009/02/17 16:38:35.502| commSetTimeout: FD 21 timeout 300
> 2009/02/17 16:38:35.502| aclCheckFast: list: 0x8324d40
> 2009/02/17 16:38:35.503| ACLChecklist::preCheck: 0xbfda3854 checking
> 'ident_lookup_access deny all'

.. some lot of time skipped...

>
> After 5 mins client could connect:
>
>
> 2009/02/17 16:40:37.154| event.cc(225) checkEvents
> 2009/02/17 16:40:37.163| EventDispatcher::dispatch: Running
> 'idnsCheckQueue'
> 2009/02/17 16:40:37.164| idnsCheckQueue: ID 0x0404timeout
> 2009/02/17 16:40:37.164| idnsCheckQueue: ID 404: giving up after 16 tries
> and 121.65 seconds
> 2009/02/17 16:40:37.164| ipcacheParse: Lookup failed 'Timeout' for
> 'dss2.siteadvisor.com'
> 2009/02/17 16:40:37.164| ipcacheParse: Lookup failed 'Timeout' for
> 'dss2.siteadvisor.com'
> 2009/02/17 16:40:37.164| ipcacheRelease: Releasing entry for
> 'dss2.siteadvisor.com'
> 2009/02/17 16:40:37.164| commConnectDnsHandle: Unknown host:
> dss2.siteadvisor.com
> 2009/02/17 16:40:37.164| commConnectCallback: FD 14, data 0x85bb438
> 2009/02/17 16:40:37.164| comm_remove_close_handler: FD 14, handler=1,
> data=0x85c73d0
> 2009/02/17 16:40:37.164| commSetTimeout: FD 14 timeout -1
> 2009/02/17 16:40:37.164| fwdConnectDone: Unknown host:
> dss2.siteadvisor.com
> 2009/02/17 16:40:37.164| fwdFail: ERR_DNS_FAIL "Service Unavailable"
>

There you go. DNS did not send back an answer within 300 seconds.
Looks like the siteadvisor browser plugin. If that is stalling the page
request from loading bad delays may follow.
This is where the 503 messages are coming from.

> Generated Tue, 17 Feb 2009 22:40:37 GMT by MBCCOPproxy (squid/3.0.STABLE1)

Ah, There have been a lot of improvements to 3.0 over the last 18 months.
... and I mean a LOT, including some very serious security flaws.

Please use the current 3.0.STABLE15 or STABLE16-RC1.

Amos
Received on Thu May 21 2009 - 02:48:19 MDT

This archive was generated by hypermail 2.2.0 : Thu May 21 2009 - 12:00:01 MDT