Re: [squid-users] site slow in loading

From: Brett Lymn <brett.lymn_at_baesystems.com>
Date: Mon, 31 Oct 2011 11:30:53 +1030

On Thu, Oct 27, 2011 at 04:37:20PM +1030, Brett Lymn wrote:
>
> OK, but, the 2.7 stable 6 machines that work well share the same parents
> as the 3.1.15 machines - they even talk to the same DNS servers.
>

I had a bit of a dig at this on the weekend and can confirm that the
problem is a DNS issue and is a combination of broken DNS and the way
squid does lookups. It looks like the new directive in 3.1.16 would
help in this case.

What looks to be happening is that squid never tries to look up the A
address, the remote server just times out on the AAAA lookup but it
takes so long that the timeout clobbers the DNS request in the queue. I
see this on a tcpdump:

   192.168.3.3.65473 > 192.231.203.132.domain: [udp sum ok] 10060+ AAAA? www.my.commbank.com.au. (40)
19:38:25.132968 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 68)
    192.168.3.3.65472 > 192.231.203.3.domain: [udp sum ok] 10060+ AAAA? www.my.commbank.com.au. (40)
19:38:30.154854 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 68)
    192.168.3.3.65473 > 192.231.203.132.domain: [udp sum ok] 10060+ AAAA? www.my.commbank.com.au. (40)
19:38:31.177449 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 68)
    192.168.3.3.65472 > 192.231.203.3.domain: [udp sum ok] 10060+ AAAA? www.my.commbank.com.au. (40)
19:38:36.197481 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 68)
    192.168.3.3.65473 > 192.231.203.132.domain: [udp sum ok] 10060+ AAAA? www.my.commbank.com.au. (40)
19:38:37.217890 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 68)
    192.168.3.3.65472 > 192.231.203.3.domain: [udp sum ok] 10060+ AAAA? www.my.commbank.com.au. (40)

And this in the cache.log with debug_options 78,3:

2011/10/30 19:22:57.089| idnsRead: starting with FD 11
2011/10/30 19:22:57.089| idnsRead: FD 11: received 40 bytes from 192.231.203.3:53
2011/10/30 19:22:57.089| idnsGrokReply: ID 0xcbe0, -2 answers
2011/10/30 19:22:57.089| idnsGrokReply: error Server Failure: The name server was unable to process this query. (2)
2011/10/30 19:22:57.089| idnsGrokReply: Query result: SERV_FAIL
2011/10/30 19:23:58.160| idnsCheckQueue: ID 0x54bftimeout
2011/10/30 19:24:58.996| idnsCheckQueue: ID 0x54bftimeout
2011/10/30 19:24:58.996| idnsCheckQueue: ID 54bf: giving up after 4 tries and 121.91 seconds

In the code I can see that the A record is supposed to be tried after a
SERV_FAIL has happened a few times but in this case the retries take so
long the DNS request gets killed out of the queue before that part of
the code is executed.

What I eventually did at home was rebuild squid with --disable-ipv6
(actually, it would be nice if this was a config directive rather than
compile time....). Once I had done this the comm bank site was actually
reasonably useable since the AAAA lookups were not being tried at all.

-- 
Brett Lymn
"Warning:
The information contained in this email and any attached files is
confidential to BAE Systems Australia. If you are not the intended
recipient, any use, disclosure or copying of this email or any
attachments is expressly prohibited.  If you have received this email
in error, please notify us immediately. VIRUS: Every care has been
taken to ensure this email and its attachments are virus free,
however, any loss or damage incurred in using this email is not the
sender's responsibility.  It is your responsibility to ensure virus
checks are completed before installing any data sent in this email to
your computer."
Received on Mon Oct 31 2011 - 01:01:06 MDT

This archive was generated by hypermail 2.2.0 : Mon Oct 31 2011 - 12:00:03 MDT