On 2014-02-25 02:12, Simon Beale wrote:
>> On 2014-02-21 06:10, Simon Beale wrote:
>>> I've got a problem at the moment with our general squid proxies where
>>> occasionally requests take a long time that shouldn't do. (i.e. 5+
>>> seconds
>>> or timeout, instead of milliseconds).
>>> 
>>> This is most common on our proxies doing 100 reqs/sec, but happens
>>> overnight too when they're running at 10 reqs/sec. I've got this
>>> happening
>>> with both v3.4.2 and also with a box I've downgraded back to v3.1.10.
>>> For
>>> v3.4.2, it's happening in both multiple worker and single worker 
>>> modes.
> 
> As a follow up, we've narrowed this down to the internal DNS resolver.
> When I deploy a 3.4.2 (which is what we're running elsewhere) that's 
> been
> recompiled with "--disable-internal-dns", the problem completely goes
> away.
> 
>> What sort of CPU loading do you have at ~100req/sec?
>>   is that at or near your local installations req/sec capacity?
> 
> For the box running with a single worker, it consumes 50% of one core 
> at
> 100 req/sec.
> For the boxes running with 9 workers, each worker consumes 5% of a core 
> at
> the same rate.
> 
>>> The test is not reproducible, sadly, but I've got a cronjob running 
>>> on
>>> localhost on these boxes testing access times to various URLs 
>>> covering:
>>> HTTPS, non-HTTPS static content, using IP not hostname over both HTTP
>>> and
>>> HTTPS, and a URL on the same vlan as the proxies. All of these test
>>> cases
>>> have it happen occasionally, but not repeatedly/reliably.
>> 
>> Some ideas:
>>   * DNS lookup delays ?
> 
> Yeah, when I enabled the dns resolution time logging in squid, that 
> became
> apparent.
> 
> Quite why the internal dns resolver shows this, but the external one
> doesn't, I don't know. The DNS server query logs show both DNS servers 
> in
> /etc/resolv.conf getting the request in turn and answering it (though 5
> seconds apart). It's happening for us in multiple datacentres, so is
> unlikely to be port errors or internal packet loss.
The dnsserver helper used when internal DNS is disabled uses 
gethostname()/getaddrinfo() and thus the local machine resolver. It has 
a limit of ~250 req/sec on most systems and most times does not support 
IPv6 DNS resolvers configured through squid.conf (does support them if 
configured through resolv.conf).
The internal DNS client is using a form of happy-eyeballs scheduling to 
send A and AAAA packets but waiting for *both* responses before 
continuing (unlike full-blown happy eyeballs which goes with the first 
response regardless of missing IPs). It should only be contacting one of 
the resolvers at a time.
 From your above description it sounds like the first resolver configured 
is occasionally "failing" after 5 sec and Squid is moving on to the 
second, which works.
  Do you have "dns_timeout 5 seconds" configured ?
With internal DNS enabled your cachemgr "idns" report has a lot of 
detail on the particular errors and actions happening.
NP: with Squid-3.4 the DNS lookup timeout has been detached from the TCP 
connect_timeout and only happens once per connection destination. So you 
can set each as short as you wish without affecting the other connection 
setup steps.
Amos
> 
> It's only(/mostly?) apparent on our squid servers that do desktop
> proxying, so do lots of DNS requests to everywhere; the squid servers 
> that
> handle just our datacentre servers don't show this problem, but only
> really go to about 40 hosts in total.
> 
> Thanks
> 
> Simon
Received on Mon Feb 24 2014 - 23:49:11 MST
This archive was generated by hypermail 2.2.0 : Tue Feb 25 2014 - 12:00:08 MST