Re: dnsserver and always_keepalive.

From: Scott Hess <scott@dont-contact.us>
Date: Fri, 6 Aug 1999 15:05:06 -0700

Duane Wessels <wessels@ircache.net> wrote:
> On Thu, 5 Aug 1999, Scott Hess wrote:
> > A couple hours after turning this on, squid reported in the error logs
that
> > it got a timeout reading from a dnsserver, and that the dnsserver in
> > question had exitted. For the next fifteen minutes, it served up pages
with
> > status 504, 503, and 000. 000! The error logs look like:
>
> Given that you've established a relationship between enabling TCP
> keepalive and this happening, I don't find these messages too
> surprising.
>
> I'm a little surprised that turning on keepalives caused the dnsserver
> to exit. The TCP connection is definately established. I don't see
> why keepalives cause this.

It's odd. I told squid to reconfigure, which restarted the dnsserver
processes - and none of them went down overnight. It _may_ be related to
dnsserver connections which were very old at the time I turned on the flag
(presumably they were idle for the last four days, the time of the last
reboot on that machine).

> > Also, I notice that as of right now, it's only got a single dnsserver
> > process, though it's configured for 5. There are no messages regarding
>
> Ok, so we can assume that the other four got disconnected somehow
> because of the keepalives. Probably the rate at which Squid looks up
> DNS names is so low (at this point in time) that only one of the
> dnsservers is required.

Really low - it's in http_accel mode, so it's only looking up a single IP
address.

> > Second question - why doesn't squid catch the dead dnsservers earlier?
> > Obviously a number of them have died, but squid hasn't logged the fact.
>
> Squid only reads from the dnsserver socket when its waiting for
> an answer. When the DNS server is sitting idle, Squid does not
> "select" on that filedescriptor for reading. So we wouldn't get
> the error until we try to write to it.

Sensible, though it would be nice if it periodically did a non-blocking
wait() to catch potentially dead children. Or periodically select() on idle
file descriptors for reading.

> > Third question - why, when it saw that the dnsserver had died, did squid
> > lose it's mind?
>
> Squid lost its mind?
>
> Which one is "the dnsserver"? You said that 4 of 5 exited, but
> things kept on working.

For some time, possibly up to 15 minutes, Squid wasn't serving requests from
the machine it was accelerating for. Unfortunately, nobody saw what it
_was_ serving, because squid was up, the accelerated machine's server was
working fine, and all our automated checks were passed. :-(.

I say "lost its mind" because it didn't restart the dnsserver immediately
and try again. When I say 4 of 5 dnsserver processes had exitted, I meant
that as of a couple hours after the event, only one was running, and the
four who traditionally were also running weren't running. I assume that
squid restarted the one that exitted, which was the one that was still
running later. What is unclear is why it wouldn't have immediately
restarted the dnsserver.

> Also note that in version 2.3, there are no more dnsserver processes :-).
> Squid will do native DNS queries all by itself.

Ooooh. That sets the hook, a bit. I only semi-trust the multiprocess-based
fixes to unblocking gethostbyname...

Thanks,
scott
Received on Fri Aug 06 1999 - 15:50:21 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:47:52 MST