Re: Multiple issues in Squid-3.2.3 SMP + rock + aufs + a bit of load

From: Henrik Nordström <henrik_at_henriknordstrom.net>
Date: Fri, 30 Nov 2012 22:25:25 +0100

fre 2012-11-30 klockan 10:47 -0700 skrev Alex Rousskov:

> IIRC, Rock store diskers should process queries while rebuilding so db
> open should not fail due to rebuild itself. I bet this is actually
> related to the problem discussed below.

I think so too.

> > - kid registration failures at startup (timeout). This seem to be due to
> > the local cache rebuild in each kid taking more than 6 seconds to
> > complete, confusing the registration timeout management.
>
> The registration code needs to be redesigned. It is not robust enough,
> especially when we have legacy code that blocks Squid for more than 6
> seconds at startup.

But the aufs rebuild should not block.. unless it's forced into
foreground rebuild which I think not.

I think the default timeouts SHOULD be quite sufficient I think if it's
only about communication timeouts. Should not get blocked by a normal
background store rebuild.

> > -const double IpcIoFile::Timeout = 7; // seconds; XXX: ALL,9 may require more
> > +const double IpcIoFile::Timeout = 300; // seconds; XXX: ALL,9 may require more
>
> Do not do that. The timeout is used for both file opening timeout and
> regular/ongoing I/O timeout checks. The latter should happen more
> frequently than once in 5 minutes, especially on busy Squids. Split this
> into OpenTimeout and IoTimeout if you want to increase the former.

Ok.

> > IpcIoFile::IpcIoFileList IpcIoFile::WaitingForOpen;
> > IpcIoFile::IpcIoFilesMap IpcIoFile::IpcIoFiles;
> > std::auto_ptr<IpcIoFile::Queue> IpcIoFile::queue;
> > @@ -218,7 +218,7 @@
> > {
> > bool ioError = false;
> > if (!response) {
> > - debugs(79, 3, HERE << "error: timeout");
> > + debugs(79, 1, HERE << "error: timeout");
>
> Do not do that. It is a common/benign case under overload.

This was needed to show the Open timeout event which is not at all
benign.

> > if ((nulpos = (char*)memchr(header_start, '\0', header_end - header_start))) {
> > - debugs(55, 1, "WARNING: HTTP header contains NULL characters {" <<
> > + debugs(55, 2, "WARNING: HTTP header contains NULL characters {" <<
> > getStringPrefix(header_start, nulpos) << "}\nNULL\n{" << getStringPrefix(nulpos+1, header_end));
> > goto reset;
> > }
>
> While I agree that Squid should not warn about malformed headers by
> default, others disagree, and this is unrelated to the rest of the changes.

Yes it's unrelated. This is a full unsorted patch of the changes I had.

> > --- squid-3.2.3/src/ipc/Strand.cc 2012-10-20 15:39:49.000000000 +0300
> > +++ squid-3.2.3/src/ipc/Strand.cc 2012-11-29 00:12:12.385439783 +0200
> > @@ -53,7 +53,7 @@
> > TypedMsgHdr message;
> > ann.pack(message);
> > SendMessage(coordinatorAddr, message);
> > - setTimeout(6, "Ipc::Strand::timeoutHandler"); // TODO: make 6 configurable?
> > + setTimeout(600, "Ipc::Strand::timeoutHandler"); // TODO: make 6 configurable?
> > }
>
> Please keep the TODO comment in sync.

Yes, except that the change should not really be needed.

> This change is discussed at the beginning of this email. In summary, 10
> minutes is sometimes worse than 6 seconds (and sometimes is not enough
> anyway). The registration algorithm should be adjusted, and the code
> that blocks Squid process for 9 minutes should be fixed. Until then, I
> am not objecting to this specific change, but perhaps you can add a
> comment discussing why this is so large.

We should look into why it is at all needed. From what I can understand
it should not be needed.

Have the foreground/background store rebuild logics been changed
somehow?

Regards
Henrik
Received on Fri Nov 30 2012 - 21:25:32 MST

This archive was generated by hypermail 2.2.0 : Sat Dec 01 2012 - 12:00:32 MST