Re: Any takers on Bug #1703? (diskd stuck at 100% CPU)

From: Steven <swilton@dont-contact.us>
Date: Sun, 30 Jul 2006 13:15:37 +0800 (WST)

On Sat, 29 Jul 2006, Henrik Nordstrom wrote:

> lör 2006-07-29 klockan 23:16 +0800 skrev Steven:
>
> > I was seeing the msgrecv() calls while running strace, but it wasn't in
> > the same loop as reported in the bug. Looks like I just found another bug
> > while trying to reproduce this one :)
>
> Was not aware there was msgrcv() calls in pthreads.

I had a COSS + diskd setup. The msgrecv() syscalls were coming from the
diskd cache_dirs. They were happening every 10ms, but there was a call to
epoll() in between each msgrecv(), so it's not the same bug.

> We don't have a backtrace in the bug, so it could be the same and I was
> chasing down the wrong path...
>
> Guess we gave to wait for Ralf to answer about the details of his setup.

I'm going to compile the same version of squid and set up using Debian
and only diskd cache_dirs and see if I can reproduce. There are 2
possibilities that I can think of. diskdinfo->away may not being
decremented every time (or is being incorrectly incremented), and
squid is waiting for replies to messages that have not actually been
sent. The other possible issue is if the diskd processes have stopped
due to the reconfigure signal, but squid is waiting for them to send a
message.

Either way, it may only happen on a loaded system (which may make it
harder to reproduce on a test system). I'll find this out shortly.

Steven
Received on Sat Jul 29 2006 - 23:15:45 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Aug 01 2006 - 12:00:02 MDT