[squid-users] aufs versus 2.6-S13 + diskd is freaky bugged

From: Michel Santos <michel@dont-contact.us>
Date: Thu, 21 Jun 2007 06:40:34 -0300 (BRT)

Only to remember, I get similar to the following after unclean diskd shutdown

>>> Store rebuilding is -0.3% complete
>>> Store rebuilding is -0.4% complete
>>> Store rebuilding is -0.4% complete
>>> Store rebuilding is -0.3% complete
>>> Store rebuilding is -0.4% complete
>>> Store rebuilding is -0.3% complete
>>> Store rebuilding is -0.4% complete
>>> ....
>>> until suddenly ...
>>> ....
>>> Store rebuilding is 1291.7% complete
>>> Store rebuilding is 743.5% complete
>>> Store rebuilding is 1240.4% complete
>>> Store rebuilding is 725.0% complete
>>> Store rebuilding is 1194.1% complete
>>> Store rebuilding is 1150.4% complete
>>> Store rebuilding is 707.9% complete
>>>

with squid-2.5 I get cache_dir emptying (believes disk is full) instead of
swap.state growing

so then I re-tested again aufs following advices from the list and
firstable I get soon

squidaio_queue_request: WARNING - Queue congestion
squidaio_queue_request: WARNING - Queue congestion

with default configure options but I do not care of this now, what really
matters, I forced resets and aufs *do also get it wrong* only it act
different:

Store rebuilding is 100% complete
Store rebuilding is 100% complete
Store rebuilding is 100% complete
Store rebuilding is 100% complete
Store rebuilding is 100% complete
... repeating ...

and swap.state grows until disk is full

difference between diskd and aufs is that I get a "hit" with diskd almost
after any reset, aufs needs to be reset twice or trice in order to "get it
done"

then, about swap.state corruption, which is told to be culprit for
cache_dir emptying after unclean shutdowns

I let run a script which copies/backup each second the swap.state and
reset the server

after fsck finishes the swap.state is identical to it's copy

soon I start squid/diskd things get crazy

so I guess this swap.state story might be the cause in some cases but I do
not found a single case after 20 forced cache_dir problems, so I guess
there is something else doing wiered things

any idea?

Might it be possible that there is ctime/mtime problem or some other time
comparism confusion in the code for rebuilding the cache_dirs?

Michel

...

****************************************************
Datacenter Matik http://datacenter.matik.com.br
E-Mail e Data Hosting Service para Profissionais.
****************************************************
Received on Thu Jun 21 2007 - 03:40:44 MDT

This archive was generated by hypermail pre-2.1.9 : Sun Jul 01 2007 - 12:00:04 MDT