Re: [squid-users] Squid Performance Issues - reproduced

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Thu, 02 Jan 2003 06:02:21 +0100

Mika Aleksandroff wrote:

> Strangest thing, I've been experimenting with this thing for a while now
> and I am actually experiencing the exact same thing. I even downloaded and
> compiled the daily (20021231) package and enabled pthreads to no apparent
> effect. I've used 'wget -O /dev/null URL' for testing.

I have now managed to randomly reproduce the situation here on my poor
old P133/64MB home development machine, giving almost exact 200KB/se hit
transfer rate for a single aufs cache hit or sometimes a couple in
parallell (each then receiving 200KB/s). Exact conditions triggering
this is yet unknown but I would guess there is some kind of oscillation
in the queue management due to the extremely even and repetive pattern
I/O pattern.

1041475349.135 5222 127.0.0.1 TCP_HIT/200 1048953 GET
http://localhost/%7Ehenrik/test2 - NONE/- text/plain

If you give it a little more varied load the symtom seems to clear up,
and then only seems to return if the Squid is left idle for some time.

Normal transfer rate on my P133 using wget over loopback is about
  4MB/s for TCP_MEM_HIT
  3MB/second TCP_HIT in disk cache buffer
  2MB/s for TCP_HIT on cold objects fully read from disk

TCP_MISS times is irrelevant, as the machine is seriously underpowered
to run all of wget, squid and apache, but averages around 2MB/s.

1041475044.568 253 127.0.0.1 TCP_MEM_HIT/200 1048952 GET
http://localhost/%7Ehenrik/test2 - NONE/- text/plain

041475086.615 1091 127.0.0.1 TCP_HIT/200 3146105 GET
http://localhost/%7Ehenrik/test3 - NONE/- text/plain

1041475289.082 17499 127.0.0.1 TCP_HIT/200 41943419 GET
http://localhost/%7Ehenrik/test - NONE/- text/plain

Hmm.. 200KB/s / 4KB per I/O request = 50 * 2 queue operations per
request = 100 = number of clock ticks/s (HZ) on a Linux Intel X86..

Wait a minute.. yes, naturally. This will quite likely happen on an aufs
Squid which has nothing to do as the I/O queue is only polled once per
comm loop, and an "idle" comm loop runs at 100/s. The question is more
why it does not always happen, and why it is 200KB/s and not 400KB/s.
What can be said is also that the likelyhood that this won't happen
decreases a lot on SMP machines as the main thread then have no
copmetition with the I/O threads for the CPU.

I have an old async-io patch somewhere which introduces a signal to
force the main process to resume when an disk I/O operation have
completed, but it may have other ill effects so it is not included (or
maintained).

Testing on my old UP P133 glibc 2.1.3 gives some surpricing results in
thread scheduling. It seems that under some conditions the signalled
thread is not awakened until much later than expected. I would expect
the signalled thread to be awakeden when the signalling thread calls
pthread_mutex_unlock or at latest at sched_yield, but when Squid is in
"200K/s aufs cache hit mode" the I/O thread stays suspended until the
next system tick or something similar it seems... will try to do
additional investigation later on.

Anyway, this 200K/s is by no means an indication of the performance or
speed of your Squid. It is only a signal that your Squid needs more
things to do for using aufs efficiently. The more concurrent things you
give it to work with the faster it goes, up to the point where your
hardware cannot handle more.

Regards
Henrik
Received on Wed Jan 01 2003 - 22:03:14 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:12:24 MST