[squid-users] SO_FAIL in store_log under high load

From: Nguyen, Khanh, INFOT <nguyenkt@dont-contact.us>
Date: Fri, 19 Jan 2007 16:45:10 -0500

Hi,

I experience high cache miss due to swap out failure under high load.

Background setup:
-------------------

I run squid 2.6 stable 4-20061009 on Redhat AS server version 4 update 3
(64 bits kernel)

The server hardware is dell 2950. 4 GB Ram, 6 hard drives, each 164 GB.
Virtual memory is set to 20 GB.

I compile the squid with --enable-async-io=10,
--enable-follow-x-forwarded-for, --enable-auth=basic, --disable-wccp,
--enable-snmp, --enable-x-accelerator-vary, --enable-remove-policies=lru

I configure 5 cache_dir, each has 140 GB on each hard drive, 50 GB is
dedicated for caches. The logs are write to one separate hard drive.
Cache_mem is set to 2.5 GB

I setup squid to run in reverse proxy mode with 100 domains, each domain
has an upstream origin server. The first 33 domains share one origin
server, the second 33 domains share the second origin server, the last
34 domain share the third origin server (total of 3 origin servers)
disable the log on access_log.

Two type of object sizes: 220 KB and 2KB. 50% request for 220KB and 50%
request for 2 KB

I prepopulate all the objects in the cache server (200000 objects)100000
objects of 220KB, 100000 objects of 2KB (2000 objects each domain). Each
object has time to live 15 minutes

Behavior:
------------

Randomly accessing 200000 objects:
--Under low load, approximate 60 r/s, the server operates well with cpu
at 30-60%, request hit rate, bytes hit rate is approximate 95%
--As soon as I increase the load to higher than this, there is a lot of
SO_FAIL in the store.log (swap failure) and thus the object is not valid
anymore and the server starts to retrieve objects from the origin server
even though it should not (object has not changed). The request/byte hit
rate goes down to 50%.

My questions are
----------------
1) what has caused the swap failure? Is it due to async threads or disk
failure?
2) what is the optimum thread value that should one choose when compile
under async-io?
3) Is there any other parameters in the squid.conf might impact the
ability to swap an object out to disk?

Note, for a few first runs, I was able to achive 600r/s, but now the
best I can do is 200r/s with a lot of SO_FAIL and the request/byte hit
hover over 50%. I even try to restart with empty cache, clear all
cache_dir but that does not fix it.

Hope there is someone has insight about this or have had some experience
on this issue can give me some advices, suggestions to improve the
operation of my cache.

Thanks,
Khanh
Received on Fri Jan 19 2007 - 14:45:20 MST

This archive was generated by hypermail pre-2.1.9 : Thu Feb 01 2007 - 12:00:01 MST