Re: Multiple storeio types?

From: Joe Cooper <joe@dont-contact.us>
Date: Fri, 09 Nov 2001 01:58:48 -0600

Henrik Nordstrom wrote:

> On Friday 09 November 2001 07.51, Joe Cooper wrote:
>
>
>>Quick query regarding the least load and UFS type. Is there a reason
>>for choosing constant 999 for the loadav for that store type? I mean is
>>it magic in some way that I'm not seeing? Could it just as easily be,
>>say, 500? That's not a real fix, of course. A real fix doesn't look
>>trivial (hence the reason it's returning a constant I guess ;-), as I
>>don't see any way to gather the load that is sensible. But 500 might be
>>more fair when used with other storedir types, and would behave the same
>> as it currently does if all dirs are ufs type.
>>
>
> It needs to return something. As it cannot adjust with load I think a very
> low priority was intentionally selected to make Squid prefer other stores
> capable of load adjustments..

Makes sense. I'll tweak it for my needs, if I need to, and worry about
the Right Way at some later date.

>>Another query, that is somewhat related: Are the diskd and aufs loadav
>>calculations 'compatible'? I guess it would take test data that
>>probably doesn't exist yet to know since they both rely on somewhat
>>arbitrary load indicators and both behave very differently based on
>>those conditions.
>>
> Probably not entirely compatible. In fact both are most likely in need of
> some tuning of the load algorithm to find the correct measurements and scales
> on how to measure their load.

Ok. That's what I thought. I'll give this some study when I return
from the cacheoff in a week. It's simpler stuff than I thought now that
I've looked at it (the MAGIC numbers confused me before!).

>>If I find that least load is more efficient for my needs, I'm going to
>>give a diskd storedir type a try on my RAM disk. Maybe lighter weight
>>than 16 threads... Any thoughts on whether I'm misled in that assumption?
>
> Should work.. the aufs store should also work. The amount of aufs threads can
> be tuned to fit your needs.

Per cache_dir?

> But I think in all cases you need to spend some time on how you want to have
> the load balanced. I guess there is a reason to why you are mixing a RAM disk
> in the store with other types.

Yep, I have my reasons. ;-) Because Squid wants to write everything to
disk, and with only two IDE disks to write to, I can't afford those
write ops. I'm trying to push 150 reqs/sec+ out of two 7200 RPM IDE
disks...which just does not work when Squid is writing every object (we
max at about 110--no matter what the bdflush/elvtune parameters are).
So I'm 'filtering' the smaller objects into a RAM disk...because they're
so small and the block size is 1024, we can cram a couple hundred
thousand objects into a 512MB RAM disk...about half the number of
objects that go into the real disks at 7.5GB each. Works surprisingly
well, and with 2GB of RAM costing under $200, it's a performance bargain
(two more IDE disks would cost the same, but gain less throughput).

The same effect could theoretically be achieved with less precision, by
giving Squid a big cache_mem setting and a minimum_object_size of
whatever my max-size on the RAM disk is. I experimented with that
first, but found that using an extremely large cache_mem setting results
in some unfortunate and weird behavior from the Linux 2.4.x kernel, and
since the cacheoff starts Monday, I haven't the time to figure out why
or a workaround.

The weird behavior I refer to is that it insists on getting very swappy
when Squid gets over ~450MB in process size, even though there is a
/ton/ of memory available for playing with. As soon as it has to start
swapping parts of Squid back in, performance drops, and eventually
everything slows to a crawl. I don't know why Linux would swap out
parts of a big process while there is plenty of memory available, but
for whatever reason the same size RAM disk doesn't cause this swap
thrashing to occur. Maybe this is a phenomenon similar to what was seen
on Irix, with memory thrashing leading to no memory blocks large enough
to allocate. I don't know. Will look deeper into it after the cacheoff.

Oh, yeah, the other benefit to the ram disk vs. big cache_mem/big
minimum obj size is that on shutdown the RAM disk can be flushed to a
real disk so that we don't lose those objects (could even be done
periodically minus the swap.state, to avoid complete lossage on an
improper shutdown or crash). Squid's memory objects can't be flushed in
any controlled manner. Maybe if I feel smart one day, I'll try to
integrate that functionality on the assumption that the Linux kernel
will be fixed eventually. I'm guessing Squid is probably more efficient
without the fake disk overhead of the RAM disk. ;-)

Thanks for your thoughts, Henrik. Always enlightening.

-- 
Joe Cooper <joe@swelltech.com>
http://www.swelltech.com
Web Caching Appliances and Support
Received on Fri Nov 09 2001 - 00:55:27 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:37 MST