Re: [squid-users] Two Squid with common cache from Nyamul Hassan on 2009-03-06 (squid-users)

From: Nyamul Hassan <mnhassan_at_usa.net>
Date: Fri, 6 Mar 2009 21:31:17 +0600

>>>> This thread just raised another question on my mind. How would adding
>>>> another instance (in the same box) with "null cache_dir" improve my
>>>> performance? I'm a bit confused. Could you please explain a bit more?
>>>
>>> I don't think it would improve performance per-se. It would meet your
>>> criteria of "i would like the cache_dir to be common between the 2
>>> squid".
>>>
>>
>> Exactly, as on-disk objects will be served by Squid A alone. And, but
>> objects in Cache_Mem will be served by Squid A or B from their own
>> respective memory allocations, and TCP_MISS entirely served from the
>> Internet (assuming Squid A is not configured to fetch from internet for
>> ICP_MISS).
>>
>>> Two squid running on the same box:
>>> Squid-A has a large on-disk cache_dir
>>> Squid-B has no disk cache, and sources all requests from Squid-A.
>>>
>>> The overall effect is that only one copy of each cached object is held
>>> on disk at that machine (in Squid-A's cache).
>>>
>>> Since Squid-B passes on most requests to Squid-A, the actual response
>>> is less than 2x, more like: up to 1.5x capacity of only squid-A.
>>>
>>
>> I didn't get this increase of "Capacity". Also, what do you mean by
>> "response"? Response Time? Can you explain a bit more?
>
> "request capacity" On a given CPU squid has a maximum absolute number of
> requests per second it can process.
>
> Using two squid in parallel load-balanced doubles that maximum req/sec,
> assuming the load balancer can handle it too. This setup with one as
> parent gets less than 1.5x (half-again) instead of 2x (double).
>

Is there a measure on this "Maximum Absolute Number" of requests?

>>
>>> It's up to you if the loss of ~25% response capacity is worth the
>>> duplicate object removal.
>>>
>>
>> I'm stumped on this one too. Can you please also educate me on "Response
>> Capacity" and "Duplicate Object Removal"?
>
> capacity as above.
> "duplicate object removal" having only one cache of objects instead of
> two.
>

On the "duplicate" note, if you've got sibling proxies configured as
"proxy-only", that should minimize duplication, as anything that Squid A
sees on Squid B is served from Squid B, with Squid A not doing any caching
of that content. The only duplicate that I can think of can happen in case
of simultaneous requests of the same object to both Squid A and Squid B, and
which is not yet cached in either of them, which shouldn't be much.

>>
>>> There is no net gain in response timing, since the Squid-B -> Squid-A
>>> request + Squid-A disk read, is usually slower than a disk read.
>>>
>>
>> This I understand clearly.
>>
>>> Off to performance:
>>> for best performance, you cache small objects into COSS directories
>>> and tune AUFS/DiskD to the OS type you are running on. Ensuring only one
>>> disk spindle is used per cache_dir with as fast disks as you can buy.
>>> Size doesn't matter, speed does.
>>>
>>
>> Yes, exactly as you and other gurus in the forum have said that in
>> numerous posts. On this note, I have seen that COSS has minimal impact
>> on disk usage, almost negligible. And, I also read somewhere, that Linux
>> usage of in-memory disk buffers is quite efficient. So, which is more
>> efficient - Squid's "cache_mem" or OS's in-memory disk buffer?
>
> Ah, one of the really big questions.
>

Looks like we have yet to find out a suitable answer to that! Is there a
way to test this? Like say, having no "cache_mem", and seeing how far we
can push this, and getting comparable readings on IO_Wait and sevice timers?

>>
>>> Then pump up the machine memory as far as it can go and allocate as
>>> said
>>> many bytes to cache_mem as possible without causing any memory swapping
>>> (particularly prevent swapping when under highest load).
>>>
>>
>> How do you find out if the system is "swapping"? We already have
>> configured the Linux box to have NO swap space.
>
> Now that I'm not sure of myself. Never having faced that much server load.
>

Well, in our case we've got only 8 GB of RAM, and cache_mem is set to 3GB.
Squid process shows it is consuming ~4GB of memory (in top).

>>
>> We have the following in each of our Squid boxes:
>> 4 x 250 GB SATA HDDs each having:
>> - 20 GB COSS with max-size=1000000 maxfullbufs=32 membufs=256
>> block-size=2048 max-size=65536
>> - 160 GB aufs with min-size=65537
>>
>> So, does that mean the Squid process needs 10 GB [(14MB / GB) x (180 GB
>> per disk) x 4 disks] of memory just to maintain the index? And, on top
>> of that the size of "cache_mem" that we specify in our config? And, also
>> on top of these, some space for disk buffers?
>
> I believe so. Yes.
>

Looks like we really need to move to 16 GB of memory!

Regards
HASSAN
Received on Fri Mar 06 2009 - 15:31:46 MST

This archive was generated by hypermail 2.2.0 : Mon Mar 09 2009 - 12:00:01 MDT