Re: [squid-users] Set up a cluster of 10 squid servers using ~170GB of memory and no disk

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 03 Oct 2013 00:11:44 +1300

On 2/10/2013 11:24 p.m., Jérôme Loyet wrote:
> thx for your reply amos
>
> 2013/10/2 Amos Jeffries:
>> On 2/10/2013 10:02 p.m., Jérôme Loyet wrote:
>>> Hello,
>>>
>>> I'm facing a particular situation. I have to set-up a squid cluster on
>>> 10 server. Each server has a lot of RAM (192GB).
>>>
>>> Is it possible et effective to setup squid to use only memory for
>>> caching (about 170GB) ?
>>
>> memory-only caching is the default installation configuration for Squid-3.2
>> and later.
>>
> is there any problem by setting cache_mem to 170GB ?

Couple of potential problems:
* the usual 32-bit problems if you are not careful about using a 64-bit
build (only 4GB accessible - shows up as ~3GB used cache).

* Issues around not leaving enough for the other processes on the system
to use. You do *not* want a Squid machine to start swapping memory out.

* cache_mem is *only* the size of storage area allocated to cached HTTP
objects. The index for that cache takes up 15MB per GB of cache space.
Plus other traffic memory requirements amounting to some MB (possibly a
few GB on a busy proxy).

* Squid can only store 2^24-1 objects in any one given cache area. 170GB
may be well above the space needed for 16.8 million objects.

* If Squid crashes for any reason memory-only caches get wiped back to
empty. This can cause random sudden peaks in bandwidth as the cache is
re-populated.

>
>>> What directive should be tweaked ? (cache_mem,
>>> cache_replacement_policy, maximum_object_size_in_memory, ...). The
>>> cache will store object from several KB (pictures) up to 10MB (binary
>>> chunks of data).
>>
>> All of the above memory cache parameters. (cache_replacement_policy is disk
>> parameter, but there is memory_cache_replacement_policy instead).
>>
>> cache_mem and maximum_object_size_in_memory in particular. The default is
>> for only small objects to be memory cached.
>>
>>
>>
>>> With 10 sibling cache squid instances, what should I use as type of
>>> cache peering ? (10 siblings or a multicast ICP cluster, other ?)
>>
>> With memory cache holding under-32KB objects SMP workers would be best. They
>> share a single memory cache, but it is size limited to 32KB memory pages due
>> to Squid internal storage design. I'm not sure yet whether the work underway
>> to extend Rock storage past this same limit is going to help shared memory
>> cache as well (hope so, but dont know).
> I'm planning to set up one squid instance on each of the 10 servers.
> As the server won't be dedicated to squid, I want to limit squid to
> run on one single process (no SMP).

Not that it matters, but that logic does not quite hold together. One
Squid instance, whether SMP or not, is still single-threaded and will
consume up to one full CPU core at most. The more cores your box has the
less % of total CPU one Squid instances will be consuming.
SMP "just" allows easier configuration of multiple instances.

>
>> For now, if you want larger objects to server from memory you will be best
>> off with a CARP cluster or HTCP sibling configuration for now.
>>
>> NP: I recommend staying away from ICP with Squid-3.2 and later. We have not
>> yet changed the defaults, but ICP has a very high false-positive hit rate on
>> HTTP/1.1 traffic. HTCP is more bandwidth hungry on the UDP but far better on
>> HIT rate.
> what's your opinion about multicast for sharing cache ?

For those where it works it can be nice. It certainly reduces the
bandwidth and delays from ICP queries.
However Squid only supports multicast for ICP, (no HTCP) and that means
the ICP false-positive rate. All those popular sites using HTTP/1.1
responses with Vary and content negotiation are almost guaranteed to
false-HIT with ICP.

Amos
Received on Wed Oct 02 2013 - 11:11:50 MDT

This archive was generated by hypermail 2.2.0 : Wed Oct 02 2013 - 12:00:04 MDT