Re: [squid-users] Re: Filter squid cached files to multiple cache dirs from Amos Jeffries on 2014-08-23 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 24 Aug 2014 11:46:00 +1200

On 24/08/2014 6:06 a.m., dxun wrote:
> So, to sum it all up (please correct me if I'm wrong) - it is possible to
> have multiple cache_dirs AND instruct a single squid instance to place files
> in those caches according to file size criteria using
> min_file_size/max_file_size params on the cache_dir directive. Also,
> maximum_object_size directive is basically a global max_file_size param
> applied to all cache_dirs, so it has to be specified BEFORE any particular
> cache_dir configuration.

Sort of.
* default value for maximum_object_size is 4MB, which is used until you
change it.
* maximum_object_size is the default value for cache_dir max-size=N
parameter. Its current value is applied only if you omit that parameter
from a cache_dir.

For example (not a good idea to actually do it like this):
# default maximum_object_size is 4 MB
cache_dir ufs /a 100 16 256

maximum_object_size 8 MB
cache_dir ufs /b 100 16 256

maximum_object_size 2 MB
cache_dir ufs /c 100 16 256

Is the same as writing:

  cache_dir ufs /a 100 16 256 max-size=4194304
  cache_dir ufs /b 100 16 256 max-size=8388608
  cache_dir ufs /c 100 16 256 max-size=2097152

>
> If that is the case, I am wondering - is this separation actually
> inadvisable for any reason?

It is advised for better performance on high throughput configurations
with multiple cache_dir.
It does not matter for other configurations.

> Is there a better way to separate files
> according to their transiency and underlying data store speed?

Squid automatically separates out the most recently and frequently used
objects for storage in the high speed RAM cache. Also monitors the drive
I/O stats for overloading. There is just no differentiation between HDD
and SSD speeds (yet) - although indirectly via the loading checks SSD
can see more object throughput then HDD.

rock cache type is designed to reduce disk I/O loading on objects with
high temporal locality ("pages" often requested or updated together in
bunches), particularly if they are small objects.

Transiency is handled in memory, or by RAM caching objects for a while
before they go near disk. This is controlled by
maximum_object_size_in_memory, objects over that limit will have disk
I/O regardless of transiency in older Squid. Upcoming 3.5 releases only
involve disk on them if they are actually cacheable.

? What would
> you recommend?

Upstream recommendation is to configure maximum_object_size, then your
cache_dir ordered by the size of objects going in there (smallest to
largest).

Also, to use a Rock type cache_dir for the smallest objects. It can be
placed on the same HDD as an AUFS cache and working together a rock for
small objects and AUFS for large objects can utilize larger HDD sizes
better than either cache type alone.
* 32KB object size is the limit for rock in current stable releases,
that is about to be increased with squid-3.5.

Based on theory and second-hand reports: I would only use an SSD for
rock type cache with block size parameter for the rock cache sized to
match the SSD sector or page size. So that only writing a single rock
block/page causes each SSD sector/page to bump further towards its
lifetime write limit.

Amos
Received on Sat Aug 23 2014 - 23:46:11 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 27 2014 - 12:00:12 MDT