Re: [RFC] cache architecture

From: Pieter De Wit <pieter_at_insync.za.net>
Date: Tue, 24 Jan 2012 20:51:48 +1300

Hi Amos,

My 2c :)

I know you have noted it already, but please please don't change the
start to something like another project does (start up dirty scan - I
have personally seen a 12 hour plus start up time). Why they still on
that, who knows...anyways, back to the email here :)

Alex, sorry if you covered these things, I have briefly skimmed over
your email (was in transit) so count them as a +1.

On 24/01/2012 15:24, Amos Jeffries wrote:
> This is just a discussion at present for a checkup and possibly
> long-term re-design of the overall Architecture for store logics. So
> the list of SHOULD DO etc will contain things Squid already does.
>
> This post is prompted by
> http://bugs.squid-cache.org/show_bug.cgi?id=3441 and other ongoing
> hints about user frustrations on the help lists and elsewhere.
>
> Getting to the chase;
>
> Squids existing methods of startup cache loading and error recovery
> are slow with side-effects impacting bandwidth and end-user experience
> in various annoying ways. The swap.state mechanism speeds loading up
> enormously as compared to the DIRTY scan, but in some cases is still
> too slow.
>
>
> Ideal Architecture;
>
> Squid starts with assumption of not caching. Cache spaces are loaded
> as soon as possible with priority to the faster types. But loaded
> asynchronously to the startup in a plug-n-play design.
>
> 1) Requests are able to be processed at all times, but storage ability
> will vary independent of Squid operational status.
> + minimal downtime to first request accepted and responded
> - lost all or some caching benefits at times
(+) The memory cache will be empty, "plenty" of space for
objects....more on this later (not sure about squid -k reconfigure etc...)
>
> 2) cache_mem shall be enabled by default and first amongst all caches
> + reduces the bandwidth impact from (1) if it happens before first
> request
> + could also be setup async while Squid is already operating (pro from
> (1) while minimising the con)
>
> 3) possibly multiple cache_mem. A traditional non-shared cache_mem, a
> shared memory space, and an in-transit unstructured space.
> + non-shared cache_mem allows larger objects than possible with the
> shared memory.
> + separate in-transit area allows collapsed forwarding to occur for
> incomplete but cacheable objects
> note that private and otherwise non-shareable in-transit objects are
> a separate thing not mentioned here.
> - maybe complex to implement and long-term plans to allow paging
> mem_node pieces of large files should obsolete the shared/non-shared
> split.
I guess the multiple cache_mem's will "load share" like the cache_dir's
atm ? Not sure why I mention this....
>
> 4) config load/reload at some point enables a cache_dir
> + being async means we are not delaying first response waiting for
> potentially long slow disk processed to complete
> - creates a high MISS ratio during the wait for these to be available
> - adds CPU and async event queue load on top of active traffic loads,
> possibly slowing both traffic and cache availability
>
> 5) cache_dir maintains distinct (read,add,delete) states for itself
> + this allows read-only (1,0,0) caches, read-and-retain (1,1,0) caches
> + also allows old storage areas to be gracefully deprecated using
> (1,0,1) with object count decrease visibly reporting the progress of
> migration.
+2 on this. Perhaps add a config option to cache_mem and cache_dir
called "state", e.g: state=ro|rw|"expire", so "ro,expire" will allow a
cache to be expired, aka, objects deleted (maybe Alex covered this ?)
>
> 6) cache_dir structure maintains a "current" and a "max" available
> fileno setting.
> current always starting at 0 and being up to max. max being at
> whatever swap.state, a hard-coded value or appropriate source tells
> Squid it should be.
> + allows scans to start with caches set to full access, but limit the
> area of access to a range of already scanned fileno between 0 and
> current.
> + allows any number of scan algorithms beyond CLEAN/DIRTY and while
> minimising user visible impact.
> + allows algorithms to be switched while processing
> + allows growing or shrinking cache spaces in real-time
>
> 7) cache_dir scan must account for corruption of both individual
> files, the index entries, and any meta data construct like swap.state
>
> 8) cache_dir scan should account for externally added files.
> Regardless of CLEAN/DIRTY algorithm being used.
> by this I mean check for and handle (accept or erase) cache_dir
> entries not accounted for by the swap.state or equivalent meta data.
> + allows reporting what action was taken about the extra files. Be it
> erase or import and any related errors.
>
>
> Anything else?
>
> Amos
>
When you mentioned "cache spaces" I thought:

Why not have blocks allocated (or block devices) like this:

Block 0+ (of said block "thing") - This is a header/info block/swap
state replacement
Block x/y/z - Allocated to objects

After reading block 0, you know what blocks are free and you can start
caching in those, yes this will cause double objects, but you get to
your end goal and you can expire the "cached object", recovering the
blocks. The same method can be used for memory blocks.

Perhaps too much work for little gain ? "Leave it to the OS/Filesystem
comments welcome"

Perhaps a 9) Implement dual IO queues - I *think* the IO has been moved
into it's own thread, if not, the queuing can still be applied. Any form
of checking the cache is going to effect squid, so how do we ensure we
are idle, dual queues :) Queue 1 holds the requests for squid, queue 2
holds the admin/clean up requests. The IO "thread" (if not threaded),
before handling an admin/clean up request checks Queue 1 for requests,
empties is *totally before* heading into Queue 2. This will allow you to
have the same caching as now, relieving the start-up problems ? Might
lead to the same double cache of objects as above (if you make the cache
writable before the scan is done)

Back to my hole :)

Pieter
Received on Tue Jan 24 2012 - 07:51:58 MST

This archive was generated by hypermail 2.2.0 : Tue Jan 24 2012 - 12:00:08 MST