Re: [RFC] cache architecture

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 24 Jan 2012 22:32:44 +1300

On 24/01/2012 8:51 p.m., Pieter De Wit wrote:
> Hi Amos,
>
> My 2c :)
>
> I know you have noted it already, but please please don't change the
> start to something like another project does (start up dirty scan - I
> have personally seen a 12 hour plus start up time). Why they still on
> that, who knows...anyways, back to the email here :)

Sure. I'm absolutely trying to avoid any such thing.

>
> Alex, sorry if you covered these things, I have briefly skimmed over
> your email (was in transit) so count them as a +1.
>
> On 24/01/2012 15:24, Amos Jeffries wrote:
>> This is just a discussion at present for a checkup and possibly
>> long-term re-design of the overall Architecture for store logics. So
>> the list of SHOULD DO etc will contain things Squid already does.
>>
>> This post is prompted by
>> http://bugs.squid-cache.org/show_bug.cgi?id=3441 and other ongoing
>> hints about user frustrations on the help lists and elsewhere.
>>
>> Getting to the chase;
>>
>> Squids existing methods of startup cache loading and error recovery
>> are slow with side-effects impacting bandwidth and end-user
>> experience in various annoying ways. The swap.state mechanism speeds
>> loading up enormously as compared to the DIRTY scan, but in some
>> cases is still too slow.
>>
>>
>> Ideal Architecture;
>>
>> Squid starts with assumption of not caching. Cache spaces are loaded
>> as soon as possible with priority to the faster types. But loaded
>> asynchronously to the startup in a plug-n-play design.
>>
>> 1) Requests are able to be processed at all times, but storage
>> ability will vary independent of Squid operational status.
>> + minimal downtime to first request accepted and responded
>> - lost all or some caching benefits at times
> (+) The memory cache will be empty, "plenty" of space for
> objects....more on this later (not sure about squid -k reconfigure
> etc...)
>>
>> 2) cache_mem shall be enabled by default and first amongst all caches
>> + reduces the bandwidth impact from (1) if it happens before first
>> request
>> + could also be setup async while Squid is already operating (pro
>> from (1) while minimising the con)
>>
>> 3) possibly multiple cache_mem. A traditional non-shared cache_mem, a
>> shared memory space, and an in-transit unstructured space.
>> + non-shared cache_mem allows larger objects than possible with the
>> shared memory.
>> + separate in-transit area allows collapsed forwarding to occur for
>> incomplete but cacheable objects
>> note that private and otherwise non-shareable in-transit objects
>> are a separate thing not mentioned here.
>> - maybe complex to implement and long-term plans to allow paging
>> mem_node pieces of large files should obsolete the shared/non-shared
>> split.
> I guess the multiple cache_mem's will "load share" like the
> cache_dir's atm ? Not sure why I mention this....
>>
>> 4) config load/reload at some point enables a cache_dir
>> + being async means we are not delaying first response waiting for
>> potentially long slow disk processed to complete
>> - creates a high MISS ratio during the wait for these to be available
>> - adds CPU and async event queue load on top of active traffic loads,
>> possibly slowing both traffic and cache availability
>>
>> 5) cache_dir maintains distinct (read,add,delete) states for itself
>> + this allows read-only (1,0,0) caches, read-and-retain (1,1,0) caches
>> + also allows old storage areas to be gracefully deprecated using
>> (1,0,1) with object count decrease visibly reporting the progress of
>> migration.
> +2 on this. Perhaps add a config option to cache_mem and cache_dir
> called "state", e.g: state=ro|rw|"expire", so "ro,expire" will allow a
> cache to be expired, aka, objects deleted (maybe Alex covered this ?)
>>
>> 6) cache_dir structure maintains a "current" and a "max" available
>> fileno setting.
>> current always starting at 0 and being up to max. max being at
>> whatever swap.state, a hard-coded value or appropriate source tells
>> Squid it should be.
>> + allows scans to start with caches set to full access, but limit the
>> area of access to a range of already scanned fileno between 0 and
>> current.
>> + allows any number of scan algorithms beyond CLEAN/DIRTY and while
>> minimising user visible impact.
>> + allows algorithms to be switched while processing
>> + allows growing or shrinking cache spaces in real-time
>>
>> 7) cache_dir scan must account for corruption of both individual
>> files, the index entries, and any meta data construct like swap.state
>>
>> 8) cache_dir scan should account for externally added files.
>> Regardless of CLEAN/DIRTY algorithm being used.
>> by this I mean check for and handle (accept or erase) cache_dir
>> entries not accounted for by the swap.state or equivalent meta data.
>> + allows reporting what action was taken about the extra files. Be it
>> erase or import and any related errors.
>>
>>
>> Anything else?
>>
>> Amos
>>
> When you mentioned "cache spaces" I thought:
>
> Why not have blocks allocated (or block devices) like this:
>
> Block 0+ (of said block "thing") - This is a header/info block/swap
> state replacement
> Block x/y/z - Allocated to objects
>
> After reading block 0, you know what blocks are free and you can start
> caching in those, yes this will cause double objects, but you get to
> your end goal and you can expire the "cached object", recovering the
> blocks. The same method can be used for memory blocks.
>
> Perhaps too much work for little gain ? "Leave it to the OS/Filesystem
> comments welcome"

Good point. The loading is already done in one form or another by all
the cache types but needs formalizing in the design description. rock
and COSS use records and slices in this fashion

FYI: At the architectural level the "space" would be an arbitrary size
represented in squid.conf by a cache_dir line or a cache_mem line. The
"space" can all be loaded in parallel or overlapping batches or whatever.
  A good space would break its full size into blocks of processing like
you describe.

>
> Perhaps a 9) Implement dual IO queues - I *think* the IO has been
> moved into it's own thread, if not, the queuing can still be applied.
> Any form of checking the cache is going to effect squid, so how do we
> ensure we are idle, dual queues :) Queue 1 holds the requests for
> squid, queue 2 holds the admin/clean up requests. The IO "thread" (if
> not threaded), before handling an admin/clean up request checks Queue
> 1 for requests, empties is *totally before* heading into Queue 2. This
> will allow you to have the same caching as now, relieving the start-up
> problems ? Might lead to the same double cache of objects as above (if
> you make the cache writable before the scan is done)

I wonder about priority queues every now and then. It is an interesting
idea. The I/O is currently done with pluggable modules for various
forms. DiskThreads and AIO sort of do this but are FIFO queued in N
parallel queues. Prioritised queues could be an interesting additional
DiskIO module.

What I'm looking for is a little bit more abstracted towards the
architecture level across cache type and implementation. At that scale
we can't use any form of "totally empty" queue condition because on
caches that receive much traffic the queue would be quite full, maybe
never actually empty. Several of the problems we have now are waiting on
the cache load completed (ie the load action queue empty) before a cache
is even considered for use.

Amos
Received on Tue Jan 24 2012 - 09:32:50 MST

This archive was generated by hypermail 2.2.0 : Tue Jan 24 2012 - 12:00:08 MST