Re: [RFC] cache architecture

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 23 Jan 2012 22:16:57 -0700

On 01/23/2012 07:24 PM, Amos Jeffries wrote:
> This is just a discussion at present for a checkup and possibly
> long-term re-design of the overall Architecture for store logics. So the
> list of SHOULD DO etc will contain things Squid already does.
>
> This post is prompted by
> http://bugs.squid-cache.org/show_bug.cgi?id=3441 and other ongoing hints
> about user frustrations on the help lists and elsewhere.
>
> Getting to the chase;
>
> Squids existing methods of startup cache loading and error recovery are
> slow with side-effects impacting bandwidth and end-user experience in
> various annoying ways. The swap.state mechanism speeds loading up
> enormously as compared to the DIRTY scan, but in some cases is still too
> slow.
>
>
> Ideal Architecture;
>
> Squid starts with assumption of not caching.

I believe you wanted to say something like "Squid starts serving request
possibly before Squid loads some or all of the cache contents, if any".
Caching includes storing, loading, and serving hits. An ideal
architecture would not preclude storing and serving even if nothing was
loaded from disks yet. I believe you already document that below, but
the above sentence looks confusing/contradicting to me.

> Cache spaces are loaded as
> soon as possible with priority to the faster types. But loaded
> asynchronously to the startup in a plug-n-play design.

Yes. Since loading a cache requires system resources, the faster we try
to load, the slower we can serve the regular traffic while we load the
cache. Some stores will need options to control how aggressive the
asynchronous load is. SMP helps, of course, but it does not solve the
problem completely in many cases.

Also, an ideal Store should accept (or at least gracefully reject) new
entries while its storage is being loaded. There should be no global "we
are loading the cache, use special care" code in main Squid.

> 1) Requests are able to be processed at all times, but storage ability
> will vary independent of Squid operational status.
> + minimal downtime to first request accepted and responded
> - lost all or some caching benefits at times
>
> 2) cache_mem shall be enabled by default and first amongst all caches
> + reduces the bandwidth impact from (1) if it happens before first request
> + could also be setup async while Squid is already operating (pro from
> (1) while minimising the con)

Sure.

> 3) possibly multiple cache_mem. A traditional non-shared cache_mem, a
> shared memory space, and an in-transit unstructured space.

In-transit space is not a cache so we should not mix it and cache_mem in
an "ideal design" blueprint. Collapsed forwarding requires caching and
has to go through cache_mem, not in-transit space.

> + non-shared cache_mem allows larger objects than possible with the
> shared memory.
> + separate in-transit area allows collapsed forwarding to occur for
> incomplete but cacheable objects
> note that private and otherwise non-shareable in-transit objects are a
> separate thing not mentioned here.
> - maybe complex to implement and long-term plans to allow paging
> mem_node pieces of large files should obsolete the shared/non-shared split.

Indeed. I do not see any compelling reasons to have shared _and_
non-shared caches at the same time. In the ideal design, the shared
cache will be able to store large objects, eliminating the need for the
non-shared cache. Please keep in mind that any non-shared cache would
violate HTTP in an SMP case.

In-transit space does not need to be shared (but it is separate from
caching as discussed above).

> 4) config load/reload at some point enables a cache_dir
> + being async means we are not delaying first response waiting for
> potentially long slow disk processed to complete
> - creates a high MISS ratio during the wait for these to be available
> - adds CPU and async event queue load on top of active traffic loads,
> possibly slowing both traffic and cache availability
>
> 5) cache_dir maintains distinct (read,add,delete) states for itself
> + this allows read-only (1,0,0) caches, read-and-retain (1,1,0) caches
> + also allows old storage areas to be gracefully deprecated using
> (1,0,1) with object count decrease visibly reporting the progress of
> migration.
>
> 6) cache_dir structure maintains a "current" and a "max" available
> fileno setting.
> current always starting at 0 and being up to max. max being at
> whatever swap.state, a hard-coded value or appropriate source tells
> Squid it should be.
> + allows scans to start with caches set to full access, but limit the
> area of access to a range of already scanned fileno between 0 and current.
> + allows any number of scan algorithms beyond CLEAN/DIRTY and while
> minimising user visible impact.
> + allows algorithms to be switched while processing
> + allows growing or shrinking cache spaces in real-time

Linear fileno space prevents many optimizations. I would not require it.
FWIW, Rock store does not use linear fileno space.

> 7) cache_dir scan must account for corruption of both individual files,
> the index entries, and any meta data construct like swap.state

Yes, ideally.

> 8) cache_dir scan should account for externally added files. Regardless
> of CLEAN/DIRTY algorithm being used.
> by this I mean check for and handle (accept or erase) cache_dir
> entries not accounted for by the swap.state or equivalent meta data.
> + allows reporting what action was taken about the extra files. Be it
> erase or import and any related errors.

I think this should be left to individual Stores. Each may have their
own way of adding entries. For example, with Rock Store, you can add
entries even at runtime, but you need to update the shared maps
appropriately.

> Anything else?

0) Each Store (including memory cache) should have its own map. No
single global store_table and no assumptions on how a store maintains
its map. Main Squid code just needs to be able to add/search/remove and
summarize entries.

Thank you,

Alex.
Received on Tue Jan 24 2012 - 05:17:36 MST

This archive was generated by hypermail 2.2.0 : Tue Jan 24 2012 - 12:00:08 MST