Re: [RFC] cache architecture

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 24 Jan 2012 21:51:29 +1300

On 24/01/2012 6:16 p.m., Alex Rousskov wrote:
> On 01/23/2012 07:24 PM, Amos Jeffries wrote:
>> This is just a discussion at present for a checkup and possibly
>> long-term re-design of the overall Architecture for store logics. So the
>> list of SHOULD DO etc will contain things Squid already does.
>>
>> This post is prompted by
>> http://bugs.squid-cache.org/show_bug.cgi?id=3441 and other ongoing hints
>> about user frustrations on the help lists and elsewhere.
>>
>> Getting to the chase;
>>
>> Squids existing methods of startup cache loading and error recovery are
>> slow with side-effects impacting bandwidth and end-user experience in
>> various annoying ways. The swap.state mechanism speeds loading up
>> enormously as compared to the DIRTY scan, but in some cases is still too
>> slow.
>>
>>
>> Ideal Architecture;
>>
>> Squid starts with assumption of not caching.
> I believe you wanted to say something like "Squid starts serving request
> possibly before Squid loads some or all of the cache contents, if any".
> Caching includes storing, loading, and serving hits. An ideal
> architecture would not preclude storing and serving even if nothing was
> loaded from disks yet. I believe you already document that below, but
> the above sentence looks confusing/contradicting to me.

I means a bit more extreme than that. Squid being prepared to serve
requests before possibly even getting to the async call which
initializes the first cache area.
We often think of cache_mem as being always present when any caching is
done, but there really is no such guarantee. Admin can already configure
several cache_dir and "cache_mem 0". The problem is just that todays
Squid do some horrible things when configured that way as side effects
of our current design assumption.

>
>> Cache spaces are loaded as
>> soon as possible with priority to the faster types. But loaded
>> asynchronously to the startup in a plug-n-play design.
> Yes. Since loading a cache requires system resources, the faster we try
> to load, the slower we can serve the regular traffic while we load the
> cache. Some stores will need options to control how aggressive the
> asynchronous load is. SMP helps, of course, but it does not solve the
> problem completely in many cases.
>
> Also, an ideal Store should accept (or at least gracefully reject) new
> entries while its storage is being loaded. There should be no global "we
> are loading the cache, use special care" code in main Squid.

Agreed. This is good wording for (6) below.

>
>> 1) Requests are able to be processed at all times, but storage ability
>> will vary independent of Squid operational status.
>> + minimal downtime to first request accepted and responded
>> - lost all or some caching benefits at times
>>
>> 2) cache_mem shall be enabled by default and first amongst all caches
>> + reduces the bandwidth impact from (1) if it happens before first request
>> + could also be setup async while Squid is already operating (pro from
>> (1) while minimising the con)
> Sure.
>
>> 3) possibly multiple cache_mem. A traditional non-shared cache_mem, a
>> shared memory space, and an in-transit unstructured space.
> In-transit space is not a cache so we should not mix it and cache_mem in
> an "ideal design" blueprint. Collapsed forwarding requires caching and
> has to go through cache_mem, not in-transit space.

So proposals for collapsables which are too large for cache_mem? or when
"cache_mem 0"?

>
>> + non-shared cache_mem allows larger objects than possible with the
>> shared memory.
>> + separate in-transit area allows collapsed forwarding to occur for
>> incomplete but cacheable objects
>> note that private and otherwise non-shareable in-transit objects are a
>> separate thing not mentioned here.
>> - maybe complex to implement and long-term plans to allow paging
>> mem_node pieces of large files should obsolete the shared/non-shared split.
> Indeed. I do not see any compelling reasons to have shared _and_
> non-shared caches at the same time. In the ideal design, the shared
> cache will be able to store large objects, eliminating the need for the
> non-shared cache. Please keep in mind that any non-shared cache would
> violate HTTP in an SMP case.

You have yet to convince me that the behavious *is* a violation. Yes the
objects coming back are not identical to the pattern of a traditional
Squid. But the new pattern is still within HTTP semantics IMO, in the
same way that two proxies on anycast dont violate HTTP. The cases
presented so far have been about side effects of already bad behaviour
getting worse, or bad testing assumptions.

>
> In-transit space does not need to be shared (but it is separate from
> caching as discussed above).
>
>
>> 4) config load/reload at some point enables a cache_dir
>> + being async means we are not delaying first response waiting for
>> potentially long slow disk processed to complete
>> - creates a high MISS ratio during the wait for these to be available
>> - adds CPU and async event queue load on top of active traffic loads,
>> possibly slowing both traffic and cache availability
>>
>> 5) cache_dir maintains distinct (read,add,delete) states for itself
>> + this allows read-only (1,0,0) caches, read-and-retain (1,1,0) caches
>> + also allows old storage areas to be gracefully deprecated using
>> (1,0,1) with object count decrease visibly reporting the progress of
>> migration.
>>
>> 6) cache_dir structure maintains a "current" and a "max" available
>> fileno setting.
>> current always starting at 0 and being up to max. max being at
>> whatever swap.state, a hard-coded value or appropriate source tells
>> Squid it should be.
>> + allows scans to start with caches set to full access, but limit the
>> area of access to a range of already scanned fileno between 0 and current.
>> + allows any number of scan algorithms beyond CLEAN/DIRTY and while
>> minimising user visible impact.
>> + allows algorithms to be switched while processing
>> + allows growing or shrinking cache spaces in real-time
> Linear fileno space prevents many optimizations. I would not require it.
> FWIW, Rock store does not use linear fileno space.

Okay. So something else. A non-linear map or tree of tri-state values
(quad-state, whatever). Used, open, unchecked.

The above linear is simple, but adds sequential limits to the scan.

>
>
>> 7) cache_dir scan must account for corruption of both individual files,
>> the index entries, and any meta data construct like swap.state
> Yes, ideally.
>
>
>> 8) cache_dir scan should account for externally added files. Regardless
>> of CLEAN/DIRTY algorithm being used.
>> by this I mean check for and handle (accept or erase) cache_dir
>> entries not accounted for by the swap.state or equivalent meta data.
>> + allows reporting what action was taken about the extra files. Be it
>> erase or import and any related errors.
> I think this should be left to individual Stores. Each may have their
> own way of adding entries. For example, with Rock Store, you can add
> entries even at runtime, but you need to update the shared maps
> appropriately.

How does rock recover from a third-party insertion of a record at the
correct place in the backing DB followed by a shutdown?
erase the slot? overwrite with something later? load the object details
during restart and use it?

For now it is perfectly possible to inject entries into UFS and COSS (at
least) provided one knows the storage structure and is willing to cope
with a DIRTY restart.

>> Anything else?
> 0) Each Store (including memory cache) should have its own map. No
> single global store_table and no assumptions on how a store maintains
> its map. Main Squid code just needs to be able to add/search/remove and
> summarize entries.

Oops. Yes. Thansk. I kind of assumed it for (1).

>
>
> Thank you,
>
> Alex.
Received on Tue Jan 24 2012 - 08:51:39 MST

This archive was generated by hypermail 2.2.0 : Wed Jan 25 2012 - 12:00:11 MST