Re: Large Rock Store

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 17 Oct 2012 09:30:45 -0600

On 10/17/2012 08:57 AM, Kinkie wrote:
> On Wed, Oct 17, 2012 at 4:34 PM, Alex Rousskov wrote:

>> Please ignore the current StoreHashIndex class name. It will not survive
>> this polishing.
>>
>> This class is needed to represent all disk caches taken together and
>> coordinate activities among individual disk caches (e.g., cache_dir
>> selection for a given miss object).
>>
>> Why separate the memory store from disk stores? While there is a lot in
>> common between disk and memory caches, there are a few significant
>> differences. For example:
>> * There may be many disk caches (that need coordination) but there is at
>> most one memory cache. That is why disk caches need a "disk cache
>> coordination" class and memory cache does not.
>>
>> * Memory cache access methods are synchronous. Most disk cache access
>> methods are not (or should not be).
>
> This is an interesting argument in my opinion, which I have spent some
> time thinking on in the past.
> In my opinion, the "memory cache" name is misleading.

I do not see why it is misleading. Your arguments below are not about
memory cache (you call it RAM-backed) but about adding more levels or
layers of caches. There is nothing wrong with that idea as such, but I
do not think it invalidates the MemStore class API we have now or the
current StoreController role as a coordinator of memory and disk caches.

> IMVHO eventually we could have a tiered cache system, with consistent
> API modeled after the current disk caches, with a "small and very fast
> cache" (e.g. ram-backed), a "bigger and quite fast cache" (e.g.
> rockstore+ssd - backed) cache, a "big and quite slow cache" (e.g.
> aufs), and a "very big and very slow" cache (e.g. somehow distributed,
> for instance: memcached, aufs over nfs, hadoop/cassandra/gfs...), and
> policies to promote/demote objects across the various tiers.

Agreed. We could, eventually, and our Storage classes will be ready for
that change. If that uncertain future comes, somebody will just rewrite
the relevant Storage::Controller methods to handle more caching levels
or layers.

> What is now the memory cache, in that context, would become more of a
> specialized, synchronous, transient area used to shuttle data (partial
> objects, collapsed forwarding, etc) to/from the http pipe, and
> possibly to support the object promotion/demotion activities. But then
> it wouldn't be really a cache anymore, would it?

No, it would not, but I suspect there still will be a memory cache and,
today, we do need a memory cache. Tomorrow, we might add a "specialized,
synchronous, transient area" as well. Nothing wrong with that, of
course, but I do not see how it would affect our class hierarchy today
-- we already have at least two levels/layers that we must handle so
some of the appropriate abstractions will be there if more layers are
needed tomorrow.

>> * An object may be placed in at most one disk cache, in at most one
>> memory cache, but it can be cached both on disk and in memory. The first
>> algorithm is (or will be) implemented by the StoreHashIndex replacement,
>> the second by MemStore itself, and the third by StoreController.
>
> Why?
> If we accept we can promote/demote objects in tiered caches, it means
> that an object could be (with different lifetimes if determined by
> capacity) in different caches.

I assume your "why" is for the "an object cannot be in two disk caches
at the same time" implication. This is just how the disk cache works today.

There is nothing wrong with the idea of supporting duplicates on
different disk-like storage, but (a) we do not support that today and
(b) the polishing I am proposing will make such support easier in the
future -- one would just need to change how the StoreHashIndex
replacement class manages disk caches (possibly merging that management
code with the corresponding Storage::Coordinator code if all stores
become the same when it comes to managing duplicates).

Are you suggesting changes to the Storage class hierarchy that would
contradict the proposed polishing? If yes, what are those changes?

Thank you,

Alex.
Received on Wed Oct 17 2012 - 15:30:55 MDT

This archive was generated by hypermail 2.2.0 : Wed Oct 17 2012 - 12:00:06 MDT