Re: [RFC] cache architecture

From: Pieter De Wit <pieter_at_insync.za.net>
Date: Tue, 24 Jan 2012 23:02:22 +1300

<snip>
>> Perhaps a 9) Implement dual IO queues - I *think* the IO has been
>> moved into it's own thread, if not, the queuing can still be applied.
>> Any form of checking the cache is going to effect squid, so how do we
>> ensure we are idle, dual queues :) Queue 1 holds the requests for
>> squid, queue 2 holds the admin/clean up requests. The IO "thread" (if
>> not threaded), before handling an admin/clean up request checks Queue
>> 1 for requests, empties is *totally before* heading into Queue 2.
>> This will allow you to have the same caching as now, relieving the
>> start-up problems ? Might lead to the same double cache of objects as
>> above (if you make the cache writable before the scan is done)
>
> I wonder about priority queues every now and then. It is an
> interesting idea. The I/O is currently done with pluggable modules for
> various forms. DiskThreads and AIO sort of do this but are FIFO queued
> in N parallel queues. Prioritised queues could be an interesting
> additional DiskIO module.
Hard to implement given the current "leg work" is already done ? How
well does the current version of squid handle multicores and can this
take advantage of cores ?
>
> What I'm looking for is a little bit more abstracted towards the
> architecture level across cache type and implementation. At that scale
> we can't use any form of "totally empty" queue condition because on
> caches that receive much traffic the queue would be quite full, maybe
> never actually empty. Several of the problems we have now are waiting
> on the cache load completed (ie the load action queue empty) before a
> cache is even considered for use.
>
> Amos
At that scale, no matter what you do, you will impact performance/your
"wanted" outcome. It's about reaching an acceptable balance which I
think, you, as a dev, will have a hard time predicting for any real life
usage out there. Perhaps "we" (in " since I am yet to contrib a single
line of code :) ) can make it "Weighted Priority" and as such, have
squid.conf options to tune it. The Admin has to decide how aggresive
squid must be at rebuilding (makes me think of the raid rebuild options
in HP RAID controllers) the cache. I am thinking of:

cache_rebuild_weight <0-"max int"> ?

For every x requests, action an "admin/clean up" request, unless "Queue
1" is empty, then drain "Queue 2"

I am also thinking of a "third" queue, something like:

Queue 1 - Write requests (depends on cache state, but has the most
impact - writes are slow)
Queue 2 - Read requests (as above, but less of an impact)
Queue 3 - Admin/Clean up

The only problem I have so far is Queue 1 is above Queue 2.....they
might be swapped since you are reading more than writing ? Perhaps
another config option.....

cache_dir /var/dir1 128G 128 128 Q1=read Q2=write (cache_dir syntax
wrong....)
cache_dir /var/dir2 32G 128 128 Q1=write Q2=read (as above, but this
might be on ssd)

I think this might be going too far ?

Cheers,

Pieter
Received on Tue Jan 24 2012 - 10:02:30 MST

This archive was generated by hypermail 2.2.0 : Wed Jan 25 2012 - 12:00:11 MST