Re: [squid-users] Ramdisks

From: Joe Cooper <joe@dont-contact.us>
Date: Fri, 23 Nov 2001 03:30:52 -0600

Henrik Nordstrom wrote:

> On Thursday 22 November 2001 01.04, Joe Cooper wrote:
>
>
>>Let me see if I get what you're pointing at by the term 'log structured':
>>
>>The log stores the meta-data and object location in a fixed position on
>>disk (pre-allocated large enough for the full cache_dir).
>>
>
> Not quite.
>
> log structured != fixed position on the drive.
>
> log structured == the storage space is viewed as an virtually infinite sized
> sequential log. Any changes to the filesystem is stored in the tail of the
> log. Active portions of this virtual log is then mapped to your actual
> storage media.

Ok. So you're talking along the same lines as COSS. I was picturing a
more traditional FS with a journal type log. I get it and agree
thoroughly (and I've read the long thread about it from about 1.5 years
back from your website).

 
>>Object blocks are pre striped onto the RAW disk (possibly allocated and
>>used in bigger than FS block-size stripes), or simply 'imagined' on the
>>disk by the Squid log system if we are still sitting atop a traditional
>>UFS.
>>
>
> Objects are packed into the log, which is striped out on the disk in suitable
> chunks when needed to make room for more objects in the pre-write buffer.

This is where I got confused. So the log is in memory, but in
'infinite' as far as the the interface is concerned...buffer pressure
flushes it in stripes as needed, and popularity can put an object back
into the head of the log.

> The main ideas for log structured filesystems was presented around 1987-1990.
> First as an approach on how to store a writeable filesystem on a write-once
> media, then as a viable approach in how to optimize a filesystem for writing.
>
> I did not know about "log structured filesystems" when I first presented the
> ideas here on squid-dev. It was someone here that showed me the light to this
> quite interesting but less known area of computing research.
>
> A quite good introduction to the subject is
> Beating the IO Bottleneck: A Case for Log-Structured File Systems,
> Ousterhout & Douglis
> Can be found at http://collective.cpoint.net/lfs/papers/lfs-case.ps.gz
>
> An quite extensive collection of papers on the subject can be found from the
> same Linux Log-Structured Filesystem project page
> <http://collective.cpoint.net/lfs/>, even if the project as such died long
> before becoming anything remotely useful... There is a couple of partly
> relevant papers not on his list, but all the major ones are there.

Will read. Having only read journalling filesystem research (and
assuming log ~= journal) I wasn't aware this type of FS was well
researched (I know about the similar object store in INND, but haven't
studied it in detail either).

>>Anyway, pretty cool. To be honest, though, at this point I think the
>>CPU usage problem is one of the biggest that Squid faces. When we can
>>get 80-100 reqs/sec from a single IDE disk (dependent on workload), and
>>only 130 from two, and only 150 from three or four or five--even though
>>they each have a dedicated bus...the law of diminishing returns is
>>getting pretty ugly and it really hurts scalability. Of course, CPU
>>could be saved somewhat by reducing disk overhead.
>>
>
> Yes, I know, and is why I am also tinkering with the eventio model, and
> trying to find how one can make a caching proxy SMP scaleable.
>
> eventio allows for experimenting with efficient networking designs.
>
> the filesystem design including metadata makes it easier for SMP systems with
> multiple writers to manage a shared cache.

Any reason why there can't be a super-thin delegation thread, that
passes requests on to 'real' Squid processes based on a hash, so that
cache data sharing isn't required? That's what we're doing at the
network layer for dual processor Squid boxes, and it works very well.
We've just got an iptables rule that splits on the last octet
(0.0.0.0/0.0.0.128, and 0.0.0.128/0.0.0.128). It's not quite an even
distribution, but it removes the requirement of sharing data.

>>This is a simple way to achieve it, and probably just as effective
>>as more complicated flushing decisions. This, of course, requires
>>a reasonable sized cache_mem and a policy to decide what to write
>>and when. I will look into adding a 'picky writes' option to Squid
>>to give it some testing.
>>
>
> With a good FS designed for a cache workload there is little reason for
> 'picky writes' unless you want to save the disk area. There is plenty of
> bandwidth to modern drives, and space is also plenty. It is mostly I/O
> operations/s the drives are short of. And the ratio between I/O bandwidth,
> space and operations/s is steadily increasing making operations/s a bigger
> and bigger relative bottleneck for each drive generation... thus it is
> getting increasingly important to find solutions that tries to utilize what
> the drives are good at to maintain a good cost effectiveness.

True...but we don't have a good filesystem design working yet. ;-)

I think picky writes is a 1 hour hack for me (50 minutes to find the
spot to make the change, and 10 minutes to change it and add a
squid.conf option), while the FS stuff is beyond my abilities at this
point. I'm very much looking forward to using COSS, at least in
experimental environments in the short term, but it still has at least
one crasher (and possibly a second, but I won't bug Adrian about it
until he's tracked down the first one I spotted). So even for folks who
have the knowledge and have wrapped their head around the code, a good
object store for Squid is a ways off in the future.

I'll drop the idea of picky writes, if and when COSS can give us
practically free writes. ;-) As it is now, write bandwidth is a
primary cause of the upper bound on Squids performance.

-- 
Joe Cooper <joe@swelltech.com>
http://www.swelltech.com
Web Caching Appliances and Support
Received on Fri Nov 23 2001 - 02:28:02 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:39 MST