Re: memory-mapped files in Squid from Kevin Littlejohn on 1999-01-29 (squid-dev)

From: Kevin Littlejohn <darius@dont-contact.us>
Date: Sat, 30 Jan 1999 01:39:38 +1100

>>> Henrik Nordstrom wrote
> Oskar Pearson wrote:
>
> > If I understand this correctly, the major advantage of a fifo buffer is
> > that there is almost no fragmentation... right?
>
> That is one of the advantages, but there are more.
>
> * Almost no fragmentation
> * Writes can be optimized to use as few I/O ops as possible, even
> crossing object boundaries.
> * I/O can be fully elevator optimized to minimize seek times. No random
> seeking.

I'm not sure how you achieve this - given the request pattern is random?
Unless you're planning on collating requests and serving multiples. I
had thought any sane fs would keep that in mind when seeking somewhere,
but the implementation of this in such a way as to not starve any given request
is interesting. No random seeking is a bold claim, any which way.

(Incidentally, sfs has been built with 'elevator' writes in mind - I thought
about it, and decided that you're always going to do less writes than
reads, so simply picking one block out of any blocks between you and the
block to read, and committing that on the way, is valid. Could conceivably
lead to maxing out the fs cache, but that's addressable fairly easily.
Read seeks are not optimised at all - the idea is simply to service each
request as immediately as possible. Chains of requests can be dropped
onto the service queue, so if you want a whole bunch of blocks from a
given file, you can sorta encourage the drive service thread to read them all
at once, or at least one after the other.

> * No or minimal filesystem overhead. A extremely simple layout of the
> store can be used.
> * No, or very small block sizes, giving a higher disk size utilization
> per stored object.

But if you're serving random objects, you still have to hold an index of
location somewhere. That index rises in size with the smaller block sizes.
Not a killer, but worth bearing in mind.
(Where, on a cyclic fs, do you store static information, btw? ;)

> * Quite obvious how to make checkpoints, so we can have instant restarts
> without risking cache corruption even if Squid crashes and leaves the
> cache in a "dirty" state.

This is true (and the last thing you want to be doing is a 'fsck' of some
sort), but again you need to address the indexes of files. Unless you're
pulling the same trick sfs does, and making the location on disk the 'name'
of the file as well, and storing that in squid's own index. In which case,
pick a _big_ data type, given you want the small block sizes ;) Either way,
you need to pay close attention to what happens if the machine goes away
at every point of committing a new file and it's relevant meta-data to disk.

>
> > I personally think that a fairly classic filesystem (something like the sfs
> > code) is probably the way to go.
>
> Lets see where a FIFO approach can end up. You have to admit that it is
> an interesting area to investigate.

Personally, I'd like to see both. I'm still very sceptical of a circular
fs design - I just don't think it matches the usage pattern or the expiry
pattern (and yes, I _like_ being able to tweak my refresh patterns, ta
very much ;), but if it works, then it's a worthy thing to have.

KevinL
(fs racing, anyone? ;)

--------------- qnevhf@obsu.arg.nh ---------------
Kevin Littlejohn,
Technical Architect, Connect.com.au
Don't anthropomorphise computers - they hate that.
Received on Tue Jul 29 2003 - 13:15:56 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:02 MST