Re: Squid performance wish-list

From: Michael O'Reilly <michael@dont-contact.us>
Date: 25 Aug 1998 12:16:04 +0800

Stewart Forster <slf@connect.com.au> writes:
> Hi Stephen,
>
> Sorry for getting a little frustrated.
>
> > Yes. Only that in reality, the numbers will not be so bad, as long
> > as fragmentation is not at it's maximum. I.e. if the 16MB is in one
> > chunk, then it will take only one disk access to delete it.
>
> True, but as we all know, fragmentation will set in pretty quickly
> once the disk is full and you'll never find 16MB chunks again once you
> delete a few files and allocate a few in place of the 16MB file.

I guess what stephen is talking about here is an extent based
filesystem.

There's a few points to keep in mind. We're not implementing a general
purpose filesystem here...

#1. We delete as often as we create.
#2. We generally know how big things are going to end up before we
        create them.
#3. We normally sit fairly close to full. (i.e. 95% used).
#4. Objects are normally fairly small. (i.e. 5 orders of mag less than
        the size of the disk or better).
#5. Most objects are 'soft', in that we're allowed to delete them
        almost any time we like if we really need to.

#6. We never do anything other than sequential reading or writing.
#7. We never need to partial truncate. We only ever delete entirely.
#8. We only ever append to a file (never seek into the middle and
        write).

These points has some interesting consequences.

#2 means that we can be very very successful at avoiding
        fragmentation.
#4 means that we should have a very high success rate at finding
        places to put objects such that the entire object is in a
        single extent.
#5 says that we can 'fix' fragmentation if it gets disasterous.
#6 says that extents are really groovy, becuase it makes sense to say
"1234 blocks starting at 34" rather than "1234, 1235, 1236, ... "
#7 further justifies extents.

Noting that there's a LOT of literature on fragmentation, and a decent
amount on extent based filesystems.

Noting that there's no need to embed the extent meta info into the
extent itself. It would be more normal for an inode like:
        type struct {
                unsigned long start;
                unsigned long len;
        } Extent;

        struct inode {
                ....
                Extent e[10];
                Extent ie;
                Extent die;
                Extent tie;
                };

where 'ie' is an extent of Extent structures, die is a doubly indirected
block of extent structures etc. etc.

Noting that in 99.9999% of cases it would even need an 'ie' let alone
a 'die' or a 'tie' (because you know how big the object will be, so
you simply put it into free space that is large enough to contain it,
so zero fragmentation. Only so very small fraction of the time will
you ever need to fragment [ by virtue of #3, #4 and even #5 at a
pinch]).

Michael.
Received on Tue Jul 29 2003 - 13:15:52 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:53 MST