Re: Squid-FS

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Wed, 22 Apr 1998 22:10:44 +0200

--MimeMultipartBoundary
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Alex Rousskov wrote:

> If you go back to the previous posts on this topic, you will
> see that by SquidFS I mean "big", "DB-style" files with many
> objects per file _on top_ of an existing FS. Thus, we gain
> performance benefits (see previous posts for a long list) and
> preserve advantages of Unix FS (recovery and such).

No. You lose the structural consistency checks, block allocation and the
crash recovery built into a os FS. Using one big file, or writing
directly to a partition is essentially the same thing (except that
writing directly to a partition is slightly less overhead).

There is many things one can do to optimize performance of a standard
filesystem: tuning OS/FS parameters, design the application in such a
way that the OS caches gets used in a efficient way, and using
asyncronous features to avoid blocking. Much of this is needed when
implementing a private FS as well.

What we gain by using a private FS is the FS overhead imposed by having
one extra level of indirection (the directories), and the user<->kernel
mode switch overhead on open/close operations.

> Is reducing hit response time by half "huge"?

It depends. Given the nature of uncached HTTP it probably isn't. I would
say that high sustained throughtput is far more important than snappy
(as opposed to quick) response times, and of course most important of
all is stability, and speedy recovery when it do fail.

> At least third of hit response time is due to local disk I/O.
> The actual effect is probably bigger (>=50%?) because disk I/Os
> compete with network I/Os in one select loop. I have no hard proof
> that SquidFS can eliminate all of this overhead, of course. These
> are just estimations.

I/O will always be there unless you have enought ram to keep all objects
in ram. And if the OS is tuned appropriately (enought directory and
inode caches, and no atime updates) the number of disk accesses should
be roughtly the same.

I don't think a WWW cache can be compared with a Usenet-News server
(this was where the discussion was started previously). There is very
large differences between the two types of servers with respect to
storage maintaince..

Differences that pops up in my mind: (not based on real statistics)

* Object size
News: The most objects are of similar size
WWW: Object sizes vary much more.
= harder to control fragmentation

* Object "creation" (when new objects needs to be stored)
Not much difference here if a contignous News feed is used. Batched news
transfers is another issue..
= I don't think this matters...

* Object deletetion
News: Almost FIFO.
WWW: Random order (LRU)
= Very hard to control fragemtation

* Object re-validation and updates
News: never done
WWW: Objects needs to be validated slightly updated (updated/added
headers).
= no difference

* Object access
News: index listings and objects. Objects accessed in a somewhat linear
order by the client. No "misses".
WWW: freshness queries and objects. No index listings. More random
access order except that inlined images is fetched/checked immediately
after the page which refers to them.

/Henrik

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:47 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:45 MST