Re: [squid-users] file system type/params optimal for squid?

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sun, 26 Oct 2003 09:48:23 +0100 (CET)

On Sat, 25 Oct 2003, Linda W. wrote:

> I'm slightly confused -- do you mean reiserfs is best out of the
> journalled fs's, or best including non-journaled async (ext2? fat32?)
> fs's.

From what I recall it even performed better than ext2. But I recommend you
to find the benchmar results to verify this. The benchmarks run by Joe
Cooper is somewhere on the swelltech.com web site.

> Doing benchmarks right is fairly difficult. So many variables. So many
> parameters can affect things.

Not really. The polymix-4 workload is a standard workload and only has two
variables

a) The size of your cache

b) The rate you want to test if the proxy can handle

but it is rather time consuming.

> Like just choice of fs's default allocation unit. If a format prog has
> defaults of a 512-byte allocation block, it might make a big difference
> in a test where another sets up for 16Kb blocks. Defaults could explain
> a difference in performance if most read/writes are >512 bytes and
> <16Kb.

This is all about figuring out which cache server configuration is the
best. Without benchmarking all one can do is guessing.

> Do you know off hand what Reiserfs's default alloc size is?

It works differently. See the Reiserfs documentation for details.

> Aren't ext2 and fat32, ufs, etc....all pretty much
> async/non-journaled? Weren't they
> (and in many cases, still are) used for decades without being
> "sensitive"?

fat32 is mostly syncronous on most systems built on fat32 (i.e. DOS,
Windows).

In NT fat32 is fully asyncronous.

ext2 in Linux is somewhere inbetween due to how Linux manages it's
buffer/cache, giving most of the benefits of asyncronous filesystem
operations while at the same time providing reasonable crash resistance.

> Bugs happen in journaling fs's too -- all of the files I'd modified in
> the previous day had '0's written throughout them.

Most journaled filesystems does not journal the file contents, only the
filesystem structure (directories, file lengths, block allocations etc).

> What other windows file system would one compare NTFS to? BTW, at one
> point, I thought I remember fat32 being syncronous on linux.

You can mount any filesystem sycronous on Linux if you like. This is
commonly recommended for floppy disk operations as users have a tendency
to remove the floppy without first unmounting the filesystem..

> Theoretically, with no support for access rights, file owner and limited
> time field accuracy, FAT32 should run faster than ext2.

Again it depends.

The ext2 design makes it a lot easier for the filesystem to maintain many
concurrent operations than the fat32 design.

> But -- for a 'temporary internet cache', how much fault tolerance does
> one need? I could see (if memory was cheap enough) of running squid on
> a RAM disk. If your server stays up for a month at a time, I think the
> effects of losing the cache once a month would be negligible compared to
> the benefit of zero ms disk access...

Agreed, but rather expensive if you want a large cache.

> I dunno...the algorithms to store and retrieve data in a database might
> have been given more research bucks to be optimized for speed than the
> the squid database on top of a file system delay.

Maybe, but instead you get the significant overhead of having to pass all
information between Squid and the DB server.

Also, I do not think you want a relational database for cache content..
relational databases is optimized for content where all the records have
the exact same size while web cache data varies a lot in size..

> What if it is an asyncronous/buffered rewritable CD? :-)

Then you are effectively running a ram disk of the size of one CD.

Regards
Henrik
Received on Sun Oct 26 2003 - 01:48:33 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:20:40 MST