Re: (reiserfs) Re: files with numeric names (fwd)

From: Kevin Littlejohn <darius@dont-contact.us>
Date: Tue, 07 Sep 1999 09:26:12 +1000

>>> Blue Lang wrote
>
>
> Figured someone here might be interested in some of this thread, so I'm
> cross posting it.. Anyone got any feedback?
>
> --
> Blue Lang
> IBM Global Services
> P: (919) 486-5183 E: wdlang@us.ibm.com
>
> ---------- Forwarded message ----------
> Date: Mon, 6 Sep 1999 14:20:21 +0200
> From: Russell Coker <russell@coker.com.au>
> To: Matthew Kirkwood <weejock@ferret.lmh.ox.ac.uk>
> Cc: reiserfs@devlinux.com
> Subject: (reiserfs) Re: files with numeric names
>
> >> Assertion: Squid and INN have file access patterns that are different
> >> from all other applications.
> >
> >Is this really true? I would have thought that they just pushed the
> >filesystem a lot harder.
>
> Squid: average file size ~13K.
> INN: average file size <4K.
> My root file system (includes a small squid cache): average file size 25K.
>
> I consider that alone to be convincing evidence that INN and Squid have
> different access patterns.

Yup. Not only size of files, but the patterns of access are different too
- both between squid and INN, and between those two and your average
filesystem.

> >> Assertion: a file system could recognise the files used by such
> >> programs by their numeric file names (INN) and hexadecimal file names
> >> (Squid).
> >
> >I think you'd get a surprising number of false positives this way.
>
> Do any examples come to mind?

Hrm. Personally, I'd count this as a bad - it'd be a feature I'd want to
turn off. _I_ know which fs is being used by the cache - asking _me_ to
set it up specially is perfectly reasonable. Allowing the fs to 'guess'
seems like asking for trouble, for very little win...

> >> Theory: a file system could change the way it handles meta-data when
> >> it sees such file names to improve performance of these applications.
> >
> >In what ways would its behaviour change?
>
> Cache directories and other meta-data agressively to the exclusion of caching
> file data.
> ATIME resolution changes.
> Probably more possibilities exist, I just can't think of them at the moment.

atime stuff definately. meta-data etc. maybe - although at the higher
usage end, you'd probably find you're still not going to help enough.

The biggest win (I still maintain) for squid will come not from an adaptive
general purpose filesystem, but from a squid-specific fs. The biggest loss
at the moment is squid index -> filename -> inode -> file - and with the
number of files on disk for your average decent-sized squid, the
metadata/directory caches would have to be almost full-sized to meet that.

Big thing to remember: You only get about a 40-50% hit rate on files in a
squid cache - so your locality is already out the window. That applies to
metadata cache locality, too :( The best answer is not to improve
cacheing, but to do away with as much need for metadata as possible.

> >Do bear in mind that anybody doing serious squid or INN will have
> >separate filesystem (on separate spindles) for the purpose. You
> >may not think that this should have to be the case, but I think that
> >it's a lot nicer than "magic" behaviour.
>
> True. Maybe some of these things could be done better by mount options.

*nod* Definately, by mount options in the first case - but be _very_
careful what you optimise for anyway ;)

KevinL
Received on Mon Sep 06 1999 - 17:43:16 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:48:21 MST