Re: memory-mapped files in Squid from Kevin Littlejohn on 1999-01-28 (squid-dev)

From: Kevin Littlejohn <darius@dont-contact.us>
Date: Fri, 29 Jan 1999 01:09:05 +1100

>>> Oskar Pearson wrote
> Hi
>
> Reiserfs is optimised for many small files. It doesn't use block
> allocations (if I remember correctly), so fragmentation is not really an
> issue. The two problems are that it's:
>
> 1) complex. We don't really want this in Squid unless we know that it works
> 2) best implemented at the kernel level.
> 3) very processor intensive
>

Well, on the heels of that...

I've been waiting, for part of this week, for a better web site to come
up. It's taking too long, and I really want to get what code I have here
out for people to pick on, so...

http://www.bofh.net.au/~darius/squidfs contains a squidfs.tgz file. This
has the code I've been hacking at. Some caveats:

1) it's not hooked into squid yet - as has previously been pointed out, the
   disk access code in squid is spread through several different places, and
   is not particularly clean. I have once already adjusted async-io to
   hook to this - I'm reasonably happy that you could hook either with or
   without async-io (with is kinda redundant, but might be interesting
   *shrug*)

2) It's interface needs work. In particular, I've just a few days ago come
   to the opinion that it should have an identical interface to the standard
   fs commands (read, write, open, etc). See point one ;) Anyway, some of
   it is half-way through being switched over.

3) It's probably buggy as all hell. I've been playing with it standalone,
using a dd'ed blank file as a mock drive. I've only written small blocks,
but it can read and write at least under 4K of data ;)

4) I'm not a C programmer - the code may be a _tiny_ bit crusty...
   In particular, the types used for fd's and inodes and so forth are
   a bit scrambled at the moment - I'm in the midst of re-organising
   them to be sane...

That out of the way, I'm putting this forward because I'd love to see
something done specific to squid, and I'm constantly running out of time
to do it. There's numerous comments through the code about design
decisions made - hopefully it's not too bizarre. My big hope, tho, is
not that this is a 'perfect squid fs', but that this gives a good framework,
both easily adaptable and easily measureable, for doing some serious work
on squid-specific fs'es. From the various numbers that have been bandied
around, I suspect every second cache is running under very different
conditions. Given that, then getting something out there that can be
heavily tweaked is probably worthwhile.

(Having said that, Stew and I spent a couple of weeks beating up on each
other's fs ideas, and I'm reasonably confident that this system, as it was
designed if not as it's implemented, will provide good performance, with
(hopefully) not too much fragmentation. I think it also addresses the
other big points well - it's not complex to a ridiculous extent, and it
is very portable - needs threads, and that's about it.)

Ah well, anyway, if it's useful to people, good - if not, I don't lose
anything by putting it in public (other than having put ugly code out for
people to see... ;)

Oh, one other thing - that machine is on the far end of a 64K link - please
be gentle on it. I'm in the process of getting a better-connected site
happening... (he says like people will be mobbing it ;)

All feedback gratefully accepted... ;)

KevinL
(extremely nervous about throwing this code into public - it's nowhere near
'finished' :(

--------------- qnevhf@obsu.arg.nh ---------------
Kevin Littlejohn,
Technical Architect, Connect.com.au
Don't anthropomorphise computers - they hate that.
Received on Thu Jan 28 1999 - 15:10:33 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:57 MST