Re: filesystems

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Thu, 29 Mar 2001 20:15:57 +0200

Kevin Littlejohn wrote:

> Agreed with the rest of the post, and I'd figured that much out. However, my
> concern with the above is I'm not clearly seeing how moving the async handling
> in aufs _down_, into shared code, would be a good thing - aufs is the only
> thing that uses the request_queue/request_queue2 structure, and to my mind
> _should be_ the only thing that does - otherwise we risk making diskd and ufs
> rely on pthreads, don't we?

aufs is the only one that uses any part of async-io.

disk.c is not part of async-io. not a bit of it is used.

The request queues is in async-io, not aufs.

if making the FS code generic, then how asyncronous I/O is implemented
must be abstract to the FS code. The current way aufs manages
asyncronous opens is not very abstract. Other async implementations
(like diskd) might well want to do it differently.

> Surely the request queues are specific to aufs, but most of the rest of the
> code is shareable - the actual mechanisms for talking to disk, for instance,
> or the creation of request objects/callback data, and the eventual tracking of
> what objects exist in the fs?

The request queues is equally much shareable as aio_do_read(). Both are
at a VERY LOW level in the implementation.

> I can't wrap my head around why we'd want a queueing mechanism in ufs beyond
> what it already has, is basically the problem - and what it already has could
> be subsumed into aufs as "if (!async) { }" type blocks anyway, it's that
> trivial - especially if the _do_read() _do_write() etc. calls were common
> (blocking) code.

Note that I am saying that the whole disk.c is obsolete in the first
place. It is only used by "ufs", and "ufs" does not actually need any
queues at all. The only reasons why it is the way it is is historical
ones.

> I kind of envisage something like:
>
> storeRead() gets called to read data - it builds a request object, and either
> calls ufs_read() and then the appropriate callback in the case of ufs, or
> queues the request obect and returns in the case of aufs. If aufs, then when
> the request is taken off the request queue to be handled, it would call
> ufs_read(), and then any appropriate callbacks.

So how do you envision it done in diskd or some other approach where the
actual reading is performed in a completely differen process on
completely different filehandles?

The queuing methods for diskd and async-io is completely different. Not
much common between the two approaches to I/O.

I think you may have got the layering in the wrong order here. Currently
the layering is:

store -> fs -> I/O

where I/O can be implemented asyncronously using OS primitives (diskd or
async-io).

the fs layer is asyncronous using Squid primitives, just as the rest of
Squid processing.

ufs, aufs and diskd share most of the fs code by duplication today, but
very little of the I/O code. There are some minor differences in the FS
code due to I/O layer differences, and as long as those differences are
there merging the three FS implementations into one common with
different I/O implementations is a bit hard..

Other FS:es might want to have the same I/O models available to
implement asyncronous I/O in their storage format, but as long as the
I/O models are direct part of their FS sharing the I/O model in another
FS is not that easy either.

So what I proposed is to make a library of the common "ufs" functions,
capable of being used together with any of the three I/O
implementations. As part of this the I/O implementations APIs needs to
be homogenized.

Now some buts:

a) By doing generalizations like this, several possibilities to large
optimizations are closed. It is a fact that "aufs" is long overdue a
major rehaul to optimize the type and frequency of I/O calls, and the
overhead involved in each such call. Such an optimization will require
changes in the "fs" code of "aufs" as well as in the I/O code, making it
differ more from "ufs" than today. The same class optimizations is
possible in diskd.

b) When the above is made, the "async-io" layer is most likely less
suitable for sharing with another FS implementation with different
requirements on the I/O layer. Also applies to "diskd".

What async-io/diskd is is no more than asyncronous open/read/write/close
system calls. Neither cares for what type of Squid "FS" is being used,
or for what.

The layering currently in place in async-io/diskd is perhaps not the
best suited for asyncronous FS implementations in Squid. For better
operation the "ufs" class of stores needs larger macro operations like
"store this finite amount of data into file XXXX", which would open the
file, store the contents and then close the file as one large async
operation. This to avoid to much switch of control between the main
thread and the work(er) thread to complete a simple operation.

/Henrik
Received on Thu Mar 29 2001 - 11:32:27 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:13:41 MST