Re: small patch for async writes

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Tue, 15 Mar 2005 02:37:09 +0100 (CET)

On Mon, 14 Mar 2005 swilton@q-net.net.au wrote:

> I tried enabling the ASYNC_WRITE define in src/fs/aufs/store_asyncufs.h
> and found that the the cache log was filling up with "this be aioCancel.
> Danger ahead!" errors. The squid-users list says that these errors are
> harmless, but I found that they were not, as objects were not being fully
> written to disk. The attached patch seems to fix this by fixing what
> seems to be a logic flaw.

Indeed.

And when revisiting the code I even at first thought both the ASYNC_WRITE
and blocking cases should be like in the (corrected) ASYNC_WRITE case, but
the loop_detect gave a hint to why the blocking case looks the way it
does. The loop seen there collapses the recursion when a lot of data has
been queued while creating the file, preventing a stack overflow.

But here is a problem in the blocking code path should a write error
occur. In such cases the SIO will be terminated on the write error and
then the control returns here and true danger is ahead.

> The only thing I'm not sure about is what will happen if another request
> wants to open this file while flags.close_request is set to 1. This
> would be a problem, the solution to this would be to get the process to
> block until all io has finished on this filedescriptor.

Hmm.. no this is fine. close_request is unrelated to writing as such, it
just indicates all writes have been queued and the object is complete once
all of them has been written.

At first I thought there was a generic race on ASYNC_WRITE where reads may
be attempted by the core before the writes had completed but this is
guarded by the offset only increased when the write has completed so this
is safe.

As a second glance there appeared to be another small race in aufs from
the fact that the file_callback is not used. This could make the core
think that it can attempt opening the file even before the create
operation has completed. But then again this is also protected by the
offset mentioned above as no new clients will attempt to open the file for
reading until there is confirmed data stored in it which guarantees the
create has succeeded.

Please note that ASYNC_WRITE is deliberately turned off by default as the
writes are effectively absorbed by the OS cache provided the disk
subsystem is not totally saturated, while reads blocks for a significant
amount of time even on an idle drive..

The overhead of scheduling an async i/o operation in aufs is quite
substantial so I try to not schedule async I/O operation when it can be
predicted the operation most likely won't block.

On OS:es where meta updates is not asyncronous you probably want to enable
ASYNC_CLOSE however. But as was for ASYNC_WRITE this code path has not
been tested in quite some time and is in fact not even implemented at
all.. (always ASYNC.. most likely not what I intended...)

Regards
Henrik
Received on Mon Mar 14 2005 - 18:37:12 MST

This archive was generated by hypermail pre-2.1.9 : Fri Apr 01 2005 - 12:00:04 MST