Re: Race condition in storeClose()

From: Brian Degenhardt <bmd@dont-contact.us>
Date: Wed, 22 Nov 2000 13:26:54 -0800

On Wed, Nov 22, 2000 at 08:19:37PM +0200, Andres Kroonmaa wrote:
> On 22 Nov 2000, at 23:36, Adrian Chadd <adrian@creative.net.au> wrote:
>
> > If there's a problem with an client FD, it will destroy the connection.
> > Eventually it hits storeUnregister(), but it doesn't check if there are
> > pending disk IOs (and close them if they DO exist.)
> >
> > I think this is a bad thing, and could be causing our problems with
> > random memory trashings, since a pending read would complete, and
> > you'd overwrite the memory if you're not careful.
>
> It is causing the problem. To chase it I flagged freed mempool
> objects as read-only vm-pages and watched who's writing them. Got
> consistently crashes from within aioread copying back to user buffer.
>
> > So, ideas here? Should the FS handle killing pending disk IOs on a
> > storeIOState on a storeClose() / storeUnlink() ? Or should we
> > implement a storeCancel() call to cancel an IO, which is called from
> > storeUnregister() .. ?
>
> logically, when we write()+close(), we expect data to reach disks,
> but when we close() before read() completes, we expect read to be
> canceled. So, I think storeClose should do. It just needs to make
> difference between read or writes.
>
> > People running squid-HEAD / squid-2.4 and aufs, and can replicate
> > the crashes, please try this patch just to try:
> >
> > if (storeAufsSomethingPending(sio)) {
> > aiostate->flags.close_request = 1;
> > - return;
> > + /* return; */
> > }
> > storeAufsIOCallback(sio, DISK_OK);
>
> Other option is to call aioClose() before return, or just aioCancel().
> Current state simply asks aufs to close the fd after its pending op
> completes. With canceled reads this isn't what we want.
>
> Just one issue with this: do we actually need to allow pending IO to
> complete? Perhaps we don't want to cancel aiowrite but want to append
> a close command into the aio queue.
> If we simply cancel pending io above, could we possibly have partially
> written objects in store?

I'm getting lots of these:

2000/11/22 13:24:02| WARNING: failed to unpack meta data
2000/11/22 13:24:11| WARNING: failed to unpack meta data
2000/11/22 13:24:13| WARNING: failed to unpack meta data

Isn't that the sign of a partially written object?

> Yet we have to cancel pending read instantly, I believe.
>
>
> ------------------------------------
> Andres Kroonmaa <andre@online.ee>
> Delfi Online
> Tel: 6501 731, Fax: 6501 708
> Pärnu mnt. 158, Tallinn,
> 11317 Estonia
>
Received on Wed Nov 22 2000 - 14:27:01 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:13:00 MST