Re: [adrian@creative.net.au: [bugzilla-daemon@squid-cache.org: [Bug 451] - Occasional corruption of objects]]

From: Robert Collins <robertc@dont-contact.us>
Date: 04 Nov 2002 10:30:36 +1100

On Mon, 2002-11-04 at 10:21, Adrian Chadd wrote:
>
> Guys,
>
> I've had another person poke me about this, also involving mass
> apt-get upgrades.
>
> I'm not entirely sure how to go about fixing this - I thought about
> setting up a http server which will _only return_ chargen-style
> content files, and having squid enforce this particular data body.
>
> I think this is a "drop everything and fix" bug, and it will almost
> certainly take quite a bit of effort to find. :(
>
> Any suggestions on where to look in the 2.5 branch?

I've not yet reviewed in detail the attached patch, but it looks good to
me at first glance.

Rob

attached mail follows:


After weeks of poking around in the code, I've finally figured out what was causing download corruption in Squid 2.4 and 2.5.

Note that the corruption does not occur in UFS due to its synchronous file access, so if you don't mind slow performance, stick with that store format. For the performance minded who use AUFS, read on.

The scenario:

1) user makes request
2) squid sets up read request in AIO queues:

2002/10/28 14:04:31| storeAufsOpen: fileno 00007DFE
2002/10/28 14:04:31| storeAufsRead: queueing read because FD < 0

3) comm_poll runs again, and sees that client hit 'Stop' on browser:

2002/10/28 14:04:31| clientReadRequest: FD 66: reading request...
2002/10/28 14:04:31| commSetSelect: FD 66 type 1
2002/10/28 14:04:31| clientReadRequest: FD 66: (104) Connection reset by peer
2002/10/28 14:04:31| comm_close: FD 66

4) storeAufsClose marks the flags->close_request = 1 on the sio:

2002/10/28 14:04:31| storeAufsClose: dirno 0, fileno 00007DFE, FD -1

5) squidaio_poll_done hasn't heard about any cancellation, so it keeps going:

opens:
2002/10/28 14:04:31| squidaio_poll_done: 0x86d7878 type=1 result=0x86d8a3c
2002/10/28 14:04:31| OPEN of /content/00/2D/00007DFE to FD 17
2002/10/28 14:04:31| DONE: 17 -> 0

***reads (and writes results to memory areas which were freed above!!!!!)***
2002/10/28 14:04:31| squidaio_poll_done: 0x86d7878 type=2 result=0x86d55fc
2002/10/28 14:04:31| READ on fd: 17
2002/10/28 14:04:31| DONE: 2132 -> 0

closes:
2002/10/28 14:04:31| squidaio_poll_done: 0x86d7878 type=4 result=0x86d8a3c
2002/10/28 14:04:31| CLOSE of fd: 17
2002/10/28 14:04:31| DONE: 0 -> 0

At this point, the request has been written out to a random memory area. Unfortunately, it occasionally ends up in someone else's download. Thus, the corruption.

The fix is simply to move the check for close_request a little higher up in store_io_aufs.c so that it occurs after the OPEN and before the READ).

I was able to replicate the corruption at least once a minute in my testing. The patch below has been running solid for the past few hours with ZERO corruption.

Although I never heard back from the squid developers as to whether they were working on this, hopefully they will have something to say now that it has (ostensibly) been solved. Hello...anybody there?

The standard disclaimers apply of course: Works for Me, YMMV, etc.

-Phil Oester
(not a squid developer, but I did sleep at a Holiday Inn express last night)
 

--- original/src/fs/aufs/store_io_aufs.c Sun Aug 11 16:14:35 2002
+++ fixed/src/fs/aufs/store_io_aufs.c Mon Oct 28 14:39:32 2002
@@ -286,6 +286,10 @@
     aiostate->fd = fd;
     commSetCloseOnExec(fd);
     fd_open(fd, FD_FILE, storeAufsDirFullPath(INDEXSD(sio->swap_dirn), sio->swap_filen, NULL));
+ if (aiostate->flags.close_request) {
+ storeAufsIOCallback(sio, errflag);
+ return;
+ }
     if (FILE_MODE(sio->mode) == O_WRONLY) {
        if (storeAufsKickWriteQueue(sio))
            return;
@@ -293,8 +297,6 @@
        if (storeAufsKickReadQueue(sio))
            return;
     }
- if (aiostate->flags.close_request)
- storeAufsIOCallback(sio, errflag);
     debug(79, 3) ("storeAufsOpenDone: exiting\n");
 }

Received on Sun Nov 03 2002 - 16:30:39 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:18:36 MST