[squid-users] [PATCH] 2.5s1 Download corruption

From: Phil Oester <kernel@dont-contact.us>
Date: Fri, 11 Oct 2002 13:30:19 -0700

Been digging a bit deeper into the file corruption bug I and others (see bug #451) have been experiencing. After liberal use of 'squid -k debug', I came up with the following scenario for the corruption.

1) Client A makes request, squid retrieves, starts sending data to client (for ease of testing, a large request is preferred)

2) Client B makes request:

2002/10/10 11:30:37| comm_poll: FD 66 ready for reading
2002/10/10 11:30:37| clientReadRequest: FD 66: reading request...
2002/10/10 11:30:37| commSetSelect: FD 66 type 1
2002/10/10 11:30:37| parseHttpRequest: Method is 'GET'
2002/10/10 11:30:37| parseHttpRequest: URI is '/images/ddm3h.gif'

Squid retrieves client B request from disk, meanwhile, writes for client A continue:

2002/10/10 11:30:37| comm_poll: FD 273 ready for writing
2002/10/10 11:30:37| commHandleWrite: FD 273: off 0, sz 4096.
2002/10/10 11:30:37| commHandleWrite: write() returns 4096
2002/10/10 11:30:37| clientWriteComplete: FD 273, sz 4096, err 0, off 3469312, len -1

3) comm_poll runs again:

2002/10/10 11:30:37| comm_poll: 2+0 FDs ready
2002/10/10 11:30:37| comm_poll: FD 66 ready for reading

Odd...why is FD 66 ready for reading? Should be writing the data...

2002/10/10 11:30:37| clientReadRequest: FD 66: reading request...
2002/10/10 11:30:37| commSetSelect: FD 66 type 1
2002/10/10 11:30:37| clientReadRequest: FD 66: (104) Connection reset by peer

Ahh...received ECONNRESET from client! In that case, close up shop on this FD

        comm_close(fd);

Of course, that comm_poll showed 2 FDs ready.

2002/10/10 11:30:37| comm_poll: FD 66 ready for writing

Hmm...that could be a problem - we just closed that FD. More on that later...

4) Now, go about business as usual on the request from Client A

2002/10/10 11:30:37| comm_poll: FD 273 ready for writing
2002/10/10 11:30:37| commHandleWrite: FD 273: off 0, sz 4096.
2002/10/10 11:30:37| commHandleWrite: write() returns 4096
2002/10/10 11:30:37| cbdataValid: 0x23570fe0
2002/10/10 11:30:37| clientWriteComplete: FD 273, sz 4096, err 0, off 3473408, len -1

However, the aborted request from Client B somehow shows up in the Client A download at approximately this offset.

It is at this point where I wish I had a better understanding of the code to be able to figure out where the memory is getting corrupted, but after spending a day or two on it, the best I can come up with is the attached patch, which I originally thought would solve the problem.

As illustrated above, the comm_poll shows the aborted FD is ready for read _and_ write, so I figured when it tried to perform the write to a closed FD it was causing the corruption. Unfortunately, while I still suspect my patch is the right thing to do, it does not solve the corruption.

So at this point, I submit this to you Squid programming gurus out there...please help!!!

-Phil Oester

--- squid-2.5.STABLE1-orig/src/comm_select.c Sat Apr 27 01:48:42 2002
+++ squid-2.5.STABLE1/src/comm_select.c Wed Oct 9 16:10:29 2002
@@ -452,7 +452,7 @@
                        comm_poll_http_incoming();
                }
            }
- if (revents & (POLLWRNORM | POLLOUT | POLLHUP | POLLERR)) {
+ if (F->flags.open && (revents & (POLLWRNORM | POLLOUT | POLLHUP | POLLERR))) {
                debug(5, 5) ("comm_poll: FD %d ready for writing\n", fd);
                if ((hdl = F->write_handler)) {
                    F->write_handler = NULL;
Received on Fri Oct 11 2002 - 14:30:22 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:10:40 MST