Re: HTTP client disconnection vs ICAP disconnection from John Craws on 2011-02-15 (squid-dev)

From: John Craws <john.craws_at_gmail.com>
Date: Tue, 15 Feb 2011 10:55:18 -0500

Alex,

Thank you so much!

I'll start testing with trunk and gathering the information you requested.

John

On Tue, Feb 15, 2011 at 12:19 AM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 02/09/2011 02:49 PM, John Craws wrote:
>> I am including logs of what happens when wget is cancelled.
>>
>> FD 8 = wget
>> FD 9 = origin
>> FD 12 = ICAP
>
> John,
>
> I recall fixing a similar bug not too long ago, but I do not
> remember where it happened. If it is not too difficult, please try using
> Squid trunk even if the eventual fix needs to be propagated to
> your/other Squid version.
>
>
>> noteInitiatorAborted() is scheduled from the ClientHttpRequest
>> destructor for FD 8 (wget). It doesn't seem to be executed however.
>> Shouldn't it be?
>
> The call is not delivered because the recipient does not exist any more:
>
>> 2011/02/07 12:32:30.381| AsyncCall.cc(42) make: will not call
>> Initiate::noteInitiatorAborted [call49421] because of job gone
>
> This means that the initiatee (one of the ICAP jobs that eventually
> creates an ICAP REQMOD transaction and waits for its results) quit some
> time ago. This is normal and, if I understand your report correctly, you
> do not care about REQMOD. You care about RESPMOD.
>
>
> Your problem is not with the HTTP client-side <-> ICAP REQMOD link. Your
> problem is with the HTTP server-side <-> ICAP RESPMOD link. If HTTP
> server job is never destroyed then the ICAP transaction will run until
> the ICAP transaction is done or will get stuck waiting for the
> server-side to supply or consume response body. More on that below.
>
> I suggest that you run this with debug_options set to ALL,9 and post the
> complete log from the time the request starts to 1 minute after the
> client is gone. It would be better if there are no other transactions
> around. Compress and/or file a bug report as needed. If you do this with
> the latest trunk and CC me, I promise to take a look at the log and will
> try to help :-). As usual, others may find and fix the bug faster.
>
> And, just to clarify:
>
> On 02/08/2011 05:38 PM, Amos Jeffries wrote:
>
>>> 2. -> AsyncJob::deleteThis("HttpStateData::httpStateConnClosed") is not
>>> > called.
>
>> Ambiguous whether this is needed. IIRC at this point doneAll() supposedly
>> should return true which would lead to (3) below.
>
> Amos describes the current AsyncJob design correctly. However, Server
> and HttpStateData classes were written well before that design too shape
> and have never been fully converted. I am sure there are some bugs
> floating around there and it is possibly you found one of them.
> Specifically, I do not think Server/HttpStateData implement doneAll()
> correctly. Instead, they often rely on deleteThis(), quitIfAllDone() and
> other conversion hacks.
>
>
>> The server side seems very twisted and does not appear to implement the
>> Jobs API at the leaf HttpStateData level, doing it at the generic
>> ServerStateData level instead.
>
> IMHO, the server side is far less twisted than the client side, but it
> does require an extra effort to see the whole chain of events when it is
> split between generic Server and HTTP-specific HttpStateData (or FTP)
> objects. Based on your description, it sounds like your problem is
> generic (client disconnect affects all Server kids in similar fashion),
> but I may be wrong.
>
>> 1. Squid downloads the response from the origin server pretty fast.
>> 2. ** The ICAP service needs the complete original response before the
>> modified response may be generated for the client.
>> 3. The download of the original response completes and
>> HttpStateData::processReplyBody() sees a COMPLETE_NONPERSISTENT_MSG
>> (because of persistentConnStatus()).
>> 4. serverComplete() is called.
>> 5. serverComplete() calls closeServer().
>> 6. HttpStateData::closeServer() calls comm_close() on the FD, ** but
>> it first removes the close handler using comm_remove_close_handler()
>> **.
>> 7. comm_close() (_comm_close()) calls commCallCloseHandlers(), but the
>> close handler has been removed.
>
> All of the above sounds good. If Server is done talking to the origin
> server, it should close the socket normally. There is no need for
> emergency/unexpected cleanup provided by the comm_close handler.
>
> You need to figure out why the ICAP transaction waits for something if
> the Server thinks it has nothing more to give. Did the Server forget to
> notify the ICAP side that there will be no more virgin response body?
> This is done by calling stopProducingFor().
>
> BTW, note how we do not talk about the client-side at all in the above.
> Based on your description, it appears that the server side is unaware of
> the client abort. This is almost normal because the server code is
> probably not registered to receive StoreEntry abort notifications. It
> will eventually notice the abort when it tries to write to that entry.
> There is room for optimization here, but your primary problem is
> probably elsewhere (i.e., in the stuck ICAP transaction, waiting for
> something from the server-side, if I understand your report correctly).
>
>
> HTH,
>
> Alex.
>
>
>
>
>
>
>
Received on Tue Feb 15 2011 - 15:55:25 MST

This archive was generated by hypermail 2.2.0 : Tue Feb 15 2011 - 12:00:04 MST