Re: [PATCH] Bug 3643: Connection Auth redesign

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 28 Mar 2013 19:41:46 +1300

On 28/03/2013 7:50 a.m., Alex Rousskov wrote:
> On 03/27/2013 05:38 AM, Amos Jeffries wrote:
>> On 27/03/2013 10:16 a.m., Alex Rousskov wrote:
>>> On 03/25/2013 09:59 PM, Amos Jeffries wrote:
<snip>
>
>>>> + // clobbered with alternative credentials
>>>> + if (aur != credentialsState) {
>>>> + debugs(33, 2, HERE << "ERROR: Closing " << clientConnection << " due to change of connection-auth from " << by);
>>>> + credentialsState->releaseAuthServer();
>>>> + credentialsState = NULL;
>>>> + unpinConnection();
>>>> + // this is a fatal type of problem. Close the connection immediately
>>>> + clientConnection->close(); // XXX: need to send TCP RST packet if we can.
>>>> + stopReceiving("connection-auth change attempted");
>>>> + return;
>>>> + }
>>> IIRC, we should either stopReceiving() or close the connection. Doing
>>> both is conceptually wrong because stopReceiving() implies that we
>>> should keep sending if needed (which connection closure prohibits) and
>>> because connection closure already implies that we will not receive
>>> anything. Doing both may also lead to confusing double-closures.
>> For this case we need to close the connection ASAP and data loss of
>> everything buffered for this connection is the desired outcome.
> Let's call comm_reset_close() then (directly or indirectly, see below).
>
>
>> The stopReceiving() is to both speed up the signalling that the client
>> connection has closed, and produce a nice error message about why.
>> Are you aware of any better way to exit ConnStateData with an error?
> There is no API to do that and we should not abuse stop*() API. Our
> choices are:
>
> 1) Just use comm_reset_close().
> Fastest closure possible but no nice error message.
>
> 2) Add an optional reason to comm_reset_close().
> Fastest closure possible, with an error message.
> Perhaps make this a Connection method?
>
> 3) Add ConnStateData::close(message, reset) or similar method.
> Fastest closure possible, with a context-aware error message.
>
> Please note that we should not use ConnStateData::mustStop() because
> setAuth() may not be under job call protection. We have to close the
> connection to kill our job, unfortunately.

Yes. I'm minded to do (2) and (3) as a followup patch. For now just
using comm_reset_close().

>
>> NP: Comm::Connection::close() can be called any number of times and
>> protects against repeated comm_close() cycles if that was what you mean?
> Yes, I know, but that protection does not make repeated closures less
> confusing.
>
>
>> I am also having second thoughts about whether the unpin and
>> stopReceiving side effects matter when called from swanSong(). That
>> comment and code was written for the mk1 patch which used deleteThis in
>> the abort case (and crashed sometimes as a result). What do you think
>> about just calling setAuth(NULL) there?
> Well, it is difficult to suggest the best action because the need for
> releaseAuthServer() API itself is a sign of a problem -- the release
> should be done automatically (by default) when UserRequest is destroyed.
> However, I am sure it is difficult to fix that.

The UserRequest details are not solely locked by this reference. The
original bug which prompted this cleanup was about these helper locks
staying around after the connection closed simply due to the credentials
staying around in use by other state. Which results in the helpers all
being falsely "in use"/reserved under heavy traffic. None of the other
code releases the

> If a proper solution is too difficult, how about adding a private
> ConnStateData::endAuth() method that will do this:
>
> + auth_->releaseAuthServer();
> + auth_ = NULL;
> + unpinConnection();
>
> and then call if from setAuth() and from swanSong(), as needed? This way
> any changes to that cleanup code will be in one place.

Nevermind. I've got it. The bad side effect is that in the placement of
clearing auth before closing the connections all the connection closures
triggered get the setAuth removal message reportes. If we ensure
setAuth(NULL) is used after the connections are closed that disappears.
I've adjusted swanSong to cleanup connection auth a lot better now.

> BTW,
>
> 1) Why are we unpinning the connection in setAuth()? Seems like that is
> not directly related to authentication. Can that call wait until we
> close the connection and the regular cleanup takes place?

In majority of cases it was caused by the authentication, but so long as
the unpin is actioned on cleanup I think you are right.
All the re-pin and server connection re-use logics in the presence of
pinning are gone now yes? I left this in the patch from last year
because I was't clear on when unpin occured.

> 2) How come ConnStateData::swanSong() does not call unpinConnection()?
I dont know the answer to that one. I agree it should.

> Are we sometimes leaking pinning.peer and pinning.host because of that?

Quite right about the leak, I have applied the patch changing the
swansong cleanup to simply call unpinConnection().
I suspect it was undetected because the event of unpin not already being
called is rare.

New patch attached with the above small changes to conn-auth included.

Amos

Received on Thu Mar 28 2013 - 06:41:53 MDT

This archive was generated by hypermail 2.2.0 : Thu Mar 28 2013 - 12:00:11 MDT