Re: [PATCH] Master transaction state object

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 17 Jun 2013 09:33:40 -0600

On 06/09/2013 06:50 AM, Amos Jeffries wrote:
> New patch attached for review.
>
> This excludes the XXX/TODO and pipeline parts, most of which are already
> merged and the remainder being deferred until a later update.

Thank you for addressing my concerns related to pipelining changes.

Please note that the mk2 patch is missing the new MasterXaction class
being added. The notes below are for the changes actually present in the
patch.

> } flags;
> +
> struct {
> Comm::ConnectionPointer serverConnection; /* pinned server side connection */

Please remove a whitespace change.

> + // Squid listening port details where this connection arrived.
> AnyP::PortCfg *port;
>

Please use doxygen /// comments to describe class data members.

> httpsAccept(const CommAcceptCbParams &params)
> {
> - AnyP::PortCfg *s = static_cast<AnyP::PortCfg *>(params.data);
> + MasterXaction::Pointer xact = params.xaction;
> + const AnyP::PortCfgPointer s = xact->squidPort;
> + assert(s.valid());

Do we have to kill worker when the port becomes invalid while the accept
call is being propagated? If yes, please do that in httpAccept() as
well. If not, please handle the case of an
invalidated-while-we-were-waiting port more gracefully.

BTW, is is probably a bad idea to use cbdata-protected PortCfg. Ideally,
when reconfiguration changes the port, it should not be invalidated,
causing all sorts of assertions for pending transactions using that
port. We should use reference counting instead (with some additional API
to mark no-longer-configured ports), but that change is outside your
project scope, of course.

>>>> === modified file 'src/CommCalls.h'
>>> ...
>>>> + /// Transaction which this call is part of.
>>>> + MasterXaction::Pointer xaction;
>>>
>>> It feels wrong to place that object inside every Comm call. A Comm call
>>> is a very low-level event. What would it do with a Master Transaction?
>>> It would not even know which part of the Master Transaction corresponds
>>> to the call! In fact, there are Comm calls that are not associated with
>>> a master transaction. Let's not try to cram master transactions
>>> everywhere just because it is needed somewhere. It will force us to
>>> either create fake master transactions or deal with possibly nil master
>>> transaction pointers.
>>
>> Right now the TcpAcceptor AsyncJob is using it in AcceptCbParams to
>> pass the new transaction out to its child.

Storing a master transaction pointer may be appropriate for
AcceptCbParams because you want all accepts, even UDP and SMTP ones, to
be associated with some master transaction (an important and not-obvious
design decision on its own!).

>> I am envisioning that the CloseCbParams will need to use it to pass it
>> to the Comm layer close handlers,

Why? I would imagine that any job handling a connection closure should
already have a pointer to the master transaction because that job had to
deal with the corresponding _open_ connection (associated with that
master transaction) at some point.

>> and that the IoCbParms will need to use it to pass to low level I/O handlers

What would the low-level I/O handlers do with a master transaction
pointer? They would not know which part of the master transaction they
are doing I/O for and, hence, would not be able to update any
connection-specific counters.

> - for each of those
>> components to write the stats accounting details into the xaction
>> data. For now it is a NULL pointer on these parameters.

The only stats I can think of they would be able to update are "I/Os per
master transaction" stats, which we do not collect now, and probably for
a good reason (they would be mostly meaningless because I/Os differ too
much depending on which connection socket or disk file descriptor they
are being performed).

Until there is code that actually collects these questionable stats,
let's keep this new master tranasction pointer confined to the accept
callbacks (if you insist on that). We can always move the master
transaction pointer member to the parent callback data class if we
decide it is needed there for one reason or another.

>>> We will need to adjust the description of the MasterXaction class
>>> accordingly. For now, I suggest adding three important pieces of
>>> information:
>>>
>>> * What is a "client transaction"? Should we define that as anything
>>> worth logging into access.log? Or does that extend to ICP clients? SNMP?
>>>
>>> * What event starts/creates a master transaction?
>>
>> I am thinking these conditions create a new transaction:
>> 1. New TCP connection, New UDP connection, New SCTP connection, etc. etc.
>> 2. Parsing a new request on an existing persistent connection of any
>> above type.
>> 3. starting a Squid internally generated request
>>
>> We get into some tricky semantics with ssl-bump and ESI that still
>> need to be worked out.

I could not review the corresponding changes since the MasterXact class
was missing from the mk2 patch. They are very important though IMO.

>>>> ConnStateData();
>>>> + ConnStateData(const MasterXaction::Pointer &xact);
>>> Should the old constructor be removed now?

>> The old constructor is there as a "// do not implement."

Why? It is better to remove the default constructor completely if it is
not needed any more so that we get compile-time errors instead of
linker-time errors.

>> If the state shared between all Jobs is going to be in here logically
>> we will end up placing the transaction in/out buffers in here as well.

A master transaction does not have an in buffer and an out buffer.
Individual protocol/side transactions associated with the master
transaction may have such buffers, but they should not share or expose
their transaction-specific knowledge at the master transaction level.

>> In which case the Comm systems will all be using it for source/sink.
>> And by intention the read/write at least will be adding their stats here.

I do not think master transaction abstraction works that way: There is
no [single] source/sink for a master transaction.

>> It also demonstrates to you and others the mechanism for passing
>> between Jobs.

True, but if feels like the wrong mechanism to me. While it may sound
convenient to make a MasterXaction object a universally-accessible
global (essentially), I think it will create more problems than it will
solve because not all async calls and not all async jobs are related to
a master transaction. This mechanism does not reflect reality well and,
hence, is likely to fail long-term. Let's not introduce it until we
really need it.

>>> Please do not commit all those XXXs/TODOs though. After the
>>> MasterXact skeleton is finished and committed (this thread), just post
>>> the new MasterXact-using code as you have time to work on it.
>>
>> Do you not think it better to mark the missing API alterations for
>> others to be aware of and maybe chose to join in working on?

IIRC, some of those added XXXs and TODOs were incorrect IMO, but I would
prefer not to delay this change by discussing them here and now. Thank
you for removing them.

Sorry for not being able to review this earlier,

Alex.
Received on Mon Jun 17 2013 - 15:34:01 MDT

This archive was generated by hypermail 2.2.0 : Mon Jun 17 2013 - 12:00:13 MDT