Re: [RFC] connections logging

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 24 Jun 2014 11:09:16 -0600

On 06/24/2014 12:33 AM, Amos Jeffries wrote:
> On 24/06/2014 12:26 p.m., Alex Rousskov wrote:
>> On 06/20/2014 03:04 AM, Amos Jeffries wrote:
>>> I am playing with the idea of adding a new log file to record just the
>>> connections handled by Squid (rather than the HTTP request/reply
>>> transactions).
>>>
>>> This log would include connections opened but never used, port details,
>>> connection lifetimes, re-use counts on persistent connections, SSL/TLS
>>> presence/version (or not), and other connection details we find a need
>>> for later.
>>
>>
>>> The driver behind this is to help resolve a growing amount of user
>>> queries regarding "happy eyeballs" idle connections and broken TLS
>>> connections. We are also adding other potentially never-used connections
>>> ourselves with the standby pools.
>>
>> A couple of days ago, I came across another use case for logging
>> connections unrelated to Happy Eyeballs. It sounds like this is going to
>> be a generally useful feature. My use case is still being clarified, but
>> here are the already known non-obvious requirements:
>>
>> 1. Logging when the connection is received (unlike transactions that are
>> logged when the transaction is completed).
>>
>> 2. Logging client-to-Squid connections.
>>
>> 3. Introduction of some kind of unique connection ID that can be
>> associated with HTTP/etc transactions on that connection, correlating
>> access.log and connection.log entries. [ uniqueness scope is being
>> clarified, but I hope it would just be a single worker lifetime ]
>>
>
> For IDs I was thinking of just emitting the MasterXaction::id value sent
> in by TcpAcceptor. Assumign this is defined as the TCP transaction (see
> below).

Yes, MasterXaction ID can be copied to be used as a connection ID, but
there is no good reason for such a reuse IMO. We need a Connection ID.
IMO, it would be better to just add a Connection-dedicated InstanceId
(especially if it satisfies the yet-unknown uniqueness scope requirements).

Yes, an InstanceId-based ID would work for the "worker lifetime"
uniqueness scope, and I hope that is all we need to support.

>> I am not sure what you mean by "if the connection was a transaction".
>
> I mean defining the TCP actions of opening a socket/connnection, its
> entire lifetime, and close() as being one transaction. With the
> TcpAcceptor produced MasterXaction representing that cycle.

I strongly disagree that we should change the MasterXaction definition
like that. We already have a Connection class to represent what you
want, I think. Why drag Master Transaction into this, abandoning its
current role of aggregating information about transactions caused by a
single HTTP request (or equivalent)?

HTTP transactions need a Master Transaction to correlate an HTTP request
received from a client with the corresponding HTTP request(s) sent to
the server, with the corresponding HTTP response(s) received from the
server, with the corresponding ICAP messages, DNS activity, etc. A
Master Transaction usually relates to several Connections as well, but
Connection is not a Master Transaction.

> So we have two new transaction types there. TCP transaction and HTTP
> transaction, to go alongside ICAP and eCAP transaction etc.

It is a bad idea to use the same terminology and class hierarchy for
low-level transport concepts such as TCP and for high level HTTP
transactions executed on top of a transport connection.

We should just use Connection instead of "TCP transaction". If we need
TcpConnection, we can add that. The Connection hierarchy is separate
from the Xaction hierarchy, but Xactions may use Connections, of course.

>> The MasterXaction is usually associated with at least one connection
>> (via the small "t" transaction using that connection), and one
>> connection is usually associated with at least one MasterTransaction,
>> but transactions and connections are completely different concepts at
>> Squid level IMO. I doubt we should try to merge the concept of a
>> transaction and a connection somehow.

> Generating the HTTP transaction MasterXaction object (child of
> MasterXaction?) by cloning it from the TCP connection one represents
> that relationship. The clone could be seeded with the parents id value
> to persist that info for later use (eg your start of HTTP transaction
> log entry).

I am afraid we have a completely different view of what MasterXaction is
and what protocol transactions are (or should be). It is possible that I
am completely misinterpreting your statement about cloning and seeding,
but if you are calling for creation of some abstract MasterXaction class
that will be a parent for various specific FooMasterXaction classes such
as HttpMasterXaction and TcpMasterXaction, then I see no need for (and a
lot of harm from) that level of complexity.

I do not know how to avoid another prolonged discussion here, but
perhaps we can start with definitions of the proposed classes (instead
of discussing cloning and seeding which are just low-level details)? The
classes I am talking are, roughly:

* MasterXaction: [Historical/aggregated information about] a
client-Squid HTTP transaction and related/caused transactions such as
Squid-server HTTP transactions, ICAP transactions, etc.

* HttpXaction: An HTTP request and its final response pair, including
any 1xx control messages between them.

* Connection: A transport connection with another agent. For example, a
TCP connection to Squid's HTTP port.

>> MasterTransaction is created at connection acceptance time because that
>> is when the first HTTP/etc. transaction starts (parsing headers is not
>> the first step) and, hence, that is when the master transaction starts.

> Hmm. We have some issue here satisfying the people who want HTTP
> transation only to represent the time of request+reply versus browser
> happy eyeballs (and now Squid) opening standby connections with long
> durations before first byte. That is one reason supporting separate
> transaction MasterXaction for TCP and HTTP.

I do not see any in-scope design-level issues here. Yes, there are many
timestamps associated with a single Master Transaction or even with a
single HTTP transaction. Different folks will look at different
timestamps stored in a [Master] Transaction or in transactions the
master points to. That multitude of timestamps and measurements is
inevitable and does not represent a design problem in this discussion
scope IMO.

>> We could make the connection acceptance event itself a transaction (an
>> object of some new AcceptXaction class inherited from a generic
>> Transaction class and associated with the corresponding Master
>> Transaction object just like HTTP and ICAP transactions should be). I do
>> not yet see what that would actually buy us except more classes to worry
>> about, but, at the very least, it would not be conceptually wrong.

> I agree accept by itself is probably useless. The longer TCP lifecycle
> on the other hand (see above) has some use for stats and reporting.

And we already have a class where such stats and reporting can be stored
and managed. It is called Connection. What is wrong with using
Connection for connection-related stats and reporting?

Thank you,

Alex.
Received on Tue Jun 24 2014 - 17:09:39 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 24 2014 - 12:00:17 MDT