Re: [PATCH] %>la for intercepted connections from Alex Rousskov on 2011-08-30 (squid-dev)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 30 Aug 2011 21:46:05 -0600

On 08/30/2011 09:00 PM, Amos Jeffries wrote:
> On Tue, 30 Aug 2011 15:55:56 -0600, Alex Rousskov wrote:
>> On 08/28/2011 01:10 PM, Amos Jeffries wrote:
>>> On 29/08/11 06:39, Tsantilas Christos wrote:
>>>> On 08/27/2011 08:03 PM, Amos Jeffries wrote:
>>>>> On 28/08/11 02:50, Tsantilas Christos wrote:
>>>>>> %>la for intercepted connections
>>>>>>
>>>>>> This patch adjusts the %>la logformat code handling for intercepted
>>>>>> connections
>>>>>> based on the following rules:
>>>>>> - If the corresponding http_port or https_port option has an explicit
>>>>>> listening host name or IP address, then log the IP address.
>>>>>> - Otherwise, log a dash character.
>>>>>>
>>>>>> Also adjusts %>lp logformat code handling for intercepted
>>>>>> connections to
>>>>>> always
>>>>>> log the port number from the corresponding http_port or https_port
>>>>>> option.
>>>>>
>>>>> +1. Looks fine.
>>>>>
>>>>> Amos
>>>>
>>>> I will commit this patch to trunk if there is not any objection.
>>>>
>>>>
>>>> PS. I forgot to mention that this is a Measurement Factory project.
>>>
>>>
>>> This whole thing itches a worry in the back of my mind. Updating the
>>> release notes about %>la creation today makes me realize what it is.
>>>
>>> We are using ">" on tags to indicate incoming things,
>>
>> I do not think that part is accurate. I will try to provide a better
>> definition below.
>>
>>> usually state
>>> shared with the clients view of the world. This change makes the tag
>>> loose that overlap with the clients world view on intercepted traffic.
>>>
>>> What do you think about resurrecting %la / %lp for this data instead?
>>
>> I think ">" is the right choice here because we are logging the Squid
>> address where the client has connected to:
>>
>> ">" means information related to the client-Squid connection
>> "<" means information related to the Squid-server connection
>>
>
> Yes. And lack of it appears to be consistently representing squid view
> of something regardless of whether it was client or server.
>
> ... Such as the config port a transaction came through. ie "%la"
>
>> "l" means information related to the Squid side of a connection
>
> and _that_ is what this patch breaks. Or rather obfuscates for
> intercepted traffic.
>
>>
>> Thus,
>>
>> ">l" means information related to the Squid side of a client-Squid
>> connection, and that is what we want to log.
>>
>
> Which worries me. I agreed to it earlier on grounds that is was squid
> outward view of the connection. But taking a closer look at the concepts
> and documentation vs the patch the misgivings comes back.
>
> The patch changes meaning of that definition from "local address" to
> "listening address".

Yes, for intercepted connections. Listening address is a local address.

> "local address" ("the Squid side of a client-Squid connection") at the
> connection/TCP/IP level is what al->tcpClient contains right now, before
> patching. The actual real client->Squid connections IP:port.

If we are to go into these low-level details, one could argue that there
is no actual/real client-Squid connection at all because the client does
not think it is talking to Squid.

> Meaning our definition for the "l" is a bit wrong here.
>
> Consider there are two FD involved with each connection and how we
> handle those.
> FD 1 is listening, it has la of ::, and lp of 3129. no remote.
> FD 2 is a connection received on that. It has local=10.0.0.1:80
> remote=192.168.0.52:123
>
> FD 3 is listening, it has la of 192.168.0.1, and lp of 3128. no remote.
> FD 4 is a connection received on that. It has local=192.168.0.1:3128
> remote=192.168.0.52:456
>
> now the details as you describe:
>
>> ">" means information related to the client-Squid connection
>
> ... AIUI that would be FD 2 and FD 4.
>
>> "l" means information related to the Squid side of a connection
>
> ... AIUI that would be from FD 4 : 192.168.0.1 (>la) and 3128 (>lp)
>
> BUT you want FD 3 local and FD 4 remote to log here. Why not also log FD
> 1 local and FD 2 remote on their line? they are the same "the Squid side
> of a client-Squid connection" by that definition.

I do not fully understand your specific examples. I see no relevant
differences between FD1-2 and FD3-4 groups, and I do not understand how
a single connection can have four Squid descriptors associated with it.

> My goal here is consistency and clarity of individual tokens. These are
> about to be used to dynamically generate redirected URLs in deny_info
> and error page texts.
>
>
> I suggested %la / %lp since they seem more fuzzy on where the details
> comes from without > or < claims. Seems a perfect fit for local squid
> view of something equally fuzzy. Along the lines of how we use %un for
> "any username we can find" as opposed to the specific sources.
> AND they have the extra benefit of previously being used to log the
> config IP:port by older Squid (under the conditions you want to make >la
> do so). Reviving them with this more consistent definitive content would
> technically just be a policy change on their removal. Keeping the policy
> decision to _move_ origdst over to >la, leaving cases like Linux DNAT
> where both have valid non-identical details.
>
> The alternative that occurs to me is our recent use of %S_ where "S"
> means Squid. Also a perfect fit by the definitions. But not as easily
> backward compatible.

I believe that since connection is intercepted, it is in the gray area
and many conflicting things will be "kind of true" about it.

If you insist on %la, and Christos is fine with that, let's add %la that
does what Christos implemented for %>la and also log a dash for %>la
when the connection is intercepted.

While the above adds more work, what is critical for me, based on user
requests, is that a single logformat option records actual Squid address
for non-intercepted connections and specified Squid http_port address
for intercepted connections.

My understanding is that such functionality is needed in environments
where Squid handles regular and intercepted requests on multiple
http_ports and where billing and similar needs require the knowledge of
the port handling each transaction.

Thank you,

Alex.
Received on Wed Aug 31 2011 - 03:46:23 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 31 2011 - 12:00:03 MDT