Re: [PATCH] %>la for intercepted connections

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 31 Aug 2011 15:00:44 +1200

 On Tue, 30 Aug 2011 15:55:56 -0600, Alex Rousskov wrote:
> On 08/28/2011 01:10 PM, Amos Jeffries wrote:
>> On 29/08/11 06:39, Tsantilas Christos wrote:
>>> On 08/27/2011 08:03 PM, Amos Jeffries wrote:
>>>> On 28/08/11 02:50, Tsantilas Christos wrote:
>>>>> %>la for intercepted connections
>>>>>
>>>>> This patch adjusts the %>la logformat code handling for
>>>>> intercepted
>>>>> connections
>>>>> based on the following rules:
>>>>> - If the corresponding http_port or https_port option has an
>>>>> explicit
>>>>> listening host name or IP address, then log the IP address.
>>>>> - Otherwise, log a dash character.
>>>>>
>>>>> Also adjusts %>lp logformat code handling for intercepted
>>>>> connections to
>>>>> always
>>>>> log the port number from the corresponding http_port or
>>>>> https_port
>>>>> option.
>>>>
>>>> +1. Looks fine.
>>>>
>>>> Amos
>>>
>>> I will commit this patch to trunk if there is not any objection.
>>>
>>>
>>> PS. I forgot to mention that this is a Measurement Factory project.
>>
>>
>> This whole thing itches a worry in the back of my mind. Updating the
>> release notes about %>la creation today makes me realize what it is.
>>
>> We are using ">" on tags to indicate incoming things,
>
> I do not think that part is accurate. I will try to provide a better
> definition below.
>
>> usually state
>> shared with the clients view of the world. This change makes the tag
>> loose that overlap with the clients world view on intercepted
>> traffic.
>>
>> What do you think about resurrecting %la / %lp for this data
>> instead?
>
> I think ">" is the right choice here because we are logging the Squid
> address where the client has connected to:
>
> ">" means information related to the client-Squid connection
> "<" means information related to the Squid-server connection
>

 Yes. And lack of it appears to be consistently representing squid view
 of something regardless of whether it was client or server.

 ... Such as the config port a transaction came through. ie "%la"

> "l" means information related to the Squid side of a connection

 and _that_ is what this patch breaks. Or rather obfuscates for
 intercepted traffic.

>
> Thus,
>
> ">l" means information related to the Squid side of a client-Squid
> connection, and that is what we want to log.
>

 Which worries me. I agreed to it earlier on grounds that is was squid
 outward view of the connection. But taking a closer look at the concepts
 and documentation vs the patch the misgivings comes back.

 The patch changes meaning of that definition from "local address" to
 "listening address".
 "local address" ("the Squid side of a client-Squid connection") at the
 connection/TCP/IP level is what al->tcpClient contains right now, before
 patching. The actual real client->Squid connections IP:port.

 Meaning our definition for the "l" is a bit wrong here.

 Consider there are two FD involved with each connection and how we
 handle those.
  FD 1 is listening, it has la of ::, and lp of 3129. no remote.
  FD 2 is a connection received on that. It has local=10.0.0.1:80
 remote=192.168.0.52:123

  FD 3 is listening, it has la of 192.168.0.1, and lp of 3128. no
 remote.
  FD 4 is a connection received on that. It has local=192.168.0.1:3128
 remote=192.168.0.52:456

 now the details as you describe:

> ">" means information related to the client-Squid connection

 ... AIUI that would be FD 2 and FD 4.

> "l" means information related to the Squid side of a connection

 ... AIUI that would be from FD 4 : 192.168.0.1 (>la) and 3128 (>lp)

 BUT you want FD 3 local and FD 4 remote to log here. Why not also log
 FD 1 local and FD 2 remote on their line? they are the same "the Squid
 side of a client-Squid connection" by that definition.

 My goal here is consistency and clarity of individual tokens. These are
 about to be used to dynamically generate redirected URLs in deny_info
 and error page texts.

 I suggested %la / %lp since they seem more fuzzy on where the details
 comes from without > or < claims. Seems a perfect fit for local squid
 view of something equally fuzzy. Along the lines of how we use %un for
 "any username we can find" as opposed to the specific sources.
  AND they have the extra benefit of previously being used to log the
 config IP:port by older Squid (under the conditions you want to make >la
 do so). Reviving them with this more consistent definitive content would
 technically just be a policy change on their removal. Keeping the policy
 decision to _move_ origdst over to >la, leaving cases like Linux DNAT
 where both have valid non-identical details.

 The alternative that occurs to me is our recent use of %S_ where "S"
 means Squid. Also a perfect fit by the definitions. But not as easily
 backward compatible.

>
> We could add another logformat code to log the IP address where the
> intercepted client was _trying_ to connect to, but nobody has asked
> to
> log that information yet, AFAIK.
>

 Right. The legal issues and requirements of legal-intercept are talked
 about but nobody has actually asked outright yet IIRC. What I'm hearing
 more and more about is the need to log the IP+port for use in comparison
 against raw network traffic etc in legal-intercept situations.

 Amos
Received on Wed Aug 31 2011 - 03:00:49 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 31 2011 - 12:00:03 MDT