Re: [RFC] new Squid status codes

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 11 Mar 2014 10:19:42 +1300

Addressing (hopefully) Alex concerns inline...

On 2014-03-11 02:22, Kinkie wrote:
> Hi all,
> some of these values are really orthogonal in scope, so I'd love to
> see them logged independently. Doing so would however change the
> standard logging format, so this proposal is the second best choice
> and I support it.
>
> On Sun, Mar 9, 2014 at 6:18 AM, Alex Rousskov wrote:
>> On 03/07/2014 06:26 PM, Amos Jeffries wrote:
>>
>>> TCP_TUNNEL
>>>
>>> - initially for CONNECT requests which Squid serves Direct. Also to
>>> be
>>> used in future if Squid accepts an Upgrade request for protocols like
>>> WebSockets
>>>

Okay. Is this better?
  - traffic for which Squid is not able to act as a proxy. The
transaction payload/body was tunneled as raw binary to the server.

NP: While I use CONNECT as the common example it is not necessarily the
only request we will have to do this for (ie my mark-1 HTTP/2 patch).

>>>
>>> TCP_RELAY
>>>
>>> - for requests which Squid serves without even considering the
>>> stored
>>> content. ie CONNECT relayed to cache_peer,
>>
>>
>> The above two definitions overlap. Please adjust to make them disjoint
>> if you think both are needed, keeping in mind that:
>>
>> * Whether the request went direct or to a peer is already covered by
>> the
>> "hierarchy code" field and should not be repeated in the "Squid result
>> code" (a.k.a. "Squid request status" and "Squid status code",
>> depending
>> on where one looks) that you propose to expand.
>>
>> * Request method such as CONNECT is also logged separately.
>>

CONNECT could be either tunneled or relayed.

In the case of RELAY:
  - the traffic with server/peer is in a protocol which Squid supports
(HTTP/FTP/Gopher/ICY).
  - the inability to cache is defined by the specification for the
method.
  - note RELAY is not limited to CONNECT, it could be from TRACE/PUT/POST
or any one of the extension methods.

In the case of TUNNEL
  - the traffic with server/peer is in unknown protocol taken from inside
a CONNECT (or equivalent) message verbatim rather than gateway like
FTP/Gopher/etc.

So RELAY could be seen if the peer was another proxy and the CONNECT
being passed on as-is. TUNNEL if the peer were an origin and CONNECT
being delivered there un-wrapped.

>> * The _MISS suffix is appropriate in both cases (using the current
>> definition of _MISS at
>> http://wiki.squid-cache.org/SquidFaq/SquidLogs):
>> "The response object delivered was the network response object".
>>

Yes these are sub-types of MISS. I was thinking RELAY to be used most
when the RFC prevents cache storage being relevant or checked. We can
drop the definition line below...

>>
>>> requests forced to be MISS by "cache deny" or size limits, etc
>>
>>> The key property being that these can never be a HIT so dont worry
>>> about it when trying to reduce MISS.
>>
>> The two examples listed above seem to contradict the "key property"
>> intent stated right after them: If a MISS happens because of "cache
>> deny
>> or size limits" one may actually want to "worry" about that
>> transaction
>> when "trying to reduce MISS". For example, if I accidentally
>> misconfigure my Squid to "cache deny all", then I do want to analyze
>> transactions that missed because of that mistake. I do not want to
>> ignore those transaction, or I will never find the configuration bug.

If you are looking for that kind of misconfiguration then a TCP_FORCED_*
would be useful. Would you like that added?

I'm tempted to propose that and FORCED_HIT but research in
refresh_pattern where most of those would be relevant shows that
implementing it would be difficult at present.

>>
>>> TCP_SHARED_*
>>>
>>> - to indicate collapsed forwarding on the request. Similar in
>>> principle
>>> to the TCP_CLIENT_* and TCP_REFRESH_* labels indicating client forced
>>> something to happen or revalidation took place.
>>> Mainly so people can measure the difference between reguler
>>> HIT/MISS
>>> and SHARED_HIT/MISS to determine if collapsed forwarding is worth it.
>>
>> I agree that _SHARED_ (and adding sharing stats to the Cache Manager)
>> can be useful.
>>
>> Please adjust the definition to clarify that _SHARED_ is going to be
>> used for all transactions that _started_ collapsed and not just those
>> that ended collapsed. For example. a request that was initially
>> collapsed but for which things did not work out (e.g., the master
>> transaction on which we collapsed got an uncachable response, and
>> Squid
>> had to send that request to the origin server in isolation after all)
>> would most likely be logged as TCP_SHARED_MISS.
>>
>> This SHARED tag will _not_ be used for the "initial" transaction that
>> others collapsed on, right? Again, it would be nice if the definition
>> of
>> the _SHARED_ tag made that clear[er].

Yes that was the idea exactly.

TCP_SHARED_HIT being when request #2 is collapsed into request #1
TCP_SHARED_MISS being when request #3 is held for collapsing, but is
forced into forwarding with extra lag.

Similar for collapsed revalidations and TCP_SHARED_REFRESH_*.

Amos
Received on Mon Mar 10 2014 - 21:19:48 MDT

This archive was generated by hypermail 2.2.0 : Tue Mar 11 2014 - 12:00:12 MDT