Re: [PATCH] log received body size

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 03 Aug 2010 21:27:57 +1200

Tsantilas Christos wrote:
> Added %http::<bs and %icap::<bs logformat codes to HTTP and ICAP body
> sizes received from the next HTTP hop or the ICAP server.
>
> Logging "received message body" is useful because Squid may receive a
> lot more or a lot less than it serves to the client or than the original
> resource size, which may happen when handling Range requests and partial
> responses, when adapting bodies, and for other reasons.
>
> For HTTP, we define "received message body" as message body bytes that
> Squid stores, merges, adapts, and/or forwards. In most cases, they are
> the same as body bytes sent by the server to Squid. However, the two
> bodies may differ for reasons such as errors (where the start of the
> body was not found), HTTP transfer encodings (where Squid strips chunked
> encoding to find the message body), and generated FTP directory listings
> (that were received in a completely different format on a control
> connection).
>
> For ICAP, the "received message body" is the Encapsulated sections,
> after the encapsulated HTTP body, if any, is dechunked.
>
> This is a Measurement Factory project.
>
> Regards,
> Christos
>

* It took me a long while to see that the -1 values being used as
defaults were intended to be converted to dash in the long. Please
update all the relevant places to document that fact.

src/Server.cc:

  ServerStateData::addVirginReplyBody(const char *data, ssize_t len)
  {
+ adjustBodyBytesRead(+len);

whats with the +? we don't overload the unary operator anywhere and the
libc default is no-op AIUI. If len is required to be positive here thats
grounds for an abs(), Must() or assert() to prevent data corruption and
highlight the bugs.

src/adaptation/icap/ModXact.cc:

src/cf.data.pre:

The use of [http::] is a bit congruous here. Since the body data is not
necessarily HTTP protocol from the receiver.
   Given that there is no other obvious scope, can you change the docs
to say "HTTP-equivalent body data" from the upstream source? That should
allow then the comment about FTP listings.
  As a corollary what about gopher/whois listings, and other internal
loaded/generated objects such as errors and icons?

* I've been a bit uncomfortable since the beginning about the
de-chunking happening before the body bytes are accounted. This
introduces some loss into the protocol accounting apparently being done.
   Are those lots bytes accounted for as headers? or is the total reply
bytes value going to be larger than the sum of headers+body ?
  Do we stop at de-chunking? what about un-zipping? etc...

It's your implementation but I'm of the opinion this should log the
on-wire data size of the segment used for body. Including the chunking
bytes, and excluding the Squid-added markup bytes on FTP/Gopher listings.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.5
Received on Tue Aug 03 2010 - 09:28:09 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 04 2010 - 12:00:03 MDT