[PATCH] Limit log field width

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 20 Jan 2010 10:34:55 -0700

Hello,

    The attached patch was motivated by user complaints that “vi”,
“cat”, and even some more sophisticated log analysis tools have trouble
handling long access.log lines. Those long log lines result from long
URLs that do occur in the wild.

Squid places a 8192 character limit on the URL length but that limit
exceeds (a) some of the tool limits and (b) Squid's access.log buffer
limit (if some other fields are logged).

My solution was to honor the .precision setting in logformat field
specifications. You can use it with %ru or any other text field.

For example, the format code below limits logged URI size to the first
1000 characters.

  logformat xsquid ... %rm %.1000ru %un ...

Squid access log line buffer cannot exceed 8192 characters. If you want
to preserve fields logged after the URL, your logged URL width limit
should be smaller than 8192 to leave space for other fields on the log
line.

There is no width limit by default.

Here is a possible commit message:

---------------------------------
Support maximum field width for string access.log fields.

Some standard command-line and some log processing tools have trouble
handling URLs or other logged fields exceeding 8KB in length. Moreover,
Squid violates its own log line format and truncates the entire log line
if, for example, the URL is 8KB long. By supporting .precision format
argument, we allow the administrator to specify logged URL size and
avoid these problems.

Limiting logged field width has no effect on traffic on the wire.

TODO: The name comes from the printf(3) "precision" format part. It may
be a good idea to rename our "precision" into max_width or similar,
especially if we do not support floating point precision logging.

TODO: Old code used chars to store user-configured field width and
precision. That does not work for URLs, headers, and other entries
longer than 256 characters. This patch changes the storage type to int.
The code should probably be polished further to remove unsigned->signed
conversions.
---------------------

Please review.

Thank you,

Alex.

Received on Wed Jan 20 2010 - 17:34:50 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 21 2010 - 12:00:06 MST