Re: [squid-users] logging question from Henrik Nordstrom on 2002-09-04 (squid-users)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Thu, 5 Sep 2002 01:19:45 +0200

On Wednesday 04 September 2002 22.26, maxwell wrote:

> Regardless of the merits of personal point of view, I am in a
> position that requires me to log this information in order to save
> these employee's jobs. What if they are baking? Then they will be
> fired. What if they are entering private information in a loan or
> credit application? They will be fired.

The question raised was not wether they will get fired or not, it was
the responsibilities of your company towards the person in case you
by logging this information accidenlty (or intentionally) leaks it to
a third person.

> While I thank you for the repeated explanation of your personal
> beliefs and preferences, my task is unchanged. If I cannot log the
> information and use it to vindicate these people, they will be
> fired. The math is quite simple. No logging == fired employees.
> Not in the future, not in the past, but now. These are real
> people, who's jobs are at stake and who will face unemployment
> _without_ benefit of unemployment insurance or continued benefits,
> within about ten working days. The paperwork has already been
> started, and I will have a single opportunity to stop it during a
> review.

Extending Squid to do the required logging should not be too hard I
think. Any data submitted via GET can be logged in access.log by
disabling strip_query_terms (default on). Any data sumbitted via
POST/PUT pass via the pumpServerCopy function in pump.c (Squid-2.4)
or the clientReadBody function in client_side.c (Squid-2.5).

You could also use a simpler "logging TCP plug" infront of Squid.

Or use a traffic dump (tcpdump -i eth0 -w traffic.dump -s 1600 host
your.squid.proxy and port 3128) and then inspecting it with ngrep or
other similar tools to get a picture of what data people are
submitting.

Making the traffic dumps gets you going now with no delay, but
processing the results is a somewhat daunting task.. but if you first
split on a per-ip basis then it should not be too hard (assuming your
users tend to use one IP per user with a somewhat static IP
assignment not randomly changing every 5 minutes). Only requires a
small amount of scripting to automate the extraction of the traffic
you really is interested in.

Regards
Henrik
Received on Wed Sep 04 2002 - 17:32:58 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:10:05 MST