[squid-users] Log Daemon Program, input information

From: dweimer <dweimer_at_dweimer.net>
Date: Tue, 11 Mar 2014 15:37:44 -0500

I have written a log daemon application using Python to write data into
PostgreSQL, however it periodically errors with

Invalid byte sequence for encoding "UTF8": 0xe2 0x3f 0x27

obviously it's receiving some data that it can't encode to UTF8 and
write to the database, but I can't figure out a method to retrieve the
incoming data in order to see what data its receiving that it can't
encode.

Every attempt I have made to use Python's built in try/except mechanism
to catch the error just stops logging entirely when triggered, instead
of preforming the except section of code I wrote to output the data to a
text file.

While I continue to try and figure that out, does anyone have more
information as to what data encoding/character sets squid can output in
the log data? I am asuming its a special character used in the
request_url field that's causing the problem, I just haven't a slightest
clue as to what, as I haven't been able to trigger it in my test
environment, only on the production one.

I am using a custom log output format, that is basically the default
with the field separators changed to |~|, to make parsing the output
into columns easier.

logformat SQL
%ts.%03tu|~|%6tr|~|%>a|~|%Ss|~|%03>Hs|~|%<st|~|%rm|~|%ru|~|%[un|~|%Sh|~|%<a|~|%mt

-- 
Thanks,
    Dean E. Weimer
    http://www.dweimer.net/
Received on Tue Mar 11 2014 - 20:37:52 MDT

This archive was generated by hypermail 2.2.0 : Wed Mar 12 2014 - 12:00:07 MDT