Fwd: [squid-users] Extracting selected data from logfile

From: David Rodríguez Fernández <davidrf_at_gmail.com>
Date: Fri, 20 Mar 2009 14:16:04 +0100

This don't show all IP. Only the IP from the last 5000 request. If you
want all IP from all request in the access file you must use:

grep -w "403" access.log | awk '{print $1}' | sort | uniq > file.txt

On Fri, Mar 20, 2009 at 11:48 AM, Frog <frog_at_rsf1.net> wrote:
>
> Hello all,
>
> Thank you Chris for the suggestion. It helped enormously. I have extracted the data I was looking for by using the following:
>
> tail -n 5000 access.log | grep "403" | awk '{print $1}' | uniq -d > file.txt
>
> Best regards
> Frog.
>
>
> ----- Original Message -----
> From: "Chris Robertson" <crobertson_at_xxxxxx>
> To: squid-users_at_squid-cache.org
> Sent: Thursday, 19 March, 2009 21:37:25 GMT +00:00 GMT Britain, Ireland, Portugal
> Subject: Re: [squid-users] Extracting selected data from logfile
>
> Frog wrote:
> > Hello All,
> >
> > Hopefully someone may be able to assist me.
> >
> > I have Squid setup here as a reverse proxy. I have logging configured using the following settings in squid.conf:
> >
> > logformat combined %>a %ui %un [%tl] "%rm %ru HTTP/%rv" %Hs %<st "%{Referer}>h" "%{User-Agent}>h" %Ss:%Sh
> > access_log /var/log/squid/access.log combined
> >
> > To block certain bots and bad user agents I have the following:
> >
> > acl badbrowsers browser "/etc/squid/badbrowsers.conf"
> > http_access deny badbrowsers
> >
> > The http_access deny returns a 403 to a visitor that meets the criteria in badbrowsers.conf and this works perfectly. But I would like to take this one step further. I would like to build a blacklist in real time if possible of IP addresses that have been served a 403 error.
> >
> > Unfortunately my knowledge of most of the popular scripting languages is non-existent so I was wondering if something like a redirector could be configured to meet my needs?
> >
> > I have looked at fail2ban however it doesn't seem to parse my log files even if I change the squid log format to common.
> >
> > Basically I am wondering if there is a way to parse the logfile to append to a new file any IP address that was served a 403.
> >
>
> Something like...
>
> tail -n 5000 /path/to/access.log |grep "HTTP/[^"]*\" 403" |awk '{print $1}
>
> ...run from the command line should (on my GNU/Linux machine) search the
> last 5000 lines (tail -n 5000) of the file at /path/to/access.log for
> the string "HTTP/" followed by any number of characters that are NOT a
> double quote, followed by a double quote, a space, and the string "403"
> (grep "HTTP...).  The first column from any lines with a matching
> pattern will be printed (awk '{print$1}).
>
> This is in no way tested, and obviously does not append to a file or run
> automatically.
>
> > Thank you in advance for any pointers.
> >
> > Frog..
> >
>
> Chris
>
Received on Fri Mar 20 2009 - 13:16:15 MDT

This archive was generated by hypermail 2.2.0 : Fri Mar 20 2009 - 12:00:03 MDT