[squid-users] porn filtering, blacklists, and squid log file analysis

From: Dave <dmehler26@dont-contact.us>
Date: Sun, 8 Jul 2007 11:53:35 -0400

Hello,
    I'm trying to implement porn filtering. I'm trying a variety of setups
to see which will give me the best results. First i'm using squid (2.6 port
on FreeBSD), as a transparent proxy in all setups. Setup1 is using
squidGuard, and the Mesd blacklist. When i dropped in mesd to the picture
the situation improved, a lot of previously accessible sites were now
blocked. My volunteer has a test machine for this and was able to google and
to either pull up images, nothing with pornographic-like names, but that
kind of images, and sites that weren't on the list. I update the blacklist
every night, but i need to write a script that goes through the access.log,
finds machine accesses and where they go, and then sets up a list of sites.
It then goes through said list, eliminating all duplicate entries, and sees
which domains still work, those that do are automatically added to a custom
squidguard blacklist and squidguard is reconfigured, squid reloaded.
    After that explanation i use grep on the access.log to find only the
accesses from the machine i want my test box, put that in another file. I
then use cut to take out i think it's field 10 or 11 it's the url of the
page, drop that in another file. The problem is i have a file containing
9500 entries, manually going through this isn't an option. If anyone can
help with this i can put the file somewhere where it can be downloaded.
        On the subject of blacklists aside from the mesd list, is there
anymore lists for squid/squidguard, that are free or free for noncommercial
purposes?
    My second setup involves dansguardian. My issue with this is first the
last time i tried this yes it worked though i never stress-tested this to
the extent i'm going for now, and second it seemed to slow the internet down
very noticeably to the point where everyone was telling me. I've got squid
as a transparent proxy using pf and i'd like to keep that arrangement, last
time i had to change this if there's an alternative i'm open to suggestions.
Thanks.
Dave.
Received on Mon Jul 09 2007 - 00:35:19 MDT

This archive was generated by hypermail pre-2.1.9 : Wed Aug 01 2007 - 12:00:03 MDT