digging in your squid logs...

From: <po6@dont-contact.us>
Date: Wed, 2 Aug 2000 12:40:04 -0700 (PDT)

Bored?
The "quality" of the stuff that this script digs out of your
cache is quite amazing. Tested with squid 1.1.22 - sorry, I know
it's ancient - anybody want to test this and provide a patch for a
recent squid version?

Make sure your netscape is configured to use the squid cache...
===============8<===========dirtdigger=========>8=========
#!/usr/bin/perl -w
# Dirt Digger for Squid Cachelog
# optional Parameter: at start, skip # of lines in log

use English;
use strict;

my($MATCH,$FORMAT,$MINSIZE,$CMD,$WAITFORKEY);

# Comment out CMD to just list the matches
$CMD = 'netscape -remote openURL\(%s\) 1>/dev/null 2>&1';

$MATCH = "(teen|sex|porn|bizar|fuck|doll|xxx|dirty|taboo|virgin|pussy|nude|\
          centerfold|spread|wet|couple|suck|[^c]lick|cum|babe|\
          tits|erot|girl|chick|celeb|hard|core|lez|lesb|naked|flesh|butt|\
          amateur|young|slut|fick|horny|blow|swallow|love|pam|\
          akn-systems|\.susy\.|torrid|hustler|incantrix|\
          geil|fetish|black|orient|asia|orgy|orgie|sins|heels|sinful|swed)";

#$MATCH = "."; # quick for match *

my $SKIP = "(score|freepornsite|porncity|ipics)"; # sites that are down or suck
#undef $SKIP;

$FORMAT = "(gif|jpg|jpeg)"; $MINSIZE = 40 * 1024; # for pics
#$FORMAT = "(mov|mpeg|mpg|avi)"; $MINSIZE = 2000 * 1024; # for movies
#$FORMAT = "(mpeg|mpg)"; $MINSIZE = 900 * 1024; # for movies
#$FORMAT = "mp3"; $MINSIZE = 2000 * 1024; # for mp3
#$FORMAT = "zip"; $MINSIZE = 1430 * 1024; # for ZIP

$WAITFORKEY = 1;
my $dummy;

my($CACHELOG) = "/var/cache/log";

##
### End of Config Section
##

$OUTPUT_AUTOFLUSH = 1;
my($STARTLINE) = shift;

open(URLS, $CACHELOG) or die "oh. Failed to open log $CACHELOG: $!";

##
### End of Config Section
##

$OUTPUT_AUTOFLUSH = 1;
my($STARTLINE) = shift;

open(URLS, $CACHELOG) or die "oh. Failed to open log $CACHELOG: $!";

while(<URLS>) {
        next if defined $STARTLINE && $INPUT_LINE_NUMBER < $STARTLINE;
        next unless /\.$FORMAT/i && /$MATCH/xi;
        next if defined $SKIP && /$SKIP/xi;

        $_ =~ /\ (\d+)\ ((ftp|https?):.*)/; # isolate size and URL
        next unless $1 > $MINSIZE;
        print "$INPUT_LINE_NUMBER: $2 $1";
        system(sprintf($CMD,$2)) if defined $CMD;
        # wait for return, or print one:
        defined $WAITFORKEY ? $dummy = <STDIN> : print "\n";
}

======end======8<===========dirtdigger=========>8=========
Received on Wed Aug 02 2000 - 13:43:43 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:54:41 MST