Re: [squid-users] auto blacklist users from ian j hart on 2007-12-07 (squid-users)

From: ian j hart <ianjhart@dont-contact.us>
Date: Sat, 8 Dec 2007 01:20:24 +0000

On Friday 07 December 2007 23:49:35 Amos Jeffries wrote:

[Apologies in advance if I've miss-understood anything, it's late (early) and
I'm somewhat brain dead. This time zone thing's a killer]

> ian j hart wrote:
> > On Friday 07 December 2007 00:58:31 Adrian Chadd wrote:
> >> So if I get this right, you'd like to log the acl list that passed or
> >> failed the user?
> >>
> >>
> >>
> >> Adrian
> >
> > Near enough.
> >
> > I want to log the aclname (or custom error page name) and the username.
> > I'll probably want the url in short order, followed by anything else that
> > proves useful.
> >
> > I want to do this for users who are denied access.
> >
> > [The more general solution you state above would probably be okay too. I
> > might need to add DENY/ACCEPT so I can include that in the regexp.]
> >
> > <tangent>
> > Here's an example of how this might be generally useful. I have thee
> > different proxy ACLs.
> >
> > A url_regexp
> > A dstdomain list harvested from a popular list site
> > A "daily" list gleaned from yesterdays access summary
>
> Problem:
> If a student can get through all day today whats to stop them?

Nothing. But here's what I hope will happen. (I probably shouldn't reveal
this, but what the hey).

Let's say someone makes a 1000 requests while visiting 20 sites. That's 50
requests per site.

If they use a proxy, all 1000 requests accrue to the same site.

When I list the top sites by requests, proxy sites will naturally appear
nearer the top of the list.

[If they read this and stop using a proxy, that's the desired affect, and I
get logs of every site they visit, so that's a win for me anyway].

Not only that but peer pressure demands that they tell their friends, so you
don't need many users for a site to really stand out. [If they read this and
stop telling their friends, that's also a win for me].

So I spend a few minutes every day looking at the summary and add the sites to
the daily list.

The next day, they all hop back on the proxy site and get their first strike.
Muwhahaha!
Now they have a choice. Do I try another proxy and possibly get banned?
Muwhahahahahahahaha!

And If I don't spot it the first day, after a few days it'll be so popular I
can't miss it. Well that's the plan anyway.

I was thinking of a 10 point limit with 8 points for a proxy. The fine would
be expended at one point per day. So the second proxy hit would get them a
working weeks ban. About right I would say. Trying to bypass the filters has
to be the most serious offence.

If I can get them off proxies then whatever other unsuitable sites they do
visit at least they'll be logged.

> Is the list going to be accumulative over all time? or just until
> nobody is requesting the particular site?

dstdomains are fast, right? I don't see any reason to remove them once found.

FWIW the harvest list is ~6500 domains growing at a few every 10 minutes and
the daily list is 200 growing at a few a day.

Of course there's always the chance that a teacher might spot them and report
the user. In which case I view the log and see where they went then add it to
the daily.

If I'm really bored (or stuck answering the phone) I can always tail the
access log and guess which sites are suspect as they fly by. Quite theraputic
actually. fgrep \? helps here.

>
> > Which one matched? (This is where the url would be nice)
>
> Which given the current squid code is why I pointed you at deny_info
> which runs a script _at the point the request was made_ and will accept
> the ACL and URL.
>
> This gives you three benefits:
>
> 1) you can reset all student access again at the start of the day
> but block again _immediately_ when they start acting up.
> If they learn to obey the rules they will retain their access to okay
> sites. If not they are screwed.

I'll read this later, tired must sleep.

>
> 2) lets you list the newly banned site on the error page itself to
> warn/teach the students which URL will get them in trouble.

Standard error page includes this, doesn't it.

> (they may not know whats newly on the list yet, and this saves you the
> trouble of manually informing people)

I don't want to inform them. THINK EVIL.

If they know which sites are blocked, they'll just avoid them. I want them to
search for proxies and not know if they are safe to use. This is the back
pressure I was on about.

If they search for "myspace unblocker" they know exactly what's on the end of
the link so I feel no remorse in doing it this way. Plus I'm evil.

Cheers

>
> 3) lets you do almost anything you like when setting the allow/block
> state. You control the script entirely.
>
> > You can get this info by raising the log level, but not on one line,
> > which makes parsing evil. And each file is more verbose too.
> >
> > [A "full monty" implementation would be a separate match.log file
> > defaulting to "none"]
> > </tangent>
> >
> > Here's part of client_side.c
> >
> > if (answer == ACCESS_ALLOWED) {
> > ...
> > } else {
> > int require_auth = (answer == ACCESS_REQ_PROXY_AUTH ...
> > debug(33, 5) ("Access Denied: %s\n", http->uri);
> > -> debug(33, 5) ("AclMatchedName = %s\n",
> > AclMatchedName ? AclMatchedName : "<null>");
> >
> > That's half what I need straight away!
> >
> > The problem is that this is called more than once. e.g.
>
> hmm, I'm guessing at your config here but It looks kinda like:
>
> REQ ...youtube.com
>
> > passwd
> > blockproxies
>
> (HIT http_access deny passwd blockproxies)
> 407 error. auth needed for this site.
>
> REQ ... youtube.com (+ username & password)
>
> > blockproxies
>
> (HIT http_access deny passwd blockproxies)
> 404 error.
>
> Thats two seperate requests. Normal sequence for basic auth.
>
> > First one is the auth, second is the url match, and third is the error
> > page (I think).
> >
> > I can easily _not match_ the passwd ACL, but If I'm counting 'strikes' it
> > would be cleaner if blockproxies were logged just the once.
> >
> > And that's where I came in.
> >
> > Is there a better place for this, what should be a one liner. The error
> > page is only returned once, right? Which Is why I thought somewhere near
> > there would be about right. Just need a clue from someone who sees the
> > whole picture.
> >
> > If you read this far, well done :)
> >
> > Thanks

-- 
ian j hart

Received on Fri Dec 07 2007 - 18:20:56 MST

This archive was generated by hypermail pre-2.1.9 : Tue Jan 01 2008 - 12:00:01 MST