Re: [squid-users] squidguard not redirecting from Amos Jeffries on 2013-05-18 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 18 May 2013 20:54:43 +1200

On 18/05/2013 6:23 p.m., Helmut Hullen wrote:
> Hallo, Amos,
>
> Du meintest am 18.05.13:
>
>>>> SG has numerous problems which caused it not to do what it's
>>>> supposed to, including that "emergency" mode thing. Here are some
>>>> things to consider:
>>>> 1) a BIG blacklist is overhyped - when I had a good look at our
>>>> requirements, there was only a small percentage of those websites
>>>> we actually wanted to block, the rest were either squatting
>>>> websites or non-existent, or not relevant. Squid could blacklist
>>>> (eg ACL DENY) those websites natively with a minimum of fuss.
>>> May be - it does a good job even with these unnecessary entries.
>> If the list is that badly out of date it will also be *missing* a
>> great deal of entries.
>
> Yes - may be. But updating the list is a really simple job.
>
>>>> 2) SG has not been updated for 4 or 5 years, if that's your latest
>>>> version, you are still out of date.
>>> I can't see a big need for updating. Software really doesn't need
>>> changes ("updates") every month or so.
>> For regular software yes. But security software which has set itself
>> out as enumerating badness/goodness for a control method needs
>> constant updates.
> May be - but "squidguard" does a really simple job: it looks into a list
> of not allowed domains and URLs and then decides wether to allow or to
> deny. That job doesn't need "constant updates".

Unfortunately it does so by forcing all the compications into Squid.

In order for SG to do that "really simple job". Squid is required to:
* manage a group of sub-processes, including all error handling when
they fail or hang.
* generate and process requests and responses in a protocol to
communicate with those sub-processes
* schedule client request handling around the delay from external
processing, including recovery on SG errors
* clone the HTTP request and perform a sub-request when redirected-to
URL is presented by SG.

Much better to have Squid doing the simple ACL task and drop all of the
above complications.

Not to mention that Markus fed back a lot of the ufdbGuard improvements
into Squid-3.2 and we now have ACLs which operate reasonably fast over
big lists of regex. Not that using big lists of regex is a great idea
anyway.

>
>>>> More to the point, you will not find much help now. or anyone to
>>>> fix it even if you could prove it's a bug.
>>> "That depends!" - I know many colleagues who use "squidguard" since
>>> years; the program doesn't need much help.
>> During which time a lot of things have progressed. Squid has gained a
>> lt of ACL types, better regex handling, better memory management, and
>> an external ACL helpers interface (which most installations of SG
>> should really be using).
>
>> Which brings me back to my question of what SG was being used for. If
>> it is something which the current Squid are capable of doing without
>> SG then you maybe can gain better traffic performance simply by
>> removing SG from the software chain. Like csn233 found it may be
>> worth it.
> The squidguard job is working with a really big blacklist. And working
> with some specialized ACLs.

Which apart from the list files, is all based on received information
sent to it by Squid.

> I know "squid" can do this job too - and I maintain a schoolserver which
> uses many of these possibilities of "squid". But then some other people
> has to maintain the blacklist. That's no job for the administrator in
> the school.

You are the first to mention that change of job.

The proposal was to:
  * make Squid load the blacklist
  * remove SG from the software chain
  * watch response time improve ?

Nowhere in that sequence does it require any change of who is creating
the list.

At most the administrator may need to run a tool to convert from some
strange format to one Squid can load. (FWIW: both squidblacklists.org
and Shalla provide lists which have already been converted to
Squid-compatible formats).

> "better traffic performance" may be a criteria, but (p.e.) blocking porn
> URLs is (in schools) a criteria too.
> Teachers have to look at "legal protection for children and young
> persons" too.

I'm just talking about shifting the checks to the place where they can
be tested most effecctively. Not removing them.

Squid already has the information about user login, IP address, MAC
address, URL. No doubt Squid is already doing allow/deny access based on
login and IP which users are trying to get access with. Making Squid
load the blocklist and usie it in the http_access controls is relatively
simple.
So what is left for SG to do? in most cases you will find the answer
is "nothing".

Note that we have not even got near discussing the content of those
"regex" lists. I've seen many SquidGuard installations where the
rationale for holding onto SG was that squid "can't handle this many
regex". Listing 5 million domain names in a file with some 1% having a
"/something" path tacked on the end does not make it a regex list.
** split the fie into domains and domain+path entries. Suddenly you
have a small file of url_regex, a small file of dstdom_regex and a long
list of dstdomain ... which Squid can handle.

Amos
Received on Sat May 18 2013 - 08:54:56 MDT

This archive was generated by hypermail 2.2.0 : Sat May 18 2013 - 12:00:17 MDT