RE: [squid-users] Extending SquidGuard to work out of a SQL database

From: Rick Matthews <RedHat.Linux@dont-contact.us>
Date: Tue, 4 Mar 2003 10:07:55 -0600

> True - but I am betting large hardware and decision caching will
> compensate.

Don't forget that those decisions will need to be treated as unique
to source, destination, time, day, and possibly total usage.

Henrik suggested:
> > I would recommend periodically updating the SquidGuard databases
> > from MySQL for easier configuration and maintenance, but keep
> > SquidGuard as it is.

You responded:
> I have a hard time seeing that scaling well.

I don't understand your scalability concerns, and maybe that's
because I'm not sure I understand your goal. You say that you
"...are doing this for enterprise scalability reasons", but you also
say "many admins in many orgs" and "many different customers with
different needs". Could you clarify your goals here? What problem(s)
are you addressing?

squidGuard is extremely fast and efficient in its current form. The
majority of the problems and complaints that are posted on the
squidGuard mailing list can be categorized into three areas:

Installation
------------
The squidGuard documentation is at least one version behind the code.
Following the installation instructions to the letter will result in
a broken squidGuard.

Configuration
-------------
The documentation plays a part in configuration problems, too. It is
a good base, but it stops short of providing real-life applications
of squidGuard's capabilities and provides little insight to successful
configuration and operation.

You could add significant value here by building in operational best
practices and by insulating the user from the arcane squidGuard.conf
file. This would build in diagnostic processes, and provide a logical
approach to problem research. Build in research tools.

Blacklists
----------
The blacklists are probably the single most critical factor in
deciding squidGuard's success, and yet the users are left floundering.
Add value by providing good blacklists. Provide automated blacklist
updates. Build a tool that will read hundreds of thousands of porn
urls (from the blacklist) and create an expressionlist that can
identify new porn urls with a very high degree of accuracy (without
false positives). Provide an easy process for local blocks and
allows, while still protecting the user from himself.

> I have seen some efforts to this end, but we are talking about
> parsing out a SquidGuard config from an sql db - ( not a simplistic
> task I would think )

Are you looking for a "simplistic" solution?

> - aside from Berkeley DB updates. Each time this config changes due
> to new sources/acl/destination all squidguards need to be hupped.
> Now that can be a large hit. As we are trying to take the
> configuration to a highly distributed ( many admins in many orgs)
> level, this would probaly make the whole thing fall down.

How will the ABC Corporation be affected when the XYZ Corporation
changes their config file?

If you are simply talking about blacklist subscriptions, you'd
be much better off using squidGuard/Berkeley as the engine,
and distributing squidGuard-style diff files. (Restarting a
squidGuard that is using pre-built Berkeley db files takes seconds
[my p200 does it in 5 seconds])

There are quite a few things that you could do to improve
squidGuard and make it a better commercial solution.

Rick Matthews

> -----Original Message-----
> From: Joe Maimon [mailto:jmaimon@ttec.com]
> Sent: Monday, March 03, 2003 8:09 AM
> To: Henrik Nordstrom
> Cc: squid-users@squid-cache.org
> Subject: Re: [squid-users] Extending SquidGuard to work out of a SQL
> database
>
>
>
>
> Henrik Nordstrom wrote:
>
> >mån 2003-03-03 klockan 13.41 skrev Joe Maimon:
> >
> >
> >
> >>My company is looking to extend the Squid redirector, SquidGuard to work
> >>realtime out of a SQL database. We are looking to target the GNU/Linux
> >>environment and the MySql database server.
> >>
> >>
> >
> >Hmm.. I would be a little worried about the latency of using MySQL in
> >the mix.. it is very hard to beat BerkelyDB when it comes to read
> >latency of static data..
> >
> >
> True - but I am betting large hardware and decsion caching will compensate.
>
> >I would recommend periodically updating the SquidGuard databases from
> >MySQL for easier configuration and maintenance, but keep SquidGuard as
> >it is.
> >
> I have a hard time seeing that scaling well. I have seen some efforts to
> this end, but we are talking about parsing out a SquidGuard config from
> an sql db - ( not a simplistic task I would think ) - aside from Berkely
> DB updates. Each time this config changes due to new
> sources/acl/destination all squidguards need to be hupped. Now that can
> be a large hit. As we are trying to take the configuration to a highly
> distributed ( many admins in many orgs) level, this would probaly make
> the whole thing fall down. And users hate latency on config changes.
>
> I actualy practice this concept for my named.conf files. It wasn't
> simple for that and lost some flexibility there as well.
>
> That being said, your advice merits some consideration.
>
> >
> >Regards
> >Henrik
> >
> >
> >
>
Received on Tue Mar 04 2003 - 09:08:05 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:13:55 MST