Re: [squid-users] Extending SquidGuard to work out of a SQL database from Joe Maimon on 2003-03-04 (squid-users)

From: Joe Maimon <jmaimon@dont-contact.us>
Date: Tue, 04 Mar 2003 15:34:38 -0500

Rick Matthews wrote:

>The following was also posted to squid-users@squid-cache.org:
>
>
>
>>True - but I am betting large hardware and decision caching will
>>compensate.
>>
>>
>
>Don't forget that those decisions will need to be treated as unique
>to source, destination, time, day, and possibly total usage.
>
>
Good point

>Henrik suggested:
>
>
>>>I would recommend periodically updating the SquidGuard databases
>>>from MySQL for easier configuration and maintenance, but keep
>>>SquidGuard as it is.
>>>
>>>
>
>You responded:
>
>
>>I have a hard time seeing that scaling well.
>>
>>
>
>I don't understand your scalability concerns, and maybe that's
>because I'm not sure I understand your goal. You say that you
>"...are doing this for enterprise scalability reasons", but you also
>say "many admins in many orgs" and "many different customers with
>different needs". Could you clarify your goals here? What problem(s)
>are you addressing?
>
>
We are trying to add Value to our service as an ISP. That means we want
to share filtering machines with different users or organizations. We do
not want to maintain the rule or the lists. We want our users, or their
representatives to login to the interface and interact with theirs and
the globaly shared config only where apropriate.

>squidGuard is extremely fast and efficient in its current form. The
>majority of the problems and complaints that are posted on the
>squidGuard mailing list can be categorized into three areas:
>
>Installation
>------------
>The squidGuard documentation is at least one version behind the code.
>Following the installation instructions to the letter will result in
>a broken squidGuard.
>
>
The webmin module also helps.

>Configuration
>-------------
>The documentation plays a part in configuration problems, too. It is
>a good base, but it stops short of providing real-life applications
>of squidGuard's capabilities and provides little insight to successful
>configuration and operation.
>
>You could add significant value here by building in operational best
>practices and by insulating the user from the arcane squidGuard.conf
>file. This would build in diagnostic processes, and provide a logical
>approach to problem research. Build in research tools.
>
We actualy want to replace the entire configuration aproach with a web
front and a SQL back.

>
>Blacklists
>----------
>The blacklists are probably the single most critical factor in
>deciding squidGuard's success, and yet the users are left floundering.
>Add value by providing good blacklists. Provide automated blacklist
>updates. Build a tool that will read hundreds of thousands of porn
>urls (from the blacklist) and create an expressionlist that can
>identify new porn urls with a very high degree of accuracy (without
>false positives). Provide an easy process for local blocks and
>allows, while still protecting the user from himself.
>
>
I agree. This will have to be done by somebody eventually, maybe even us.

>
>
>>I have seen some efforts to this end, but we are talking about
>>parsing out a SquidGuard config from an sql db - ( not a simplistic
>>task I would think )
>>
>>
>
>Are you looking for a "simplistic" solution?
>
>
Once we have to get that deep into it, I think a hack - which the above
clearly is - should theoretically be the wrong aproach. What we are
looking for will no doubt be anything but simplistic.

>
>
>>- aside from Berkeley DB updates. Each time this config changes due
>>to new sources/acl/destination all squidguards need to be hupped.
>>Now that can be a large hit. As we are trying to take the
>>configuration to a highly distributed ( many admins in many orgs)
>>level, this would probaly make the whole thing fall down.
>>
>>
>
>How will the ABC Corporation be affected when the XYZ Corporation
>changes their config file?
>
>
New sources, new destinations, new Rules, new times, new redirects, new
rewrites. These will all require the master config being reparsed and
rebuilt and the squidguards rehupped.

>If you are simply talking about blacklist subscriptions, you'd
>be much better off using squidGuard/Berkeley as the engine,
>and distributing squidGuard-style diff files. (Restarting a
>squidGuard that is using pre-built Berkeley db files takes seconds
>[my p200 does it in 5 seconds])
>
>
A diff would work for list maintenance. And yes the hup is fast. But
doing a hup for config changes means that it is not wise to download the
config changes very frequently.

>There are quite a few things that you could do to improve
>squidGuard and make it a better commercial solution.
>Rick Matthews
>
That would be a good thing. Anyways, based on the feedback I am getting,
it looks like we may have to go back to the drawing board.

>
>
>
>
>>-----Original Message-----
>>From: Joe Maimon [mailto:jmaimon@ttec.com]
>>Sent: Monday, March 03, 2003 8:09 AM
>>To: Henrik Nordstrom
>>Cc: squid-users@squid-cache.org
>>Subject: Re: [squid-users] Extending SquidGuard to work out of a SQL
>>database
>>
>>
>>
>>
>>Henrik Nordstrom wrote:
>>
>>
>>
>>>mån 2003-03-03 klockan 13.41 skrev Joe Maimon:
>>>
>>>
>>>
>>>
>>>
>>>>My company is looking to extend the Squid redirector, SquidGuard to work
>>>>realtime out of a SQL database. We are looking to target the GNU/Linux
>>>>environment and the MySql database server.
>>>>
>>>>
>>>>
>>>>
>>>Hmm.. I would be a little worried about the latency of using MySQL in
>>>the mix.. it is very hard to beat BerkelyDB when it comes to read
>>>latency of static data..
>>>
>>>
>>>
>>>
>>True - but I am betting large hardware and decsion caching will compensate.
>>
>>
>>
>>>I would recommend periodically updating the SquidGuard databases from
>>>MySQL for easier configuration and maintenance, but keep SquidGuard as
>>>it is.
>>>
>>>
>>>
>>I have a hard time seeing that scaling well. I have seen some efforts to
>>this end, but we are talking about parsing out a SquidGuard config from
>>an sql db - ( not a simplistic task I would think ) - aside from Berkely
>>DB updates. Each time this config changes due to new
>>sources/acl/destination all squidguards need to be hupped. Now that can
>>be a large hit. As we are trying to take the configuration to a highly
>>distributed ( many admins in many orgs) level, this would probaly make
>>the whole thing fall down. And users hate latency on config changes.
>>
>>I actualy practice this concept for my named.conf files. It wasn't
>>simple for that and lost some flexibility there as well.
>>
>>That being said, your advice merits some consideration.
>>
>>
>>
>>>Regards
>>>Henrik
>>>
>>>
>>>
>>>
>>>
>
>
>
>
Received on Tue Mar 04 2003 - 13:34:49 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:13:55 MST