Re: [squid-users] open webpage category database proposal

From: Antony Stone <Antony@dont-contact.us>
Date: Sat, 26 Jul 2003 13:47:24 +0100

On Saturday 26 July 2003 12:32 pm, Bgs himself wrote:

> Hi !
>
> I'm thinking about an open db and the squid list seems to be a good
> starting point.
> My plan: create a reverse DNS based web page category database. It would
> be similar to the *RBL systems. The db is a pseudo reverse DNS server.

RBL works for mail servers because a hostname is either a mailserver, or it
isn't. I don't see the idea working so neatly for web servers, because a
single website can have many many different types of content in subpages -
just think of www.geocities.com for a fairly extreme example of this.

Also a reverse lookup when processing an incoming email may add a few seconds
to the transmission time - but who cares? The same is not true of website
access - people would care a lot (and complain).

> By querying the db you get back a 32 bit info which contains information
> about category, sub category, rating and category specific custom flags.
>
> Something like:
> www.something.com gives you a.b.c.d pseudo IP number where:
>
> 'a' is the category ID
> 'b' is the sub category ID
> 'c' is the rating
> 'd' is the custom flag

I think a big difficulty here would be the question of whether your idea of
illegal / undesirable / objectionable / etc content is the same as mine - one
person's humour site may be another person's idea of pornography...

> The db would be user managed: everyone could add (with proper checking of
> course) new sites. After a while this might grow into an uptodate db.

Email RBLs can be automatically checked - you don't need a person to decide
whether a mail server is an open relay or not, and databases like Razor and
DCC help to check for machines which spew out spam on a regular basis.

I don't see that this is possible with websites - there would be much more
human decision-making involved (if the website contents can be checked
automatically, why not just do it on your own proxy server instead of in a
centralised database?)

So, although it seems like a nice idea, I can't say that I think it's
particularly feasible.

I'd be interested to hear if anyone else thinks otherwise, though, and I'm
happy to discuss off this list if that's preferred.

Antony.

-- 
Software development can be quick, high-quality, or low-cost.
The customer gets to pick any two out of three.
Received on Sat Jul 26 2003 - 06:47:36 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:18:17 MST