Re: [squid-users] open webpage category database proposal

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sat, 26 Jul 2003 19:25:42 +0200

On Saturday 26 July 2003 14.47, Antony Stone wrote:

> RBL works for mail servers because a hostname is either a
> mailserver, or it isn't. I don't see the idea working so neatly
> for web servers, because a single website can have many many
> different types of content in subpages - just think of
> www.geocities.com for a fairly extreme example of this.

All you need to make that technically work is to define a DNS
namespace for URLs. DNS as such is very neutral as long as it is
defined that the name space for this purpose and is not to look up
host names. To not confuse others who look into the database A
records should be avoided. There is many other suitable record types
to use.

> I think a big difficulty here would be the question of whether your
> idea of illegal / undesirable / objectionable / etc content is the
> same as mine - one person's humour site may be another person's
> idea of pornography...

One of many difficulties, but probably relatively minor if the system
works by content classification rather than abstract terms about how
suitable the content is. If you classify the site as a certain type
of web site (port, email, portal, business, banking, trading etc)
then unless you did an error others is likely to agree.

The big diffifculty is in getting a substantial base of users who make
classifications of web sites beyond the most obvious ones.

> I don't see that this is possible with websites - there would be
> much more human decision-making involved (if the website contents
> can be checked automatically, why not just do it on your own proxy
> server instead of in a centralised database?)

Cerberian does something which resembles the technical parts of this
idea quite closely, using a custom XML query sheme rather than DNS
but still. Their approach also gives other benefits allowing them to
rate URLs more accurately.

> So, although it seems like a nice idea, I can't say that I think
> it's particularly feasible.

I would not agree, but I would not thing it is feasible on a "free"
basis alone.

Regards
Henrik Nordström
MARA Systems AB, Sweden
Received on Sat Jul 26 2003 - 11:26:06 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:18:17 MST