redirecting, and parseing for content

From: David Flynn <Dave@dont-contact.us>
Date: Thu, 30 Mar 2000 00:21:52 +0100

Eavning all,
                    I am working on a school project, to replace the current
web filtering system (wich is out of our controll) for a taylor made system
that is made to the schools required specs, and that i can then tweak, in
order that we can change our ISP ( go to a dedicated line, requiring us to
do the filtering) ... i am using squid as a cache and have written a
redirector that uses ident responces to find out what user is doing what and
then using an SQL database, look up if they can access the site ... if not
send them to the new url.

Here are some points i am stuck on:
    1. It seems that squid does not do an ident query for each request from
the same ip ... and sometimes remembers what it was, and sometimes forgets
... skrewing up my redirector ... i am getting arround that by having to
implemtnet some SMB calls to identify the user on the machine. .. Is this a
known problem with ident ? or could it be my client ?

    2. I would like to do some transparent redirecting ... ie that the
client is not told that the page it is recieving is differet to that it
wants ... ie not changing the url the client displays for a banned page to
that of the redirected page ... any ideas ... or am i just beeing silly
putting the 301: or 302: in the redirect ?

    3. As an extention to the project we need to actually parse the document
for "undesired" content, usually impleid by rather fruity language, and the
ego's of the ppl writing the pages that put every concievable word to do
with the site in the page. Now there is a simple way i can thing of doing
it with squid and a redirector ... to get the redirector to look up the
page, buffer the content if it is a txt format (ie html , txt , etc... ) and
then parse it for things .. the parseing is "easy" just requires alot of
brute force processor power ... hoever its that i would have to donwload the
page twice ... once by the redirector for parseing and then by squid to
relay it ... is it possible to get squid to buffer the output and then send
that to another process for examination and then for that process to send
the OK ... or a redirected URL ... or are there any other utilities in
existance that would sit in the connection some where to do this filtering
?? ... i think i have explained this poorly , but if you do have anything
that could assist me, please do !!!

Thanks

David Flynn
Received on Wed Mar 29 2000 - 16:27:56 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:52:28 MST