[squid-users] saving web page body to file... help needed

From: Siddhesh PaiRaikar <siddhesh.pairaikar@dont-contact.us>
Date: Wed, 17 Jan 2007 19:07:32 +0530

hi..

we are trying to develop a small enhancement to the existing application of
squidguard using the squid proxy server... which can later be embedded in to
squid itself as a html web page body scanner for unwanted content.

the idea is to extract the body of the HTML page and store it in a file..
and then perform all the working i.e. use desired search techniques and
allow/disallow the page based on the content.

for that we needed to save the body content of every HTML page received in a
temporarily created file and then work on the file as desired.
as we are running a little time constraint here.. we are not able to scan
the entire code and see where exactly squid takes in the web page from the
internet and temporaily stopes it before displaying...

i mean .. pretty obviously the server must be doing that... we tried for a
long time to fing out from the source files but to no avail.
we are novices to this field of development and hence would like assistance.

if we can just know where exactly is squid taking the web page body of the
page and storing it before sending it to the browser for display...
i.e. the exact source file name, the function name and the variable in the
souce code of squid in which the file is stored.

eagerly awaiting a response...

thank you.

-Siddhesh Pai Raikar
(India)

P.S. I really did not know who this was to be sent to and i tried finding
out without much success.. so plz forgive me if you are not the one meant to
receive this.
Received on Wed Jan 17 2007 - 06:37:44 MST

This archive was generated by hypermail pre-2.1.9 : Thu Feb 01 2007 - 12:00:01 MST