RE: Using Squid to "Strip" a web site - Is it possible?

From: Armistead, Jason <>
Date: Wed, 14 Jan 1998 18:59:00 -0500


Squid can't do this - it only RESPONDS to requests placed on it (rather
than doing things of it's own bat)

I'd suggest using GNU WGET to do this. It can traverse a web site and
download the files to your disk.

Try or any good GNU mirror.

The problem is really one of limiting the traversal to just the "tree"
of the web site you're interested in (sometimes WGET gets a bit carried
away IMHO).

Plus, if you have active server pages (ASPs), CGI, non-relative
(absolute) URL links to the same site or to other sites (e.g. those
pesky advertising sites), then the "mirror" won't work, unless you're
prepared to hack the HTML files to remove these.

Good luck



> ----------
> From: Dave[]
> Sent: Thursday, 15 January 1998 8:41
> To:
> Subject: Using Squid to "Strip" a web site - Is it possible?
> Is there a way to make Squid "strip" a web site of all available
> files?
> I guess I would be looking for a way to "mirror" a web site, and use
> that
> "mirror" for off-line browsing.
> I am not worried about disk space or network bandwidth.
> If not Squid, is there a shareware/freeware program that I could use
> to
> access a web site, and pull everything available off of it in an
> unattended
> fashion?
> Dave
Received on Wed Jan 14 1998 - 16:08:04 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:38:25 MST