Re: [squid-users] Predictive caching?

From: Bob Arctor <curious@dont-contact.us>
Date: Wed, 23 Jul 2003 02:08:47 +0200

squid could just CONNECT with all links available on the page.
then it could check for available bandwitch and crawl thru site searching all
html pages (and caching them), not entering those which are dynamically
created. there also should be ablity to tune bandwidth used by crawler, and
an option to fetch also files of certain types and sizes.
recrusion depth should be also regulated...

IMO only CONNECT with all links on page feautre should be implemented in
squid, rest could be done by wget-forking proxy.

On Tuesday 22 July 2003 18:26, Robert Collins wrote:
> On Wed, 2003-07-23 at 01:45, Chris Wilcox wrote:
> > Hi all,
> >
> > Just had a suggestion about a project I'm working on: can we provide
> > predictive caching? I know it's possible to use cron and wget to
> > schedule downloads of pages to keep them in the cache, but is there any
> > way I can get squid to follow links on pages it downloads so they load
> > even quicker when requested by users? There's nothing in the docs so
> > I'm presuming this is a no? In which case, is anyone aware of ways I
> > can get this type of behaviour to work?
>
> Nothing within squid today, and unlikely unless someone contributes or
> sponsors it.
>
> if you research into this, be sure to accomodate:
> handling /large/ objects that the user never requests.
> handling the meta-data headers to ensure the pre-fetched object is
> compatible with the Vary: instructions issued by the server.
> handling the link saturation that will likely be caused.
> dealing with synthetic URL's (such as created by javascript).
>
> Cheers,
> Rob

-- 
-- 
Received on Tue Jul 22 2003 - 18:09:27 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:18:14 MST