RE: [squid-users] Squid WWW Pre Cache / Pre Fetch

From: Elsen Marc <elsen@dont-contact.us>
Date: Fri, 27 Aug 2004 10:49:02 +0200

 
>
> Hi All
>
> I would appreciate some help if possible with regard to how one would
> achieve the following:
>
> Consider the situation of a school, where a teacher wants to
> do a lesson
> based on the image and PDF heavy content of a website. To run the
> lesson realtime over a 64kbps line would be killer slow and not
> effective at all.
>
> I understand that squid has some sorted of prefetch/precache
> ability.

   It hasn't.

> I
> do not mean the teacher browsing the site the day before and
> hoping that
> the objects will be cache for the lesson the other day.
> Squid has never
> worked like that for me? Maybe I am configuring it wrong?
>

It's not SQUID's problem. If the provider of that info (pdf document), gives
adequate freshness info, SQUID will cache it according to those parameters.
What if SQUID would pre-fetch the object, and the remote webserver says it's only
'fresh' for an hour or shorter. SQUID could quikly consume a lot of 'local machine'
and network resources.

Better is that the remote webserbver says it will not 'expire' for one
week and SQUID will treat the object accordingly. Problem basically solved.

>
> So I suppose I'm looking for something like and combination
> between an
> http offline downloader and a www proxy.

  SQUID only does the www proxy part...

>
> As another example, say an ISP wanted to ensure that as much
> content as
> possible from olympics.org was pre cached before the event actually
> started, how would they achieve this?
>

They would be fools : I would assert that the amount of info changed
daily on that site may even bigger then the size in MB of the original
site itself.
Now the invested time in such actions would be futile as the static site
would immediately loose it's meaning 'the day after'.
Unless the ISP has tachyonic abilities and know the result of the 100m
before the event has taken place (!)

Let the webmaster of olympics.org know the world what is longstanding info
and what not. This is how webcaches work in combination with webservers.

Further related links :

 http://www.mnot.net/cache_docs/

 http://www.ircache.net/cgi-bin/cacheability.py

The moral of the story : http is client (browser) driven. It is the provider
of info (remote webserver) who bares the responsibility of providing adequate
freshness info. So that caches can act and work accordingly.

M.
Received on Fri Aug 27 2004 - 02:50:33 MDT

This archive was generated by hypermail pre-2.1.9 : Wed Sep 01 2004 - 12:00:02 MDT