Re: Cache in multiple disks an other (more or less) related stuff

From: Fernando Schapachnik <fpscha@dont-contact.us>
Date: Thu, 18 Sep 97 9:24:08 GMT

I want to thank all the people listed here an many others how asnwer my
questions and brought some light into my ignorance :).

Below is a summary of information I consider relevant.

Kind regards,

Fernando P. Schapachnik

I wrote:
>
> (Please reply to me and I'll sumarize, because I'm not in the list.)
>
> Hello:
> I'd like to know how to tell Squid to use X Mbs of space in, say,
> /usr/cache1, and Y Mbs in /usr/cache2 (another disk), where the available
> space in, for example, /usr/cache1 is bigger than X Mbs. In other words:
> I want Squid to use two disks, but I'd like to control how many Mbs in
> each. Do any of you know how to achieve this?
>
> I was also wondering how many connections does Squid open to
> fectch a page an its images. It first gets the html, then images one
> after the other, as oposed to the browser which opens many connections
> simultaneously, right?
>
> Thanks in advance for your help.
>
>
> Fernando P. Schapachnik
> S&M Internet
>
>

Frank Smith <fsmith@spec.com> replied:

You can't do what you want, directly, as the cache size is a global
variable. To accomplish it, you need to set cache size to the greatest
common factor of your desired sizes and make multiple cache dirs on
each partition so that you use the correct total amount of space.
   This is easy if you want something like 500 meg in cache1 and
1 gig in cache2 - just set max cache size to 500 meg and make one
cache dir in /usr/cache1 and two cache dirs in /usr/cache2. It becomes
much more tedious if you want 300, 500, and 700 meg dirs. Then you can
either make a lot of 100 meg directories, or use 150 or 200 meg dirs
and not use all the space you want.
   I think there is a max number of dirs you can have, but I don't recall
what it is.

Oskar Pearson <oscar@is.co.za> added:

Basically the problem is follows:

When you get an object (such as an html page) it will choose a random
point on the disk (ie it could be in any one of the low level
directories, and in any of the ones further down). Since this is the case,
squid would have to check the size of each of the directories before
it wrote the object... which is a problem...

> space in, for example, /usr/cache1 is bigger than X Mbs. In other words:
> I want Squid to use two disks, but I'd like to control how many Mbs in
> each. Do any of you know how to achieve this?
Use the raid0 stuff, it can handle any size disks (under linux)

Henrik Nordstrom <hno@hem.passagen.se> said, related to the number of
connection Squid opens:

Squid does not care whether it fetches a page or a image, it is all
handled the same. It is up to the browser to detect that a page contains
images and send a request to squid for those images, which in turn
fetches the images from the server. If your browser uses 30
simoultaneous connections, then squid will do the same (unless the
images are already cached).

I asked Bill Wichers <billw@unix0.waveform.net>:

> Do you know if popular browsers like Netscape or Internet Explorer opens
> multiple connections when using a proxy.
> Two more questions: what happens when the user presses "reload", does
> squid "reload" the page? And, when, for example, the link is down, does
> Squid check the net for newer versions of the object?

And he replied:

From observation, Netscape will open up to about 4 connections at once. I
don't know how many MSIE will open at once, but I'm sure it's more than
one. Netscape used to allow the user to adjust the number of simultaneous
connections, but the new 4.x releases are hard coded to (if I remember
right) 4 connections.

Just clicking "Reload" will make Squid do an IMS (If Modified Since)
request for an object. If a new version of the object is on the server,
then squid will fetch it. If the version on the server is not newer than
the cached version then squid just returns the cached version.

Holding <shift> while clicking "Reload" sends a pragma-nocache reload,
which forces squid to fetch the object from the originating server whether
it has a fresh cached copy or not.

If Squid can't get to any other caches or the originating servers, then it
generally times out attempts to fetch objects. I have found that it does
return fresh cached copies when it has them, though.

I asked Henrik Nordström what was he meaning when he said "if the browser
opens multiple connections, Squid will do the same":

> Of course, when you say that "Squid will do the same" you are
> meaning it will put the request it its queue and serve them one
> after the other, because it uses just one thread, isn't it?

No. Squid handles a large number of simoultaneous request (several
hundred on almost all platforms, thousands on some), even thougth it is
not a "multi-threaded" application. Squid uses non-blocking I/O, and a
big state machine build around select() to archeive the same goal as
multi threading, to handle more than one thing at the "same" time. The
only big drawback from not beeing truly multithreaded is that it does
not scale on multiple CPU's.

Fernando P. Schapachnik
S&M Internet
Received on Thu Sep 18 1997 - 05:58:26 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:06 MST