[squid-users] Re: cache_dir size v.s. available RAM

From: HillTopsGM <emailgregagain_at_gmail.com>
Date: Sun, 25 Aug 2013 08:38:39 -0700 (PDT)

>> In connection with my last post, I also had this question:
>>
>> Let's say that with my 4GB of RAM I decided to create a total cache
>> storage
>> area that was 650GB; obviously the index would be much larger than could
>> be
>> stored in RAM.
>>
>> If my primary purpose was to 'archive' my windows updates, I'd expect
>> that
>> it would take the system only a couple of seconds to review the index
>> that
>> would spill over to the drive, and then we'd be back in business for the
>> updates - no?
>
> Sort of. This "couple of seconds delay" would happen on *every* HTTP
> request to the proxy.

hmmmm, I never thought of that Amos. I guess that makes sense. How would
the system know if that url is cached if it didn't check the index. I guess
that is why we wouldn't want it to spill out of the Ram and onto the disk.

>> I simply want the Proxy to help serve updates of all programs - Windows,
>> Browser updates like Firefox, Thunderbird, Adobe Reader, Skype, nVidia
>> Driver updates (100's of MB at a crack), etc, etc.
>>
>> I was thinking of creating a rule (maybe someone could help be write it
>> so
>> it makes sense) that all sites would be accessed directly and told NOT TO
>> BE
>> cached.
>
> You seem to have the common misunderstanding about what DIRECT is. HTTP
> permits an arbitrarily long chaining of proxies:
>
> client->A->B->C->D->E->F->..... -> origin server
>
> always_direct causes Squid to ignore any cache_peer which you have
> configured and use DNS lookup to fetch the object DIRECT-ly from the
> origin. Giving an error if the DNS produces no results or is not working.
>
> never_direct does the opposite and forces Squid to ignore DNS for the
> domain being requested and just send to cache_peer. Giving an error if
> the cache_peer are unavailable.

If I had to choose between "always_direct" & "never_direct" I think I'd go
with "*always_direct*"
The error would only happen if the site was down, or there was a DNS issue
(as you say), and I'd get those errors in that case regardless of whether or
not I was using a proxy.

Did I understand this correctly?

=================
=================
Ok, let me try this idea out on you. What if I did this:

As I only have 4GB of RAM, I create 2 cache directories 100GB each like so:

*cache_dir ufs /var/spool/squid3_cache-1 102400 16 256
cache_dir ufs /var/spool/squid3_cache-2 102400 16 256*

The indexing should never spill over onto the disk and so all requests
should still be processed as quick as possible.

As I simply want the Proxy to help serve updates of all programs - Windows,
Browser updates like Firefox, Thunderbird, Adobe Reader, Skype, nVidia
Driver updates (100's of MB at a crack), etc, etc.

I was thinking of creating a rule (maybe someone could help be write it so
it makes sense) that all sites would be accessed directly and told NOT TO BE
cached. This would help make sure all the updates I don't want to lose last
in the cache as long as possible.

For Example:

*STEP 1/4*
acl noproxy dstdomain .com .net .org <==== etc, etc. *Would that work
(would that be the way to 'wildcard' those domains)? *

always_direct allow noproxy
cache deny noproxy

*STEP 2/4: *
Then for Each site, in particular, I want cached (like the Windows update
sites) create rules like this:

cache allow windowsupdate

NOTE: I chose 'windowsupdate' as that is what was used for the acl rules on
the FAQ page here >> http://wiki.squid-cache.org/SquidFaq/WindowsUpdate

*STEP 3/4*, Next I was thinking that I'd have to add is acl's for
acl windowsupdate dstdomain microsoft.com
acl windowsupdate dstdomain windowsupdate.com
acl windowsupdate dstdomain my.windowsupdate.website.com

. . . as I see that those domains are part of the refresh rules for the
windows updates but not the acl's.
To be on the safe side I thought that I'd add them to make sure anything
that comes from there would be cached as per the
*cache allow windowsupdate*
rule.

I am still wondering why they *WERE NOT* included in the group of acl's
listed on that page.
Comments on that?

*STEP 4/4*, Lastly, all I'd have to do is add an acl for sites that I want
to be cached in addition to the windows updates like so:

acl windowsupdate dstdomain .mozilla.org
acl windowsupdate dstdomain .adobe.com
acl windowsupdate dstdomain .java.com
acl windowsupdate dstdomain .nvidia.com
etc, etc, etc,

. . . Would THIS set up not cache ONLY the sites that I list, and by doing
so ensure that all the windows updates never get overwritten by the 'Daily
caching' of stuff I don't care about?

Does that make sense?

*Last Question for this post:*

Is there a way to tell the system to 'dump/overwrite' the "Daily Caching" of
objects yet give priority to keeping the windows updates longer - other than
the 4 steps I listed at the beginning of this post?

--
View this message in context: http://squid-web-proxy-cache.1019090.n4.nabble.com/cache-dir-size-v-s-available-RAM-tp4661705p4661762.html
Sent from the Squid - Users mailing list archive at Nabble.com.
Received on Sun Aug 25 2013 - 15:39:22 MDT

This archive was generated by hypermail 2.2.0 : Sat Aug 31 2013 - 12:00:30 MDT