Re: squid disk cache

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 09 Oct 2011 18:28:34 +1300

On 09/10/11 16:46, pavi wrote:
> Moving cached objects inside the cluster is achieved successfully by using
> FTP. The problem is how to change the meta data of squid disk cache,

Exactly like Henrik said. HTTP is the easy way. Moving data files
externally to Squid will only work on UFS (Unix File System) caches and
you already found that the index is not altered by simply moving the
disk file.

  The disk files are a minor part of the system. Squid itself is the
other part.

>
> As for the understanding, there should be a single reference file to match
> cached objects, requested URL and/or originating server URL. this file
> should be the primary contact for users who connect to squid proxy server to
> access internet. then we can change its entries after moving objects among
> the servers.
> *If this type of file exists,
> 1. What is it ?
> 2. Where it is located ?
> 3. How to understand, or change that file ?
>
> if not, are there any global reference that directly access by user requests
> ? *

Why do you require there must be only one?

You need to avoid the concept of "file" when thinking about caches. HTTP
has no such concept. Squid only uses "files" when it has to interact
with a "filesystem" disk service (aka UFS).

There are several tree-structured collections in RAM. The UFS disk
cache_dir index for the one you altered contributes one or two nodes to
the URL index hash/tree per disk file.
  http://wiki.squid-cache.org/SquidFaq/SquidMemory

There is swap.state journal to preserve of changes to the index. It is
loaded on startup. Erased and replaced on shutdown. If it is missing,
wrong, corrupted it gets discarded and Squid begins a (DIRTY) re-scan of
the entire cache on next restart. File by file. This process will pick
up the meta data inside the moved object file and add it to the in-RAM
index and record it to swap.state.
  For UFS this requires a full restart of Squid and can take minutes or
hours, or even days during which traffic cannot be served from that cache.
  If you want to work on a tool which does that scan outside of Squid
and rebuilds a clean and valid swap-state journal for loading by Squid
on next startup it would be welcomed. But it does not appear to be a
reasonable approach for real-time alterations and mirroring.

Some people do mirror whole caches between Squid. But this is only
safely done at present when both are shutdown. For example when
replacing old hardware under Squid and wanting to keep the cache
relatively current when it starts up again on the new box. This shares a
lot with the mirroring of virtual machines between hardware.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.15
   Beta testers wanted for 3.2.0.12
Received on Sun Oct 09 2011 - 05:28:42 MDT

This archive was generated by hypermail 2.2.0 : Sun Oct 09 2011 - 12:00:14 MDT