Re[2]: Do not make a compatible squid file system

From: wang_daqing <wang_daqing@dont-contact.us>
Date: Thu, 3 Feb 2000 12:53:03 +0800

Hello Henrik,

Henrik wrote:
HN> Perhaps. I haven't looked at the results yet (didn't know the results
HN> was available yet..)

Sorry for my mistake. I consider the first bake-off result as this
year bake-off result

HN> A bit. However, to turn Squid into a really high performance proxy more
HN> changes than only the object store is required. There are also very
HN> obvious bottlenecks in the networking and data forwarding.

I don't know too much on bottlenecks in the networking and data
forwarding. But I think the disk is always the major bottlenecks in
such I/0 intense application like cache.

HN> I think you may have misunderstood Duane there.

HN> Squid will continue to have a file based object store as one option in
HN> the foreseeable future, but there will be other options not relying on
HN> having a filesystem or directory structures.

Wessels said. "However, we are actively working on a filesystem
API for Squid that will allow us to experiment with new types of storage
for Squid (e.g. "SquidFS"), while still remaining compatible with the Unix
filesystem."

My English is not very well and I am not careful. I consider he said
"SquidFS" still remaining compatible..., but it's wrong. Now I know he
said "a filesystem API" still remaining compatible...

Just as Kevin Littlejohn told me:
KL> Duane, Henrik, and others have taken
KL> great pains to create an abstraction layer between squid and the filesystem
KL> that allows support for both a standard *nix filesystem, and for these
KL> new options.

But I also do not agree this. For you create a new system not like
other filesystem. Why you want to operate them same way? I suggest
abstract in object store layer not in filesystem layer.

HN> What makes you say that it isn't similar to any existing filesystem? In
HN> the lower layers of the UFS family of filesystems the file is named by
HN> it's inode which is an abstract number. The directories and filenames is
HN> a way to index the inodes

First I mean no filename, just the hash key, so you cannot create file
or directory by name or path. You can create object or open object
only by hash key. And the hash key is not same to inode index or other
for it is unnumbered. Secondary I mean no separate inode for you don't
randomize access file. So if need node when file is in pieces, store
it together with file.

>> There will be less pieces in disk. You don't need a i-node for
>> files, just use a pointer to data and a flag (indicate whether
>> store in one piece), if it stores in several pieces, add a node
>> table just before the file (cache object) to point rest pieces.

HN> As you say you will need some kind of file node, for consistency
HN> validation reasons if nothing else.

I don't know what are you thinking or feel about inode. I think a bit
FAT is enough(one disk block one bit) for fsck.

HN> Have a large can of ideas on how to make a cache object store. The
HN> cyclic one is only one of the designs. In it's purest circular form it
HN> would require a lot of disk to maintain a good hit ratio, however with
HN> some minor modifications and wasting some disk space you can still
HN> preserve most of the properties of LRU, which should make it more
HN> economic in terms of disk. The disk storage is cyclic, but object
HN> lifetime more like LRU.

I have no time to read all the discuss on cyclic(It's difficult to
me). But I guess it's waste disk eg money. I guess current question
for backbone cache is not have enough disk or enough memory to fit the
disks. With same money, a hierarchy(using PC) may be get better
performance.

My suggestion is want to less disk read(one read when validation) and
less memory usage(No separate meta data,directory,inodes and compress
as possible). If the cyclic has benefit, it may be not conflict with
it. It's just a allocation policy. You can use a weight to determine
allocate in single pieces as possible or allocate new object together
as possible and get a balance for write and read.

HN> A quick summary of the storage ideas buzzing around in my head:

HN> * Cyclic storage with LRU modifications
HN> * Chunked storage
HN> * Chunked cyclic storage
HN> * To look into the ideas of purely log based filesystems
HN> * On-disk hash based indexing without a memory index (i.e. like what you
HN> proposed)
Thst's just the Novell BorderManager doing.

HN> * and a couple of other things

HN> We could sure make use of more programmers or designers in this area.
HN> You are more than welcome to join the work of making a better disk
HN> storage system if you want to.

I am very glad to join your work although my English is not very well
and may be I cannot make even a line code for you. Tell me how to do.

Below part is about Novell BorderManager I know:

In 1997, I have set up a Novell BorderManager 2 FastCache Server in a
old Compaq 486 mechine. (Now I am using squid for BorderManager 2 is
not flexible to configure). I don't know BorderManager 3 or ICS.

In BorderManager 2, they are also use tree directory to store cache
object. But different with Squid, the filename is hash key not serial
number. The mainly different is there are no file like swap.state in
Squid contain all cached object meta data. So I think the process is
depend the hash key to determine visit which directory, and which
file.

The advantage of this structure is you don't need too much memory to
run. Of cause if the memory do not enough to cache whole directories
will reduce performance. Why Squid not use a structure like this? I
think the question is the Netware can easily adjust use how many
memory for directories cache and how many for others. But in UNIX it
is difficult or impossible(I don't know). If the file cache take more
memory than directories, you cannot get better performance at all for
reuse file possibility is very low but reuse directory is high.

I think the Novell's way is better if you can control the filesystem.
You needn't choice all or not if the disk is big but the memory is
lack. Another part of my suggestion is store meta data in directory
items is just for this. (Just make them same and eliminate useless
field to Squid like attrib,ACL,file date etc to save directory cache
memory)

Best regards,
 Wang_daqing mailto:wang_daqing@163.net
Received on Wed Feb 02 2000 - 21:53:58 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:21 MST