Re: The store interface

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sun, 13 Jun 1999 21:54:31 +0000

Alex Rousskov wrote:

> OK, based on your definition, swapfileno does not belong to StoreEntry's
> main part at all and should be moved to FS-specific part (fs-union). Some
> file systems may not have any directories and/or internal swap file names.
> Some may need more space that a single integer to represent those two.

There is no FS-specific part in StoreEntry (not in the versions I have
seen anyway). I think you are confusing things with storeIOState which
is a internal filehandle to a on-disk object (much like how FILE is a
internal filehandle for stdio functions). storeIOState has a union with
FS-specific parts ONLY used by the FS-specific code to maintain internal
state while reading/writing the FS object. Actually the whole
storeIOState is private to the fs code, but some properties (and basic
functionality) are shared by all implementations.

sfileno is very different from this. it is the index key to the on disk
object. The StoreEntry manager needs to be aware of this, as it is the
"name" used when accessing on-disk objects by calls to the FS-dependent
code.

The only thing I am objecting to is how and when sfileno is assigned.
Currently the high level StoreEntry manager assigns this using a
file_map bitmap for each cache_dir, but in my opinion the responsibility
for finding the FS-dependent name of an object is at the FS-dependent
code and not in the generic StoreEntry manager. Main goal of this
objection is to limit the number of object directories (object name
directories/indexes, not filesystem directories) needed in Squid to one,
namely the StoreEntry directory. I don't see the need in maintaining two
levels of object directories, one for cache_key and one for sfileno.

I think it is reasonable to expect that the FS-dependent code should be
able to give each stored object a unique integer and find some method to
map this to a on-disk location (filename for UFS, disk block for a block
based filesystem like COSS, something else for another kind of
filesystem).

To do this cleanly sfileno should be separated into it's two components:
cache_dir and filenumber. The high level code selects the cache_dir
component, and the low level FS-dependent code selects the filenumber
component.

Oh, yes there is yet one more change I feel is needed to the store-io
interface. The "methods" should know which StoreDir the are being called
on, preferably by being passed a reference to it ("this" when using
C++).

Lets repeat and summarize what I have been trying to say:

The filenumber component of swap_file_number should be assigned by the
FS-dependent code, to allow it to make use of the StoreEntry directory
as the directory structure. The FS-dependent code should not be required
to maintain a directory of it's own with the sole purpose of mapping
sfileno to on-disk object.

In UFS you have this second level object directory in the UFS directory
files which maps to inodes which is the true on-disk objects, and if you
look back into the early Squid-FS discussions these directory lookups
was one of the things we wanted to eleminate when building a custom
Squid-FS. The current desing with centrally and FS-independently
assigned swapfileno does not allow us to skip this directory lookup.

I then propose the following Squid-FS interface changes (with
corresponding minimal changes to StoreEntry):

* swap_file_number (sfileno) should be broken up into it's two
independent parts. One which is used by the StoreEntry manager to
maintain in which cache_dir the object is located, and one which tells
the FS-dependent code where in this cache_dir the object is located.

* The FS-dependent part should be assigned by the FS-dependent code to
allow it to share the StoreDir object directory and not need to maintain
it's own directory of objects.

* Since not all filesystem models can determine the on-disk index key
until some or all of the data is actually written, assignment of the
FS-dependent filenumber should be allowed to be made asyncronously.
Since FS-dependent code should not need to know about StoreEntries this
needs to be done as a callback to the calling modlue (StoreEntry
manager). Also, since this filenumber is needed to be able to open the
object for reading, the call can also be seen as a indication that
reading from the object is now possible.

* Not all filesystems can support readers while writing a object,
especially not during the early phases of development. Support for this
should be optional, and can be supported by either revert to VM object
model if the filesystem code does not support readers while writing, or
deny additional readers during writing by keeping the object private
until readers can be supported (the latter preffered by me, the first
requires less changes).

* When sfileno has been separated into it's two separate entities, the
store-io interface should change to one more like a traditional
object-oriented interface, i.e. instead of receiving a sfileno which
(amongst other things) indicates cache_dir number it should receive a
reference to the CacheDir the call operates on (preferably as first
argument to all methods, to mimic the C++ call interface). This futher
isolates the FS-dependent code from the rest of Squid as it now does not
need to know where Squid keeps it's configuration details.

Ps. As you may have suspected by now I didn't get the time I hoped for
during my train travels to make a clearer picture of my vision of the
future modularised store manager, but I am working on finding time for
it. The vision is simple (three modules with well defined borders and
interfaces: object access, disk access, object management), but I need
to work on the details of the responsibility division and flow between
the modules before giving a more indepth explanation.

/Henrik
Received on Tue Jul 29 2003 - 13:15:59 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:15 MST