RE: moving back to 32-bit swap_filens ?

From: Chemolli Francesco (USI) <ChemolliF@dont-contact.us>
Date: Mon, 20 Aug 2001 15:32:49 +0200

> On Tue, Aug 14, 2001, Henrik Nordstrom wrote:
> > The intention from start was to make the file and
> dir-number field sizes
> > adjustable at build time.
> >
> > Regarding structure sizes: The rules are mostly the same on all
> > platforms. It is not as platform dependent as it may seem.
> It is caused
> > by a combination of alignment requirements and malloc overhead.
> > Currently StoreEntry is at the edge (or at least was last
> time I looked
> > at it). Increasing it by one byte is almost the same as
> increasing it by
> > 16 bytes, which unfortunately is quite a lot in the context.
>
> Are you sure its alignment? I'm dead sure its malloc overhead.
> 16 bytes is a big much for struct alighment. Also, when you
> set it to 32 bits, did you drop the sdirno type down to a char
> when you did the size comparison?

It's not unheard of. Some CPU architectures (mainly alpha IIRC) really
like to have handy alignments.

Another thing: I think that COSS might be a wonderful test of
Linux's O_DIRECT open flag. Only caveat, you have to read in PAGE_SIZE
blocks, at PAGE_SIZE boundaries. But it could increase performance, at
least this is what Andrea Arcangeli promises, since it will use zero-copy
DMA from disk to userspace (IIRC) and bypass all kernel caches.

> > I have a branch attempting to shrink the overhead of
> StoreEntry quite a
> > bit. See compactsentry. Unfortunately there currently is
> some bug left
> > there, and due to my other workloads the branch been idle
> for quite a
> > while now.
> >
> > Changing the sizes while working with COSS is perfectly
> fine. However,
> > I don't think these adjustments should be in HEAD until we
> commit a good
> > working COSS there.
>
> COSS works quite fine with a 2gb spool.

We could leave this as a limit, and use multiple spools...
Also, I think it would be a good idea to add documentation saying
that it's best to dd if=/dev/zero possibly after mkfs to decrease
defragmentation...

> If I get some time out here I'm going to work on improving the
> read IO - I have some old work in the commloops branch which
> removes the CLIENT_SOCK_BUF - the storeClientCopy() callback
> can pass a NULL pointer over, which results in the data not
> being copied. This is useful for allowing larger (than SM_PAGE_SIZE)
> reads to come back from storeRead().
>
> I'd work on the >2gb code, but my travel laptop currently doesn't
> have enough free space. Maybe next week when I've FreeBSD'ed it.

I don't think it's a high priority. Having 10 coss storedirs should be
no problem. If we can get as high as 256, then there's plenty of space
for a while.

> My COSS todo list looks like this:
>
> * finish up async_io.c (add open/close wrappers which handle flushing
> pending IO, just in case)
> * Fix up the read code to read back the entire object in one
> system read,
> rather than 4k chunks

Uhm.. I'm not very convinced of this. Can improve the memory pressure
considerably. Maybe you could make it 100k chunks or something, but
I'd prefer to keep the chunking.

> - This involves changing storeRead() to return a
> cbdataFree'able buf,
> rather than taking a buffer to copy into
> * Up the COSS FS size to be >2gb
> * Look at adding the swaplog stuff to the beginning of each stripe
>
> The last point is a little sticky - how do we guarantee FS
> consistency when
> we're using a buffered file/disk dev? I was thinking of
> calling fsync(),
> but that might screw up any pending IO. I'd think of using a raw disk
> device, but squid doesn't currently "cache" the data coming from a
> disk-based store client.

Can't we just use two files? That way we can fsync() only metadata or
something.

> So, I'll leave the last step until the rest of them are done, and then
> work on it. That might require the most squid internal reworking.
>
> I'm getting there. :-)
>
> So, here's a question: should we put the slab allocator into
> squid-HEAD
> now?

Uhm... I'm supposed to deploy 2.5 today. Can't you wait a sec before
breaking
things?
Pretty please?

-- 
	/kinkie 
Received on Mon Aug 20 2001 - 07:23:53 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:13 MST