Re: Async I/O from Henrik Nordstrom on 1998-09-18 (squid-dev)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sat, 19 Sep 1998 02:29:54 +0200

Alex Rousskov wrote:

> > * Main thread knows about which writes are being made, and
> > when they complete. About the only thing we need to add
> > locking to is the object data.
>
> The problem is that communication between main thread and children
> may eat all the benefits of combining IOs. Locking the object data
> does not appeal to me either.

There is no added communication between the threads. The only
communication between the threads are main->thread "do this",
thread->main "it's done".

> > * If done by the threads then more locking would be needed,
> > making it both inefficient and complex.
>
> If so, yes. But I still _hope_ there is a cheap,
> virtually-no-locking solution. I do not have one though.

Sounds like you would like to find a "do nothing, gain everything"
solution ;-).

Fortunately there may be one. I'll return on this subject tomorrow. To
tired to spell it out now (I will most likely get it wrong if I try to).

> Complexity (here) to me is not the amount of CPU work (that's
> cost), but rather my ability to predict the side effects of
> the given algorithm.

That is close to my definition as well, but I include code complexity.
More complex code is more prune to bugs. My definition of complexity
(here most other places) is number of branches, states,
interdependencies and how easy it is to understand the effects of it.

> From the first glance, I cannot understand what the side effects
> (if any) your algorithm has, and I've seen huge effects from
> (good or bad) load balancing. So I will step aside and let others
> decide...

Only measuring can determine the real effects. Without measuring we can
only speculate on the expected effects.

Note that the "average" is not really average here, there are some
important slowness in the formula that makes it keep a long range memory
that decades by time. The fraction is a weight that says how quickly it
reacts to changes.

But you are right. Fast fluctuating values may cause the system to
oscillate, which if it happens makes it somewhat unpredictable. That is
why I would measure max queue length of a period of time, and not every
queuelength. max queue length / second (or whatever time interval that
is suitable) should not fluctuate to heavily.

Expected properties of 2 pass size+queue based disk selection with
limited threadpools per disk:
1. Searchorder is available disk space, to try to first load files on
the disk with most unused space unless the load is to high.
2. Thread availability quickly pushes swapouts to the next disk when
load creeps up higher than the disk can sustain without "delays".
3. If there is no disk that is classified as "idle", then "average"
request queue is used to select the disk with less work.

Hmm.. thinking about this some further I realise that the algorithm is
probably a bit to space centric. I'll return with more thoughts tomorrow
as my idea does require some redesign. Now I have to get some sleep.

/Henrik
Received on Tue Jul 29 2003 - 13:15:54 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:55 MST