Re: async-io

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Mon, 24 Apr 2000 23:59:46 +0200

Eric Stern wrote:

> I think that elevator sorting really could be a big benefit. I did some
> quick benchmarking about a month ago..wrote a quick program that just
> randomly seeks/reads x many times around the disk, and times the results. I
> got about 30 ops/second. Then, I generated a list of those requests, sorted
> them, and ran them through and ended up with about 60 ops/second. My
> conclusions were:
> 1) sorting (ie reducing seek distance) is a HUGE benefit
> 2) HD's are MUCH slower than my instincts indicated

The quotes I have seen is more in the range of 100 seeks/s for todays
drives. Heck, even my old 1GB IDE drive handles ~56 op/s or with a
elevator sort of 20 requests/cycle ~63 op/s. Larger elevator cycles than
this is unrealistic due to time constraints. As you can see a elevator
cycle of 20 requests is about 1/3 of a second (20 ops of 63 in total).
The increase of 18% compared to unsorted order.

However my point is that for the UFS store we do not have a usable sort
key to perform elevator sorting on. COSS or other "raw" stores is a
different business but for elevator sorting to be beneficial it needs to
have enought requests sort or the decrese in the average seek distence
will be quite minimal. And for a cache the I/O queue should be
relatively small to keep the service time down, and this close to
vaporizes any window for elevator sorting benefits. I would say it is of
more importance to make proper use of pre-reading to avoid seeks in the
middle of small to medium sized objects (<32K or something like that) in
the first place.

Or put in more harshe words: There is not much of a point in elevator
sorting reads. There is a point for writes when they can be done
asyncronosuly to the service provided, but we already do that at a much
higher level, and again for the UFS based store we don't have a useable
key to sort on.

A simple and dumb test program is attached. In the test above I ran it
as

Unsorted:
   measureops -r -n -s 2048 -l 20 -t 1000000000 /dev/hdc
Elevator:
   measureops -e -r -n -s 2048 -l 20 -t 1000000000 /dev/hdc

A more interesting measurement is op/s and bytes/s per size

On my old slow drive IDE drive:

size ops/s bytes/s
1024 57 58374
2048 56 115157
4096 55 226833
8192 52 432020
16384 48 791516
32768 40 1339778
65536 31 2077754
131072 23 3045777

As you can see the potential here is far greater than what you ever can
acheive by playing around with small elevator sorts. And there is a
quite obvious tradeoff between service time and thruput.

/Henrik
Received on Mon Apr 24 2000 - 16:09:28 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:23 MST