Re: StorEntry-less disk store from Henrik Nordstrom on 2000-11-03 (squid-dev)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Fri, 03 Nov 2000 22:26:00 +0100

Alex Rousskov wrote:

> I would think they would want to use a cache digest or similar small,
> lossy hash to check for potential hits with high-enough probability...

Was first thing on my mind ;-)

Attached some "old" mail on the subject..

/Henrik

Message-ID: <39B34AE3.38971307@hem.passagen.se>
Date: Mon, 04 Sep 2000 09:10:27 +0200
From: Henrik Nordstrom <hno@hem.passagen.se>
X-Mailer: Mozilla 3.01 (X11; I; Linux 2.2.16-3 i586)
MIME-Version: 1.0
To: Joe Cooper <joe@swelltech.com>
CC: squidng@cacheboy.net
Subject: Re: reiserfs_raw update
References: <20000903222902.A10665@hole.botik.ru> <39B30E44.8B979381@swelltech.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Joe Cooper wrote:
>
> Ok, it's up and running very solid here. Hit rate under load is great,
> but the load it's handling is a bit low. I'm one hour into a run at 50
> reqs/sec and response times are slowly but steadily climbing...(keep in
> mind 80 or 90 is what a box of this capacity will handle with the old
> Squid+hno patches). I tried a run at 70 but the response times climbed
> to unnacceptably high very quickly (within 10 minutes).

Sounds like disk I/O queues are building up.

How many I/O threads do you have running?

How much I/O is being reported by vmstat?

It might also be the storetree code. This is creating some additional
reads. Sizif, it might be a good idea to speed up the lookups by keeping
a compact in-core hit index, for example in the form of a cache digest
like bloom filter for the store_dir. It is possible to do deletetions
in such compact bitmap hashes by storing one bit more than needed and
on delete deleting the last bit and allow for a single bit loss in any
position but the last. The index is preferably managed by the FS and
mmap:ed and locked in memory by the application.

To make the bloom filter work you need a reasonably secure hashing
function, or else you risk to much bit clustering making the filter
useless.

/Henrik

Return-Path: <squidng-return-215-hno=hem.passagen.se@cacheboy.net>
Received: from localhost (localhost [127.0.0.1])
        by henrik.localdomain (8.9.3/8.9.3) with ESMTP id RAA04723
        for <henrik@localhost>; Mon, 4 Sep 2000 17:14:34 +0200
Received: from pop.passagen.se
        by localhost with POP3 (fetchmail-5.1.0)
        for henrik@localhost (single-drop); Mon, 04 Sep 2000 17:14:34 +0200 (CEST)
Received: from skywalker.creative.net.au (qmailr@node16292.a2000.nl [24.132.98.146]) by mail1.passagen.se (8.9.3/8.9.3/1.22)
        with SMTP id <JAA29258> for <hno@hem.passagen.se>; Mon, 4 Sep 2000 09:38:22 +0200 (MET DST)
Received: (qmail 74121 invoked by uid 1002); 4 Sep 2000 07:38:15 -0000
Mailing-List: contact squidng-help@cacheboy.net; run by ezmlm
Precedence: bulk
X-No-Archive: yes
List-Post: <mailto:squidng@cacheboy.net>
List-Help: <mailto:squidng-help@cacheboy.net>
List-Unsubscribe: <mailto:squidng-unsubscribe@cacheboy.net>
List-Subscribe: <mailto:squidng-subscribe@cacheboy.net>
Delivered-To: mailing list squidng@cacheboy.net
Received: (qmail 74114 invoked from network); 4 Sep 2000 07:38:14 -0000
Sender: joe@henrik.localdomain
Message-ID: <39B352E1.9071021D@swelltech.com>
Date: Mon, 04 Sep 2000 02:44:33 -0500
From: Joe Cooper <joe@swelltech.com>
X-Mailer: Mozilla 4.73 [en] (X11; U; Linux 2.2.16-11.0 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: squidng@cacheboy.net
Subject: Re: reiserfs_raw update
References: <20000903222902.A10665@hole.botik.ru> <39B30E44.8B979381@swelltech.com> <39B34AE3.38971307@hem.passagen.se>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Henrik Nordstrom wrote:
>
> Joe Cooper wrote:
> >
> > Ok, it's up and running very solid here. Hit rate under load is great,
> > but the load it's handling is a bit low. I'm one hour into a run at 50
> > reqs/sec and response times are slowly but steadily climbing...(keep in
> > mind 80 or 90 is what a box of this capacity will handle with the old
> > Squid+hno patches). I tried a run at 70 but the response times climbed
> > to unnacceptably high very quickly (within 10 minutes).
>
> Sounds like disk I/O queues are building up.
>
> How many I/O threads do you have running?

I see 17 squid owned processes when running under load, 16 of those are
kaiod threads. So, 16 threads. Where is that defined, and should I
consider raising it or lowering it for this system? (450 MHz proc, dual
7200 RPM drives, 384MB RAM) I was using 30 threads in the old async
Squid, which was pretty much the sweet spot for this hardware.

> How much I/O is being reported by vmstat?

Don't know about before the lockup, but here's some numbers for 5
minutes into a test:

procs memory swap io
system cpu
r b w swpd free buff cache si so bi bo in cs us
sy id
0 0 0 0 139012 154048 51704 0 0 10 290 1055 1047 9
22 69
1 0 0 0 136444 154448 53676 0 0 10 1 901 983 9
16 75
0 0 0 0 133408 154888 56044 0 0 11 190 1050 1034 10
16 74
1 0 0 0 128268 157336 58472 0 0 9 120 1007 1051 8
19 73
0 0 0 0 125320 157724 60804 0 0 10 213 1046 1063 10
18 72

I'll forward some more numbers in an hour or so once it's been running
for a while.

> It might also be the storetree code. This is creating some additional
> reads. Sizif, it might be a good idea to speed up the lookups by keeping
> a compact in-core hit index, for example in the form of a cache digest
> like bloom filter for the store_dir. It is possible to do deletetions
> in such compact bitmap hashes by storing one bit more than needed and
> on delete deleting the last bit and allow for a single bit loss in any
> position but the last. The index is preferably managed by the FS and
> mmap:ed and locked in memory by the application.

That sounds good. We're definitely not using more than our fair share
of memory now (I think we can afford more than 20MB on a nearly half GB
box!). So it probably is a good idea to keep up with hits and misses a
little more in RAM.

> To make the bloom filter work you need a reasonably secure hashing
> function, or else you risk to much bit clustering making the filter
> useless.
                                  --
                     Joe Cooper <joe@swelltech.com>
                 Affordable Web Caching Proxy Appliances
                        http://www.swelltech.com

Return-Path: <squidng-return-224-hno=hem.passagen.se@cacheboy.net>
Received: from localhost (localhost [127.0.0.1])
        by henrik.localdomain (8.9.3/8.9.3) with ESMTP id RAA04828
        for <henrik@localhost>; Mon, 4 Sep 2000 17:15:00 +0200
Received: from pop.passagen.se
        by localhost with POP3 (fetchmail-5.1.0)
        for henrik@localhost (single-drop); Mon, 04 Sep 2000 17:15:00 +0200 (CEST)
Received: from skywalker.creative.net.au (qmailr@node16292.a2000.nl [24.132.98.146]) by mail1.passagen.se (8.9.3/8.9.3/1.22)
        with SMTP id <PAA22576> for <hno@hem.passagen.se>; Mon, 4 Sep 2000 15:52:32 +0200 (MET DST)
Received: (qmail 76679 invoked by uid 1002); 4 Sep 2000 13:52:22 -0000
Mailing-List: contact squidng-help@cacheboy.net; run by ezmlm
Precedence: bulk
X-No-Archive: yes
List-Post: <mailto:squidng@cacheboy.net>
List-Help: <mailto:squidng-help@cacheboy.net>
List-Unsubscribe: <mailto:squidng-unsubscribe@cacheboy.net>
List-Subscribe: <mailto:squidng-subscribe@cacheboy.net>
Delivered-To: mailing list squidng@cacheboy.net
Received: (qmail 76672 invoked from network); 4 Sep 2000 13:52:21 -0000
Date: Mon, 4 Sep 2000 17:52:22 +0400
From: Yury Shevchuk <sizif@botik.ru>
To: Henrik Nordstrom <hno@hem.passagen.se>
Cc: squidng@cacheboy.net
Subject: Re: reiserfs_raw update
Message-ID: <20000904175222.C15882@hole.botik.ru>
References: <20000903222902.A10665@hole.botik.ru> <39B30E44.8B979381@swelltech.com> <39B34AE3.38971307@hem.passagen.se>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
User-Agent: Mutt/1.0pre3i
In-Reply-To: <39B34AE3.38971307@hem.passagen.se>
Organization: Program Systems Inst., Pereslavl-Zalessky, Russia

On Mon, Sep 04, 2000 at 09:10:27AM +0200, Henrik Nordstrom wrote:
> Joe Cooper wrote:
> >
> > Ok, it's up and running very solid here. Hit rate under load is great,
> > but the load it's handling is a bit low. I'm one hour into a run at 50
> > reqs/sec and response times are slowly but steadily climbing...(keep in
> > mind 80 or 90 is what a box of this capacity will handle with the old
> > Squid+hno patches). I tried a run at 70 but the response times climbed
> > to unnacceptably high very quickly (within 10 minutes).
>
> It might also be the storetree code. This is creating some additional
> reads.

Storetree introduces the overhead of one extra open(2) (well, RAWOPEN)
per cache miss. (These open attempts should be fast in reiserfs,
according to some early benchmarks.) Storetree does no additional
actions on cache hit as compared to traditional squid (except
possibly writing back the lastref change).

So the fill phase of benchmarks is most unfavorable for storetree, as
95% of requests are misses. Polymix-3 allows to specify lowered rate
for the fill phase, perheps we should use this.

> Sizif, it might be a good idea to speed up the lookups by keeping
> a compact in-core hit index, for example in the form of a cache digest
> like bloom filter for the store_dir. It is possible to do deletetions
> in such compact bitmap hashes by storing one bit more than needed and
> on delete deleting the last bit and allow for a single bit loss in any
> position but the last. The index is preferably managed by the FS and
> mmap:ed and locked in memory by the application.

This seems to be considerable complication. But if we don't find
other fixable bottlenecks, this will probably be the only way to raise
the request rate.

Thanks,

-- sizif
Received on Fri Nov 03 2000 - 15:19:06 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:55 MST