Re: async-io for 2.4

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Fri, 03 Nov 2000 23:12:13 +0100

Joe Cooper wrote:

> Squid gives these errors in cache.log:
>
> 2000/11/03 01:19:46| storeAufsOpenDone: (2) No such file or directory
> 2000/11/03 01:19:46| /cache1/01/5F/00015F6A
> 2000/11/03 01:20:15| comm_poll: poll failure: (12) Cannot allocate
> memory
> 2000/11/03 01:20:15| Select loop Error. Retry 1

Smells like the kernel ran out of memory.

Have seen this message reported by other user way before these changes.
In some cases even without async-io I think.

async-io pushes the buffer/cache memory substantially harder than a
non-threaded Squid. The more efficient the async-io implementation is,
the harder the buffer/cache will be pushed.

> ReiserFS also reports memory allocation failures, and triggers a
> deadlock condition (which I'll report to the ReiserFS guys...Chris Mason
> over there is a hero on these kinds of problems).

A further indication of the above..

> Now...memory is not filled when this condition occurs. There is
> some stuff in swap (136MB worth, actually) but it was put there
> much earlier and didn't cause problems.

What we are talking about here is not virtual memory, but the physical
memory (RAM).

> CPU saturation seems to be the culprit here.

Might be, or it is a sideeffect. Only a long run statistics showing the
CPU usage until the point where it hangs can tell. There is no point in
looking at the CPU usage after it has failed..

If you have plenty of CPU left and it then abruptly eats all the CPU
when the problem occurs then the CPU usage change is most likely an side
effect of the real problem. My guess here is again memory shortage
triggering errors in memory shortage fault recovery. The Linux kernel is
not well known for being forgiving when it runs out of free RAM,
reiserfs even less so..

/Henrik
Received on Fri Nov 03 2000 - 15:18:52 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:55 MST