Re: [squid-users] Too many open files

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 28 Jul 2013 21:24:16 +1200

On 28/07/2013 6:19 p.m., Peter Retief wrote:
>>> Peter:
>>> Do you mean you've patched the source code, and if so, how do I get
>>> that patch? Do I have to move from the stable trunk?
>> Amos:
>> Sorry yes that is what I meant and it can now be found here:
>>
> http://www.squid-cache.org/Versions/v3/3.HEAD/changesets/squid-3-12957.patch
>> It should apply on the stable 3.3 easily, although I have not tested that.
>> NP: if you rebuild please go with the 3.3.8 security update release.
> I have patched the file as documented, and recompiled with the 3.3.8 branch
>
>>> Peter:
>>> The first log occurences are:
>>> 2013/07/23 08:26:13 kid2| Attempt to open socket for EUI retrieval
> failed:
>>> (24) Too many open files
>>> 2013/07/23 08:26:13 kid2| comm_open: socket failure: (24) Too many
>>> open files
>>> 2013/07/23 08:26:13 kid2| Reserved FD adjusted from 100 to 15394 due
>>> to failures
>> Amos:
>> So this worker #2 got errors after reaching about 990 open FD (16K -
> 15394). Ouch.
>
>> Note that all these socket opening operations are failing with the "Too
> many open files" error the OS sends back when limiting Squid to 990 or so
> FD. This has confirmed that Squid is not mis-calculating > where its limit
> is, but something in the OS is actually causing it to limit the worker. The
> first one to hit was a socket, but also a disk file access is getting them
> soon after so it is likely the global OS limit
>> rather than a particular FD type limit. That 990 usable FD is also
> suspiciously close to 1024 with a few % held spare for emergency use (as
> Squid does when calculating its reservation value).
>
> Amos, I don't understand how you deduced the 990 open FD from the error
> messages above ( "adjusted from 100 to 15394")?

Squid starts with 16K of which 100 are reserved FD. When it changes that
the 16K limit is still the total, but the reserved is raised to make N
sockets reserved/unavailable.

So 16384 - 15394 = 990 FD safe to use after adjustments caused by the error.

> I would have deduced that
> there was some internal limit of 100 (not 1000) FD's, and that squid was
> re-adjusting to the maximum currently allowed (16K)?

Yes, that is correct. However it is the "reserved" limit being raised.

Reserved is the number of FD which are configured as available but
determined to be unusable. For example this can be though of as the
cordon on a danger zone for FD - if Squid strays into using those number
of sockets again it can expect errors. Raising that count reduces Squid
operational FD resources by the amount raised.
Squid may still try to use some of them under peak load conditions, but
will do so only if there is no other way to free up the safe in-use FD.

Due to that case for emergency usage, when Squid sets the reserved limit
it does not set it exactly on the FD number which got error'd. It sets
is 2-3% into the "safe" FD count. So rounding 990 up that slight amount
we get the 1024 which is a highly suspicious value.

Amos
Received on Sun Jul 28 2013 - 09:24:26 MDT

This archive was generated by hypermail 2.2.0 : Sun Jul 28 2013 - 12:00:05 MDT