Re: AW: [squid-users] squid 3.4. uses 100% cpu with ntlm_auth

From: Carlos Defoe <carlosdefoe_at_gmail.com>
Date: Thu, 24 Apr 2014 18:57:46 -0300

Just updating... I tried one more time, with squid 3.4.4.

Same thing, 100% CPU after some minutes and with few hundreds of
users. It doesn't stay at 100% all the time, like an infinite loop...
it goes in a range from 90, 95, 99, 100%.

Almost sure it is something with auth helper handling. My line in
squid.conf is the following:

"auth_param negotiate program
/usr/local/squid/libexec/negotiate_wrapper_auth --ntlm
/usr/bin/ntlm_auth --diagnostics --helper-protocol=squid-2.5-ntlmssp
--domain=EXAMPLE --kerberos
/usr/local/squid/libexec/negotiate_kerberos_auth -s GSS_C_NO_NAME"

Majority of authentications are handled by negotiate_kerberos_auth.

I also tried to attach strace (strace -p pid) to squid process for a
minute, when it reached 100%. The result is on the following link. I
don't know if it is useful. The only thing that I found weird is the
"Broken Pipes" lines.

http://goo.gl/WhssSh

Squid 3.3 series is ok, I'm running 3.3.12.

bye guys

On Wed, Feb 5, 2014 at 7:33 AM, Rietzler, Markus (RZF, SG 324 /
<RIETZLER_SOFTWARE>) <markus.rietzler_at_fv.nrw.de> wrote:
> that's not bad to hear. I have seen the new version. at the moment the only way to "test" is to use this new version in production and see what happens. very annoying.
>
> with switch back to 3.3.x or 3.2.x everything works perfect!
>
> no it is not ntlm/Kerberos.
>
> see my posting from January:
>
>> -----Ursprüngliche Nachricht-----
>> Von: Rietzler, Markus (RZF, SG 324 / <RIETZLER_SOFTWARE>)
>> Gesendet: Mittwoch, 8. Januar 2014 10:48
>> An: 'Eliezer Croitoru'
>> Betreff: AW: AW: AW: [squid-users] Squid 3.4 sends Windows username
>> without backslash to external wbinfo_group helper
>>
>> just a quick answer:
>>
>> yesterday we switched to ntlm fakeauth to eliminate any problems with
>> samba/winbind protocol talking to DC (every now and then winbind losts its
>> trust with DC, so with fakeauth we can be sure there is now influence with
>> tcp connections and talking to DC).
>>
>> but also with fakeauth we can see the rising of cpu usage. we then enabled
>> 2 workers and this seems to reduce the problem somewhat. the rise is not
>> that fast...
>
> ... but also happens in the end!
>
>
>> -----Ursprüngliche Nachricht-----
>> Von: Carlos Defoe [mailto:carlosdefoe_at_gmail.com]
>> Gesendet: Dienstag, 4. Februar 2014 20:38
>> An: squid-users
>> Betreff: Re: AW: [squid-users] squid 3.4. uses 100% cpu with ntlm_auth
>>
>> For me, the version 3.4.3 have the same behavior. It uses 100% CPU (in
>> one core, the others are normal). For the users, it's just a slowed
>> down navigation. As soon as I change back to the 3.3.8, everything
>> works fine.
>>
>> Actually I'm not sure the problem is caused by ntlm or kerberos or
>> external_acl_type or anything related to authentication. But I can't
>> disable it to be sure.
>>
>> This time I will leave one server runnnig with 3.4.3 and try to debug.
>> I have already tried to increase the debug level on every auth helper,
>> but I couldn't see nothing wrong. I'll try debug_options ALL,9
>> tomorrow.
>>
>> With strace, should I look for something? System calls squid does all
>> the time...
>>
>>
>> On Sun, Jan 26, 2014 at 11:47 PM, Alan <lameventanas_at_gmail.com> wrote:
>> > On Wed, Jan 8, 2014 at 1:05 PM, Amos Jeffries <squid3_at_treenet.co.nz>
>> wrote:
>> >> On 7/01/2014 10:21 p.m., Rietzler, Markus (RZF, SG 324 /
>> >> <RIETZLER_SOFTWARE>) wrote:
>> >>> thanxs,
>> >>>
>> >>> our assumption is, that it is related to helper management. with 3.4.
>> there is a "new helper protocol", right?
>> >>
>> >> Right. That is the big user-visible bit in 3.4.
>> >>
>> >> But there are other background changes involving TCP connection
>> >> management, authentication management, ACL behaviours and some things
>> in
>> >> 3.3 series also potentially affecting NTLM.
>> >>
>> >> The feature changes just give us a direction to look in. We still have
>> >> to diagnose each new bug in detail to be sure. There are others already
>> >> using NTLM in older 3.3/3.4 versions without seing this problem for
>> example.
>> >>
>> >>> our environment worked with 3.2 without problems. now with the jump to
>> 3.4. it will not work anymore. so number of requests are somehow important
>> but as it worked in the past...
>> >>>
>> >>> if we go without ntlm_auth we can't see any high cpu load. so the
>> first thought ACL and eg. regex problems can be
>> >>> discarded. maybe there are some cross influences. but we think it lies
>> somewhere in helpers/auth.
>> >>
>> >> Did you get any better cache.log trace with the debug_options 29,9
>> 84,9?
>> >>
>> >> Amos
>> >>
>> >
>> > I have the same problem here, I noticed it when I went from 3.3.8 to
>> 3.4.2.
>> > I assumed the problem was introduced with 3.4.x, so I went back to
>> > 3.3.11 and it is working fine.
>> > I'm using aufs, negotiate_kerberos_auth and a custom external acl
>> helper.
>> >
>> > Unfortunately these are production servers, so I can't strace or
>> > increase logging as suggested.
Received on Thu Apr 24 2014 - 21:57:58 MDT

This archive was generated by hypermail 2.2.0 : Fri Apr 25 2014 - 12:00:07 MDT