Re: [squid-users] Fwd: rock store: a bug or ...?

From: Henrik Lidström <freebsd_at_lidstrom.eu>
Date: Mon, 03 Feb 2014 12:39:43 +0100

On 01/29/14 22:10, Nikolai Gorchilov wrote:
> Can somebody reproduce the problem? Or I'm the only one who have it?
>
> Best,
> Niki
>
> P.S. Made some additional testing with diskd storage - no problems
> detected. IPC seems to work as expected in diskd mode.
>
> On Wed, Jan 29, 2014 at 12:26 PM, Nikolai Gorchilov <niki_at_x3me.net> wrote:
>> I just made a new discovery - the problem seemingly disappears when
>> running Squid with -N, thus removing the disker process out of the
>> equation.
>>
>> Could it be a worker-disker IPC related issue, instead of rock store
>> specific problem?
>>
>> Best,
>> Niki
>>
>> On Wed, Jan 29, 2014 at 12:00 PM, Nikolai Gorchilov <niki_at_x3me.net> wrote:
>>> Dear Amos
>>>
>>> On Wed, Jan 29, 2014 at 5:18 AM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
>>>>>
>>>>> The simple steps to reproduce:
>>>>>
>>>>> 1. Empty store dir and recreate swap with -z
>>>>
>>>> Did you leave sufficient time after doing this to ensure that the -z
>>>> operation was completed successfully and all the "squid -z" spawned
>>>> processes stopped?
>>>
>>> Yes.
>>>
>>>>> 2. Start Squid with the above config
>>>>> 3. Open a new private browser session (or clear browser cache) and
>>>>> request www.w3.org. Most of the 35+ requested items will get cached.
>>>>> 4. Once the page loads fully, request full reload Ctrl-Shift-R/Cmd-Shift-R
>>>>
>>>> Step 4 operation is blocked by the "reload_into_ims on" config directive
>>>> which turns these force-reload (Cache-Control:no-cache) requests into
>>>> If-Modified-Since (IMS) requests.
>>>>
>>>> Under correct protocol operation they should be TCP_MISS responses.
>>>
>>> Exactly. This problem exists only when delivering TCP_HITs.
>>> TCP_REFRESH_UNMODIFIED for instance works fine.
>>>
>>> As mentioned in conf comments reload_into_ims is used only to make the
>>> problem reproduction easier - having multiple TCP_HITs in short period
>>> of time somehow trigger it.
>>>
>>> Same scenario works perfectly fine when using aufs.
>>>
>>>>>> During this (and every next) reload few (1-3-5) random requests out of
>>>>> all 35+ objects will enter into this strange state, until the user
>>>>> aborts the connection.
>>>>>
>>>>> Access.log excerpts:
>>>>> rock: http://pastebin.com/QWTgqRTD
>>>>> aufs: http://pastebin.com/0WGdHPP9
>>>>>
>>>>> Pay attention to '===[]===' marks in the logs, marking user's actions.
>>>>>
>>>>> Please help! Can you reproduce the problem or it's somehow related to
>>>>> my environment?
>>>>
>>>>
>>>> I note that you are only waiting 5-10 seconds between teh refresh and
>>>> escape. While this may be long enough, does the trace show any
>>>> difference if you leave it for a much longer time such as 30 or 60 seconds?
>>>
>>> I have seen connections in that state waiting for hours (say 1 night).
>>> Just did 30 minutes test:
>>>
>>> 1390989136.532 1910610 172.16.101.252 TCP_HIT_ABORTED/000 0 GET
>>> http://www.w3.org/2008/site/css/minimum - HIER_NONE/- -
>>>
>>> Checked active_connections for this same connection just before
>>> hitting ESC. Here's what Squid reports:
>>>
>>> Connection: 0x28e8be8
>>> FD 22, read 322, wrote 0
>>> FD desc: Reading next request
>>> in: buf 0x28eb9b0, offset 0, size 4096
>>> remote: 172.16.101.252:58477
>>> local: 192.168.1.10:3128
>>> nrequests: 1
>>> uri http://www.w3.org/2008/site/css/minimum
>>> logType TCP_HIT
>>> out.offset 0, out.size 0
>>> req_sz 322
>>> entry 0x2935d50/DC7BF6E1F61DA763BBCBD0043D4281A4
>>> start 1390987225.923229 (1886.649296 seconds ago)
>>> username -
>>>
>>> Hope this helps!
>>>
>>> Best,
>>> Niki

Hi there.
Noticed the same thing when testing 3.HEAD-20140127-r13248 with large
rock support this weekend at home. Connections stalling forever after a
while , until ESC (stop browser) as you say. This is on FreeBSD
9.2-STABLE running aufs usually, so no IPC/SHM extra configuration had
been done. Reverted to aufs, cause no time to do further analysis.

ex:
1391183222.450 293652 192.168.2.107 TCP_HIT_ABORTED/000 0 GET
http://s.ytimg.com/yts/jsbin/www-vflZWeAO9/www_common_mod.js - HIER_NONE/- -

/Henrik
Received on Mon Feb 03 2014 - 11:40:16 MST

This archive was generated by hypermail 2.2.0 : Tue Feb 04 2014 - 12:00:04 MST