Re: [squid-users] Fwd: rock store: a bug or ...?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 04 Feb 2014 10:00:24 +1300

On 2014-02-04 00:39, Henrik Lidström wrote:
> On 01/29/14 22:10, Nikolai Gorchilov wrote:
>> Can somebody reproduce the problem? Or I'm the only one who have it?
>>
>> Best,
>> Niki
>>
>> P.S. Made some additional testing with diskd storage - no problems
>> detected. IPC seems to work as expected in diskd mode.
>>
>> On Wed, Jan 29, 2014 at 12:26 PM, Nikolai Gorchilov <niki_at_x3me.net>
>> wrote:
>>> I just made a new discovery - the problem seemingly disappears when
>>> running Squid with -N, thus removing the disker process out of the
>>> equation.
>>>
>>> Could it be a worker-disker IPC related issue, instead of rock store
>>> specific problem?
>>>
>>> Best,
>>> Niki
>>>
>>> On Wed, Jan 29, 2014 at 12:00 PM, Nikolai Gorchilov <niki_at_x3me.net>
>>> wrote:
>>>> Dear Amos
>>>>
>>>> On Wed, Jan 29, 2014 at 5:18 AM, Amos Jeffries
>>>> <squid3_at_treenet.co.nz> wrote:
>>>>>>
>>>>>> The simple steps to reproduce:
>>>>>>
>>>>>> 1. Empty store dir and recreate swap with -z
>>>>>
>>>>> Did you leave sufficient time after doing this to ensure that the
>>>>> -z
>>>>> operation was completed successfully and all the "squid -z" spawned
>>>>> processes stopped?
>>>>
>>>> Yes.
>>>>
>>>>>> 2. Start Squid with the above config
>>>>>> 3. Open a new private browser session (or clear browser cache) and
>>>>>> request www.w3.org. Most of the 35+ requested items will get
>>>>>> cached.
>>>>>> 4. Once the page loads fully, request full reload
>>>>>> Ctrl-Shift-R/Cmd-Shift-R
>>>>>
>>>>> Step 4 operation is blocked by the "reload_into_ims on" config
>>>>> directive
>>>>> which turns these force-reload (Cache-Control:no-cache) requests
>>>>> into
>>>>> If-Modified-Since (IMS) requests.
>>>>>
>>>>> Under correct protocol operation they should be TCP_MISS responses.
>>>>
>>>> Exactly. This problem exists only when delivering TCP_HITs.
>>>> TCP_REFRESH_UNMODIFIED for instance works fine.
>>>>
>>>> As mentioned in conf comments reload_into_ims is used only to make
>>>> the
>>>> problem reproduction easier - having multiple TCP_HITs in short
>>>> period
>>>> of time somehow trigger it.
>>>>
>>>> Same scenario works perfectly fine when using aufs.
>>>>
>>>>>>> During this (and every next) reload few (1-3-5) random requests
>>>>>>> out of
>>>>>> all 35+ objects will enter into this strange state, until the user
>>>>>> aborts the connection.
>>>>>>
>>>>>> Access.log excerpts:
>>>>>> rock: http://pastebin.com/QWTgqRTD
>>>>>> aufs: http://pastebin.com/0WGdHPP9
>>>>>>
>>>>>> Pay attention to '===[]===' marks in the logs, marking user's
>>>>>> actions.
>>>>>>
>>>>>> Please help! Can you reproduce the problem or it's somehow related
>>>>>> to
>>>>>> my environment?
>>>>>
>>>>>
>>>>> I note that you are only waiting 5-10 seconds between teh refresh
>>>>> and
>>>>> escape. While this may be long enough, does the trace show any
>>>>> difference if you leave it for a much longer time such as 30 or 60
>>>>> seconds?
>>>>
>>>> I have seen connections in that state waiting for hours (say 1
>>>> night).
>>>> Just did 30 minutes test:
>>>>
>>>> 1390989136.532 1910610 172.16.101.252 TCP_HIT_ABORTED/000 0 GET
>>>> http://www.w3.org/2008/site/css/minimum - HIER_NONE/- -
>>>>
>>>> Checked active_connections for this same connection just before
>>>> hitting ESC. Here's what Squid reports:
>>>>
>>>> Connection: 0x28e8be8
>>>> FD 22, read 322, wrote 0
>>>> FD desc: Reading next request
>>>> in: buf 0x28eb9b0, offset 0, size 4096
>>>> remote: 172.16.101.252:58477
>>>> local: 192.168.1.10:3128
>>>> nrequests: 1
>>>> uri http://www.w3.org/2008/site/css/minimum
>>>> logType TCP_HIT
>>>> out.offset 0, out.size 0
>>>> req_sz 322
>>>> entry 0x2935d50/DC7BF6E1F61DA763BBCBD0043D4281A4
>>>> start 1390987225.923229 (1886.649296 seconds ago)
>>>> username -
>>>>
>>>> Hope this helps!
>>>>
>>>> Best,
>>>> Niki
>
> Hi there.
> Noticed the same thing when testing 3.HEAD-20140127-r13248 with large
> rock support this weekend at home. Connections stalling forever after a
> while , until ESC (stop browser) as you say. This is on FreeBSD
> 9.2-STABLE running aufs usually, so no IPC/SHM extra configuration had
> been done. Reverted to aufs, cause no time to do further analysis.
>
> ex:
> 1391183222.450 293652 192.168.2.107 TCP_HIT_ABORTED/000 0 GET
> http://s.ytimg.com/yts/jsbin/www-vflZWeAO9/www_common_mod.js -
> HIER_NONE/- -
>
> /Henrik

This is being discussed in squid-dev mailing list thread "Re: Regression
introduced in trunk r13201 (large rock merge)" (I have cc'd you on my
reply there).

PS. Problems with "alpha" release code (AKA. 3.HEAD, trunk, and other
devel bzr branches) are actually off-topic here. Please followup in the
squid-dev thread.

Amos
Received on Mon Feb 03 2014 - 21:00:28 MST

This archive was generated by hypermail 2.2.0 : Tue Feb 04 2014 - 12:00:04 MST