ICAP communication stalling

From: Guy Helmer <guy.helmer_at_palisadesystems.com>
Date: Tue, 1 May 2012 16:03:04 -0500

I have Squid (sync'ed with trunk as of -r12082 (2012-03-07 v3.2.0.16+)) running at a site where it is configured to use ICAP respmod to analyze several binary/executable content types, and as a result Squid was sending many large Netflix HTTP responses to the icap service.

Under what seems to be a heavy load at the time, about 25 of the connections from squid to icap stalled in the midst of the respmod request. Based on packet traces, icap received roughly 50% to 90% of the encapsulated HTTP response before the connection went quiet. The stalled requests all look similar to this:

  4 13:32:27.149901 127.0.0.1 -> 127.0.0.1 ICAP RESPMOD icap://127.0.0.1:1344/mw ICAP/1.0

0000 02 00 00 00 45 00 04 48 ec c1 40 00 40 06 00 00 ....E..H..@.@...
0010 7f 00 00 01 7f 00 00 01 2d 83 05 40 24 13 1b 0b ........-..@$...
0020 4e 06 72 f8 80 18 23 00 02 3d 00 00 01 01 08 0a N.r...#..=......
0030 0f 7e 72 cd d5 08 7b 9a 52 45 53 50 4d 4f 44 20 .~r...{.RESPMOD
0040 69 63 61 70 3a 2f 2f 31 32 37 2e 30 2e 30 2e 31 icap://127.0.0.1
0050 3a 31 33 34 34 2f 6d 77 20 49 43 41 50 2f 31 2e :1344/mw ICAP/1.
0060 30 0d 0a 48 6f 73 74 3a 20 31 32 37 2e 30 2e 30 0..Host: 127.0.0
0070 2e 31 3a 31 33 34 34 0d 0a 44 61 74 65 3a 20 4d .1:1344..Date: M
0080 6f 6e 2c 20 33 30 20 41 70 72 20 32 30 31 32 20 on, 30 Apr 2012
0090 31 37 3a 33 32 3a 32 37 20 47 4d 54 0d 0a 45 6e 17:32:27 GMT..En
00a0 63 61 70 73 75 6c 61 74 65 64 3a 20 72 65 71 2d capsulated: req-
00b0 68 64 72 3d 30 2c 20 72 65 73 2d 68 64 72 3d 35 hdr=0, res-hdr=5
00c0 33 34 2c 20 72 65 73 2d 62 6f 64 79 3d 38 36 33 34, res-body=863
00d0 0d 0a 58 2d 43 6c 69 65 6e 74 2d 49 50 3a 20 31 ..X-Client-IP: 1
00e0 30 2e 31 30 2e 30 2e 36 37 0d 0a 0d 0a 47 45 54 0.10.0.67....GET
00f0 20 68 74 74 70 3a 2f 2f 6e 66 6c 78 2e 69 2e 61 http://nflx.i.a
0100 33 39 39 66 30 30 33 2e 78 2e 6c 63 64 6e 2e 6e 399f003.x.lcdn.n
0110 66 6c 78 69 6d 67 2e 63 6f 6d 2f 31 38 32 2f 31 flximg.com/182/1
0120 30 33 38 37 33 34 31 38 32 2e 69 73 6d 76 2f 72 038734182.ismv/r
0130 61 6e 67 65 2f 36 38 30 38 36 39 37 32 34 2d 36 ange/680869724-6
0140 38 31 36 39 38 31 37 31 3f 65 74 69 6d 65 3d 32 81698171?etime=2
0150 30 31 32 30 35 30 31 30 31 32 32 30 38 26 6d 6f 0120501012208&mo
0160 76 69 65 48 61 73 68 3d 31 33 33 26 65 6e 63 6f vieHash=133&enco
0170 64 65 64 3d 30 37 33 61 63 37 65 32 31 39 31 38 ded=073ac7e21918
0180 39 33 32 38 37 33 64 32 31 26 72 61 6e 64 6f 6d 932873d21&random
0190 3d 31 32 39 36 33 30 33 35 36 36 20 48 54 54 50 =1296303566 HTTP
01a0 2f 31 2e 30 0d 0a 41 63 63 65 70 74 3a 20 2a 2f /1.0..Accept: */
01b0 2a 0d 0a 52 65 66 65 72 65 72 3a 20 68 74 74 70 *..Referer: http
01c0 3a 2f 2f 6d 6f 76 69 65 73 2e 6e 65 74 66 6c 69 ://movies.netfli
01d0 78 2e 63 6f 6d 2f 6c 61 79 6f 75 74 2f 73 69 6c x.com/layout/sil
01e0 76 65 72 6c 69 67 68 74 2f 53 4c 50 6c 61 79 65 verlight/SLPlaye
01f0 72 34 2e 78 61 70 3f 76 3d 32 2e 31 33 33 32 2e r4.xap?v=2.1332.
0200 35 32 30 2e 31 2d 32 0d 0a 41 63 63 65 70 74 2d 520.1-2..Accept-
0210 4c 61 6e 67 75 61 67 65 3a 20 65 6e 2d 75 73 0d Language: en-us.
0220 0a 55 73 65 72 2d 41 67 65 6e 74 3a 20 4d 6f 7a .User-Agent: Moz
0230 69 6c 6c 61 2f 34 2e 30 20 28 63 6f 6d 70 61 74 illa/4.0 (compat
0240 69 62 6c 65 3b 20 4d 53 49 45 20 38 2e 30 3b 20 ible; MSIE 8.0;
0250 57 69 6e 64 6f 77 73 20 4e 54 20 35 2e 31 3b 20 Windows NT 5.1;
0260 54 72 69 64 65 6e 74 2f 34 2e 30 3b 20 2e 4e 45 Trident/4.0; .NE
0270 54 20 43 4c 52 20 31 2e 31 2e 34 33 32 32 3b 20 T CLR 1.1.4322;
0280 2e 4e 45 54 20 43 4c 52 20 32 2e 30 2e 35 30 37 .NET CLR 2.0.507
0290 32 37 3b 20 2e 4e 45 54 20 43 4c 52 20 33 2e 30 27; .NET CLR 3.0
02a0 2e 34 35 30 36 2e 32 31 35 32 3b 20 2e 4e 45 54 .4506.2152; .NET
02b0 20 43 4c 52 20 33 2e 35 2e 33 30 37 32 39 3b 20 CLR 3.5.30729;
02c0 2e 4e 45 54 34 2e 30 43 3b 20 49 6e 66 6f 50 61 .NET4.0C; InfoPa
02d0 74 68 2e 32 29 0d 0a 48 6f 73 74 3a 20 6e 66 6c th.2)..Host: nfl
02e0 78 2e 69 2e 61 33 39 39 66 30 30 33 2e 78 2e 6c x.i.a399f003.x.l
02f0 63 64 6e 2e 6e 66 6c 78 69 6d 67 2e 63 6f 6d 0d cdn.nflximg.com.
0300 0a 0d 0a 48 54 54 50 2f 31 2e 31 20 32 30 30 20 ...HTTP/1.1 200
0310 4f 4b 0d 0a 43 61 63 68 65 2d 43 6f 6e 74 72 6f OK..Cache-Contro
0320 6c 3a 20 6e 6f 2d 73 74 6f 72 65 0d 0a 43 6f 6e l: no-store..Con
0330 74 65 6e 74 2d 54 79 70 65 3a 20 61 70 70 6c 69 tent-Type: appli
0340 63 61 74 69 6f 6e 2f 6f 63 74 65 74 2d 73 74 72 cation/octet-str
0350 65 61 6d 0d 0a 45 54 61 67 3a 20 22 65 64 30 39 eam..ETag: "ed09
0360 34 64 61 2d 32 63 31 33 39 61 62 37 2d 34 61 61 4da-2c139ab7-4aa
0370 62 31 33 65 30 66 64 31 38 30 22 0d 0a 45 78 70 b13e0fd180"..Exp
0380 69 72 65 73 3a 20 57 65 64 2c 20 31 37 20 41 75 ires: Wed, 17 Au
0390 67 20 32 30 31 31 20 31 37 3a 33 38 3a 30 31 20 g 2011 17:38:01
03a0 47 4d 54 0d 0a 4c 61 73 74 2d 4d 6f 64 69 66 69 GMT..Last-Modifi
03b0 65 64 3a 20 57 65 64 2c 20 31 37 20 41 75 67 20 ed: Wed, 17 Aug
03c0 32 30 31 31 20 31 30 3a 34 38 3a 35 34 20 47 4d 2011 10:48:54 GM
03d0 54 0d 0a 41 63 63 65 70 74 2d 52 61 6e 67 65 73 T..Accept-Ranges
03e0 3a 20 62 79 74 65 73 0d 0a 53 65 72 76 65 72 3a : bytes..Server:
03f0 20 4c 65 76 65 6c 2d 33 20 4f 72 69 67 69 6e 20 Level-3 Origin
0400 53 74 6f 72 61 67 65 2f 31 2e 35 0d 0a 44 61 74 Storage/1.5..Dat
0410 65 3a 20 4d 6f 6e 2c 20 33 30 20 41 70 72 20 32 e: Mon, 30 Apr 2
0420 30 31 32 20 31 37 3a 33 32 3a 32 37 20 47 4d 54 012 17:32:27 GMT
0430 0d 0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 ..Content-Length
0440 3a 20 38 32 38 34 34 38 0d 0a 0d 0a : 828448….

Followed by data, and then seven seconds into the request:
508 7.963216 127.0.0.1 -> 127.0.0.1 ICAP Continuation
509 8.063217 127.0.0.1 -> 127.0.0.1 TCP icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574111001 TSER=259953128
510 67.964832 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
511 67.964832 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260013128 TSER=3574111001
512 127.966453 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
513 127.966453 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
514 127.966453 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574230901 TSER=260013128
515 127.966453 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 512#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260073128 TSER=3574111001
516 187.968078 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
517 187.968078 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
518 187.968078 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574290901 TSER=260073128
519 187.968078 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 516#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260133128 TSER=3574230901
520 247.969707 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
521 247.969707 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
522 247.969707 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574350901 TSER=260133128
523 247.969707 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 520#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260193128 TSER=3574290901
524 307.971339 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
525 307.971339 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
526 307.971339 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574410901 TSER=260193128
527 307.971339 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 524#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260253128 TSER=3574350901
528 367.972974 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
529 367.972974 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
530 367.972974 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574470901 TSER=260253128
531 367.972974 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 528#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260313128 TSER=3574410901
532 427.974613 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
533 427.974613 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
534 427.974613 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574530901 TSER=260313128
535 427.974613 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 532#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260373128 TSER=3574470901
536 487.976255 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
537 487.976255 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
538 487.976255 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574590901 TSER=260373128
539 487.976255 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 536#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260433128 TSER=3574530901
540 547.977901 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] 11651 > icap [ACK] Seq=825599 Ack=1 Win=71680 Len=0
541 547.977901 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive] icap > 11651 [ACK] Seq=0 Ack=825600 Win=71680 Len=0
542 547.977901 127.0.0.1 -> 127.0.0.1 TCP [TCP Keep-Alive ACK] icap > 11651 [ACK] Seq=1 Ack=825600 Win=71680 Len=0 TSV=3574650901 TSER=260433128
543 547.977901 127.0.0.1 -> 127.0.0.1 TCP [TCP Dup ACK 540#1] 11651 > icap [ACK] Seq=825600 Ack=1 Win=71680 Len=0 TSV=260493128 TSER=3574590901

Then squid reported:
2012/04/30 13:41:21 kid1| suspending ICAP service for too many failures
2012/04/30 13:41:21 kid1| essential ICAP service is suspended: icap://127.0.0.1:1344/mw[down,susp,fail11]

But squid + icap handled a number of other ICAP REQMOD and RESPMOD requests in this period 13:32:27 to 13:41:21, including 28 or more RESPMOD requests to the mw service. We didn't see any other error reports about the mw icap service immediately prior to this error.

Does this sound similar to Squid Bug #3191 http://bugs.squid-cache.org/show_bug.cgi?id=3191? Otherwise, any ideas on how to debug this? This installation is at a very busy site, so I'm not sure a debugging log is feasible...

Thanks,
Guy--------
This message has been scanned by ComplianceSafe, powered by Palisade's PacketSure.
Received on Tue May 01 2012 - 21:03:20 MDT

This archive was generated by hypermail 2.2.0 : Tue May 08 2012 - 12:00:10 MDT