Re: [squid-users] squid 3.2.1 under solaris dying with "segment violation"

From: Jose-Marcio Martins da Cruz <Jose-Marcio.Martins_at_mines-paristech.fr>
Date: Mon, 17 Sep 2012 08:54:15 +0200

Hello,

Someone can help me ?

Shall I send this to bugs_at_squid-cache.org ?

As from the cores I posted here, squid dies each 1-3 minutes some place at
memory allocation/free place, given some bad ipcache entry.

squid stop dying friday evening and start dying again monday morning. Probably
some client asking for some "strange URL". This doesn't happens with squid 3.1.19.

Is there any other information I shall post ?

Thanks

José-Marcio

Jose-Marcio Martins da Cruz wrote:
>
> Hello,
>
> No answers about this problem... I finally succeeded to get some core files.
>
> I have some number of core files, all of them can be resumed by these two,
> related to ipcache management : ipcacheFreeEntry or ipcacheCreateEntry.
>
> I also got some results with truss, showing that but crash are related to
> ipcache entries.
>
> Any help will be welcome
>
> Regards,
>
> José-Marcio
>
> ***********************************************************************
> ...
> Core was generated by `(squid-1)'.
> Program terminated with signal 6, Aborted.
> #0 0xfed59315 in _lwp_kill () from /lib/libc.so.1
> (gdb) bt
> #0 0xfed59315 in _lwp_kill () from /lib/libc.so.1
> #1 0xfed54188 in thr_kill () from /lib/libc.so.1
> #2 0xfed01d73 in raise () from /lib/libc.so.1
> #3 0xfece1bbd in abort () from /lib/libc.so.1
> #4 0x082288a1 in death (sig=11) at tools.cc:388
> #5 0xfed55dbf in __sighndlr () from /lib/libc.so.1
> #6 0xfed4bab5 in call_user_handler () from /lib/libc.so.1
> #7 <signal handler called>
> #8 0x081dcb3b in memFree (p=0x87f4af8, type=184436712) at mem.cc:214
> #9 0x081d6bd0 in ipcacheFreeEntry (data=0xafe47e8) at ipcache.cc:1140
> #10 0x081d7342 in ipcacheRelease (dofree=true, i=0xafe47e8) at ipcache.cc:203
> #11 ipcacheRelease (i=0xafe47e8, dofree=true) at ipcache.cc:186
> #12 0x081d8aba in ipcache_nbgethostbyname (name=0xbdd4440 <error reading variable>,
> handler=0x81f34a0 <peerSelectDnsResults(ipcache_addrs const*,
> DnsLookupDetails const&, void*)>, handlerData=0xa841940)
> at ipcache.cc:670
> #13 0x081f0dd8 in peerSelectDnsPaths (psstate=0xa841940) at peer_select.cc:268
> #14 0x081f1a1d in peerSelectFoo (ps=0xa841940) at peer_select.cc:511
> #15 0x0822e324 in tunnelStart (http=0xbdd1e38, size_ptr=0xbdd1e50,
> status_ptr=0xbdd1ef4) at tunnel.cc:665
> #16 0x08161bea in ClientHttpRequest::processRequest (this=0xbdd1e38) at
> client_side_request.cc:1340
> #17 0x081623dd in ClientHttpRequest::doCallouts (this=0xbdd1e38) at
> client_side_request.cc:1604
> #18 0x08164545 in ClientRequestContext::clientAccessCheckDone (this=0xa5d1b48,
> answer=@0x80469cc: ACCESS_ALLOWED)
> at client_side_request.cc:842
> ...
>
> *********************************************************
>
> Core was generated by `(squid-1)'.
> Program terminated with signal 6, Aborted.
> #0 0xfed59315 in _lwp_kill () from /lib/libc.so.1
> (gdb) bt
> #0 0xfed59315 in _lwp_kill () from /lib/libc.so.1
> #1 0xfed54188 in thr_kill () from /lib/libc.so.1
> #2 0xfed01d73 in raise () from /lib/libc.so.1
> #3 0xfece1bbd in abort () from /lib/libc.so.1
> #4 0x082288a1 in death (sig=11) at tools.cc:388
> #5 0xfed55dbf in __sighndlr () from /lib/libc.so.1
> #6 0xfed4bab5 in call_user_handler () from /lib/libc.so.1
> #7 <signal handler called>
> #8 0x081dcaca in memAllocate (type=142559992) at mem.cc:206
> #9 0x081d6bfa in ipcacheCreateEntry (name=0xc042a10 <error reading variable>)
> at ipcache.cc:302
> #10 0x081d8adf in ipcache_nbgethostbyname (name=0xc042a10 <error reading variable>,
> handler=0x81f34a0 <peerSelectDnsResults(ipcache_addrs const*,
> DnsLookupDetails const&, void*)>, handlerData=0xa5d2de0)
> at ipcache.cc:692
> #11 0x081f0dd8 in peerSelectDnsPaths (psstate=0xa5d2de0) at peer_select.cc:268
> #12 0x081f1a1d in peerSelectFoo (ps=0xa5d2de0) at peer_select.cc:511
> #13 0x0818b83f in FwdState::start (this=0xa6061d0, aSelf=...) at forward.cc:144
> #14 0x0818fa4d in FwdState::Start (clientConn=..., entry=0xc08f0e8,
> request=0xc042930, al=...) at forward.cc:318
> #15 0x0815c3a1 in clientReplyContext::processMiss (this=0xc041768) at
> client_side_reply.cc:665
> #16 0x0815cab0 in clientReplyContext::doGetMoreData (this=0xc041768) at
> client_side_reply.cc:1754
> #17 0x0815cf6b in clientReplyContext::identifyStoreObject (this=0xc041768) at
> client_side_reply.cc:1535
> #18 0x08160974 in ClientHttpRequest::httpStart (this=0xc040308) at
> client_side_request.cc:1358
> #19 0x081623dd in ClientHttpRequest::doCallouts (this=0xc040308) at
> client_side_request.cc:1604
> #20 0x081635d5 in checkNoCacheDoneWrapper (answer=ACCESS_ALLOWED,
> data=0xa5d1b48) at client_side_request.cc:1271
> ---Type <return> to continue, or q <return> to quit---
> #21 0x08284655 in ACLChecklist::checkCallback (this=0xa5d25a0,
> answer=ACCESS_ALLOWED) at Checklist.cc:194
> #22 0x082514b3 in ACLFilledChecklist::checkCallback (this=0xa5d25a0,
> answer=ACCESS_ALLOWED) at FilledChecklist.cc:37
> #23 0x08285403 in ACLChecklist::matchNonBlocking (this=0xa5d25a0) at
> Checklist.cc:126
> #24 0x08285739 in ACLChecklist::nonBlockingCheck (this=0xa5d25a0,
> callback_=0x81635b0 <checkNoCacheDoneWrapper(allow_t, void*)>,
> callback_data_=0xa5d1b48) at Checklist.cc:325
> #25 0x08163589 in ClientRequestContext::checkNoCache (this=0xa5d1b48) at
> client_side_request.cc:1256
> #26 0x08162371 in ClientHttpRequest::doCallouts (this=0xc040308) at
> client_side_request.cc:1557
> #27 0x08164545 in ClientRequestContext::clientAccessCheckDone (this=0xa5d1b48,
> answer=@0x80469cc: ACCESS_ALLOWED)
> at client_side_request.cc:842
> #28 0x081647b9 in ClientRequestContext::clientAccessCheck2 (this=0xa5d1b48) at
> client_side_request.cc:735
> #29 0x08161e35 in ClientHttpRequest::doCallouts (this=0xc040308) at
> client_side_request.cc:1542
> #30 0x08164545 in ClientRequestContext::clientAccessCheckDone (this=0xa5d1b48,
> answer=@0x8046d70: ACCESS_ALLOWED)
> at client_side_request.cc:842
> #31 0x081648e5 in clientAccessCheckDoneWrapper (answer=ACCESS_ALLOWED,
> data=0xa5d1b48) at client_side_request.cc:747
> #32 0x08284655 in ACLChecklist::checkCallback (this=0xa5d2790,
> answer=ACCESS_ALLOWED) at Checklist.cc:194
> #33 0x082514b3 in ACLFilledChecklist::checkCallback (this=0xa5d2790,
> answer=ACCESS_ALLOWED) at FilledChecklist.cc:37
> #34 0x082854b1 in ACLChecklist::matchNonBlocking (this=0xa5d2790) at
> Checklist.cc:105
> #35 0x08285739 in ACLChecklist::nonBlockingCheck (this=0xa5d2790,
> callback_=0x81648c0 <clientAccessCheckDoneWrapper(allow_t, void*)>,
> callback_data_=0xa5d1b48) at Checklist.cc:325
> #36 0x08164a17 in ClientRequestContext::clientAccessCheck (this=0xa5d1b48) at
> client_side_request.cc:715
> #37 0x08161cc9 in ClientHttpRequest::doCallouts (this=0xc040308) at
> client_side_request.cc:1513
> #38 0x08165ba9 in ClientRequestContext::hostHeaderVerify (this=0xa5d1b48) at
> client_side_request.cc:663
> #39 0x08161c97 in ClientHttpRequest::doCallouts (this=0xc040308) at
> client_side_request.cc:1506
> ---Type <return> to continue, or q <return> to quit---
> #40 0x08151348 in clientProcessRequest (conn=0xb88e568, hp=0xc042900,
> context=<optimized out>, method=..., http_ver=...)
> at client_side.cc:2691
> #41 0x08152329 in ConnStateData::clientParseRequests (this=0xb88e568) at
> client_side.cc:2788
>
>
>
> Jose-Marcio Martins da Cruz wrote:
>>
>> Hello,
>>
>> I'm begining a new thread.
>>
>> To remember, I'm trying to migrate a squid server from 3.1.19 to 3.2.1. The old
>> one works fine, but with the new version, squid dies each 1 to 3 minutes, with
>> "segment violation".
>>
>> I'm not able do get core files under Solaris, despite of correct settings
>> (ulimit, ...).
>>
>> So, I set up debug_options to "ALL,8" and "ALL,9". and I wrote a single script
>> which prints the last 25 lines before each SEGFAULT received.
>>
>> The result is at http://www.j-chkmail.org/users/squid
>>
>> The contents of the three files are :
>>
>> * cores-4.txt - debug_options ALL,8 - squid doing cache
>> * cores-5.txt - debug_options ALL,9 - squid doing cache
>> * cores-9.txt - debug_options ALL,8 - cache_dir option commented
>>
>> (If you want, I can send you the files.)
>>
>> The third test was to be sure that the problem didn't come from the cache
>> partition on a zfs filesystem.
>>
>>
>> Before each "segfault" there is a line indicating the destination hostname. Most
>> of the time, but not allways, the hostname doesn't resolve or it's a CNAME.
>>
>> Thanks for your help,
>>
>> José-Marcio
>>
>>
>>
>>
>>
>>
>
>

-- 
  Envoyé de ma machine à écrire.
  ---------------------------------------------------------------
   Spam : Classement statistique de messages électroniques -
          Une approche pragmatique
   Chez Amazon.fr : http://amzn.to/LEscRu ou http://bit.ly/SpamJM
  ---------------------------------------------------------------
  Jose Marcio MARTINS DA CRUZ            http://www.j-chkmail.org
  Ecole des Mines de Paris                   http://bit.ly/SpamJM
  60, bd Saint Michel                      75272 - PARIS CEDEX 06
  mailto:Jose-Marcio.Martins_at_mines-paristech.fr
Received on Mon Sep 17 2012 - 06:54:23 MDT

This archive was generated by hypermail 2.2.0 : Mon Sep 17 2012 - 12:00:03 MDT