[squid-users] Re: SMP-Rock-frequent FATAL: Received Segment Violation...dying."only on kid3" from Dr.x on 2013-11-21 (squid-users)

From: Dr.x <ahmed.zaeem_at_netstream.ps>
Date: Thu, 21 Nov 2013 02:37:34 -0800 (PST)

Eliezer Croitoru-2 wrote
> Hey Dr,
>
> (notes inside)
>
> On 19/11/13 00:54, Amos Jeffries wrote:
>> Either event may have corrupted it slightly. Squid is supposed to
>> contain sufficient checksum protection in rock to cope with most forms
>> of corruption, but nobody's perfect.
>>
>> So, please try to get a core dump, or stack trace of the problem before
>> going any further. This will help us to isolate where the problem is
>> occuring. If it is corruption related we will be needing to try and add
>> better protection for that case.
>>
>> *after* that, please try:
>>
>> * shutting Down your Squid by any means necessary to ensure there are 0
>> processes running.
> "pgrep squid"
>
>> * *move* the caches to somewhere they can be analysed later if necessary.
>
>> * rebuild the configured cache_dir with squid -z
>
>> * wait until -z process completed *AND* there are 0 processes still
>> running in the background
> And no traffic at all on the server.
>
>> * restart the main Squid
>>
>> This entire process should not take more than a minute.
>>
>> If the problem remains after doing that you will have successfully
>> eliminated cache corruption as a cause and we go back to needing a
>> backtrace to figure it out.
>
> The same result\test can be achieved by running the service in "RAM
> only" cache mode.
>
> If all these Dying happens lots of times it means that to reproduce it
> you will need a very small amount of run-time(probably).
>
> I assume that Letting the service run on a "RAM only" mode will allow
> this service to still serve clients more then 24\48 hours smoothly with
> 0 problems in cache.log.
> In a case that you will still have troubles on a RAM only mode we can be
> more then 90% sure that the cause was related to DISK cache.
> I will not run to say that rock was fault at the problem.
>
> If you can attach the related DISKs\partitions details from fstab it can
> might help more.
> (feel free to send all the technical data in a PM while it will can take
> time to process and especially core dumps)
>
> Regards,
> Eliezer

hi eliezer , amos

after the monitoring ,
i found that no "fatal signals " to squid

what i did is :
i removed all cache contents and let squid start caching from start
=====================
here is general runtime info :
Squid Object Cache: Version 3.3.9

Start Time: Tue, 19 Nov 2013 07:37:00 GMT
Current Time: Thu, 21 Nov 2013 10:18:47 GMT

Connection information for squid:
        Number of clients accessing cache: 799
        Number of HTTP requests received: 7735174
        Number of ICP messages received: 0
        Number of ICP messages sent: 0
        Number of queued ICP replies: 0
        Number of HTCP messages received: 0
        Number of HTCP messages sent: 0
        Request failure ratio: 0.00
        Average HTTP requests per minute since start: 2543.0
        Average ICP messages per minute since start: 0.0
        Select loop called: 1340794089 times, 0.870 ms avg
Cache information for squid:
        Hits as % of all requests: 5min: 10.8%, 60min: 9.4%
        Hits as % of bytes sent: 5min: -0.9%, 60min: -0.7%
        Memory hits as % of hit requests: 5min: 21.6%, 60min: 20.5%
        Disk hits as % of hit requests: 5min: 10.3%, 60min: 10.9%
        Storage Swap size: 20837408 KB
        Storage Swap capacity: 50.9% used, 49.1% free
        Storage Mem size: 1024000 KB
        Storage Mem capacity: 100.0% used, 0.0% free
        Mean Object Size: 32.00 KB
        Requests given to unlinkd: 0
Median Service Times (seconds) 5 min 60 min:
        HTTP Requests (All): 0.08333 0.07964
        Cache Misses: 0.10273 0.09542
        Cache Hits: 0.00000 0.00000
        Near Hits: 0.03772 0.03772
        Not-Modified Replies: 0.00000 0.00000
        DNS Lookups: 0.00076 0.00172
        ICP Queries: 0.00000 0.00000
Resource usage for squid:
        UP Time: 182506.341 seconds
        CPU Time: 15893.487 seconds
        CPU Usage: 8.71%
        CPU Usage, 5 minute avg: 9.62%
        CPU Usage, 60 minute avg: 10.29%
        Process Data Segment Size via sbrk(): 445960 KB
        Maximum Resident Size: 15585152 KB
        *Page faults with physical i/o: 1*
Memory usage for squid via mallinfo():
        Total space in arena: 446620 KB
        Ordinary blocks: 388282 KB 6719 blks
        Small blocks: 0 KB 0 blks
        Holding blocks: 365972 KB 40 blks
        Free Small blocks: 0 KB
        Free Ordinary blocks: 58338 KB
        Total in use: 58338 KB 7%
        Total free: 58338 KB 7%
        Total size: 812592 KB
Memory accounted for:
        Total accounted: 60183 KB 7%
        memPool accounted: 60183 KB 7%
        memPool unaccounted: 752409 KB 93%
        memPoolAlloc calls: 80
        memPoolFree calls: 1998262623
File descriptor usage for squid:
        Maximum number of file descriptors: 655360
        Largest file desc currently in use: 1643
        Number of file desc currently in use: 2677
        Files queued for open: 0
        Available number of file descriptors: 652683
        Reserved number of file descriptors: 500
        Store Disk files open: 2
Internal Data Structures:
          2194 StoreEntries
          1029 StoreEntries with MemObjects
         32000 Hot Object Cache Items
        651168 on-disk objects
============================================================

but i noted that there is count # 1 for :

*Page faults with physical i/o: 1*

is that natural ???

regards

-----
Dr.x

--
View this message in context: http://squid-web-proxy-cache.1019090.n4.nabble.com/SMP-Rock-frequent-FATAL-Received-Segment-Violation-dying-only-on-kid3-tp4663349p4663428.html
Sent from the Squid - Users mailing list archive at Nabble.com.

Received on Thu Nov 21 2013 - 10:38:19 MST

This archive was generated by hypermail 2.2.0 : Thu Nov 21 2013 - 12:00:06 MST