Re: [squid-users] SMP-Rock-frequent FATAL: Received Segment Violation...dying."only on kid3"

From: Eliezer Croitoru <eliezer_at_ngtech.co.il>
Date: Tue, 19 Nov 2013 01:18:48 +0200

Hey Dr,

(notes inside)

On 19/11/13 00:54, Amos Jeffries wrote:
> Either event may have corrupted it slightly. Squid is supposed to
> contain sufficient checksum protection in rock to cope with most forms
> of corruption, but nobody's perfect.
>
> So, please try to get a core dump, or stack trace of the problem before
> going any further. This will help us to isolate where the problem is
> occuring. If it is corruption related we will be needing to try and add
> better protection for that case.
>
> *after* that, please try:
>
> * shutting Down your Squid by any means necessary to ensure there are 0
> processes running.
"pgrep squid"

> * *move* the caches to somewhere they can be analysed later if necessary.

> * rebuild the configured cache_dir with squid -z

> * wait until -z process completed *AND* there are 0 processes still
> running in the background
And no traffic at all on the server.

> * restart the main Squid
>
> This entire process should not take more than a minute.
>
> If the problem remains after doing that you will have successfully
> eliminated cache corruption as a cause and we go back to needing a
> backtrace to figure it out.

The same result\test can be achieved by running the service in "RAM
only" cache mode.

If all these Dying happens lots of times it means that to reproduce it
you will need a very small amount of run-time(probably).

I assume that Letting the service run on a "RAM only" mode will allow
this service to still serve clients more then 24\48 hours smoothly with
0 problems in cache.log.
In a case that you will still have troubles on a RAM only mode we can be
more then 90% sure that the cause was related to DISK cache.
I will not run to say that rock was fault at the problem.

If you can attach the related DISKs\partitions details from fstab it can
might help more.
(feel free to send all the technical data in a PM while it will can take
time to process and especially core dumps)

Regards,
Eliezer
Received on Mon Nov 18 2013 - 23:19:01 MST

This archive was generated by hypermail 2.2.0 : Thu Nov 21 2013 - 12:00:06 MST