Re: [squid-users] Re: is there any thing wrong from cache manager logs ?!!

From: Eliezer Croitoru <eliezer_at_ngtech.co.il>
Date: Fri, 08 Nov 2013 00:37:20 +0200

OK so after Amos did the calculations (Thanks) I assume that using lsof
will give us more clue about it.
The first thing to do is to start a ssh session into the server with the
setting:
ServerAliveInterval 30
Added into the /etc/ssh/ssh_config (on the client side)

When you do have this session running you wont have any troubles running
top or any other basic tests on the server while there is a degradation.

Now this command is what you will need in order to make the suspicious
more accurate and maybe lead to something that can help you:
"lsof -u squid -a -i 4 -n -P"
(squid is the default username on centos for the proxy user)

Dont try to run this command just like this out of the blue since the
output can be more then 60k lines long..

You should try to throw it into a file at the tmp dir so this:
"lsof -u squid -a -i 4 -n -P >/tmp/tmp_lsof.1"
Should be safe.
The next thing is to find out how many FD are in sum and how many are
ESTABLISHED etc.. so run these:
##START
lsof -u squid -a -i 4 -n -P >/tmp/tmp_lsof.1
cat /tmp/tmp_lsof.1 |wc -l
cat /tmp/tmp_lsof.1 |grep UDP |wc -l
cat /tmp/tmp_lsof.1 |grep ESTABLISHED |wc -l
cat /tmp/tmp_lsof.1 |grep TIME_WAIT |wc -l
cat /tmp/tmp_lsof.1 |grep CLOSE_WAIT |wc -l
cat /tmp/tmp_lsof.1 |grep LISTEN |wc -l
cat /tmp/tmp_lsof.1 |grep ":53" |wc -l
cat /tmp/tmp_lsof.1 |grep ":TPROXYPORT" |wc -l
##END
(TPROXYPORT IS the port from squid.conf)

Once you have all the above results before the degradation in it and
after we might have a clue about the source of the problem and whether
it comes from too much FD which are not being used but causing the
system to loop throw lots of them.

Eliezer

On 11/08/2013 12:16 AM, Dr.x wrote:
> Hey there,
>
>> It's a bit weird..
>> Lets try it from another angle one at a time..
>> First user the basic 256MB MEMcache which is less the 512MB.
>> Remove any cache_dir that do exits..
>
>>>> *i started before without cache dir , but i had alot of "vary object
>>>> loop" logs in cache.log
> i put cache dir so that i dont want to see any suspicious logs in cache.log
> *
>
>
>> This way we have pushed squid into the CPU and mem land.
>> Having low memory is nothing to be afraid of..
>> The main issue is why would you get into a position that squid that can
>> pump lots of users traffic is not responding to you very well.
>> If you have an unresponsive server to even SSH the next thing is basic
>> PING tests.
>> ARPING
>> PING
>> TCPING
>> SSH_TCP_PING (telnet\nc port 22)
>> HTTPING
>
> *>>>again , the server in this status is very slow and with i could hardly
> get access to last logs of cache manager due to slow responce of my
> machine*
>
>
>
>
>> maximum FD will be about 1k(1024) in use which should no be too much of
>> a problem for any tiny proxy.
> *
>>>> i raised it because im planning to put alot of users when caryy my work
>>>> to delr720 machine
> *
>
>> And now again, what OS are you using? why do you need 500k FD allowed
>> for a system that should use about 16k FD just for a test? 65k FS should
>> be enough for most basic small ISP setup.
>
> *>>i will try it , im using centos 6.4 with kernel rebuild to support
> tproxy,*
>
>
>
>> So only one instance of squid no workers at all no ROCK storage adding
>> users should show you how the service works..
>> (I can calm you down that I took up and down a server that serves more
>> then 1k users live on it.)
>
> *>>>well i will make another test*
>
>
>> Indeed named can load on the server but since it's serving only squid I
>> assume you do have another local DNS server in place so point to it.
>> Using 8.8.8.8 public DNS will not solve your problems but rather make
>> them worse unless you do have 1-30ms response time from it.
>
> *>>>i know that , but just made a test to get problem of dns out and let
> it far from my current issue !!*
>
> You can also do DNS cache warm-up..
>
>
>> Again what is the degradation you are talking about?
>> Try to download a static file from some couple random mirrors of let say
>> some linux distro or a mirror of another file you have in mind like
>> microsoft..
>
> *
>> the degradation is as below :
> when start squid , and watching youtube , browse sites , no delay and
> youtube is excellent
> after some time , browsing is very slow ,youtube is interrupting and no
> buffering videos !!!*
>
>
>> I have looked at the graph and I do not understand what is the problem
>> while there is a degradation?
>
> *sorry, i ve modified the post above it was problem in display , here is the
> graph i re-uploaded it
> http://www2.0zz0.com/2013/11/07/21/928332022.png
> as u see , the traffic should be about 55-60 M but after sometime the
> traffifc become 30 M , which mean that there is slow and degredation occured
> !!*
>
>
>> Just a tiny test I want to add for all the above:
>> start any local http server that you like and prefer like apache, nginx,
>> lighthttpd, GoAhead-Webserver, micro_httpd or any other and put it on
>> the "lan" or "network segment" with the squid server and try to download
>> files of 512bytes 1KB 1MB 9MB 15MB 100MB and up.
>> also try to access the cache-mgr page using:
>> http://MY_CACHE_IP:zzzzzz/squid-internal-mgr/mem
>
>>> with 1 user , no problem and squid is very nice ,
>
> at the same time there is a problem and after\before.
>
>
>> The graph that you supply doesn't really explains the problem in any of
>> the network measurements at least I can understand.
>
> *>look here http://www2.0zz0.com/2013/11/07/21/928332022.png*
>
>
>
> Try to take a small look at:
> http://wiki.squid-cache.org/ConfigExamples/UbuntuTproxy4Wccp2
>
>> Which uses the tunnel method rather then the mac address rewritting
>> method 2 wccp forwarding.
>> (I am merely saying that I know it works very well for me)
>> What device does the WCCP interception??
>
> it is cisco MLS 76xx
>
> There are bugs out-there in squid but I would like to see the bug..
> I am not sure but what are the "free -m" status on this server while
> squid is not running at all?
>>> will make another test and tell u
>
>
>> I am until now try to think of a test that will show and explain the
>> problem in hands.
>> There is an issue that I remember about FD limit that is being forced by
>> the bash\startup script on squid master instance but I am not sure if
>> this is the source to the issue or it's another thing.
>> please add to squid init.d\startup script "ulimit -a" and the output of
>> that..
>
> [root_at_squid ~]# ulimit -a
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 63772
> max locked memory (kbytes, -l) 64
> max memory size (kbytes, -m) unlimited
> open files (-n) 131072
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 8192
> cpu time (seconds, -t) unlimited
> max user processes (-u) 63772
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
> [root_at_squid ~]# ulimit -n
> 131072
> [root_at_squid ~]#
>
> Are there any clues in the cache.log?
> at first , no logs , no errors , after sometime i have ""closing of due to
> life timeout in youtube"" videos
>
> Best Regards,
> Eliezer
>
>
>
>
>
> -----
> Dr.x
> --
> View this message in context: http://squid-web-proxy-cache.1019090.n4.nabble.com/is-there-any-thing-wrong-from-cache-manager-logs-tp4663156p4663172.html
> Sent from the Squid - Users mailing list archive at Nabble.com.
>
Received on Thu Nov 07 2013 - 22:37:38 MST

This archive was generated by hypermail 2.2.0 : Fri Nov 08 2013 - 12:00:20 MST