Re: [squid-users] Re: deep understanding of squid smp with disker and worker from Amos Jeffries on 2014-02-15 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 16 Feb 2014 02:03:20 +1300

On 15/02/2014 11:42 p.m., Dr.x wrote:
> Alex Rousskov wrote
>> On 02/13/2014 12:45 PM, Dr.x wrote:
>>> now the limitation of 32kB is on disk objects only ??
>>> or
>>> both (memory or disk )
>>
>>
>> Shared memory cache prior to Large Rock support has the 32KB limit.
>> Shared memory cache after Large Rock support does not have that limit.
>> Non-shared memory cache (any version) does not have that limit.
>>
>> Alex.
>
> hi Alex, thanks alot for explanation ,
> now how do i enable large rock store ??? from wiki i found it on 3.5 squid
> , and last version is 3.4.x ??
> can i enable it with my version of 3.4.3 ??

It is a massive patch with dependencies on a number of changes in the
non-3.4 development code.

The existing code which will one day be part of 3.5 beta can be
downloaded from http://www.squid-cache.org/Versions/v3/3.HEAD/ .

NP: Although right now be aware that we are tracking down a bug in the
r13270+ which causes Squid-3.HEAD to hang randomly. So grab a tarball
with an r# from before that if it is going to go anywhere near production.

> ===================================
>
> another question i have
> ive implemented squid withboth aufs and rock
> i have 3 aufs hardsisk
> i creatred 6 workers
> and mapped workers 2, 4, 6 to aufs hardsisks aufs1, aufs2, aufs3, hardsisks
> , also i mapped cpu cores of workers 2, 4 6, to 8, 10 , 12 cores of my cpu
>
> now what i noted is :
> workers are not balanced !!!!
> i mean that the http traffic is not balanced from os to workers
> that mean that one worker will have more than others and so on

"Balancing" of connections to workers is done by and depends on the OS.
No OS kernel we know of balances well, although some workarounds have
been done in Squid to trick the kernel into spreading traffic out over
CPU cores better.

Also there is no relationship between a connections TCP SYN packet and
the amount of TCP data packets on that connection. Connection balancing
in all forms is done by SYN packet alone.

PS. Also, please do not get caught up in the myth that perfect "balance"
in CPU consumption is important. Under extreme load (approaching 100%
CPU) a worker is so busy processing those data packets that it has less
ability to pickup new SYN / connections. So there is a certain amount of
load reduction that happens naturally right at the point it really
matters - no configuration required.

>
> what i mean can be summarized with disk utilization duting time
> as i said i have 3 workers mapped to 3 aufs hardsisks
> nw when i use df -h , i have
> [root_at_squid ~]# df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda1 117G 14G 98G 13% /
> tmpfs 16G 2.0G 14G 13% /dev/shm
> shm 16G 2.0G 14G 13% /dev/shm
> *
> /dev/sdb1 117G 13G 99G 12% /ssd1
> /dev/sdc1 117G 19G 93G 17% /ssd2
> /dev/sdd1 235G 2.7G 220G 2% /ssd3
> *
> /dev/sde1 275G 772M 260G 1% /sata1 ========>rock
> /dev/sdf1 99G 765M 93G 1% /sata2 ==============>rock
>
> keep underline at the bold lines, those are my 3 aufs hardsisks , you will
> see that :
> /dev/sdb1 117G 13G 99G 12% /ssd1
> /dev/sdc1 117G 19G 93G 17% /ssd2 =================>most requested
> recived
> /dev/sdd1 235G 2.7G 220G 2% /ssd3 ===============>less requestes
> recieved

NOTE: request count has nothing to do with disk space used.

sdc1 may have received 1,000,000 requests of size 20KB.
sdd1 may have received 1,000,000 requests of size 3KB

same number of requests. Different disk space required.

This also has *inverse* relationship to amount of CPU required ...

* Squid CPU consumption mostly comes from processing HTTP headers.

* Requests of object size 3KB with 300byte headers has 10% header
processing per byte of traffic.

* Request of objet size 30KB with 300byte header has 1% header
processing per byte of traffic.

* Requests of size 3.3KB can be ~10x as many in same traffic bandwidth
as 30.3KB requests.

So ... 10x request count and 10x header processing cost --> 100x CPU
requirement.

>
> another question ,
>
> can i monitor the cache.log for each aufs instance ???

You can configure a worker-specific cache.log using the syntax:
cache_log /path/to/cache${process_number}.log

Alternaively, you an just grep the shared cache.log for lines with the
specific "kid" label of the worker being monitored.

> i configured access.log and cache.log for each aufs instance , but only
> access.log worked !!!!
> i cant monitor cache.log for each aufs instance but i can monitor cache.log
> for rock store ???

When SMP is enabled rock store debug should be printed out by the Disker
processes. Find which one they are and use the same method for
monitoring those kids as mentioned above for the AUFS processes.

>
> can you guide me with best mount options with best aufs aubdirectories so
> that i have the best byte ratio and have more save for my bw ??

Mount options have nothing to do with byte ratio!!

Mount options are 100% related to the _response speed_ Squid provides on
disk HITs.
* disable atime
* disable journalling

Byte HIT ratio and count HIT ratio are both depending on what HTTP
headers say about the requested objects in traffic and total cached data
size.

>
> agian , i have 24 cores in my servers and need to get benefit of it

"need" really? you have that much traffic?

Amos
Received on Sat Feb 15 2014 - 13:03:34 MST

This archive was generated by hypermail 2.2.0 : Sat Feb 15 2014 - 12:00:05 MST