Re: [squid-users] TCP_HIT/504 problem with small Squid cluster

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 22 Oct 2009 00:27:56 +1300

Robert Knepp wrote:
> Hi - first time poster so be gentle.
>
>
> Some general info regarding my setup:
>
> 0) Running Squid 2.7 in reverse proxy mode
> 1) Each Squid is configured to use it's local webserver on 127.0.0.1
> as the origin server and the other servers in the farm as siblings
> 2) This Squid cache is transparent to the end-user (although I do pass
> along a select few cache controls such as if-none-match).

bad word! bad word! Squid in this context is a "reverse proxy". To all
intents and visibility of the client they are the web server.

The various common meanings of "transparent" has nothing to do with it.

> 3) It is protected behind local AUTH applications which perform
> complex access checks before passing the request onto Squid

You might be able to reduce your server box overheads by merging that
into a Squid auth helper. This may or may not be a big issue to do so.

> 4) All documents will be requested and cached as
> [http://127.0.0.1/URL] so Squid is really only serving a single domain
>
> ************************************************************************************************
> Transparent Proxy Cluster
>
> [user agent]
>
> |
> v
>
> [Load Balancer]
>
> |
> |
> -------------------------------------------------------------------------------
> | | |
> |
> v v v
> v
>
> [WEB1-AUTH] [WEB2-AUTH] [WEB3-AUTH]
> [WEB4-AUTH]
>
> | | |
> |
> v v v
> v
>
> [SQUID1] (icp) [SQUID2] (icp) [SQUID3]
> (icp) [SQUID4]
>
> | | |
> |
> v v v
> v
>
> [WEB1-ORIG] [WEB2-ORIG] [WEB3-ORIG]
> [WEB4-ORIG]
>
>
> ************************************************************************************************
>
>
> Here is a simplified squid.conf from the first server (all others have
> the same settings except the sibling list is shifted).
>
> #------
> http_port 3128 act-as-origin accel vhost http11

The use of vhost here forces Squid to process the Host: header and cache
URLs with its content as the domain name.

To meet criteria (4) "All documents ... cached a [http://127.0.0.1/URL]"

You need to be using:

   http_port 80 act-as-origin accel http11 dstdomain=127.0.0.1

> icp_port 3130
> cache_dir ufs /cache/data 2048 16 256

aufs please.

> cache_mem 8 GB
> request_timeout 5 seconds
> persistent_request_timeout 5 seconds
> refresh_pattern . 0 20% 4320
> negative_ttl 0
>
> acl all src all
> acl localhost src 127.0.0.1/xx

WTF? why fudge a mask value that is only relevant to a sealed
machine-local IP address?

> acl localnet src 127.0.0.1/xx
> acl localnet src xxxxxxxxxxxxx
> acl Safe_ports port 3128
> acl Safe_ports port 80
> http_access allow localhost
> http_access deny !Safe_ports
> http_access allow localnet
> http_access deny all
> icp_access allow localnet
> icp_access deny all
>
> ## Origin server
> cache_peer 127.0.0.1 parent 80 0 name=localweb max-conn=250 no-query
> no-netdb-exchange originserver http11
> cache_peer_access localweb allow localnet
> cache_peer_access localweb deny all
> ## Sibling Caches
> # cache_peer [IP_OF_SIBLING_1] sibling 3128 3130 proxy-only
> cache_peer [IP_OF_SIBLING_2] sibling 3128 3130 proxy-only
> cache_peer [IP_OF_SIBLING_3] sibling 3128 3130 proxy-only
> cache_peer [IP_OF_SIBLING_4] sibling 3128 3130 proxy-only
>

#1 rules of reverse proxies:
    If the reverse-proxy rules are not above the generic forward-proxy
rule they risk false error pages.

>
> ************************************************************************************************
<snip duplicate paste>

> ************************************************************************************************
>
> So...... I have a 'few' questions regarding my setup and how I might
> be able to improve on it.
>
> - Does the ICP sibling setup makes sense or will it limit the number
> of servers in the cluster? Or should this be redesigned to work with
> multiple parent caches instead of siblings? Or perhaps multicast ICP?
> Or I could try digests?

You want it to be scalable AND fast? multicast or digests.

You want to maximize bandwidth capacity? digests or CARP.

>
> - Would using 'icp_hit_stale' and 'allow-miss' improve hit-ratios
> between the shards? Is there a way to force a given Squid server to be
> the ONLY server storing a cached document (stale, fresh, or
> otherwise)?

icp_hit_stale allows peers to say "I have it!" when what they really
have is an old stale copy. Useful of the peer is close and the object
can be served stale while a better one is fetched. Bad if it causes
spreading of non-cacheable objects.

allow-miss allows peers to send the "I have it" message on stale
objects and fetch a new copy from their fast source when they are asked
for the full thing. Thus refreshing the object in two caches instead of
just one. Mitigating the total effect of having that one fetch be extra
slow.

>
> - Using this basic setup for about a month now and I am getting
> strange squid access.log entries when the load goes up:
>
> 2009-04-04 11:13:47 504 GET "http://127.0.0.1:3128/[URL]" TCP_HIT NONE
> 3018 0 "127.0.0.1" "127.0.0.1:3128" "-" "-"

This is due to your website being hosted on 127.0.0.1 port 3128.

The Host: header contains domain:port unless the port is the http
default port 80.

The new http_port line I gave you above should fix this as a by-product.

>
> The gateway timeout lines appear during high load and are usually
> (but not always) close to UDP_HIT entries on the same URI. In most
> cases like this the document gets returned to the user with a status
> of 200. It confused me since I thought TCP_HIT represented a cache
> object was found locally and is being served.
> Or maybe it is related to a false UDP_HIT? Could this be network
> related? Or can a slow response from the origin server cause this?
>

UDP_HIT - a sibling requested the object via ICP and was sent a positive
answer that the object is stored in cache.

TCP_HIT - a client requested an object and was provided an object from
cache.

http://wiki.squid-cache.org/SquidFaq/SquidLogs#Hierarchy_Codes

> - Is 8 GB cache memory (out of 32GB in each box) going to cause
> problems for Squid? And what happens if it fills up quickly?
>

Things get shoved out of the memory cache if too old or moved to disk
cache if still possibly usable.

>
> Anyway,I just wanted to throw out these questions for the experts.
> I'll likely be trying some of these changes just to see the effects.
>
>
> Feel free to chime in on any of this stuff.
> Thanks in advance,
> Rob

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE7 or 3.0.STABLE19
   Current Beta Squid 3.1.0.14
Received on Wed Oct 21 2009 - 11:28:02 MDT

This archive was generated by hypermail 2.2.0 : Thu Oct 22 2009 - 12:00:03 MDT