[squid-users] DNS round robin / httpd_accel, again...

From: <sean.upton@dont-contact.us>
Date: Mon, 04 Jun 2001 15:15:34 -0700

Greetings,
I know this may be a newbie question here, but after many, many hours of
playing around, searching the mailing list, Usenet, reading all docs, I
can't find an answer... Sorry this post is so long, but I figured I might as
well be thorough in the description of my problem. I'm setting up the
poor-person's load balancer8; more specifically, I am using an L4 switch
that is balancing 2 squid http_acclerator boxes that will each cache content
and use a round-robin approach picking out of /etc/hosts to distribute
requests (cache misses) to 2+ internal web servers. Right now, though I am
just trying to get a single box set up as a squid http accelerator that
distributes requests to 2 web servers.

I know this has been brought up before (though the posts in mailing list
archives don't seem to address my confusion)... My problem is that squid
won't distribute requests correctly, using the first address returned by
dnsserver for all requests.

  I'm setting up squid in httpd_accel mode, with multiple back-end servers
all serving the same content; I am attempting to get a round-robin approach
set up, and squid (I believe) is built correctly for that
(--disable-internal-dns); dnsserver seems to work correctly, AFAIK, and once
and a while, after clearing my cache and restarting squid, I seem to be able
to get squid to serve from both boxes briefly after startup, but my
impression is that it is caching ip lookups. I tried to follow a path
similar to instructions in the following archives list message:
http://list.cineca.it/cgi-bin/wa?A2=ind0004&L=squid&P=R21462

dnsserver & dnsserver -D both return:
nodes.example.com
$addr 0 172.16.2.5 172.16.2.6

I assume that something in ipcache.c / fqdncache.c is not liking my config,
and caching a single IP/address (more specifically, the first of the 2 IPs
returned by dnsserver; I can switch the order in /etc/hosts to get a
different result), then mapping and using it while it is cached; my
impression from debug (level 1, at least) output is that this is the case,
though I'm not sure why it would do this (my ipcache_size, fqdncache_size,
and positive_dns_ttl are all set to 0).

I'm also under the impression that this will work without a redirector,
though I eventually plan to deploy one (in addition) as certain URLs will
eventually be mapped to a different TCP port on the same servers (thus, I
will be directing some traffic with the squid boxes to nodes.example.com:80
and nodes.example.com:8080, depending on the URL - still maintaining a 1:1
mapping so that caching works), but that is beyond the scope of my current
issue, which is how squid will select from multiple addresses on a lookup
returned by dnsserver... I assume something in my config is wrong.

There are a bunch of details of my config below this message; I guess I'm
confused because (I think) in theory what I've set up is supposed to work.
Any thoughts on what the problem might be, or pointers on how to further
debug the problem?

Sean

=================================================
Lots of details on my config are below this line:
-------------------------------------------------
  [----->MY EVENTUAL SETUP<-----]

                 |
           o->[ROUTER]<-o
_______ / | \
         / [L4 SWITCH] \ Intel 7140 - Treats below caches at a single
VIP
Virtual | / \ | uses DST/OPR spoofed packets to return
responses
Web | / \ | direct through router; 7140 IP is also
the VIP
Server | v v |
on [CACHE1]____ [CACHE2] Squid http accelerator caches:
VIP | ____\__/ | use the single VIP for both caches on lo:0
           | / \ | caches will eventually peer via ICP.
_______ [NODE1] [NODE2] Web Server Nodes: 172.16.2.5 and 172.16.2.6
               \ / set up in caching boxes /etc/hosts file
         [File/DB Servers] as nodes.example.com

I'm only trying to address a portion of this now, by first setting up one
cache proxying to 2 nodes, distributing the load using a round-robin
technique. Once this is set up, the peer cache will be set up, then the L4
switch and the cache nodes will be configured to support balancing with
out-of-path packet return using one virtual IP (squid will be bound to the
VIP).

---------------------------------------------------

Squid: squid-2.4.STABLE1 source tarball
platform: Debian GNU/Linux unstable (SPARC) / glibc 2.2.3 /
                gcc 2.95.4 / Linux 2.2.17 (default 1CPU Deiban kernel) /
                on Sun E250-SMP/1GB

Build info: ./configure --sysconfdir=/etc --enable-dlmalloc
--enable-gnuregex \
--with-pthreads --enable-storeio=ufs,aufs,diskd
--enable-removal-policies=lru,heap \
--disable-wccp --enable-cache-digests --disable-ident-lookups \
--disable-internal-dns --enable-underscores sparc-debian-linux

from /etc/hosts:
172.16.2.5 nodes.example.com
172.16.2.6 nodes.example.com

from /etc/nsswitch.conf:
hosts: files dns

results from dnsserver:
cache2:/usr/local/squid/libexec/squid# ./dnsserver
nodes.example.com
$addr 0 172.16.2.5 172.16.2.6

abbreviated squid.conf:
#note, to test the round robin approach, I have disabled caching of http
objects
# with an acl...
http_port 80
hierarchy_stoplist cgi-bin manage_
acl QUERY urlpath_regex cgi-bin manage_
no_cache deny QUERY
cache_mem 128 MB
maximum_object_size 8192 KB
maximum_object_size_in_memory 128 KB
ipcache_size 0
fqdncache_size 0
cache_replacement_policy heap LFUDA
cache_dir diskd /cache 2000 16 512
cache_access_log /cache/log/access.log
cache_log /cache/log/cache.log
cache_store_log /cache/log/store.log
cache_dns_program /usr/lib/squid/libexec/squid/dnsserver
dns_children 16
#note, tried dns_defnames both ways, now just using default...
positive_dns_ttl 0
#note, tried positive_dns_ttl 0 and 1
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl SSL_ports port 443 563
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 563 # https, snews
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
acl PURGE method PURGE
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
#allow all until you can think of something more specific ...
http_access allow all
http_access allow localhost PURGE
http_access deny PURGE
http_access deny all
icp_access deny all
httpd_accel_host nodes.uniontrib.com
httpd_accel_port 80
#tried using httpd_accel_uses_host_header both ways, now using default

-----------------------------------------
Running squid:
squid -NCd1 OUTPUT, RIGHT AFTER REMOVING THE CACHE FILES AND RUNNING squid
-z:
cache2:/cache# /usr/local/squid/bin/squid -NCd1
2001/06/04 13:08:10| Starting Squid Cache version 2.4.STABLE1 for
sparc-debian-linux-gnu...
2001/06/04 13:08:10| Process ID 10279
2001/06/04 13:08:10| With 1024 file descriptors available
2001/06/04 13:08:10| Performing DNS Tests...
2001/06/04 13:08:10| Successful DNS name lookup tests...
2001/06/04 13:08:10| helperOpenServers: Starting 16 'dnsserver' processes
2001/06/04 13:08:10| Unlinkd pipe opened on FD 24
2001/06/04 13:08:10| Swap maxSize 2048000 KB, estimated 157538 objects
2001/06/04 13:08:10| Target number of buckets: 7876
2001/06/04 13:08:10| Using 8192 Store buckets
2001/06/04 13:08:10| Max Mem size: 131072 KB
2001/06/04 13:08:10| Max Swap size: 2048000 KB
2001/06/04 13:08:10| Local cache digest enabled; rebuild/rewrite every
3600/3600 sec
2001/06/04 13:08:10| Rebuilding storage in /cache (DIRTY)
2001/06/04 13:08:10| Using Least Load store dir selection
2001/06/04 13:08:10| Set Current Directory to /cache
2001/06/04 13:08:10| Loaded Icons.
2001/06/04 13:08:10| Accepting HTTP connections at 0.0.0.0, port 80, FD 25.
2001/06/04 13:08:10| Ready to serve requests.
2001/06/04 13:08:11| Done scanning /cache swaplog (0 entries)
2001/06/04 13:08:11| Finished rebuilding storage from disk.
2001/06/04 13:08:11| 0 Entries scanned
2001/06/04 13:08:11| 0 Invalid entries.
2001/06/04 13:08:11| 0 With invalid flags.
2001/06/04 13:08:11| 0 Objects loaded.
2001/06/04 13:08:11| 0 Objects expired.
2001/06/04 13:08:11| 0 Objects cancelled.
2001/06/04 13:08:11| 0 Duplicate URLs purged.
2001/06/04 13:08:11| 0 Swapfile clashes avoided.
2001/06/04 13:08:11| Took 1.1 seconds ( 0.0 objects/sec).
2001/06/04 13:08:11| Beginning Validation Procedure
2001/06/04 13:08:11| Completed Validation Procedure
2001/06/04 13:08:11| Validated 0 Entries
2001/06/04 13:08:11| store_swap_size = 64k
...
After shutting this down with Ctl+C, then starting again, there is one
difference:
  (snip)
2001/06/04 13:17:27| 1 Entries scanned
  (snip)
2001/06/04 13:17:27| 1 Objects loaded.

I can only assume that the object loaded is the IP address for
nodes.uniontrib.com, cached.
-------------------------------------------

=========================
Sean Upton
Senior Programmer/Analyst
SignOnSanDiego.com
The San Diego Union-Tribune
619.718.5241
sean.upton@uniontrib.com
=========================
Received on Mon Jun 04 2001 - 16:11:57 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:00:28 MST