Re: [squid-users] squid 3.2.1: workers not working

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 11 Oct 2012 10:58:34 +1300

On 11.10.2012 03:42, Rietzler, Markus (RZF, SG 324 /
<RIETZLER_SOFTWARE>) wrote:
> we are trying to get squid using SMP workers. tried with workers2
> seting up different cache_dirs for each worker
>
> workers 2
> http_port 8080
> cache_dir aufs $SQUID_CACHE_ROOT/${process_number} 32000 16 256
>

Seems okay. But note the 32GB cache size.

> when we start squid we get:
>
> 2012/10/10 16:26:16.663 kid3| IoCallback.cc(107) finish: called for
> local=[::] remote=[::] FD 13 flags=1 (0, 0)
> 2012/10/10 16:26:16.663 kid1| fd_open() FD 19
> /rzf/db/www/squid/1/swap.state.clean
> 2012/10/10 16:26:16.663 kid3| comm_read_try: FD 11, size 4328, retval
> 4112, errno 0
> 2012/10/10 16:26:16.663 kid1| storeDirWriteCleanLogs: opened
> /rzf/db/www/squid/1/swap.state.clean, FD 19
> 2012/10/10 16:26:16.663 kid3| IoCallback.cc(107) finish: called for
> local=[::] remote=[::] FD 11 flags=1 (0, 0)
> 2012/10/10 16:26:16.663 kid3| comm_close: start closing FD 13
> 2012/10/10 16:26:16.663 kid3| comm.cc(735) commUnsetFdTimeout: Remove
> timeout for FD 13
> 2012/10/10 16:26:16.663 kid1| fd_close FD 15
> /rzf/db/www/squid/1/swap.state
> 2012/10/10 16:26:16.663 kid3| The AsyncCall comm_close_complete
> constructed, this=0x9e88e0 [call144]
> 2012/10/10 16:26:16.663 kid1| Cache Dir #0 log closed on FD 15
> 2012/10/10 16:26:16.663 kid3| comm.cc(1154) will call
> comm_close_complete(FD 13) [call144]
> 2012/10/10 16:26:16.663 kid1| xrename: renaming
> /rzf/db/www/squid/1/swap.state.clean to
> /rzf/db/www/squid/1/swap.state
> 2012/10/10 16:26:16.663 kid3| Coordinator.cc(146)
> handleSharedListenRequest: kid1 needs shared listen FD for
> 130.11.6.5:8080
> 2012/10/10 16:26:16.664 kid3| Coordinator.cc(154)
> handleSharedListenRequest: sending shared listen
> local=130.11.6.5:8080
> remote=[::] FD 15 flags=9 for 130.11.6.5:8080 to kid1 mapId=0
> 2012/10/10 16:26:16.664 kid3| entering comm_close_complete(FD 13)
> 2012/10/10 16:26:16.664 kid3| AsyncCall.cc(34) make: make call
> comm_close_complete [call144]
> 2012/10/10 16:26:16.664 kid1| fd_open() FD 15
> /rzf/db/www/squid/1/swap.state.last-clean
> 2012/10/10 16:26:16.664 kid3| fd_close FD 13
> 2012/10/10 16:26:16.664 kid1| fd_close FD 15
> /rzf/db/www/squid/1/swap.state.last-clean
> 2012/10/10 16:26:16.664 kid3| leaving comm_close_complete(FD 13)
> 2012/10/10 16:26:16.664 kid1| fd_close FD 19
> /rzf/db/www/squid/1/swap.state.clean
> 2012/10/10 16:26:16.664 kid3| comm.cc(2116) comm_open_uds: Attempt
> open socket for: /rzf/produkte/www/squid/var/run/squid/kid-1.ipc

> 2012/10/10 16:26:16.664 kid1| Finished. Wrote 0 entries.
> 2012/10/10 16:26:16.664 kid1| Took 0.00 seconds ( 0.00
> entries/sec).
> 2012/10/10 16:26:16.664 kid3| comm.cc(2134) comm_open_uds: Opened UDS
> FD 13 : family=1, type=2, protocol=0
> 2012/10/10 16:26:16.664 kid3| fd_open() FD 13
> 2012/10/10 16:26:16.664 kid3| comm.cc(748) commSetConnTimeout:
> local=[::] remote=[::] FD 13 flags=1 timeout 10
> FATAL: kid1 registration timed out
> FATAL: kid2 registration timed out
>
> and later:
>
> 2012/10/10 16:26:19.675 kid2| leave_suid: PID 11031 called
> 2012/10/10 16:26:19.675 kid2| leave_suid: PID 11031 giving up root,
> becoming 'www'
> 2012/10/10 16:26:19.675 kid2| leave_suid: PID 11031 called
> FATAL: Ipc::Mem::Segment::open failed to
> shm_open(/squid-squid-page-pool.shm): (2) No such file or directory
>

This is kid2 attempting to open its UDS connection to register with the
coordinator (kid3).

> but I think this comes from the first FATAL.

It's related to the "FATAL: kid2" but this one is caused by the UDS
path not existing. The shm_open() is supposed to create one if it does
not exist already.

Start with an upgrade to 3.2.2 we fixed some SHM related bugs there.
Then check bugzilla for more info, shm_open() has a few OS-specific
problems and oath problems known.

>
> why are the kids nor registered. squid (coordinator) is running but
> no other squid process and so it does not listen on port 8080.
>

The registration is done by sending packets to their UDS sockets (the
SHM path which failing to open).

To put it in more familiar terms; what is happening is somewhat vaguely
equivalent to a network socket() creation failing to happen in the
worker, which prevents connect() being done and the server/coordinator
waiting for a connect() SYN packet never receives one.
  (Bit more complex than that, but essentially similar.)

> how can we adjust the path where squid stores the ipc files for the
> coordinator and kids? it is /path/to/squid/var/run/squid. this is
> because we used -prefix="/path/to/squid" . but also we have used
> -prefix we want to have the files in other dirs. with (nearly) all we
> can adjust the locations in squid.conf and use the full/other path,

The OS determines where/what the SHM path descriptor has to be. We like
to follow the FHS specification since these are special networking
*socket* descriptors not "files". That may or may not permit your
--prefix to apply on the path, but we do not allow localization.

>
> here the output of squid -v with compile opitons. do we have to set
> one option to have workers workering?

No. SMP has no ./configure options. It is always built when supported
by the system. The "workers" config file directive is how you turn it
on/off (off being 'workers 1' or absent from the file).

>
> Squid Cache: Version 3.2.1
> configure options: '--enable-basic-auth-modules=MSNT,SMB,NEGOTIATE'

--enable-basic-auth-modules does not exist. never has AFAICT.

I assume you mean: --enable-auth-basic="MSNT SMB NEGOTIATE"

* I am also assuming you have a local basic auth module called
"NEGOTIATE" which you are patching into the Squid sources, since we
don't publish any such helper.

> '--enable-external-acl-helpers=ldap_group'
> '--enable-auth-basic'

see above.

> '--enable-auth-ntlm'
> '--enable-auth-negotiate=squid_kerb_auth'

--enable-auth-negotiate=kerberos

otherwise the negotiate_kerberos_auth binary will not build.

> '--enable-ntlm-fail-open'

--enable-ntlm-fail-open is obsolete. has not worked since *before*
squid-2.5.

> '--enable-delay-pools'
> '--enable-follow-x-forwarded-for'
> '--with-maxfd=4095'

--with-maxfd was an experiment by RHEL. The squid option is
--with-filedescriptors=4095

> '--enable-removal-policies=lru,heap'
> '--with-winbind'

--with-winbind does not exist.

> '--with-async-io'
> '--enable-storeio=ufs,aufs'
> '--disable-ident-lookups'
> '--prefix=/path/to/squid'
> '--enable-err-language=German'

--enable-err-language is obsolete.

> '--enable-underscores'
> '--with-large-files'
> '--enable-dlmalloc'
> --enable-ltdl-convenience
>
> mfg
>
> Markus Rietzler
> <rietzler_software/>
> Rechenzentrum der Finanzverwaltung
>
> Tel: 0211/4572-2130
Received on Wed Oct 10 2012 - 21:58:41 MDT

This archive was generated by hypermail 2.2.0 : Thu Oct 11 2012 - 12:00:02 MDT