Re: [squid-users] Squid->DG->Squid

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 27 Jul 2011 18:15:09 +1200

On 27/07/11 00:12, Andy Rogers wrote:
> Hi
>
> Over the past month I have been setting up and implementing a Squid3
> Setup which uses SSO against Windows DC's, after originally following
> the excellent guide from
> http://howtoforge.com/debian-squeeze-squid-kerberos-ldap-authentication-active-directory-integration-and-cyfin-reporter.
>
> I then decided to take this one step further and introduce
> DansGuardian into the loop for Content Filtering.

Se that word "loop". 'ts important.

>
> The Setup Iam working with is SquidProxy->DG->SquidProxy.

The well known "sandwich" setup is designed for use with three proxies:
   two Squid and one DG.

You are hitting the equally well known problem when you try to perform
it with only two proxies: one squid, one DG.

>
> The First is Squid to Authorise against AD and listen on Port 8080,
> then pass onto DG for Content Filtering on Port 8081, and then DG to
> pass back to the original single instance of Squid running back to
> Port 3128.
>
> This does work very well, but after studying some logs I don't think
> much is being pulled from the Squid Cache, most of my squid access.log
> is like this
>
> ---
> 1311675442.422 228 localhost TCP_MISS/200 2093 GET
> http://static.howtoforge.com/images/teaser/ubuntu.gif -
> DIRECT/178.63.27.110 image/gif
> 1311675442.423 268 192.168.22.107 TCP_MISS/200 2499 GET
> http://static.howtoforge.com/images/teaser/ubuntu.gif user@MY.LOCAL
> FIRST_UP_PARENT/127.0.0.1 image/gif
> 1311675442.473 134 localhost TCP_MISS/200 4103 GET
> http://static.howtoforge.com/themes/htf_glass/images/star_vmware_image_red.png
> - DIRECT/178.63.27.110 image/png
> 1311675442.473 166 192.168.22.107 TCP_MISS/200 4509 GET
> http://static.howtoforge.com/themes/htf_glass/images/star_vmware_image_red.png
> user_at_MY.LOCAL FIRST_UP_PARENT/127.0.0.1 image/png
> 1311675442.540 131 localhost TCP_MISS/200 1313 GET
> http://howtoforge.com/themes/htf_glass/images/next_page.gif -
> DIRECT/188.40.16.205 image/gif
> 1311675442.540 161 192.168.22.107 TCP_MISS/200 1719 GET
> http://howtoforge.com/themes/htf_glass/images/next_page.gif
> user_at_MY.LOCAL FIRST_UP_PARENT/127.0.0.1 image/gif
> ---
>
>
> And the corresponding DG access.log shows
>
> ---
> 2011.7.26 11:17:18 user@my.local 192.168.22.107 http://howtoforge.com
> *SCANNED* GET 63490 -320 1 200 text/html -v
> ---
>
> Also in my cache.log I have spotted on almost all website's iam
> getting this "WARNING: Forwarding loop detected for:" :-
>
> ---
> 2011/07/26 11:11:51| WARNING: Forwarding loop detected for:
> GET /debian-squeeze-squid-kerberos-ldap-authentication-active-directory-integration-and-cyfin-reporter
> HTTP/1.0
<snip,snip>
> Via: 1.1 squid.my.local (squid/3.1.14)
> X-Forwarded-For: 192.168.22.107
> X-Forwarded-For: 192.168.22.107
> ---

192.168.22.107 just connected in,
  forwarding for 192.168.22.107 via squid.my.local which is
  forwarding for 192.168.22.107 ...

How was Squid to know "..." wasn't going to be infinite series of
192.168.22.107 via itself?

>
> Is their anything incorrect or need adding/changing to my squid.conf
> so I can get squid to get a better hit rate with the Cache stored&
> also to remove the "WARNING: Forwarding loop detected for" message in
> my cache.log.
> Would I need to setup 2 separate squid instances instead of
> Squid1->DG->Squid2 instead?

This is the sandwich configuration looping on itself. You haev several
choices:

  * configure two instances of Squid
  * configure client->DG->Squid
  * configure client->Squid->DG
  * disable "via off" and cross your fingers there is never any actual
infinite loop. You will need a Squid built with HTTP violations enabled
to do that. Loop protection is REQUIRED by the HTTP standards.

This only solves the loop issue though. Cache MISS is separate...

> If I do, how would I need to alter my
> current squid.conf into 2 separate files?

copy-n-paste the /etc/init.d/squid3 file and make each load a different
config file into squid with the -f startup option.

Then make each config file to do the job you want its squid to do.
Doesn't matter which instance is which as long as the config is clear
what it connects up to.

On the whole, since you have requests that skip DG, and some get cached
you are not in a position to only have caching on the second loop.
Things entering the first loop WILL be found in the cache even if you
wanted them to go through DG first.

  I think you are in a safe position to simply choose the
client->Squid->DG option. Doing auth and caching in Squid with only
things which need to go through DG doing so.
  OR, to pick the client->DG->Squid option for just clients which need
to go through DG. Others can be straight client->Squid.

>
> My squid.conf
>
> ---
> ####### /etc/squid3/squid.conf Configuration File #######
> ####### cache manager
> cache_mgr squid_at_mycompany.co.uk
>
> ####### kerberos authentication
> auth_param negotiate program /usr/lib/squid3/squid_kerb_auth -d -s
> HTTP/proxy.my.local
> auth_param negotiate children 10
> auth_param negotiate keep_alive on
>
> ####### provide access via ldap for clients not authenticated via kerberos
> auth_param basic program /usr/lib/squid3/squid_ldap_auth -R \
> -b "dc=my,dc=local" \
> -D squid_at_my.local \
> -w "password" \
> -f sAMAccountName=%s \
> -h dc.my.local
> auth_param basic children 10
> auth_param basic realm Internet Proxy
> auth_param basic credentialsttl 1 minute
> ####### ldap authorizations
> # restricted proxy access logged
> external_acl_type internet_users %LOGIN /usr/lib/squid3/squid_ldap_group -R -K \
> -b "dc=my,dc=local" \
> -D squid_at_my.local \
> -w "password" \
> -f "(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=Internet
> Users,ou=Internet Groups,dc=my,dc=local))" \
> -h dc.tg.local
> # full proxy access no logging
> external_acl_type internet_users_full_nolog %LOGIN
> /usr/lib/squid3/squid_ldap_group -R -K \
> -b "dc=my,dc=local" \
> -D squid_at_my.local \
> -w "password" \
> -f "(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=Internet
> Users Full NoLog,ou=Internet Groups,dc=my,dc=local))" \
> -h dc.tg.local
> # full proxy access logged
> external_acl_type internet_users_full_log %LOGIN
> /usr/lib/squid3/squid_ldap_group -R -K \
> -b "dc=my,dc=local" \
> -D squid_at_my.local \
> -w "password" \
> -f "(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=Internet
> Users Full Log,ou=Internet Groups,dc=my,dc=local))" \
> -h dc.tg.local
>
> ####### acl for proxy auth and ldap authorizations
> acl auth proxy_auth REQUIRED
>
> # format "acl, aclname, acltype, acltypename, activedirectorygroup"
> acl RestrictedAccessLog external internet_users Internet\ Users
> acl FullAccessNoLog external internet_users_full_nolog Internet\
> Users\ Full\ NoLog
> acl FullAccessLog external internet_users_full_log Internet\ Users\ Full\ Log

Couple of notes:
  1) \-escape syntax does not work here. Whitespace needs to be
url-encoded if the backend accepts that. squid_ldap_group is not one of
those though.

The solution is to load the group name(s) from a fiel into the acl line.
And in the file /etc/squid3/InternetUsersFullLog exactly one line:
    Internet Users Full Log

When that this is done you will file the line from that file gets passed
as the group value to the external_acl_type helper.
Which means you no longer have to hard-code the group name into six
different places in the config. One external_acl_type which makes use of
the group value "%g" can be used for all the external "acl" lines.

Example:

   external_acl_type group %LOGIN \
     /usr/lib/squid3/squid_ldap_group -R -K \
         -b "dc=my,dc=local" \
        -D squid_at_my.local \
        -w "password" \
         -f
"(&(objectclass=person)(sAMAccountName=%v)(memberof=cn=%g,dc=my,dc=local))"
\
        -h dc.tg.local

   acl FullAccessLog external group "/etc/squid3/IUFullLog"
   acl FullAccessNoLog external group "/etc/squid3/IUFullNoLog"
   ...

> acl whitelistsites url_regex -i "/etc/squid3/whitelistsites.txt"
> acl blockedsites url_regex -i "/etc/squid3/blockedsites.txt"
>
> ####### squid defaults
> acl manager proto cache_object
> acl localhost src 127.0.0.1/32 ::1
> acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 ::1
> acl SSL_ports port 443
> acl Safe_ports port 80-81 # http
> acl Safe_ports port 21 # ftp
> acl Safe_ports port 443 # https
> acl Safe_ports port 70 # gopher
> acl Safe_ports port 210 # wais
> acl Safe_ports port 1025-65535 # unregistered ports
> acl Safe_ports port 280 # http-mgmt
> acl Safe_ports port 488 # gss-http
> acl Safe_ports port 591 # filemaker
> acl Safe_ports port 777 # multiling http
> acl CONNECT method CONNECT
> http_access allow manager localhost
> http_access deny manager
> http_access deny !Safe_ports
> http_access deny CONNECT !SSL_ports
> http_access allow localhost
>
> ####### enforce auth: order of rules is important for authorization levels
> no_cache deny whitelistsites

The "no_" bit there is a bad typo. Remove it and what this line actually
does will become clear.

I think usually you want to accept whitelisted things. So that should be
a blacklist, or the whole line should be erased.

> http_access allow whitelistsites
> http_access allow FullAccessNoLog auth
> http_access allow FullAccessLog auth
> http_access deny blockedsites
> http_access allow RestrictedAccessLog auth
>
> ####### logging
> # don't log FullAccessNoLog
> access_log /var/log/squid3/access.log squid !FullAccessNoLog
>
> ####### squid defaults
> http_access deny all
>
> #Log Connecting Client DNS Names instead on IP Names.
> log_fqdn on
>
> http_port 127.0.0.1:3128
> http_port 8080
>
> ##Push Traffic Through DansGuradian for Content Filtering
> cache_peer 127.0.0.1 parent 8081 0 no-query proxy-only no-delay
> no-netdb-exchange no-digest connect-timeout=15 login=*:password

Why not just "login=PASS" (exact text) to relay the users credentials?

When these new altered details loop back to the current squid it may see
a user logged in twice with two different passwords from two different
sources. Could be bad. Particularly since you are basing your diversion
through DG to be done on the group name instead of whether the request
has already been through this squid once.

I think you want to make this the top of the peer access controls:
  # already passed to DG once. Don't do it twice.
  cache_peer_access 127.0.0.1 deny localhost

> ##FullInteret Users UnFiltered will only go through Squid& will not
> go through Dansguardian).
> cache_peer_access 127.0.0.1 allow RestrictedAccessLog
> cache_peer_access 127.0.0.1 deny all
>

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.14
   Beta testers wanted for 3.2.0.10
Received on Wed Jul 27 2011 - 06:15:17 MDT

This archive was generated by hypermail 2.2.0 : Thu Jul 28 2011 - 12:00:03 MDT