Re: [squid-users] Caching issue with http_port when running in transparent mode

From: Hans Musil <hans.musil_at_gmx.de>
Date: Tue, 05 Jun 2012 19:54:12 +0200

Amos Jeffries wrote:

> On 29/05/2012 6:12 p.m., Hans Musil wrote:
> > Amos Jeffries wrote:
> >> On 29.05.2012 08:13, Eliezer Croitoru wrote:
> >>> hey there Hans,
> >>>
> >>> are you serving squid on the same machine as the gateway is?(wasnt
> >>> sure about the DNAT).
> >>> your problem is not directly related to squid but to the way that tcp
> >>> and browsers works.
> >>> for every connection that the client browser uses exist a tcp windows
> >>> that stays alive for a period of time after the page was served.
> >>> this will cause to all the connections that was served using port
> >>> 3128 to still exist for i think 5 till 10 more minutes or whatever is
> >>> your tcp stack settings.
> >>
> >> While that is true for the TCP details I think HTTP connection
> >> behaviour is why that matters. For the TCP timeouts closure to start
> >> happening HTTP has to first stop using the connection.
> >>
> >> iptables NAT only affects SYN packets (ie new connections). So any
> >> existing TCP connections made by HTTP WILL continue to operate
> >> despite any changes to NAT rules.
> >>
> >> HTTP persistent connections, CONNECT tunnels and HTTP
> >> "streaming"/large objects have no fixed lifetime and several minutes
> >> for idle timeout. It is quite common to see client TCP connections
> >> lasting whole hours or days with HTTP traffic flow throughout.
> >>
> >>>
> >>> On 28/05/2012 22:34, Hans Musil wrote:
> >>>> Hi,
> >>>>
> >>>> my box is running on Debian Sqeeze, which uses SQUID version
> >>>> 2.7.STABLE9, but my problem also seems to affect SQUID version 3.1.
> >>>>
> >>>> These are the importend lines from my squid.conf:
> >>>>
> >>>> http_port 3128 transparent
> >>>> http_port 3129 transparent
> >>>> url_rewrite_program /etc/squid/url_rewrite.php
> >>>>
> >>>>
> >>>> First, I did configure my Linux iptables like this:
> >>>>
> >>>> # Generated by iptables-save v1.4.8 on Mon May 28 21:04:09 2012
> >>>> *nat
> >>>> :PREROUTING ACCEPT [0:0]
> >>>> :POSTROUTING ACCEPT [0:0]
> >>>> :OUTPUT ACCEPT [0:0]
> >>>> -A PREROUTING -i eth1 -p tcp -m tcp --dport 80 -j DNAT
> >>>> --to-destination 10.17.0.1:3128
> >>>> COMMIT
> >>>>
> >>>> and everything works fine.
> >>>>
> >>>> But when I change the redirect port in the iptables settings from
> >>>> 3128 to 3129, Squid behaves strange: My URL rewrite program still
> >>>> gets send myport=3128, althought there is definitely no more
> >>>> request on this port, but only on 3129. This only affects HTTP
> >>>> domains that already have been requested before, i.e. with
> >>>> redirection to port 3128, and it works fine again when I do a
> >>>> force-reload on my browser. Also, things turn well when waiting
> >>>> some minutes.
> >>>>
> >>>> I suppose there is some strange caching inside Squid that maps the
> >>>> HTTP domain to an incoming port.
> >>
> >> No. There is only an active TCP connection. Multiple HTTP request can
> >> arrive on the connection long after you start sending unrelated new
> >> connections+requests through other ports.
> >>
> >>
> >> What your helper was passed is the details about the request Squid
> >> received. It arrived on a TCP connection which was accepted through
> >> Squid port 3128. The fact that you changed the kernel settings after
> >> that connection was setup and operating is irrelevant.
> >>
> >>
> >> URL-rewriting is a form of NAT on the URL, but with far worse
> >> side-effects than IP-layer NAT and is often a sign of major design
> >> mistakes somewhere in the network. Why do you have to re-write in the
> >> first place? perhapse we could point you at a simpler more standards
> >> compliant setup.
> >>
> >> Amos
> >>
> > Thanks Amos. This makes things even clearer. Actually, I'd say that my
> > problem is solved with the help of both of you. But well, let's have a
> > look on my design.
> >
> > My goal is to build up an access control mechanism for my client
> > machines to the internet. As long as a user has not yet logged in, his
> > client box should be completely cut off the internet, not only HTTP.
> >
> > The login is done by a web interface. This is where I redirect the URL
> > rewriting for any web traffic. After the user has logged in, the
> > client's HTTP packets will be DNATed to the other squid port in order
> > to be regularly proxied. I need the HTTP proxy for logging my users'
> > HTTP requests.
> >
> > Since the users' client machines are out of my control, it is
> > important for me that they don't need any special configuration,
> > That's why the squid must run in transparent mode.
>
> Okay. As expected a design problem. The huge problem with transparent
> intercept is that the browser is 100% unaware that the proxy exists. As
> far as it is concerned the re-written splash page or redirect response
> is the actual response to somebody elses domain name (google or your
> bank for example). It has zero reason to think that a new TCP connection
> is needed for followup requests. Just because the server of that page
> replied Connection:close is no reason to expect Squid to pass the
> closure on to the client (quite the reverse, Squid will go out of its
> way to keep client connections open and re-used).
>
>
> To fit in with your existing config that would be:
>
> acl port3128 myportname 3128
> deny_info http://your-login.example.com/ port3128
> http_access deny port3128
>
> The full details and some other tricks can be found at
> http://wiki.squid-cache.org/ConfigExamples/Portal/Splash
>
> This still hits the DNAT problems. I would suggest finding an
> external_acl_type helper that accesses whatever database your login
> script is recording client logins with. Using that as the ACL to deny /
> bounce new clients to the login page. With that design you can authorize
> a client on their initial request and continue using the connection
> afterwards.
>
> NP: I recenty posted to the list a version of the external_acl_type
> helper I use myself for exactly this type of portal setup.
>
> Amos

Amos, I'm back. Thanks for your last posting.

Your trick with acl, deny_info and http_access was a big help.

As far as I understand, the external_acl_type helper needs to decide every few seconds whether a client is logged in or not. With some hundreds of clients, this means hundreds of database lookups per second. That's what I wanted to avoid by flipping the squid port when a user logs in or out, respectively. This way, I only have one iptables rule instead of multiple DB lookups.

As far as the DNAT problem, I consider to simply run a "contrack -D" with appropriate -s and -d options from my login/logout script.

Hans

-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
Received on Tue Jun 05 2012 - 17:54:21 MDT

This archive was generated by hypermail 2.2.0 : Wed Jun 06 2012 - 12:00:03 MDT