[squid-users] Mixing cached and non-cached access of same URLs by session-id

From: Schermuly-Koch, Achim <a.schermuly-koch_at_cassini.de>
Date: Thu, 20 Aug 2009 16:14:25 +0200

Hi there,

i am trying to configure squid for the following use-case:

We are using squid as a reverse-proxy cache to speed up our website. A large area of the website is public. But there is also a personalized area. If a user logs into his personal site, we maintain a session for the user (using standard tomcat features jsession-id cookie with optional url-rewriting).

I can easily tell the private and public area apart by examining the URL. So no problem to configure caching for the private area.

However the pages on the public area has a small caveat: If the user was logged in the private area, we maintain the "logged-in" state and reflect that state on public pages also (outputting "Welcome John Doe" in a small box).
Of course we must not cache these pages.

  # Recognizes mysite
  acl MYSITE url_regex ^http://[^.]*\.mysite\.de

  # Don't cache pages, if user sends or gets a cookie
  acl JSESSIONID1 req_header Cookie -i jsessionid
  cache deny MYSITE JSESSIONID1

  acl JSESSIONID2 rep_header Set-Cookie -i jsessionid
  cache deny MYSITE JSESSIONID2

This seemed to wor fine. Until i did a jmeter test, mixing Requests with and without sessionid cookies. Is seems that if i request an already cached url with a session-cookie, that the cached document is flushed. This is correct from a security-point-of-view: We are not leaking any private data. But a subsequent request without cookie to the very same URL will report a cache-miss (regarding the response-headers). Which is not so good for performance. The hit rate degrades with the number of requests with cookies.

Next idea was to use "always_direct allow" instead of "cache deny" for private access on public pages. My understanding was, that squid bypasses all other internal processing:

  # Recognizes mysite
  acl MYSITE url_regex ^http://[^.]*\.mysite\.de

  # Don't cache pages, if user sends or gets a cookie
  acl JSESSIONID1 req_header Cookie -i jsessionid
  alway_direct allow MYSITE JSESSIONID1

  acl JSESSIONID2 rep_header Set-Cookie -i jsessionid
  alway_direct allow MYSITE JSESSIONID2

Big surprise: Even requests without cookie are alway cache-misses. Ok. There is no explicit "cache allow" rule. But there wasn't any in the former example as well. So what did happen? Anyway i added an explicit rule at the end, hoping it would be used, if all "always_direct" rules where evaluated to "false":

  cache allow MYSITE

Even bigger surprise:

Now the requests containing a session-cookie are also served from the cache (indicated by a Cache-Hit header, and lacking the "Welcome" box). Which is not acceptable, because we might leak private data.

One last effort (now poking in the dark) adding both directives:

  alway_direct allow MYSITE JSESSIONID1
  cache deny MYSITE JSESSIONID1

Now the result is like in the very first configuration. Like allow_direct wasn't used at all.

Please can anyone help? Is my problem solvable at all? Can someone shed some light on what "always_direct" is meant for (i have read something about cache-hierarchies...)?

Regards

achim
Received on Thu Aug 20 2009 - 14:14:40 MDT

This archive was generated by hypermail 2.2.0 : Fri Aug 21 2009 - 12:00:03 MDT