Re: [squid-users] Squid mitigation of advanced persistent tracking

From: Amos Jeffries <>
Date: Wed, 17 Aug 2011 14:35:53 +1200

 On Tue, 16 Aug 2011 18:16:38 -0700 (PDT), John Hardin wrote:
> On Wed, 3 Aug 2011, Amos Jeffries wrote:
>> On Tue, 2 Aug 2011 13:39:51 -0700 (PDT), John Hardin wrote:
>>> The analysis of the APT techniques used by Kissmetrics (at
>>> is
>>> interesting if thin, and suggests one way that Squid might be
>>> leveraged to interfere with such tracking: deleting the "Etag:"
>>> header
>>> from request replies.
> /me bows head in shame
>>> Comments?
>> All they are doing is a server-side browsing session. But unlike
>> Cookies, ETag are usually shared between many clients simultaneously.
>> Middleware like Squid is able to reply to them instead of contacting
>> the origin site. Even creates new ones the origin is not aware of when
>> compressing on the fly.
> Some more details are available in the more-academic paper:
> One example in that paper:
> GET /i.js HTTP/1.1
> Host:
> Etag: "Z9iGGN1n1-zeVqbgzrlKkl39hiY"
> Expires: Sun, 12 Dec 2038 01:19:31 GMT
> Last-Modified: Wed, 27 Jul 2011 00:19:31 GMT
> Set-Cookie: _km_cid=Z9iGGN1n1-zeVqbgzrlKkl39hiY;
> expires=Sun, 12 Dec 2038 01:19:31 GMT;path=/;
> ...has the possibly useful signature of the Etag value appearing in a
> cookie being set. Any comments on the utility of writing an eCAP
> filter to block _that_ (to either strip the cookie or block the
> entire
> response)?
> "Give up" isn't helpful. :)

 Could be useful. Up to you.

 This particular case comes under "Middleware like Squid is able to
 reply to them instead of contacting the origin site".
  ** Object will clearly never expire, therefore no need to contact the
 origin (or tracker) until 2038. Unless the client request explicitly
 contains "no-cache" or "max-age=0" to force immediate revalidation.
  ** No indication that the response was customized. Therefore it may be
 sent in response to arbitrary clients for the same object _by URL
 alone_. Also may be sent in response to client revalidations of _any_
 Etag value which was older.

 If that is actually being used in practice I would seriously doubt any
 claims the tracker makes about their data accuracy. Particularly
 regarding Asia-Pacific regions data where cache farms are popular speed

 It would need Expires in the past or Cache-Control values to prevent
 caching. In which case ETag is safe to drop along with the Cookie. :)

 If you want to go the route of creating a filter, IMO it would be most
 effective to calculate the MD5 or SHA1 of the body instance (avoiding
 range request responses, since the body is not the object instance
 there). Then recording an index of object hash versus ETag values. If
 you see non-identical bodies using one ETag or vice-versa the origin is
 broken (this tracking type is regarded such).

 Looks like KISSmetrics have officially given up the arms race anyway.
 As of 29th July.

Received on Wed Aug 17 2011 - 02:35:58 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 17 2011 - 12:00:02 MDT