X-Vary-Options patch

From: Tim Starling <tstarling@dont-contact.us>
Date: Fri, 08 Feb 2008 16:26:47 +1100

There are two major sources of suboptimal hit rate on Wikipedia which
relate to the Vary header:

* In Accept-Encoding, we only care whether "gzip" is present or not, but
IE and Firefox use different whitespace conventions and so each get
separate entries in the cache
* We only care whether the user is logged in or not. Other cookies, such
as pure-JavaScript cookies used by client-side code to store
preferences, unnecessarily degrade our hit rate.

There have been other patches related to this problem, but as far as I'm
aware, they're all special-case, site-specific hacks. My patch adds an
X-Vary-Options response header (hereafter XVO), and thus gives the
origin server fine control over cache variance. In the patch, the XVO
header overrides the Vary header, so the Vary header can still be sent
as usual for compatibility with caches that don't support this feature.

The format of the XVO header is inspired by the format of the Accept
header. As in Vary, XVO is separated by commas into parts which relate
to different request headers. Then those parts are further separated by
semicolons. The first semicolon-separated part is the request header
name, and subsequent parts give name/value pairs separated by equals
signs, defining options relating to the variance of that header.

Two option names are currently defined:

  list-contains: splits the request header into comma-separated parts
and varies depending on whether the resulting list contains the option value
  string-contains: performs a simple search of the request header and
varies depending on whether it matches.

Multiple such options per header are allowed.

So for example:

X-Vary-Options: Cookie; string-contains=UserID;
string-contains=_session, Accept-Encoding; list-contains=gzip

This would vary the cache on three tests:
* whether the Cookie header contains the string "UserID"
* whether the Cookie header contains the string "_session"
* whether the Accept-Encoding header, interpreted as a comma-separated
list, contains the item "gzip"

The patch refactors all references to the Vary and X-Accelerator-Vary
headers into the functions httpHeaderHasVary() and httpHeaderGetVary()
in HttpHeader.c. It then adds X-Vary-Options to these functions,
interpreting it as a string rather than a list to avoid inappropriate
splitting on whitespace. It puts the resulting combined header into
X-Vary-Options instead of Vary in the base vary marker object, again to
avoid inappropriate list-style interpretation. httpMakeVaryMark() then
interprets this combined header in the way described above.

The added features of the patch are conditional, and are enabled by the
configure option --enable-vary-options. Autoconf and automake will need
to be run after applying this patch.

The interpretation of some non-standards-compliant Vary headers (those
containing semicolons) is changed slightly by this patch regardless of
--enable-vary-options.

The patch is attached and also available at:
http://noc.wikimedia.org/~tstarling/patches/vary_options_upstream.patch

For your review and consideration.

-- Tim Starling

diff -Xdiffx -ru squid-2.6.18.orig/configure.in squid-2.6.18/configure.in
--- squid-2.6.18.orig/configure.in 2008-01-10 23:34:23.000000000 +1100
+++ squid-2.6.18/configure.in 2008-02-07 19:43:23.000000000 +1100
@@ -1507,6 +1507,16 @@
   fi
 ])
 
+dnl Enable vary options
+AC_ARG_ENABLE(vary_options,
+[ --enable-vary-options
+ Enable support for the X-Vary-Options header.],
+[ if test "$enableval" = "yes" ; then
+ echo "Enabling support for vary options"
+ AC_DEFINE(VARY_OPTIONS, 1, [Enable support for the X-Vary-Options header])
+ fi
+])
+
 AC_ARG_ENABLE(follow-x-forwarded-for,
 [ --enable-follow-x-forwarded-for
                          Enable support for following the X-Forwarded-For
diff -Xdiffx -ru squid-2.6.18.orig/src/client_side.c squid-2.6.18/src/client_side.c
--- squid-2.6.18.orig/src/client_side.c 2008-02-07 19:28:38.000000000 +1100
+++ squid-2.6.18/src/client_side.c 2008-02-08 14:39:38.000000000 +1100
@@ -735,10 +735,7 @@
             request_t *request = http->request;
             const char *etag = httpHeaderGetStr(&mem->reply->header, HDR_ETAG);
             const char *vary = request->vary_headers;
- int has_vary = httpHeaderHas(&entry->mem_obj->reply->header, HDR_VARY);
-#if X_ACCELERATOR_VARY
- has_vary |= httpHeaderHas(&entry->mem_obj->reply->header, HDR_X_ACCELERATOR_VARY);
-#endif
+ int has_vary = httpHeaderHasVary(&entry->mem_obj->reply->header);
             if (has_vary)
                 vary = httpMakeVaryMark(request, mem->reply);
 
@@ -4948,10 +4945,7 @@
 varyEvaluateMatch(StoreEntry * entry, request_t * request)
 {
     const char *vary = request->vary_headers;
- int has_vary = httpHeaderHas(&entry->mem_obj->reply->header, HDR_VARY);
-#if X_ACCELERATOR_VARY
- has_vary |= httpHeaderHas(&entry->mem_obj->reply->header, HDR_X_ACCELERATOR_VARY);
-#endif
+ int has_vary = httpHeaderHasVary(&entry->mem_obj->reply->header);
     if (!has_vary || !entry->mem_obj->vary_headers) {
         if (vary) {
             /* Oops... something odd is going on here.. */
diff -Xdiffx -ru squid-2.6.18.orig/src/enums.h squid-2.6.18/src/enums.h
--- squid-2.6.18.orig/src/enums.h 2008-02-07 19:28:38.000000000 +1100
+++ squid-2.6.18/src/enums.h 2008-02-07 21:35:18.000000000 +1100
@@ -256,6 +256,9 @@
 #if X_ACCELERATOR_VARY
     HDR_X_ACCELERATOR_VARY,
 #endif
+#if VARY_OPTIONS
+ HDR_X_VARY_OPTIONS,
+#endif
     HDR_X_ERROR_URL, /* errormap, requested URL */
     HDR_X_ERROR_STATUS, /* errormap, received HTTP status line */
     HDR_FRONT_END_HTTPS,
diff -Xdiffx -ru squid-2.6.18.orig/src/http.c squid-2.6.18/src/http.c
--- squid-2.6.18.orig/src/http.c 2008-02-07 19:28:38.000000000 +1100
+++ squid-2.6.18/src/http.c 2008-02-08 14:48:44.000000000 +1100
@@ -353,20 +353,29 @@
     String vstr = StringNull;
 
     stringClean(&vstr);
- hdr = httpHeaderGetList(&reply->header, HDR_VARY);
- if (strBuf(hdr))
- strListAdd(&vary, strBuf(hdr), ',');
- stringClean(&hdr);
-#if X_ACCELERATOR_VARY
- hdr = httpHeaderGetList(&reply->header, HDR_X_ACCELERATOR_VARY);
- if (strBuf(hdr))
- strListAdd(&vary, strBuf(hdr), ',');
- stringClean(&hdr);
-#endif
+ vary = httpHeaderGetVary(&reply->header);
+ debug(11,3) ("httpMakeVaryMark: Vary: %s\n", strBuf(vary));
+
     while (strListGetItem(&vary, ',', &item, &ilen, &pos)) {
- char *name = xmalloc(ilen + 1);
- xstrncpy(name, item, ilen + 1);
- Tolower(name);
+ const char *sc_item, *sc_pos = NULL;
+ int sc_ilen;
+ String str_item;
+ char *name = NULL;
+ String value_spec = StringNull;
+ int need_value = 1;
+
+ stringLimitInit(&str_item, item, ilen);
+
+ /* Get the header name */
+ if (strListGetItem(&str_item, ';', &sc_item, &sc_ilen, &sc_pos)) {
+ name = xmalloc(sc_ilen + 1);
+ xstrncpy(name, sc_item, sc_ilen + 1);
+ Tolower(name);
+ } else {
+ name = xmalloc(1);
+ *name = '\0';
+ }
+
         if (strcmp(name, "accept-encoding") == 0) {
             aclCheck_t checklist;
             memset(&checklist, 0, sizeof(checklist));
@@ -381,22 +390,76 @@
         if (strcmp(name, "*") == 0) {
             /* Can not handle "Vary: *" efficiently, bail out making the response not cached */
             safe_free(name);
+ stringClean(&str_item);
             stringClean(&vary);
             stringClean(&vstr);
             break;
         }
- strListAdd(&vstr, name, ',');
+
+ /* Fetch the header string */
         hdr = httpHeaderGetByName(&request->header, name);
- safe_free(name);
- value = strBuf(hdr);
- if (value) {
- value = rfc1738_escape_part(value);
- stringAppend(&vstr, "=\"", 2);
- stringAppend(&vstr, value, strlen(value));
- stringAppend(&vstr, "\"", 1);
+
+ /* Process the semicolon-separated options */
+#ifdef VARY_OPTIONS
+ while (strListGetItem(&str_item, ';', &sc_item, &sc_ilen, &sc_pos)) {
+ char *opt_name = xmalloc(sc_ilen + 1);
+ char *opt_value;
+ char *eqpos;
+ xstrncpy(opt_name, sc_item, sc_ilen + 1);
+ eqpos = strchr(opt_name, '=');
+ if (!eqpos) {
+ opt_value = NULL;
+ } else {
+ *eqpos = '\0';
+ opt_value = eqpos + 1;
+ }
+ Tolower(opt_name);
+
+ if (strcmp(opt_name, "list-contains") == 0 && opt_value) {
+ if (strBuf(hdr) && strListIsMember(&hdr, opt_value, ',')) {
+ opt_value = rfc1738_escape_part(opt_value);
+ strListAdd(&value_spec, "list-contains[\"", ';');
+ stringAppend(&value_spec, opt_value, strlen(opt_value));
+ stringAppend(&value_spec, "\"]", 2);
+ }
+ need_value = 0;
+ } else if (strcmp(opt_name, "string-contains") == 0 && opt_value) {
+ if (strBuf(hdr) && strIsSubstr(&hdr, opt_value)) {
+ opt_value = rfc1738_escape_part(opt_value);
+ strListAdd(&value_spec, "string-contains[\"", ';');
+ stringAppend(&value_spec, opt_value, strlen(opt_value));
+ stringAppend(&value_spec, "\"]", 2);
+ }
+ need_value = 0;
+ } else {
+ debug(11,3) ("httpMakeVaryMark: unrecognised vary option: %s\n", opt_name);
+ }
+ safe_free(opt_name);
         }
+#endif
+
+ if (need_value) {
+ value = strBuf(hdr);
+ if (value) {
+ value = rfc1738_escape_part(value);
+ strListAdd(&value_spec, "\"", ';');
+ stringAppend(&value_spec, value, strlen(value));
+ stringAppend(&value_spec, "\"", 1);
+ }
+ }
+
+ strListAdd(&vstr, name, ',');
+ stringAppend(&vstr, "=", 1);
+ if (strBuf(value_spec)) {
+ stringAppend(&vstr, strBuf(value_spec), strLen(value_spec));
+ }
+
         stringClean(&hdr);
+ stringClean(&value_spec);
+ stringClean(&str_item);
+ safe_free(name);
     }
+
     safe_free(request->vary_hdr);
     safe_free(request->vary_headers);
     if (strBuf(vary) && strBuf(vstr)) {
@@ -514,11 +577,7 @@
         /* non-chunked. Handle as one single big chunk (-1 if terminated by EOF) */
         httpState->chunk_size = httpReplyBodySize(httpState->orig_request->method, reply);
     }
- if (httpHeaderHas(&reply->header, HDR_VARY)
-#if X_ACCELERATOR_VARY
- || httpHeaderHas(&reply->header, HDR_X_ACCELERATOR_VARY)
-#endif
- ) {
+ if (httpHeaderHasVary(&reply->header)) {
         const char *vary = NULL;
         if (Config.onoff.cache_vary)
             vary = httpMakeVaryMark(httpState->orig_request, reply);
diff -Xdiffx -ru squid-2.6.18.orig/src/HttpHeader.c squid-2.6.18/src/HttpHeader.c
--- squid-2.6.18.orig/src/HttpHeader.c 2007-12-21 20:56:53.000000000 +1100
+++ squid-2.6.18/src/HttpHeader.c 2008-02-08 14:49:24.000000000 +1100
@@ -133,6 +133,9 @@
 #if X_ACCELERATOR_VARY
     {"X-Accelerator-Vary", HDR_X_ACCELERATOR_VARY, ftStr},
 #endif
+#if VARY_OPTIONS
+ {"X-Vary-Options", HDR_X_VARY_OPTIONS, ftStr},
+#endif
     {"X-Error-URL", HDR_X_ERROR_URL, ftStr},
     {"X-Error-Status", HDR_X_ERROR_STATUS, ftInt},
     {"Front-End-Https", HDR_FRONT_END_HTTPS, ftStr},
@@ -210,6 +213,9 @@
 #if X_ACCELERATOR_VARY
     HDR_X_ACCELERATOR_VARY,
 #endif
+#if VARY_OPTIONS
+ HDR_X_VARY_OPTIONS,
+#endif
     HDR_X_SQUID_ERROR
 };
 
@@ -1185,6 +1191,54 @@
     return tot;
 }
 
+/* Get the combined Vary headers as a String
+ * Returns StringNull if there are no vary headers
+ */
+String httpHeaderGetVary(const HttpHeader * hdr)
+{
+ String hdrString = StringNull;
+#if VARY_OPTIONS
+ HttpHeaderEntry *e;
+ if ((e = httpHeaderFindEntry(hdr, HDR_X_VARY_OPTIONS))) {
+ stringInit(&hdrString, strBuf(e->value));
+ return hdrString;
+ }
+#endif
+
+ hdrString = httpHeaderGetList(hdr, HDR_VARY);
+#if X_ACCELERATOR_VARY
+ {
+ String xavString = StringNull;
+ xavString = httpHeaderGetList(hdr, HDR_X_ACCELERATOR_VARY);
+ if (strBuf(xavString))
+ strListAdd(&hdrString, strBuf(xavString), ',');
+ stringClean(&xavString);
+ }
+#endif
+ return hdrString;
+}
+
+/*
+ * Returns TRUE if at least one of the vary headers are present
+ */
+int httpHeaderHasVary(const HttpHeader * hdr)
+{
+#if VARY_OPTIONS
+ if (httpHeaderHas(hdr, HDR_X_VARY_OPTIONS)) {
+ return TRUE;
+ }
+#endif
+#if X_ACCELERATOR_VARY
+ if (httpHeaderHas(hdr, HDR_X_ACCELERATOR_VARY)) {
+ return TRUE;
+ }
+#endif
+ if (httpHeaderHas(hdr, HDR_VARY)) {
+ return TRUE;
+ }
+ return FALSE;
+}
+
 /*
  * HttpHeaderEntry
  */
@@ -1438,3 +1492,5 @@
     assert(id >= 0 && id < HDR_ENUM_END);
     return strBuf(Headers[id].name);
 }
+
+
diff -Xdiffx -ru squid-2.6.18.orig/src/HttpReply.c squid-2.6.18/src/HttpReply.c
--- squid-2.6.18.orig/src/HttpReply.c 2006-06-11 10:28:19.000000000 +1000
+++ squid-2.6.18/src/HttpReply.c 2008-02-08 14:42:04.000000000 +1100
@@ -325,8 +325,7 @@
                 return squid_curtime;
         }
     }
- if (Config.onoff.vary_ignore_expire &&
- httpHeaderHas(&rep->header, HDR_VARY)) {
+ if (Config.onoff.vary_ignore_expire && httpHeaderHasVary(&rep->header)) {
         const time_t d = httpHeaderGetTime(&rep->header, HDR_DATE);
         const time_t e = httpHeaderGetTime(&rep->header, HDR_EXPIRES);
         if (d == e)
diff -Xdiffx -ru squid-2.6.18.orig/src/protos.h squid-2.6.18/src/protos.h
--- squid-2.6.18.orig/src/protos.h 2008-02-07 19:28:38.000000000 +1100
+++ squid-2.6.18/src/protos.h 2008-02-08 14:46:21.000000000 +1100
@@ -444,6 +444,8 @@
 extern squid_off_t httpHeaderGetSize(const HttpHeader * hdr, http_hdr_type id);
 extern time_t httpHeaderGetTime(const HttpHeader * hdr, http_hdr_type id);
 extern TimeOrTag httpHeaderGetTimeOrTag(const HttpHeader * hdr, http_hdr_type id);
+extern String httpHeaderGetVary(const HttpHeader * hdr);
+extern int httpHeaderHasVary(const HttpHeader * hdr);
 extern HttpHdrCc *httpHeaderGetCc(const HttpHeader * hdr);
 extern HttpHdrRange *httpHeaderGetRange(const HttpHeader * hdr);
 extern HttpHdrContRange *httpHeaderGetContRange(const HttpHeader * hdr);
diff -Xdiffx -ru squid-2.6.18.orig/src/store.c squid-2.6.18/src/store.c
--- squid-2.6.18.orig/src/store.c 2008-02-07 19:28:38.000000000 +1100
+++ squid-2.6.18/src/store.c 2008-02-08 14:55:06.000000000 +1100
@@ -721,7 +721,12 @@
     state->e = storeCreateEntry(url, log_url, flags, method);
     httpBuildVersion(&version, 1, 0);
     httpReplySetHeaders(state->e->mem_obj->reply, version, HTTP_OK, "Internal marker object", "x-squid-internal/vary", -1, -1, squid_curtime + 100000);
+#if VARY_OPTIONS
+ /* Can't put a string into a list header */
+ httpHeaderPutStr(&state->e->mem_obj->reply->header, HDR_X_VARY_OPTIONS, vary);
+#else
     httpHeaderPutStr(&state->e->mem_obj->reply->header, HDR_VARY, vary);
+#endif
     storeSetPublicKey(state->e);
     if (!state->oe) {
         /* New entry, create new unique ID */
@@ -1039,20 +1044,8 @@
         }
         newkey = storeKeyPublicByRequest(mem->request);
         if (mem->vary_headers && !EBIT_TEST(e->flags, KEY_EARLY_PUBLIC)) {
- String vary = StringNull;
             vary_id_t vary_id;
- String varyhdr;
- varyhdr = httpHeaderGetList(&mem->reply->header, HDR_VARY);
- if (strBuf(varyhdr))
- strListAdd(&vary, strBuf(varyhdr), ',');
- stringClean(&varyhdr);
-#if X_ACCELERATOR_VARY
- /* This needs to match the order in http.c:httpMakeVaryMark */
- varyhdr = httpHeaderGetList(&mem->reply->header, HDR_X_ACCELERATOR_VARY);
- if (strBuf(varyhdr))
- strListAdd(&vary, strBuf(varyhdr), ',');
- stringClean(&varyhdr);
-#endif
+ String vary = httpHeaderGetVary(&mem->reply->header);
             /* Create or update the vary object */
             vary_id = storeAddVary(mem->url, mem->log_url, mem->method, newkey, httpHeaderGetStr(&mem->reply->header, HDR_ETAG), strBuf(vary), mem->vary_headers, mem->vary_encoding);
             if (vary_id.create_time) {
Received on Thu Feb 07 2008 - 22:46:45 MST

This archive was generated by hypermail pre-2.1.9 : Sat Mar 01 2008 - 12:00:09 MST