Re: [RFC] bandwidth savigns via header eliding

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Sat, 19 Jul 2014 11:35:22 -0600

On 07/18/2014 11:33 PM, Amos Jeffries wrote:
> On 19/07/2014 2:07 a.m., Eliezer Croitoru wrote:
>> This got my eyes but I am not reading all ietf httpbits mails and I
>> would like to get a reference for this thread please?

> There are two type of removable headers:
> a) headers which exist purely to bypass security
> b) headers which exist due to intermediaries breaking them
>
> The post describing why the (b) group occur is here:
> http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/0132.html

The above email is talking about a "nnCoection: close" header which
appears to be a result of a bug in some 15-year old software.
Identifying that rare header would be overall harmful -- Squid would
spend more resources on detecting that header presence than it will save
by removing that header when it is found.

> One of the posts which is making me think we could benefit from doing
> something is:
> http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1220.html

> This lists the existing headers found in the data sets being analysed by
> IETF as representative of HTTP web traffic.

> What I can see in that listing is the following headers (by type above).
>
> (A) group:
>
> x-powered-by / x-aspnet-version / x-aspnetmvc-version / x-pb-mii -
> exists to bypass server security measures applied on Server: header.

Sounds like those headers exist to implement some above-HTTP
functionality deemed useful to those who send and/or receive them. What
will Squid break by removing those valid HTTP headers? Why breaking that
functionality is a universally good thing justifying being the default
behavior in an HTTP proxy?

> x-served-by - same as X-powered-by but also crossing over to contain
> X-forwarded-for: and Forwarded: header contents (but without the
> security protections applied for them).
>
> x-host / x-forwarded-host - exist to bypass Browser same-origin
> security measures.
>
> x-li-uuid - tracking cookie created to bypass Cookie header security
> and legislative restrictions.
>
> x-fs-uuid - header for distributing the UUID of the server hard drive
> out to the public network (seriously, what could go wrong with that huh?)
>
> x-radid - seems to be another disk drive tracking ID method.

Same questions apply here. Please correct me if I am wrong, but it
sounds like you are dividing HTTP-compliant agents into two categories:
Those that use HTTP the way you want HTTP to be used and all others. The
division appears to be based not on some HTTP MUSTs, but your view of
which "security" model must be defended.

IMO, Squid should strive to support all HTTP-compliant agents by
default. We should not be the internet police because policing traffic
requires making judgments of who is the "bad" guy, which is outside of
software developers competence. Folks that want to enforce a particular
security model may propose optional features and configuration excerpts
that do so, of course.

There are some gray areas like defense against request smuggling, but
even there extreme care should be taken to avoid harming valid HTTP
traffic. It is certainly not the area of "delete all bad headers in the
parser" solutions.

> (b) group worry me for the reasons given below:
>
> nncoection / cneonction / x-cnection - reason described in the above
> email. I am a little bit worried that in HTTP/1.1 these may have
> actually contained lists of headers which were to be dropped by the
> earlier intermediary. But obscuring the "Connection:" name we are
> potentially transmitting headers like Upgrade: or with private details
> that should be elided.

I do not see why honoring _and_ then dropping what we think is a former
Connection header helps more than it hurts (by default). In fact, that
sounds like a useful smuggling attack vector to me -- "we know Squid
will drop these headers but others will pass them on, so let's use that
for our evil needs".

> ntcoent-length / cteonnt-length - Given the reason behind 16-bit rotate
> on header name any of the mandatory HTTP/1.1->1.0 and connection:close
> addition required to make this safe will alter the checksum. So will
> content adaptation if that was the point.

I do not understand how header changes affect content checksums. Those
checksums do not include headers.

> I am left with assuming that this is done to smuggle messages in a
> pipeline through the receiving server as a single request/reply.

Your assumption seems to contradict what we know for a fact is going on
in many (probably most!) cases of such header name adaptations --
converting standard header names into extension header names to avoid
buffer copies.

> There are also a bunch of other headers which can best be called
> "garbage". Relatively harmless though.
>
> Old HTTP features and mechanisms which are now not supposed to be sent:
>
> pragma:close - dead HTTP/1.0 feature. Not to be emitted by HTTP/1.1
> software.
> p3p - dead standard, removed from service due to privacy violations.
> x-pad - supposedly an HTTPS-only feature for "fixing"

IETF does not have the power to make something "dead" (thankfully!). If
some old software uses an old feature, we should default to supporting
it (all other factors being equal). Again, it is perfectly fine to offer
an "only good modern agents are proxied" feature/configuration in Squid,
but we are discussing

> proxy-connection - dead non-standard. we already drop this one

Dropping hop-by-hop headers (from old or new standards, does not matter)
is a requirement we should follow, of course. Is it a good idea to drop
Proxy-Connection in the parser, without an opportunity to honor it (in
some cases)? I am pretty sure that will break some installations.

> debug headers that are mostly useless (we could help clean this up by
> only enabling our x-cache headers based on a debug config option)
>
> x-cache / x-cache-lookup / x-cache-action / x-cache-hits / x-cache-age
> / x-fb-debug / x-mii-cache-hit / bk-server

I agree that these should be _emitted_ by Squid only if Squid is
configured to do so. We can discuss the right configuration option and
its default setting. However, I disagree that we should drop them by
default in the parser. Doing so will break installations that rely on
what you consider "mostly useless" headers.

Finally, your RFC is about bandwidth savings. I bet that deleting all of
the "security-related" headers and the vast majority of other headers
you listed will not give you noticeable bandwidth savings. You may
adjust your RFC to focus on security or other aspects, of course, but if
bandwidth savings remain the goal, then the examples of rare
"security-bypass" and "dead" headers are not convincing at all!

Cheers,

Alex.

>> On 07/18/2014 10:32 AM, Amos Jeffries wrote:
>>> Some of the statisticas being brought up in the IETF HTTP/2 discussions
>>> is highlighting certain garbage headers which are unfortunately quite
>>> common.
>>>
>>> I have wondered about creating a registry of known garbage and simply
>>> dropping those headers on arrival in the parser. This would be in
>>> addition to the header registry lookup and masking process we have for
>>> hop-by-hop headers.
>>>
>>> Any other thoughts on this?
>>>
>>> Amos
Received on Sat Jul 19 2014 - 17:35:49 MDT

This archive was generated by hypermail 2.2.0 : Mon Jul 21 2014 - 12:00:11 MDT