Re: [PATCH] Support PROXY protocol

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 14 Jul 2014 10:25:37 -0600

On 07/12/2014 10:45 PM, Amos Jeffries wrote:

> +bool
> +ConnStateData::findProxyProtocolMagic()
> +{
> + // http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt
> +
> + // detect and parse PROXY protocol version 1 header
> + if (in.buf.length() > Proxy10magic.length() && in.buf.startsWith(Proxy10magic)) {
> + return parseProxy10();
> +
> + // detect and parse PROXY protocol version 2 header
> + } else if (in.buf.length() > Proxy20magic.length() && in.buf.startsWith(Proxy20magic)) {
> + return parseProxy20();
> +
> + // detect and terminate other protocols
> + } else if (in.buf.length() >= Proxy20magic.length()) {
> + // input other than the PROXY header is a protocol error
> + return proxyProtocolError("PROXY protocol error: invalid header");
> + }
> +

I know you disagree, but the above looks much clearer than the earlier
code to me. Thank you.

The "// detect and ..." comments are pretty much not needed now because
the code is self-documenting! The "else"s are also not needed. Your call
on whether to remove that redundancy.

Consider adding

  // TODO: detect short non-magic prefixes earlier to avoid
  // waiting for more data which may never come

but this is not a big deal because most non-malicious clients will not
send valid non-PROXY requests that is 12 bytes or less, I guess.

> + // XXX: should do this in start(), but SSL/TLS operations begin before start() is called

I agree that this should be done in start(). Fortunately, the code this
XXX refers to does not rely on job protection (AFAICT) so it is not a
big deal. The XXX is really about the SSL code that makes the move
difficult.

>>> -NAME: follow_x_forwarded_for
>>> +NAME: proxy_forwarded_access follow_x_forwarded_for

>> The new name sounds worse than the old one. Hopefully this can be left
>> as is or renamed to something better after the "proxy-surrogate" issue
>> is resolved.

> Is "forwarded_access" obectionable ?

IMO, it is misleading/awkward. Follow_x_forwarded_for was pretty good.
We can use something like follow_forwarded_client_info or
trust_relayed_client_info, but let's wait for the primary naming issue
to be resolved first. That resolution might help us here.

>>> +bool
>>> +ConnStateData::parseProxyProtocolMagic()
>>
>> This appears to parse a lot more than just the magic characters. Please
>> rename to parseProxyProtocolHeader() or similar.

> The entire PROXY protocol is "magic" connection header.

Not really. In this context, "magic" is, roughly, a "rare" string
constant typically used as a prefix to detect/identify the following
structure or message. The PROXY protocol header contains both "magical"
and "regular" parts.

> +/**
> + * Test the connection read buffer for PROXY protocol header.
> + * Version 1 and 2 header currently supported.
> + */
> +bool
> +ConnStateData::findProxyProtocolMagic()

This method does not just "test". It actually parses. IMO,
findProxyProtocolMagic() should be called parseProxyProtocolHeader().

No, it does not matter that the actual non-magic parsing code is in
other parsing methods. The method description and name should reflect
what the method does from the caller point of view. The method internals
are not that important when you are naming and describing the interface.

>>> + needProxyProtocolHeader_ = xact->squidPort->flags.proxySurrogate;
>>> + if (needProxyProtocolHeader_)
>>> + proxyProtocolValidateClient(); // will close the connection on failure
>>
>> Please do not place things that require job protection in a class
>> constructor. Calling things like stopSending() and stopReceiving() does
>> not (or will not) work well when we are just constructing a
>> ConnStateData object. Nobody may notice (at the right time) that you
>> "stopped" something because nothing has been started yet.
>
> Did not require job protection until converting
> stopSending()/stopReceiving() into mustStop() calls.

It only looked that way, but sooner or later that code would break
because it was essentially stopping the job inside the job's constructor.

> Also, note that ConnStateData is not a true AsyncJob. It never started
> with AsynJob::Start() and cleanup is equally nasty as this setup
> constructor (prior to any changes I am adding).

Yes, I know that ConnStateData is an AsyncJob that already violates a
lot of job API principles. That is not your fault, of course. However,
adding more violations into the job constructor makes things worse
(somebody would have to rewrite that later).

> I have done as suggested and created the AsyncJob::Start() functionality
> for it.

Thank you. Hopefully, you would be able to keep that change despite the
extra work it requires.

> However, this means that the PROXY protocol is no longer capable of
> being used on https_port. The PROXY protocol header comes before TLS
> negotiation on the wire, but ConnStateData::start() is only called by
> our code after SSL/TLS negotiation and SSL-bump operations complete.

I agree that there is a problem with https_port support, but the "moving
PROXY ACL check into start() brakes https_port support for PROXY"
argument does not work for me: I agree that the SSL code in trunk does a
lot of things between creating the ConnStateData job and calling
readSomeData(). However, that was true for the original (mk1) patch as
well! Moreover, the original (mk1) patch could close the connection in
the ConnStateData constructor and then proceed with negotiating SSL for
that connection. Thus, I do not think fixing ConnStateData constructor
is the source of the htts_port support problems. That support was
already broken in mk1.

I see two correct ways to address the https_port problem:

1. Refuse PROXY support on https_port for now. As any support
limitation, this is unfortunate, but I think it should be accepted. The
configuration code should be adjusted to reject Squid configurations
that use proxy-surrogate with an https_port.

2. Support PROXY on https_port. As you have mentioned, this probably
requires moving SSL negotiations inside the ConnStateData job so that
the job can be Start()ed when it is created instead of passing the job
as a POD-like structure during SSL negotiations.

> +<p><em>Known Issue:</em> Due to design issues HTTPS traffic is not yet accepted
> + over this protocol. So use of <em>proxy-surrogate</em> on <em>https_port</em>
> + is not supported.

If you continue going with #1, please do not blame mysterious "design
issues" (Squid design? PROXY protocol design? OpenSSL design? Internet
design?) but simply say that PROXY for https_port is not yet supported.
There is no design issue here AFAICT. Https_port support just needs more
development work. This is just an implementation limitation.

BTW, when receiving on an https_port, will the PROXY header be
encrypted? If yes, why not postpone the ACL check until the connection
is decrypted? And where does the PROXY draft explain/define whether the
PROXY header should be encrypted?

> +<p>Squid currently supports receiving HTTP via version 1 or 2 of the protocol.
...

"receiving HTTP via [PROXY] protocol" sounds awkward to me. The PROXY
protocol does not envelop or embed the HTTP protocol that follows the
PROXY header IMO. It just starts a connection with a small header and
does not have a notion of "PROXY message body".

If "version 1 or 2" is the PROXY protocol version (rather than the HTTP
version), then the above is inconsistent with the version 1.5 documented
later:

> +proxy-protocol.txt
> + Documents Proxy Protocol 1.5, for communicating original client IP
> + details between consenting proxies and servers even when
> + transparent interception is taking place.

I think the above ought to say "Documents PROXY Protocol versions 1 and
2", just like the draft title.

> +<p>PROXY protocol provides a simple way for proxies and tunnels of any kind to
> + relay the original client source details ...

> +proxy-protocol.txt
> + Documents Proxy Protocol 1.5, for communicating original client IP
> + details ...

AFAICT, the PROXY protocol supports a lot more than client IP details.
It also communicates client source port, transport protocol, and the
destination address details.

> +<sect1>Support PROXY protocol

We only support receiving PROXY protocol; Squid does not support being a
PROXY protocol client, right? Can we be less ambitious in the above
claim then? Something like "PROXY protocol support (receiving)" or
"Support for receiving PROXY protocol header" would work better IMO.

> + HTTP message Forwarded header, or
> + HTTP message X-Forwarded-For header, or
> + PROXY protocol connection header.

> + Allowing or Denying the X-Forwarded-For or Forwarded headers to
> + be followed to find the original source of a request. Or permitting
> + a client proxy to connect using PROXY protocol.

What happens when the incoming traffic has both an allowed PROXY header
and an allowed HTTP X-Forwarded-For header? Which takes priority? What
if X-Forwarded-For header changes client information in the middle of a
connection? Please document these cases. I could not find that info in
the PROXY protocol draft, but perhaps I missed it.

Please adjust the documentation to make it clear whether the features
are exclusive (either/or) or can co-exist (and/or).

> + proxy-surrogate
> + Support for PROXY protocol version 1 or 2 connections.
> + The proxy_forwarded_access is required to whitelist
> + downstream proxies which can be trusted.
> +

What happens to the destination information in the PROXY header? Is it
used in any way? Should we mention how it is used (or the fact that it
is not used)?

Consider s/downstream/client/.

> + /** marks ports receiving PROXY protocol traffic
> + *
> + * Indicating the following are required:
> + * - PROXY protocol magic header
> + * - src/dst IP retrieved from magic PROXY header
> + * - reverse-proxy traffic prohibited
> + * - intercepted traffic prohibited
> + */
> + bool proxySurrogate;

Why is reverse-proxy traffic prohibited? The PROXY protocol draft
mentions reverse proxies being used, but it is not clear to me whether
they are only supported as PROXY protocol clients. I do not know all the
PROXY details, but it seems to me that reverse proxies should be allowed
as servers (AFAICT, the explicit configuration requirement is the key
here, not the reverse/forward role of the proxy!).

>>> + debugs(33, 5, "PROXY protocol on connection " << clientConnection);
>>> + clientConnection->local = originalDest;
>>> + clientConnection->remote = originalClient;
>>> + debugs(33, 5, "PROXY upgrade: " << clientConnection);
>>
>> We use this kind of address resetting code in many places, right? Please
>> encapsulate it (together with the debugging) into a
>> Connection::resetAddrs() or a similar method.
>
> Two. PROXY/1.0 and PROXY/2.0 parsers.

I found more, including:

> ./log/TcpLogger.cc: futureConn->remote = remote;
> ./log/TcpLogger.cc- futureConn->local.setAnyAddr();

> ./ftp.cc- conn->local = ftpState->ctrl.conn->local;
> ./ftp.cc- conn->local.port(0);
> ./ftp.cc: conn->remote = ftpState->ctrl.conn->remote;
> ./ftp.cc- conn->remote.port(port);

> ./ftp.cc- conn->local = ftpState->ctrl.conn->local;
> ./ftp.cc- conn->local.port(0);
> ./ftp.cc: conn->remote = ipaddr;
> ./ftp.cc- conn->remote.port(port);

> ./dns_internal.cc- conn->local = Config.Addrs.udp_incoming;
> ./dns_internal.cc-
> ./dns_internal.cc: conn->remote = nameservers[nsv].S;

>> * When, in a misconfigured setup, somebody sends a PROXY header to a
>> regular Squid HTTP port, does the Squid error look obvious/clear enough?
>> Or will the admin have a hard time understanding why things do not work
>> in that case?
>>
>
> Trunk will die with a 400 error quoting the PROXY header as the
> "URL" or buffer content.

By "die", you mean respond with an HTTP 400 error, right?

> It seems clear enough to
> me not to need new code for that type of config error case.

I disagree because admins do not normally see 400 errors (until users
start complaining). Most see log messages though. Please note that in a
multi-port setup, it may not be obvious that something is misconfigured
because there may be few or no 400 errors when Squid starts serving
requests.

There are two kinds of likely misconfigurations:

1) Client sends PROXY. Squid is not configured to receive one. We could
handle this better, but I do not insist on adding code to detect and
warn about such cases because the required code would be relatively
complex, and because it would make it even easier to spam cache.log with
error messages when there is no misconfiguration at all (unless we add
even more code).

2) Client does not send PROXY. Squid is configured to require one. I
think we should announce such cases in cache.log. There is already code
to detect this problem. AFAICT, we just need to a logging line.

Thank you,

Alex.
Received on Mon Jul 14 2014 - 16:25:57 MDT

This archive was generated by hypermail 2.2.0 : Sat Jul 26 2014 - 12:00:12 MDT