Re: [PATCH] Support PROXY protocol from Amos Jeffries on 2014-08-12 (squid-dev)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 13 Aug 2014 04:17:32 +1200

On 11/08/2014 4:32 p.m., Alex Rousskov wrote:
> On 08/05/2014 08:31 PM, Amos Jeffries wrote:
>
>> I am adding proxy_protocol_access as the first access control, reverting
>> follow_x_forwarded_for for the second.
>
> Great. I think this is a much simpler/cleaner design.
>
>
>> + } else if (strcmp(token, "require-proxy-header") == 0) {
>> + s->flags.proxySurrogate = true;
>> + debugs(3, DBG_IMPORTANT, "Disabling TPROXY Spoofing on port " << s->s << " (require-proxy-header enabled)");
>> +
>
> The IMPORTANT message should probably be printed only if TPROXY spoofing
> on that port was actually configured/requested.
>
> And, at that point, I would actually make the apparently illegal
> combination a fatal misconfiguration error, but I suspect you do not
> like that approach, and I will not insist on it.
>

Done.

>> But retaining the new description texts.
>
>> -DEFAULT_DOC: X-Forwarded-For header will be ignored.
>> +DEFAULT_DOC: indirect client IP will not be accepted.
>
> The old documentation line was much more clear IMO. If it was incorrect,
> can you rephrase the correction please? Perhaps you meant to say
> something like "Any client IP specified in X-Forwarded-For header will
> be ignored"? If yes, the current version still sounds better to me.
> Adding Forwarded header to those description is a good idea if correct,
> of course.
>

Done.

>> +DEFAULT_DOC: all TCP connections will be denied
>
> I would clarify that with either
>
> * "all TCP connections to ports with require-proxy-header will be denied" or
>
> * "proxy_protocol_access deny all"
>

Done

>> + Any host for which we accept client IP details can place
>> + incorrect information in the relevant header, and Squid
>
> I would s/for/from/ but perhaps I misunderstand how this works.
>
>
>> + tok.skip(Proxy1p0magic);
>
> We already know the magic is there. If you want to optimize this, then
> skip() in ConnStateData::parseProxyProtocolHeader() and pass the
> Tokenizer object to the version-specific parsers. I do not insist on
> this change, but you may want to add a corresponding TODO if you do not
> want to optimize this.
>

This is done with the parser method Tokenizer so that if we happen not
to receive the whole header in one read for any reason we can easily
undo and retry on the following read(2).
Only if the Tokenizer parses completely and fully is the I/O buffer
updated.

We could generate a temporary SBuf and setup the Tokenizer with that.
But it's simpler just to setup the Tokenizer with the existing buffer
and unconditionally skip the magic we already know exists.

>
>> + if (!tok.prefix(tcpVersion, CharacterSet::ALPHA+CharacterSet::DIGIT))
>
> That dynamic creation, addition, and destruction of a 256-element vector
> is a significant performance penalty. Please create a static version of
> that alphanumeric(?) CharacterSet instead of computing it at least once
> for every PROXY connection.
>
>
>> + const CharacterSet ipChars = CharacterSet("IP Address",".:") + CharacterSet::HEXDIG;
>
> Same here, made worse by adding creation of an intermediate set.
>
> Please check for other occurrences.

Fixed, and checked.

>> + if (!tok.prefix(tcpVersion, CharacterSet::ALPHA+CharacterSet::DIGIT))
>> + return proxyProtocolError(tok.atEnd()?"PROXY/1.0 error: invalid protocol family":NULL);
>
> I recommend removing that code and using
>
> if (tok.skip("UNKNOWN")) { ... your UNKNOWN code here ...}
> else if (tok.skip("TCP")) { ... your TCP code here ...}
>
> instead.
>
>
>> + if (!tcpVersion.cmp("UNKNOWN")) {
>> + } else if (!tcpVersion.cmp("TCP",3)) {
>
> Please test for the more likely/common condition first.
>

Agreed. Done.

>> + // skip to first LF (assumes it is part of CRLF)
>> + const SBuf::size_type pos = in.buf.findFirstOf(CharacterSet::LF);
>> + if (pos != SBuf::npos) {
>> + if (in.buf[pos-1] != '\r')
>
> It would be better to use Tokenizer for this IMO. If you insist on
> manual parsing, consider at least skipping the magic header in the
> findFirstOf() call.
>

Waiting on the FTP change that makes it easy to settle in trunk.

>
>> + return proxyProtocolError(tok.atEnd()?"PROXY/1.0 error: missing SP":NULL);
>
> Please surround the ? and : parts of the operator with spaces for
> readability, especially since the string itself contains ':'.
>

Done.

>
>> + if (!tok.prefix(ipa, ipChars) || !tok.skip(' ') ||
>> + !tok.prefix(ipb, ipChars) || !tok.skip(' ') ||
>> + !tok.int64(porta) || !tok.skip(' ') ||
>> + !tok.int64(portb) || !tok.skip('\r') || !tok.skip('\n'))
>
> This code would be a lot clearer if you rewrite it in a positive way:
>
> const bool parsedAll = tok.prefix() && tok.skip() && ...
> if (!parsedAll)
> ...
>
> I do not insist on this change.
>

Makes some sense. Done.

>
>> + // parse src-IP SP dst-IP SP src-port SP dst-port CRLF
>> + if (!tok.prefix(ipa, ipChars) || !tok.skip(' ') ||
>> + !tok.prefix(ipb, ipChars) || !tok.skip(' ') ||
>> + !tok.int64(porta) || !tok.skip(' ') ||
>> + !tok.int64(portb) || !tok.skip('\r') || !tok.skip('\n'))
>> + return proxyProtocolError(!tok.atEnd()?"PROXY/1.0 error: invalid syntax":NULL);
>
> AFAICT, atEnd() may be false just because we have not received the whole
> PROXY header yet, not because we received something invalid. For
> example, int64() returns false not just on failures but on "insufficient
> data" (as it should!).
>
> [N.B. prefix() should behave similarly on insufficient data, but it is
> currently broken; I can fix after the FTP Relay project is done.]

I am not worried about those failure cases particularly, allowing for
sender emitting one field per packet is a tolerance optimization to
begin with. The entire header is designed to be sent in one packet and
the spec explicitly requires it to be sent in a single write(2).

>
>> + if ((in.buf[0] & 0xF0) != 0x20) // version == 2 is mandatory
>> + return proxyProtocolError("PROXY/2.0 error: invalid version");
>> +
>> + const char command = (in.buf[0] & 0x0F);
>> + if ((command & 0xFE) != 0x00) // values other than 0x0-0x1 are invalid
>> + return proxyProtocolError("PROXY/2.0 error: invalid command");
>
> These and similar v2 checks do not make sense to me because we know for
> sure that in.buf starts with Proxy2p0magic. Why test parts of a
> hard-coded constant? I am guessing you meant to skip magic before these
> tests...
>
> Which forces me to ask whether the PROXY v2 path was tested recently?

Willy reported it working before the audit, but is waiting for a release
now for further testing. I've added the magic offset which was missing
from these indexes (and related size pre-check).

>
>> + const char *clen = in.buf.rawContent() + Proxy2p0magic.length() + 2;
>
> How do we know that in.buf has at least Proxy2p0magic.length() + 2 bytes?
>

Fixed.

>
>> + const char *clen = in.buf.rawContent() + Proxy2p0magic.length() + 2;
>> + const uint16_t len = ntohs(*(reinterpret_cast<const uint16_t *>(clen)));
>
> Could this cast violate integer alignment rules on some platforms? If
> yes, we may have to memcpy those two bytes first. I do not think we can
> guarantee any specific alignment for in.buf.rawContent() and, hence, we
> cannot guarantee any specific alignment for clen.
>

Unsure. The uint16_t should be compiled as 8-bit aligned AFAIK. In
theory we only encounter that type of problem with the more complex
types (like pax below).

I dont like casting anyway, so if you have any better way to do it
without a double data copy I am interested. Note that ntohs() is
effectively one data copy.

>
>> + const pax *ipu = reinterpret_cast<const pax*>(extra.rawContent());
>
> Similar alignment concerns here, exacerbated by the complexity of the
> "pax" structure. I think we have to memcpy that.
>
>> + typedef union proxy_addr {
>> + } pax;
>
> This looks like we are trying to declare the same type twice. Can we
> drop "typedef" and replace unused "proxy_addr" with "pax"?

It seems to be accepted without copious "union pax". But this is where I
really start to worry about alignment on the cast. IIRC, the typedef
usually aggregates the compiler knowledge about type size etc into a new
type record for compile optimizations. Leaving it out the compiler has
to manage two varying size details about union contents.

Done for now. Will see how it turns out in usage.

>
>> + bool proxyProtocolError(const char *reason = NULL);
>
> I recommend removing the NULL default because it is unused, and the
> parameter is actually very important in the current code (it is not
> longer used for debugging).

Done.

Updated patch to follow.

Amos
Received on Tue Aug 12 2014 - 16:17:52 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 13 2014 - 12:00:12 MDT