Re: [PATCH] Native FTP Relay

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Mon, 11 Aug 2014 00:11:57 +1200

On 10/08/2014 2:44 p.m., Alex Rousskov wrote:
> On 08/08/2014 11:13 AM, Amos Jeffries wrote:
>> On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
>>> On 08/08/2014 09:48 AM, Amos Jeffries wrote:
>>>> Audit results (part 1):
>>>> * Please apply CharacterSet updates separately.
>>>> * Please apply Tokenizer API updates separately.
>>>> * please apply src/HttpHdrCc.h changes separately.
>
>>> Why? If the branch is merged, they will be applied with their own/
>>> existing commit messages.
>
>> They are updates on previous feature changesets unrelated to FTP which
>> will block future parser alterations (also unrelated to FTP) being
>> backported if this project takes long to stabilize.
>>
>> If they have their own commit messages then cherrypicking out of the
>> branch should be relatively easy.
>
> Thank you for explaining your rationale. To minimize overheads, let's
> come back to this issue if the branch is not merged soon.
>
>
>>>> * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
>>>> AnyP::ProtocolVersion parameters)
>
>> Okay, leave as-is but please document that as the reason for using that
>> version number
>
> Done. It was actually documented in the corresponding commit message. To
> provide _source code_ documentation in one place, I added
> Ftp::ProtocolVersion() and used that everywhere instead of hard-coded
> "1,1". It will be easier to track down all "1,1" users that way if
> somebody wants to change or polish this further later. See ftp/Elements.*

Thank you.
NP: I found two related spots that need converting as well. Highlighted
below in this round of audit notes.

>
> The resulting code required more changes than one may expect and is
> still a little awkward because Http::ProtocolVersion should have been
> just a function (not a class/type), and because PortCfg::setTransport
> internals did not belong to AnyP where they were placed. I did my best
> to avoid cascading changes required to fix all that by leaving
> Http::ProtocolVersion as is and moving PortCfg::setTransport() internals
> into cache_cf.cc where they can access FTP-specific code.
>
>
>> and a TODO about cleaning up the message processing to
>> stop assuming HTTP versions.
>
> Done in one of the many client_side.cc places where we still assume
> HTTP. Here is the new TODO:
>
>> /* RFC 2616 section 10.5.6 : handle unsupported HTTP major versions cleanly. */
>> /* We currently only support 0.9, 1.0, 1.1 properly */
>> /* TODO: move HTTP-specific processing into servers/HttpServer and such */
>> if ( (http_ver.major == 0 && http_ver.minor != 9) ||
>> (http_ver.major > 1) ) {
>
>
> This part of the problem will be gone when HTTP-specific code will be
> moved to HttpServer, but the ICAP compatibility part will not go away
> regardless of any cleanup.

The ICAP compatibility issue is a bug in ICAP or specific ICAP servers
that needs fixing there. HTTP itself can already (or shortly will) have
present "HTTP/2.0", "h2" or "h2-NN" message versions to be encapsulated.

>
>> in src/Server.cc:
>> * this change looks semantically wrong:
>> - if (!doneWithServer()) {
>> + if (mayReadVirginReplyBody()) {
>> closeServer();
>
> The change itself does not make the code look semantically wrong. On the
> semantics level, the code looks about the same before and after the change:
>
> Was: If we are not done with the server, close it.
> Now: If we are still reading from the server, close it.

It is now *apparently* skipping all the possible non-reply-reading
reasons for closing the server connection. That is semantic change...

>
>> in src/Server.h:
>> * I'm not sure the call order of doneWithServer() and
>> mayReadVirginReplyBody() is the right way around.
>
> Sorry, I am not sure what you mean by "call order" in src/Server.h
> context. That file does not determine the order of those two calls.

... but mayReadVirginReplyBody() is calling doneWithServer() in the HTTP
case. You clarified why its done below and I am certain now that it is
indeed wrong, although necessary as a temporary workaround until the
non-reply-reading cases are checked and fixed.

>
>> It seems like "being
>> done completely with a server" includes "whether more reply is coming"
>> as one sub-condition of several (see above), not the reverse.
>
> Correct. As the commit message quoted above tries to explain, FTP needed
> to distinguish two cases that the old code merged together:
>
> * The transaction may still be receiving the virgin response (the new
> mayReadVirginReplyBody method). This is when FTP and HTTP code has to
> end communication with the server. No other solution here right now (as
> the source code comment explains).
>
> * The transaction stopped receiving the virgin response but may still be
> able to communicate with the origin server. In this case, the FTP code
> wants to keep the [pinned] _control_ connection to the origin server
> open and, hence, does not want to call closeServer().
>
> The code remains the same for HTTP because the new
> mayReadVirginReplyBody() method just calls the old doneWithServer()
> method in HTTP case.
>

Okay, I think I understand the cases now. So the old HTTP code was
broken by not considering the request and Trailers cases as okay even
when ICAP closed. You are just not fixing that bug.

Would you mind adding a TODO to the new HTTP mayReadVirginReplyBody()
that it should be only considering reply body reads, not overall server use?
 The cases of reply Trailers, chunked encoding Trailers, and ongoing
server writes should not be making it return true. The other non-HTTP
Server child classes may need it the way it is though.

>
>> in src/clientStreamForward.h:
>> * why is enums.h included? please document.
>
> Done. It is used for clientStream_status_t returned by CSS(). I do not
> see other enum.h #includers justifying their enums.h use. A new policy?

Relatively. For enums, typedefs, defines, and globals since we are
trying to remove their usage ASAP. Mentioning the type name assists
identifying obsolete includes when the enum gets shuffled.

>> * please remove all "\ingroup ServerProtocolFTPInternal" this grouping
>> is replaced by the namespace already.
>
> Done reluctantly as this change caused lots of comment-only changes in
> the old FTP code.
>

Thank you.

Audit below with changes in the c* patch:

in src/cache_cf.cc:
* the new parsePortProtocol() FTP case should return
Ftp::ProtocolVersion(). Which removes one more instance of FTP 1,1 numbers.

in src/cf.data.pre:
* mentions of "ftp-track-dirs=on|off" need updating

in src/ftp/Elements.cc:
* Ftp::HttpReplyWrapper() has unnecessary:
  httpVersion = Http::ProtocolVersion(Ftp::ProtocolVersion().major,
Ftp::ProtocolVersion().minor);

 - the current default Http::ProtocolVersion() constructor sets the
right values for HTTP messages generated by Squid. The old code using
Http::ProtocolVersion(1, 1) was incorrect/outdated.

 - same for Ftp::Server::parseOneRequest() in src/servers/FtpServer.cc

Audit below with changes in the r* patch:

in src/clients/Makefile.am:
* no need for forward.h in the SOURCES list to be separated from the
other files.
 - same in src/servers/Makefile.am

* please include $(top_srcdir)/src/TestHeaders.am to run the .h header
unit tests
 - same in src/ftp/Makefile.am and src/servers/Makefile.am

in src/servers/forward.h:
* missing pre-define for class ConnStateData
 - this would have been caught by the .h unit tests mentioned above.

in src/servers/FtpServer.cc:

+ Ftp::Server::parseOneRequest() for all the following until "++"

* please reference RFC 959 section 3.1.1.5.2 for the unusual definition
of WhiteSpace CharacterSet.

* please use existing CharacterSet::LF instead of re-defining as local
NewLine.

* please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
instead of BlackSpace. Then explicit delimiter check to validate the input.
 - RFC 959 is clear:
  "The command codes themselves are alphabetic characters terminated
   by the character <SP> (Space) if parameters follow and Telnet-EOL
   otherwise."
 - NP: if the input has LF terminated lines instead of CRLF the current
parser will consume multiple \n into the "Blackspace" cmd string then
demand more. This can happen on manual input, EBCDIC mode, or broken
(malicious?) hand-crafted clients.

* the right-trimming of params only trims "\r\t " instead of the defined
WhiteSpace special set.
 - the left-trim crops all custom Whitespace characters.
 - since you are altering the other string parser API significantly
anyway, may as well add SBuf::trim version using CharacterSet and use it
here with FTP custom Whitespace variable. That also gets rid of the TODO.

* regarding "// interception cases do not need USER to calculate the uri"
 - can you explain that a bit. Where does the URI detail come from
instead of USER ?

* please use () to make this line easier to read:
  const SBuf *path = params.length() && CommandHasPathParameter(cmd) ?
                      &params : NULL;

++

* please omit StoreIOBuffer "data" parameter when its unused in methods

* Ftp::Server::wroteReplyData() please avoid wrap on debugs()

* Ftp::Server::handleEpsvReply() comment "we combine remote server
address with" is wrong/stale
 - looks like a cut-n-paste error from what handlePasvReply() does
differently.

* debug section 11 is HTTP protocol.
 - FTP protocol info debugs are supposed to be section 9.

* Ftp::Server::createDataConnection() is not using the new
Comm::Connection::setAddrs() you had me add a few days back.

in src/servers/FtpServer.h:

* please make Ftp::MasterState RefCountable and use a Pointer for
Ftp::Server::master
 - this will make is easily linked to MasterXaction later (or sooner)
without altering lots of FTP code in a non-FTP patch.

* please fix Ftp::MasterState definitions ordering (constructor).
  - coding guidelines specify methods before variables.

* please rename private member variables with _ suffix as per coding
guidelines.
 - the 1-word members in particular are easily confused with local
variables in the .cc method code.

PS. you know how we have been discussing reasons for patches
encountering audit problems and wasting peoples time ...

1) earlier I highlighted the parse classes API changes which were not
particularly FTP related. They could have been audited separately and
merged earlier. With possible benefits to the ongoing parser project.

2) this patch is also explicitly forcing the ConnStateData inheritence
model for Http::Server and Ftp::Server despite previous weeks of
argument reaching a deadlock about separation of duties between
ConnStateData or ClientSocketContext for that class hierarchy - and
indeed whether it should be a hierarchy at all.
 *without any discussion to resolve that deadlock*.
 Luckily I have learnt more of the client-side functionality since then
and am okay with this change. But since we have not communicated that a
lack of discussion here could very easily have resulted in more
days/weeks wasted.

Cheers
Amos
Received on Sun Aug 10 2014 - 12:12:23 MDT

This archive was generated by hypermail 2.2.0 : Mon Aug 11 2014 - 12:00:16 MDT