Re: [PATCH] Use SBuf for tunnel.cc I/O

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 12 May 2014 16:46:36 -0600

On 05/11/2014 11:16 AM, Amos Jeffries wrote:

> This patch adds a Comm::Write API for accepting SBuf output buffers.

> Unlike the char* API the buffer can be appended to before write(2) is
> completed and the appended bytes will be handled as part of the original
> write request.

That approach scares me because the caller has to be extra careful not
to leave the being-updated-area of the buffer in inconsistent state
across async calls. However, I doubt we have such complex callers today,
and the approach also reduces the number of usually pointless write
callback calls, so I am willing to experiment.

> All other behaviour is identical between the three Comm::Write APIs.

You forgot to mention the fact that the new Write API, unlike the
existing two, relies on the caller to keep the buffer area alive. That
requirement, IMO, is a show-stopper. I strongly object to that change.

The buf2 hack has already been added during unaudited "Convert the
ConnStateData input buffer to SBuf" commit (r13324), so there is little
I can do about it until it causes a bug, but I do object to the use of
such raw/unprotected links between Comm and its callers in any code I am
lucky to review before the commit.

If others overrule this objection, please at least change the
Comm::Write() sb argument type from reference to a pointer. Passing an
argument by reference (implying temporary use) and then taking an
address of that argument for long term object storage is wrong.

> /// Buffer to store read(2) into when set.
> // This is a pointer to the Jobs buffer rather than an SBuf using
> // the same store since we cannot know when or how the Job will
> // alter its SBuf while we are reading.
> SBuf *buf2;
>
> // Legacy c-string buffers used when buf2 is unset.
> char *buf;

Another side-effect is that "I wrote this, what's next?" notifications
may now be delayed for a very long time (potentially forever) if the
caller keeps adding content to the buffer. I suspect that most writers
do not really need such notifications. Thank you for documenting this
aspect (although it deserves to be mentioned in the commit message as
well IMO).

> - static void ReadClient(const Comm::ConnectionPointer &, char *buf, size_t len, comm_err_t errcode, int xerrno, void *data);
> + static void ReadClient(const Comm::ConnectionPointer &, char *, size_t len, comm_err_t errcode, int xerrno, void *data);

Please avoid this and similar non-changes to reduce merge conflicts. It
is OK to remove the parameter name in the method _definition_ where it
was used and is no longer needed(**), but there is no need to update
methods that did not use the parameter or the declarations. These
methods should be refactored/polished when TunnelStateData becomes a
job, but that is outside your patch scope.

(**) although the existing code like TunnelStateData::readServer()
implies that such changes are not required either.

> - void bytesIn(int const &);
> -#if USE_DELAY_POOLS
> -
> - void setDelayId(DelayId const &);
> -#endif
> -
> void error(int const xerrno);
> int debugLevelForError(int const xerrno) const;
> /// handles a non-I/O error associated with this Connection
> void logicError(const char *errMsg);
> void closeIfOpen();
> void dataSent (size_t amount);
...
> int64_t *size_ptr; /* pointer to size in an ConnStateData for logging */
>
> Comm::ConnectionPointer conn; ///< The currently connected connection.
>
> - private:
> #if USE_DELAY_POOLS
> -
> DelayId delayId;
> #endif
> -

> - server.bytesIn(len);
> +#if USE_DELAY_POOLS
> + server.delayId.bytesIn(len);
> +#endif

> - tunnelState->server.setDelayId(DelayId());
> + tunnelState->server.delayId = DelayId();

Please avoid these and similar changes if they are unrelated to your
patch scope.

> if (rep.hdr_sz < connectRespBuf->contentSize()) {
> // preserve bytes that the server already sent after the CONNECT response
> - server.len = connectRespBuf->contentSize() - rep.hdr_sz;
> - memcpy(server.buf, connectRespBuf->content()+rep.hdr_sz, server.len);
> - } else {
> - // reset; delay pools were using this field to throttle CONNECT response
> - server.len = 0;
> + server.buf.append(connectRespBuf->content()+rep.hdr_sz, connectRespBuf->contentSize() - rep.hdr_sz);
> }

Can you explain why the "else" clause is no longer needed? Or was it
bogus to start with?

> - if (!tunnelState->server.len) {
> + if (tunnelState->server.buf.isEmpty()) {

and

> - if (!tunnelState->client.len) {
> + if (tunnelState->client.buf.isEmpty()) {

but

> - if (!tunnelState->server.len)
> + if (!tunnelState->server.buf.length())

Please be consistent: isEmpty() is probably slightly better than
!length(), but it is your call which method to use.

> - debugs(26, DBG_DATA, "Tunnel server PUSH Payload: \n" << Raw("", tunnelState->server.buf, tunnelState->server.len) << "\n----------");
> + debugs(26, DBG_DATA, "Tunnel server PUSH Payload: \n" << tunnelState->server.buf << "\n----------");

Please do not remove Raw() protection for dumping raw data.

Thank you,

Alex.
Received on Mon May 12 2014 - 22:46:49 MDT

This archive was generated by hypermail 2.2.0 : Tue May 13 2014 - 12:00:13 MDT