Re: [PATCH] base-64 encoder upgrade

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Fri, 29 Apr 2011 10:39:11 -0600

On 04/29/2011 08:18 AM, Amos Jeffries wrote:

> Are you aware of any convention about src,dst or dst,src pairing and
> which goes first? All I can think of is the snprintf(out, out_max,
> ...) example.

I would mimic memcpy() and memmove(): dst, src.

However, std::copy() uses src,dst so you cannot go wrong either way :-)!

> I'm tempted to alter the API to be (outBuf, outBufsz, inBuf, inBufSz)
> before we get too many more callers for the new functions.

Good idea, IMO.

> === modified file 'src/HttpHeader.cc'
> --- src/HttpHeader.cc 2011-03-22 12:23:25 +0000
> +++ src/HttpHeader.cc 2011-04-29 13:13:00 +0000
> @@ -1414,7 +1414,10 @@
> if (!*field) /* no authorization cookie */
> return NULL;
>
> - return base64_decode(field);
> + static char decodedAuthToken[8192];
> + int decodedLen = base64_decode(decodedAuthToken, fields, sizeof(decodedAuthToken));
> + decodedAuthToken[decodedLen] = '\0';
> + return decodedAuthToken;
> }

s/fields/field/ ?

The above may crash if decodedLen is sizeof(decodedAuthToken). I think
there are similar places elsewhere in the patch. Search for '\0' to find
more.

Looking at this and other caller changes, it feels like you need
base64_decode_and_terminate() that will handle termination safely. If
you add that, please check that 64_decode_len() and/or its documentation
is adjusted accordingly.

It feels wrong that we ignore all decoding errors, but fixing that may
be outside your patch scope.

s/int decodedLen/const int decodedLen/

> === modified file 'helpers/ntlm_auth/smb_lm/ntlm_smb_lm_auth.cc'
> + int decodeLen = base64_decode(decoded, buf+3, sizeof(decoded), ch-buf+3);
>
> - if (!decoded) { /* decoding failure, return error */
> + if (!*decoded) { /* decoding failure, return error */

base64_decode() does not 0-terminate, so the above check looks wrong.

> Which reminds me, you notice how I've made a point of mentioning the
> altered source file where each audit comment related to?

No, I have not, sorry. I just search for a string in the patch -- if
there are multiple occurrences, there could be multiple similar bugs...

> Please make a
> habit to doing that too. It's much easier to find and fix the problems.

OK. And please post diffs with larger context :-)

> === modified file 'lib/base64.c'
> +int
> +base64_decode_len(const char *data)
> +{
> + if (!data || !*data)
> + return 0;
> +
> + int terminatorLen = 0;
> + int dataLen = strlen(data);

s/int dataLen/const int dataLen/

The !*data check is probably not needed.

BTW, I am not sure, but it is possible that base64_decode_len() can be
implemented more efficiently and/or more elegantly using strcspn(data,
"="), avoiding searching for \0 and then back-searching for \=. Not a
big deal though, even if strcspn() is indeed faster.

> === modified file 'lib/base64.c'
> +int
> +base64_encode_str(char *result, const char *data, int result_size, int data_size)
> +{
> + int used = base64_encode(result, data, result_size, data_size);
> + /* terminate */
> + if (used >= result_size) {
> + result[result_size - 1] = '\0';
> + return result_size;
> + } else {
> + result[used++] = '\0';
> + }
> + return used;
> +}

The above does not handle zero result_size correctly. Perhaps zero
result_size should be illegal for base64_encode_str()?

> === modified file 'tools/cachemgr.cc'
> - str64 = base64_encode(buf);
> + int encodedLen = base64_encode_len(bufLen);
> + char *str64 = static_cast<char*>(xmalloc(encodedLen));
> + if (base64_encode_str(str64, buf, encodedLen, bufLen) == 0)
> + return "";

As currently defined, base64_encode_str() cannot return zero because it
has to terminate.

And if you leave the adjusted if-statement, please note that it causes a
str64 memory leak.

> === modified file 'tools/cachemgr.cc'
> --- tools/cachemgr.cc 2011-04-12 11:33:32 +0000
> +++ tools/cachemgr.cc 2011-04-29 13:37:29 +0000
> @@ -1053,16 +1053,17 @@
> return;
>
> /* host | time | user | passwd */
> - snprintf(buf, sizeof(buf), "%s|%d|%s|%s",
> - req->hostname,
> - (int) now,
> - req->user_name ? req->user_name : "",
> - req->passwd);
> -
> + int bufLen = snprintf(buf, sizeof(buf), "%s|%d|%s|%s",
> + req->hostname,
> + (int) now,
> + req->user_name ? req->user_name : "",
> + req->passwd);
> debug("cmgr: pre-encoded for pub: %s\n", buf);
> - debug("cmgr: encoded: '%s'\n", base64_encode(buf));
>
> - req->pub_auth = xstrdup(base64_encode(buf));
> + int encodedLen = base64_encode_len(bufLen);

Please check whether both bugLen and encodedLen can be declared const.
There are probably similar places I have missed. While I do not expect a
noticeable performance improvement from const-related compiler
optimizations, const does help when reading the code -- you know there
are no funny hidden adjustments to worry about. Believe it or not, there
are even programming languages where everything is const!

Thank you,

Alex.
Received on Fri Apr 29 2011 - 16:39:42 MDT

This archive was generated by hypermail 2.2.0 : Fri Apr 29 2011 - 12:00:07 MDT