Re: [PATCH] Helper callback params from Alex Rousskov on 2012-07-05 (squid-dev)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 05 Jul 2012 10:23:44 -0600

On 07/05/2012 12:31 AM, Amos Jeffries wrote:
> This patch updates the helper reponse callback API from using char*
> buffer to a HelperReply object.
>
> * the helper I/O read handler is updated to parse the result code off
> the start of the helper response as is currently done for channel-ID.
> The callback handlers are altered to use the HelperReply::status instead
> of parsing it off themselves individually.
>
> * the remaining I/O read buffer is stored in a MemBuf and callbacks are
> updated to use it via the method other().
>
> * the responding helper-server is stored into the HelperReply object and
> stateful helper callbacks are combined into the same API as stateless.
> The callback handlers are updated to use HelperReply::lastserver instead
> of function parameter.
>
> After this patch the helper response format is: [channel-ID] [result]
> [blob] <terminator>
>
> The behavour changes expected from this is that all helpers are now able
> to send OK/ERR/BH states. Although the handlers for some helpers will
> deal with the new states as unknown response. None of the bundled
> helpers have been altered to make use of this changed potential.
>
> TODO:
> * implement key=value parser for the blob area of the format, and update
> handlers to use the HelperReply API to retrieve them.

> + HelperReply(); // not defined

Do not declare it if you do not need it. However, I think it would be
best to actually define it and then move parsing from a constructor to a
method so that helper-specific reply code can be called during parsing
(via virtual methods).

> + HelperReply(const HelperReply &); // not defined

Either

* Make private so that its usage is caught at compile time rather than
at linking time. Please also add an undefined assignment operator.

* Remove copy constructor and do not add an undefined assignment
operator. AFAICT, the only thing that prevents us from copying
HelperReplies is that current MemBuf prohibits MemBuf copying. This will
be automatically propagated to HelperReply so there is no need to
explicitly prohibit it. If/when MemBuf is improved to allow copying, the
HelperReply class will become copyable as well.

> +class HelperReply
> +{
> +public:
...
> + CBDATA_CLASS2(HelperReply);
> +};

Why do we need to add cbdata here? I do not see dynamically allocated
HelperReply objects that can be passed as pointers to async calls or
some such. Keep it simple?

> + enum Result {
> + Unknown,
> + OK,
> + ERR,
> + BrokenHelper,
...
> + TT,
> + AF,
> + NA,
> + LD

It may be a good idea to make these names more unique to avoid clashes
with poorly written code that places names like "OK" and "ERR" in the
global namespace. Adding an "hrr" (helper reply result) prefix would help.

If BrokenHelper is spelled out, perhaps "TT", "AF", and similar other
results names should be too? Or, vice versa, BrokenHelper should match
its BH string.

It would be nice to spell out those acronyms(?) in comments if we use
them. For example:

BH, // Broken Helper

> + enum Result {
...
> + } status;

Both should be named either [rResult] or [sS]tatus to stay consistent IMO.

The "status" member is also missing a Doxygen description.

> + /// for stateful replies the last server needs to be preserved across callbacks
> + CbcPointer<helper_stateful_server> lastserver;

Why is this the "last" server? From the helper reply point of view, it
is the one and only server. Please consider renaming to "server" or
"helper" or "source".

> + // some result codes for backward compatibility with NTLM/Negotiate
> + // TODO: migrate these into variants of the above results with key-pair parameters
> + TT,
> + AF,
> + NA,
> + LD

and

> + case HelperReply::Unknown: // Squid 3.2 and older the digest helper only returns a HA1 hash (no "OK")
> + case HelperReply::OK:

and

> + // The reply passed to us is constant, so we need to duplicate it and relay the duplicate onward.
> + // The helper is using old format (result 'Unknown') so everything we need is inside the 'res' variable
> + HelperReply newRes(res,(res?strlen(res):0));

The code above already warrants addition of helper-specific reply
classes that can hide helper specifics from the general helper code and
hide backward compatibility conversions from the helper-specific code
that should focus on other things.

> + if (reply.other().hasContent())
> + digest_request->setDenyMessage(reply.other().content());
> + }
> + break;
> +

Something funny is going on here with formatting: The "break" position
looks odd.

> + case HelperReply::Unknown:
> + if (reply.status == HelperReply::Unknown) {
> + debugs(29, DBG_IMPORTANT, "ERROR: NTLM Authentication Helper '" << reply.lastserver << "' crashed!.");
> + blob = "Internal error";
> + }
> +

and

> + case HelperReply::Unknown:
> + debugs(29, DBG_IMPORTANT, "ERROR: Negotiate Authentication Helper '" << reply.lastserver << "' crashed!.");
> + blob = "Internal error";
> +
> + case HelperReply::BrokenHelper:

Please add something like /* continue to the next case */ to emphasize
that "break" is not missing here.

> - arg = strchr(blob + 1, ' ');
> - } else {
> - arg = NULL;
> + const char *blob = reply.other().content();
> + char *arg = NULL;
> + if (blob && *blob != '\0') {
> + const char *temp = strchr(blob + 1, ' ');
> + arg = xstrdup(++temp);
> }

and

> + if (reply.other().hasContent()) {
> + char *temp = xstrdup(reply.other().content());
> + char *token = strwordtok(temp, &t);

Are these performance regressions? We did not do an xstrdup() before,
did we? If these are minor performance regressions desirable for
non-performance reasons, please disclose in commit message. Just like
XXXs in the code, disclosing is "good practice" and might help
identify/explain regressions later during performance testing or
investigations.

> + debugs(29, 8, HERE << "helper: '" << reply.lastserver << "' sent us {" << reply << "}");

HelperReply already adds curly braces when dumped to ostream.

> - if (reply_message.parse(reply, strlen(reply)) != Ssl::CrtdMessage::OK) {
> + if (reply_message.parse(reply.other().content(), reply.other().contentSize()) != Ssl::CrtdMessage::OK) {
> debugs(33, 5, HERE << "Reply from ssl_crtd for " << sslHostName << " is incorrect");
> } else {
> - if (reply_message.getCode() != "OK") {
> + if (reply.status != HelperReply::OK) {

This looks strange to me. We used to extract and analyze the code from
"reply". Now we are extracting that code from reply.other() but then
analyzing reply.status which is not a part of reply.other().

If the old "reply" is different from the new "reply.other()", then we
are parsing the wrong thing now. And if the old "reply" is the same as
the new "reply.other()", then we are looking at the wrong status/code now.

This confusion (even if the new code is correct!) is one more reason why
helper-specific parsing code should be moved to helper-specific
HelperReply kids as a part of this change IMO.

> + // NULL-terminate so the old helper callback handlers do not buffer-overrun
> + other_.terminate();

Null-terminations is required for most helpers callbacks I saw in the
patch. Perhaps we should remove "old "?

> + if (reply.other().hasContent()) {
> + /* 2012-06-28: This cast is due to urlParse() truncating too-long URLs itself.
> + * At this point altering the helper buffer in that way is not harmful, but annoying.
> + * When Bug 1961 is resolved and urlParse has a const API, this needs to die.
> + */
> + char * result = const_cast<char*>(reply.other().content());

Please add the following method and use it here instead of the
const_cast<>. It would make tracking the "At this point [modifying blob]
is not harmful" assumption much easier/safer.

    /// access to modifiable blob, required by redirectHandleReply()
    /// and by urlParse() in ClientRequestContext::clientRedirectDone()
    MemBuf &HelperReply::modifiableOther() const { return other_; }

The method implementation may need casting itself, but try without it
first (and it is not a problem if it does need it).

You can drop "modifiable" prefix from the method name, but it is easier
to track this that way, and since it is an exception and not a commonly
used method, I think it is better that way. Your call though.

> + // BACKWARD COMPATIBILITY 2012-06-15:
> + // Some nasty old helpers send back the entire input line including extra format keys.
> + // This is especially bad for simple perl search-replace filter scripts.
> + //
> + // * trim all but the first word off the response.
> + // * warn once every 50 responses that this will stop being fixed-up soon.
> + //
> + if (char * res = const_cast<char*>(reply.other().content())) {

Same comments apply here.

BTW, most other helper callbacks just dupe the the other().content()
string. Why do these two one prefer to avoid that and cast instead?

> + const MemBuf &other() const { return other_; };

Extra semicolon at the end.

> - http->redirect.status = status;
> - http->redirect.location = xstrdup(t + 1);
> + if (!strcmp(http->redirect.location, http->uri)) {
> + // prevent redirector setting up an infinite loop of 302.
> + xfree(http->redirect.location);
> + debugs(85, DBG_CRITICAL, "ERROR: URL-rewriter redirects the client back to " << http->uri);
> + } else {

Looks like new functionality unrelated to your patch scope. Please
commit separately if possible. BTW, have you seen redirect-to-same-URI
cause infinite 302 loops? I thought browsers can detect these (and we
may misdetect something if the two URLs are not actually the same from
client point of view). Could be another policing problem that will
require workarounds. If there are real cases were this is needed, let's
add it, of course (but separately).

> + debugs(85, DBG_CRITICAL, "ERROR: URL-rewriter redirects the client back to " << http->uri);
...
> + debugs(85, (!(temp++%50)?DBG_CRITICAL:2), "ERROR: URL-rewriter redirects the client back to the same URL. Ignoring reply " << reply);

Can you make these two cases identical in behavior,
too-many-console-messages avoidance logic, and ERROR message text?

> + static int temp = 0;

There is nothing "temporary" about this variable. Consider calling it
errorCount or similar.

> + debugs(85, (!(temp++%50)?DBG_CRITICAL:2), "ERROR: URL-rewriter redirects the client back to the same URL. Ignoring reply " << reply);

Please use something like the following trick to report the number of
ignored replies (so that it is easy to confirm why the messages have
disappeared after a while):

++temp;
debugs(.... Ignoring looping reply #" << temp << ' ' << reply);

> if (cbdataReferenceValid(state->def)) {
> - if (reply)
> + // only cache OK and ERR results.
> + if (reply.status == HelperReply::OK || reply.status == HelperReply::ERR)
> entry = external_acl_cache_add(state->def, state->key, entryData);
> else {
> external_acl_entry *oldentry = (external_acl_entry *)hash_lookup(state->def->cache, state->key);

An undisclosed functionality change unrelated to this patch scope?
Please commit separately.

> + HelperReply nulReply(NULL,0);

s/nul/null/ or s/nul/nil/

There are several such objects; search for nulReply. This is minor, of
course, but keep things more consistent and might make it easier for
some to comprehend the intent.

HTH,

Alex.
Received on Thu Jul 05 2012 - 16:23:50 MDT

This archive was generated by hypermail 2.2.0 : Mon Jul 23 2012 - 12:00:04 MDT