Re: [PATCH] HelperReply upgrade stage 3

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 01 Dec 2012 00:28:02 +1300

Updated patch attached for review.

On 27/11/2012 9:10 a.m., Alex Rousskov wrote:
> On 11/24/2012 08:27 AM, Amos Jeffries wrote:
>
>> This stage of the helper reply protocol adds kv-pair support to the
>> url_rewrite_helper interfacefor URL redirect and rewriteoperations.
>>
>> It uses the new Notes objects and kv-pair field added by the stage 2
>> helper protocol instead of parsing the 'other' field.
>>
>>
>> The response syntax for URL helpers becomes:
>> [channel-ID SP ] result [ SP kv-pair ... ] [ SP other] EOL
>>
>> * 'other' field is now deprecated and will be ignored/discarded on any
>> response containing a result code field.
>>
>>
>> When result code "OK" is presented by the helper several kv-pairs are
>> reserved to control Squid actions:
>>
>> * "rewrite-url=" is added to return a re-written URL.
>> - When this key is presented the URL is re-written to the new one by
>> Squid without client interaction.
>> - The 'url' keys presence will override this key.
>>
>> * "url=" is added to return the redirected-to URL.
>> - When this key name is presented an HTTP redirect response is
>> generated for the client.
>> - This keys presence overrides the 'rewrite-url' key actions.
>>
>> * "status=" is added to hold the HTTP status code for use in redirect
>> operations.
>> - This field is optional and status is no longer required for marking
>> redirect actions.
>> - If no redirect status is provided Squid will assign one (currently
>> the default is 302, that may change in the future).
>> - This key is only relevant when 'url' key is also presented. In all
>> other uses it is currently ignored.
>> When result codes BH or ERR are presented by the helper no redirect or
>> rewrite action is performed and no kv-pair key names are reserved for
>> use at this time.
> but any key=value pairs in a BH or ERR response are still parsed into
> notes and can be logged, right?

Right.

> Can we tell folks which key names are guaranteed not to clash with
> future Squid key names? For example, we can say that no Squid-supported
> key name will end with "_". Or we could recommend using helper-specific
> prefixes for custom key names instead.

Done with stage-2 in the wiki protocol documentation.

> N.B. We should not disclaim using X-* names because some of those (e.g.,
> X-Client-Username) are a de facto standard and might be used by Squid
> here eventually.
>
>
> Please note that helper annotations will _not_ be sent to adaptation
> services.

K. Added to the commit message.

>
>> Any other keys MAY be sent on any response. The URL helper interface
>> makes no other use of them, but this patch does pass them on to the ALE
>> object for logging as transaction Notes. All kv-pairs returned by the
>> helper (including the url, stauts rewrite-url keys) are available for
>> logging via the %{...}note log format option.
> What about the "note" ACL? Can it be used to match against
> helper-provided annotations? Or will the helper annotations be
> inaccessible to it? Please document one way or another.
>
> If the note ACL will not work for helper annotations, we may need
> "stage4". It may be better to fix that now instead, but that is your
> call provided you will make it work.

ACLs can access them from the HttpRequest object at any point after
being added by the helper response. 'note' ACL is no exception when you
get around to adding it.

>
>> + switch(reply.result) {
>> + case HelperReply::Unknown:
>> + case HelperReply::TT:
>> + debugs(85, DBG_IMPORTANT, "ERROR: URL rewrite helper returned invalid result code. Wrong helper? " << reply);
>> + break;
>> +
> Please add a debugs() line to print reply.result before the above switch
> unless we print that elsewhere already. Not all cases inside the switch
> have debugs() lines so it would be nice to know what the switch value was.
The full reply object {result,notes{...},other()} is being displayed at
level 85,5 at the top of the function.

>
>> + debugs(85, DBG_IMPORTANT, "ERROR: URL rewrite helper: " << reply);
> Please add "redirect_fail_count" value to the above debug line or add a
> new debug line for it. It seems critical in understanding what would
> happen after the error.

Done. Suffixed ", attempt 1 of 2" / ", attempt 2 of 2"

>
>> +// XXX update to handle the new format responses...
>> +// #1: redirect with a specific status code OK status=NNN url="..."
>> +// #2: redirect with a default status code OK url="..."
>> +// #3: re-write the URL OK rewrite-url="..."
> based on the patch description, the above is already implemented, no?
Ah yes. Moved into OK case as documentation there.

>
>> + char * result = const_cast<char*>(statusNote->values[0]->value.termedBuf());
>> + status = (http_status) atoi(result);
> Please make result constant if possible.
Done.

>
>> + char * result = const_cast<char*>(reply.other().content());
>> + http_status status = (http_status) atoi(result);
> Please make both result and status constant if possible.
>
Done.

>
>> + char * result = const_cast<char*>(statusNote->values[0]->value.termedBuf());
>> + http->redirect.location = xstrdup(urlNote->values[0]->value.termedBuf());
>> + if (urlNote != NULL && strcmp(urlNote->values[0]->value.termedBuf(), http->uri)) {
> xstrdup() calls exit() on NULL strings. strcmp() might dump core. The
> above lines may assume that termedBuf() never returns NULL. Please
> double check that "value" member cannot be an "undefined" String in
> these contexts. It is probably prudent to check for that centrally, when
> parsing annotations, if we are going to rely on that -- refuse to add
> undefined Strings to annotation values.
Fixed with use of the stage-2 firstValue() accessor which guarantees ""
at minimum.

>
>> + Notes metaNotes; // collection of meta notes associated with this request.
> I find it ironic that you insisted on renaming "meta headers" but are
> now adding "meta notes" :-). Please consider using "notes" or
> "annotations" instead. The annotations are already "meta".
This linking of notes into HttpRequest was one of the reasons I insisted
on removing the 'headers' part of the naming. It would have been hell
tracking the difference between mimeHeaders and metaHeaders and
virginHeaders and adaptedHeaders.
I never had anything against the 'meta' part other than it looked weird
on the other type name choices.

Renamed to "helperNotes" for now to clarify they are notes from the helpers.

> The description seem to imply that metaNotes contains all annotations
> associated with this request, but I do not think this is true because
> some annotations are stored elsewhere. This is messy so let's document
> exactly which annotations are using this particular storage (I think it
> is currently used just for helper annotations, right?).
It *will* contain all the notes from all the helpers - when I get around
to altering the other helpers. But this sponsor had specific target
requirements and a short timeframe I did not get to make the same update
on the other interfaces code (yet).

> Finally, I am concerned that we are adding Notes to all HttpRequests
> when most HttpRequest objects do not need annotations. Currently, Notes
> are implemented using Vector<> which has low performance overhead when
> unused. I am worried that if we later replace Vector<> with std::vector
> or similar, the inclusion of Notes will cause a performance hit. What do
> you think? Should we use a pointer to Notes here instead?
>

I was hoping to avoide all the extra checking pointers need and the
duplicated if-NULL-allocate each helepr interface will need to perform.
But I see your point. Changed to a Notes*.

>
>> + * Adds a set of notes from another notes list to this set.
>> + * Create any new keys needed.
>> + * If the key name already exists in list, add the given value to its set of values.
>> + */
>> + void add(const Notes &src);
> Please polish the description to indicate that pointers to src notes
> and/or pointers to individual note values are added. Modifying src after
> add() may change the added notes as well!

And vice versa.
I have dropped the new-key import so that future merges of other helper
values for the key does not alter the original src objects key. But the
values inside the keys seem to be safe enough to leave as references and
save a bit on memory. More on that below...

>
>> + uint8_t redirect_fail_count;
> New member missing a doxygen description.

Added.

>> --- src/HttpRequest.cc 2012-10-27 00:13:19 +0000
>> +++ src/HttpRequest.cc 2012-11-24 14:55:43 +0000
>> @@ -114,6 +114,7 @@
>> peer_domain = NULL; // not allocated/deallocated by this class
>> vary_headers = NULL;
>> myportname = null_string;
>> + metaNotes.clean();
>> tag = null_string;
> HttpRequest::init() does not need to call clean() but
> HttpRequest::clean() does!
Resolved the particular issue with move to pointer. Adding a delete to
HttpRequest::clean() and NULL-initialize to the constructors to resolve
the logic flaw.

>
>> @@ -221,6 +222,7 @@
>> // XXX: what to do with copy->peer_domain?
>>
>> copy->myportname = myportname;
>> + copy->metaNotes.notes = metaNotes.notes;
>> copy->tag = tag;
>> #if USE_AUTH
>> copy->extacl_user = extacl_user;
> Hm.. The generated Notes assignment operator will not really clone
> individual notes inside the "notes" vector/array. It will just copy
> pointers to them. Is that OK because we never modify a Note? I am not so
> sure. Perhaps an older Note will be modified when the "note" directive
> is checked, affecting cloned requests that should be independent from
> it? Should we prohibit Notes assignment and add an explicit
> Notes::clone() method that is guaranteed to clone Notes rather than copy
> their pointers?

If you think there is a case against retaining this memory linking I can
add a Notes::clone()? But AFAICS there is no existing need for it (maybe
in future ...).

Notes are read/append-only by design. No write/alter for existing
values. So I don't see any issue here. The most I expect we will do is
erase Note::Pointer linkages to a particular request object. And not
even doing that specifically yet.

The model is that each HttpRequest down the transaction chain inherits
from the one it was cloned from a set of read/append-only notes. Any
helper or ACLs called may read some notes for their parameters and
append new notes until the last post-adaptation HttpRequest object at
the end of the transaction moves the whole set into ALE for logging.

The keys might have been altered later to add new values, I have
replaced that copy/reference with new(). But the values themselves are
read-only, so seem to be sfae enough and I have left as a refcounted
pointer copy.

>
>> + char *t = NULL;
>> +
>> + if ((t = strchr(result, ':')) != NULL) {
> This can be merged into one line and simplified. You may also be able to
> declare "t" constant.
>
> if (const char * t = strchr(result, ':')) { ...

Done.

Amos

Received on Fri Nov 30 2012 - 11:28:25 MST

This archive was generated by hypermail 2.2.0 : Fri Nov 30 2012 - 12:00:18 MST