Re: [PREVIEW] Error page format codes upgrade

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 19 Apr 2013 23:44:53 +1200

On 19/04/2013 5:41 a.m., Alex Rousskov wrote:
> On 04/06/2013 11:48 PM, Amos Jeffries wrote:
>
>> What this does is convert ErrorState object to using the generic
>> libformat.la parser and macro expansions instead of its own rather
>> limited custom ones for error pages and deny_info URL creation.
> This is a very nice improvement IMO, thank you.
>
>
>> class AccessLogEntry: public RefCountable
>> {
> ...
>> +
>> + /// details of the error page presented to the client (if any)
>> + ErrorState *errorDetails;
>> };
> Please do not use a general name like "errorDetails" to store
> information dedicated to a very specific use by internal ErrorPage code.
> Besides, we already have a whole class called ErrorDetail and it has
> nothing to do with the above member. I suggest calling this new member
> "errorPage".

Done.

> Please also add a comment that explains that this pointer is managed
> internally by ErrorState and should not be used outside ErrorState code.
> You have documented that in your email, but I think it is worth
> documenting that in the code. More on the lifetime of this pointer below.
>
>
>> + MemBuf *res = ConvertText(m);
>> +
>> + // yuck! steal the res buffer for our output
>> + result.buf = res->buf;
>> + result.size = res->size;
>> + result.capacity = res->capacity;
>> + result.max_capacity = res->max_capacity;
>> + res->stolen = true; // prevent delete erasing the buffer we stole.
>> + delete res;
> Please rewrite without stealing. One way to do that would be to pass
> MemBuf pointer to ConvertText. If the passed pointer is NULL,
> ConvertText() will create a new buffer.

Done now. Thanks for the reminder.

>
>> + // XXX: if request exists, we might use ALE from request->lientConnectionManager->http.al ??
> It sounds like a great idea because error page ALE is missing
> information that the http.al has. On the other hand, if we set
> http.al.errorDetails to our page, it is not obvious that another error
> page for the same transaction will never do that as well, overwriting
> our pointer and confusing things. I wonder if we can do better:
>
> As far as I can tell, the useful lifetime of the new errorDetails
> pointer is the fmt.assemble() call. If that is true, let's document that
> fact and rewrite the code to set and clear the pointer just before and
> just after that call. You may even assert that the ale_->errorDetails
> pointer is nil in the error page destructor.
>
> If we can do the above, then we should use http.al when it is available.
> Refcounting should make that easy.

This is all done now.

>
>> *b) Doing a whole parse cycle for every error response is very costly.
>> We are already loading the text page files on every error response. This
>> adds the un-optimized libformat parser lag on top of that.
> Have you tested the performance impact of these changes? Perhaps they
> are not as bad as they sound because the old code was not super
> optimized either?

Not yet. However the old code was just doing a parse scan with multiple
append() operations at each macro site. The new code is doing the same
on top of a format tree structure allocate/deallocate cycle. So for now
this is guaranteeing yet another step towards bad performance.

>
>> *c) For now AccessLogEntry equired by the libformat API is generated by
>> ErrorState just before use. Ideally the code creating the ErrorState
>> should pass it an already filled out ALE object for the transaction.
> Unfortunately, a lot of code creating error pages is in no better
> position than ErrorState to find that valuable ALE object. This is the
> old problem of a missing "master transaction" object: HttpRequest is
> available nearly everywhere but is declared (probably correctly!)
> unsuitable for logging in favor of ALE which is not easily available in
> many contexts.
>
> I suspect that the only reasonable mid-term solution is to carefully add
> a pointer to the "master transaction" ALE to HttpRequest.
>
>> - Ip::Address src_addr;
>> + Ip::Address src_addr; // client Comm::Connection ??
> Please see if you can delete this member. You may have removed the only
> useful code that was using it. The dumping code is not very useful and
> can can probably be rewritten to get this info from elsewhere (since the
> ALE formatting code does support logging of the client source address
> and does not have access to this member).

This one is not quite so easy. It is supposed to be used for errors
early in parsing when there is no HttpRequest object.
I have taken advantage of the change though and enacted that "??" / TODO
about making it a connection pointer.
That has caused a few build issues around the place. New patch to follow
when it builds again.

>
>> - case '%':
>> - p = "%";
>> - break;
>> -
>> default:
>> - mb.Printf("%%%c", token);
>> - do_quote = 0;
>> break;
>> }
> After your changes, if an old error page had %x text, where "x" was not
> a supported error page macro name, is it possible that the finalized
> error page (as seen by users) may change? If so, this should be
> documented in the change log, I think.

Yes it changes. It has been remarked as a bug several times that new
macros are visible in old releases.

>
>> + // parse the text using Format:: parser
>> + Format::Format fmt("errorPageTemporary");
>> + fmt.parse(text);
> If possible, please create fmt later, when it is actually needed for the
> assemble() call.

Done.

Amos
Received on Fri Apr 19 2013 - 11:45:03 MDT

This archive was generated by hypermail 2.2.0 : Sat Apr 20 2013 - 12:00:16 MDT