Re: [PATCH] URL handling bugs

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 18 May 2011 09:36:34 -0600

On 05/18/2011 08:52 AM, Amos Jeffries wrote:
> This patch includes two bug fixes in URL handling which were uncovered
> during testing of the URL logging update:
>
> * URL re-write handling was not correctly creating its adapted request
> copy. The code here is much reduced by using the clone() method. Still
> not completely satisfactory (marked with XXX) since on errors there is a
> wasted cycles cloning.

Wasting a few cycles on errors would be OK, I guess, but there is also
at least a possibility of a memory leak: HttpRequest::clone() always
produces a new request, but AFAICT urlParse() sometimes returns NULL
instead of the request, without deleting the request parameter. Thus, we
must not do the following:

    urlParse(..., old_request->clone())

but have to do something like

    new_request = old_request->clone();
    if (!urlParse(new_request->method, result, new_request))) {
        debugs(...);
        delete new_request;
        new_request = NULL;
    }

which still looks terrible but at least avoids memory leaks.

Eventually, this would become an HttpRequest::resetUrl() or reparseUrl()
method, I guess:

    new_request = old_request->clone();
    if (!new_request->resetUrl(result)) {
        ...
    }

(assuming there are good reasons for urlParse() to require access to
request details), but such changes are probably outside your patch scope.

> + // XXX: why does this not get copied by clone()?
> + new_request->content_length = old_request->content_length;

Because it is set in HttpHdr::hdrCacheInit() called by
HttpRequest::clone(). You should be able to remove the above two lines.

It may be more efficient to add HttpHdr::clone() instead of initializing
its cache from scratch, of course, but that optimization is probably
outside your patch scope.

> - copy->canonical = canonical ? xstrdup(canonical) : NULL;
> + urlCanonical(copy); // does a re-build from the above copied details.

This change seems like a step backwards to me. We should be able to just
clone the canonical field when cloning the request. Why do we need to
rebuild it from scratch? The clone() method has the right to correctly
clone low-level bits. We should treat it as a copy constructor,
essentially (in fact, it should probably be eventually implemented using
a private copy constructor).

If the canonical field of the being-cloned request is not valid (and
hence should not be cloned), the bug is elsewhere.

The urlCanonical(request) global should be converted into a constant
HttpRequest method, but that may be outside your patch scope. In any
case, urlCanonical() is not designed for cloning and, unlike the HttpHdr
cache discussed above, we already have a simple xstrdup() cloning
solution here that we should use.

> * URL parsing needs to set the canonical field to unset whenever the
> URI is re-parsed into a request. This field is an optimization for later
> display speed-ups. This has been causing bad URL display throughout
> re-written request handling

Can you rephrase using a word other than "display"? I think I agree with
the above, but it is not clear what you mean by "display". Perhaps you
can say that stale canonical field values were used throughout Squid
after a request was rewritten?

> and when the clone() was corrected to use
> urlCanonical() caused asserts in the server-side.

I disagree that it was a correction. Please see above for specifics.

> This canonical change has been discussed previously. There is one other
> incorrect setting of HttpRequest::canonical in icap/ModXact.cc, but that
> must handle a const input which urlCanonical() function does not yet
> handle.

To fix that, urlCanonical(request) should become a constant HttpRequest
method. Is there an XXX about this in icap/ModXact.cc? If not, please add.

Thank you,

Alex.
Received on Wed May 18 2011 - 15:36:57 MDT

This archive was generated by hypermail 2.2.0 : Wed May 18 2011 - 12:00:53 MDT