Re: [PATCH 5/8] reconfiguration leaks: objects tied to http_port

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 16 Jun 2014 15:15:21 -0600

On 06/15/2014 12:02 AM, Amos Jeffries wrote:
> On 15/06/2014 4:44 a.m., Alex Rousskov wrote:
>> On 06/13/2014 07:25 PM, Amos Jeffries wrote:
>>> On 14/06/2014 8:07 a.m., Alex Rousskov wrote:
>>>> On 04/25/2014 02:59 AM, Amos Jeffries wrote:
>>>>> On 25/04/2014 12:55 p.m., Alex Rousskov wrote:
>>>>>> Do not leak [SSL] objects tied to http_port and https_port on reconfigure.
>>>>>>
>>>>>> PortCfg objects were not destroyed at all (no delete call) and were
>>>>>> incorrectly stored (excessive cbdata locking). This change adds
>>>>>> destruction and removes excessive locking to allow the destructed
>>>>>> object to be freed. It also cleans up forgotten(?) clientca and crlfile
>>>>>> PortCfg members.
>>>>>>
>>>>>> This change fixes a serious leak but also carries an elevated risk:
>>>>>> There is a lot of code throughout Squid that does not check the pointers
>>>>>> to the objects that are now properly destroyed. It is possible that some
>>>>>> of that code will crash some time after reconfigure. It is not possible
>>>>>> to ensure that this does not happen without rewriting/fixing the
>>>>>> offending code to use refcounting. Such a rewrite would be a relatively
>>>>>> large change outside this patch scope. We may decide that it is better
>>>>>> to leak than to take this additional risk.

>>>>> -0.
>>>>>
>>>>> I have a patch moving the SSL config options into a standalone
>>>>> ref-counted object. That can be polished up and references added to each
>>>>> ConnStateData fairly easily.

>>>> Amos, what is the status of that patch? Any ETA?

>>> Stalled behind the larger works. If it is urgent I can did it out and
>>> polish it up.
>>>
>>> [...] The design is a new
>>> Ref-Countable class to hold all the SSL options (and generated state)
>>> leaving just a Pointer to it in the main config class.
>>> * Ports which needed a clone operation took a copy of the pointer and
>>> share the context.
>>> * client/server context initialization functions take a Pointer to the
>>> class and update its state content.

>> Why treat SSL options differently from all other port options? Should
>> not the whole PortCfg object be refcounted instead?

> The other options are POD-style options. They can be reset relatively
> okay by the parsers memset() and are filled out by the config parse
> routines.

AFAICT, not all other options are "POD-style options that can be reset
relatively okay by memset()" unless "relatively OK" means "small leaks
are OK". Some non-SSL port options require specialized cleanup just like
some SSL port options do (PortCfg::name and defaultsite are two examples
with listenConn arguably being the third). Thus, the requirement for
specialized cleanup is not a reason to treat SSL options differently
from all other port options.

> Also, the same set of config settings are needed by *_port, cache_peer,
> and sslproxy_* directives. It makes sense to have one class type for SSL
> config which is used by all three.

While it may indeed be a good idea to group related options together,
such grouping is irrelevant for the purpose of this discussion. Whether
we group SSL-related options or not, they and other port options will
continue to leak. They leak not because they are not grouped but because
Squid leaks the Port structure they belong to (grouped or not).

> On the other side AnyP::PortCfgPointer already exists to reference count
> the port objects. We could spend time converting the raw pointers to use
> it instead so your patch here is not dangerous.

PortCfgPointer is not a reference counting pointer. It is a
cbdata-protected pointer. The proposed patch _is_ using/adding the
associated cdbdata protections in/to some users (but more work may be
needed to make all users safe, as disclosed). Simply converting all raw
port pointers to use PortCfgPointer is not going to make us much safer
(it will just make triaging crashes easier). The callers need to be
changed to check that the pointer is still valid before dereferencing it.

Unlike refcounting pointers, cbdata pointers do not protect naive code
from crashing. The cbdata interface is also asymmetric -- there supposed
to be only one object owner. Cbdata pointers are like std::weak_ptr and
refcounting pointers are like std::shared_pointer. Very different APIs,
offering different trade-offs.

Most likely, we should use the refcounting API for port pointers. Until
that (or a better) solution is implemented, we should either

* stop port leaks or

* acknowledge that we would rather see Squid leaking than increasing the
probability of a Squid crash after reconfigure.

Both approaches are reasonable. The proposed patch does the former. I do
not want to commit that patch because there is only one vote and it is "-0".

What bothers me here is that the justification for the -0 vote was
(AFAICT) a promise of a better patch, but it now appears that the better
patch is not coming soon, has problems of its own, and may not even
remove the leak. I cannot really combat those obstacles, and it is
difficult for me to explain/justify them to others. May I re-interpret
your -0 vote justification as "the proposed fix is too risky"?

Thank you,

Alex.
Received on Mon Jun 16 2014 - 21:15:53 MDT

This archive was generated by hypermail 2.2.0 : Sun Jun 22 2014 - 12:00:11 MDT