Re: bugs in HEAD from Henrik Nordstrom on 2002-05-19 (squid-dev)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sun, 19 May 2002 18:21:07 +0200

On Friday 17 May 2002 10:17, Andres Kroonmaa wrote:

> > The pointer returned by cbdataAlloc has an implicit lock. The
> > same pointer should be used for cbdataFree() later on.
>
> What you mean by implicit lock? cbdataAlloc only sets valid bit.

There may be two kinds of references to a cbdata allocated memory:

1. The main pointer, managed by the "owner". Created by cbdataAlloc().

2. cbdata references for use in functions doing callbacks to the
caller, created by cbdataReference(). (for example storeClientCopy())

The main pointer do not need to be locked/unlocked, except for the
special case where one temporarily holds a copy of the pointer while
explicitly calling cbdataFree and you want to protect the pointer
from yourself, allowing references after calling cbdataFree().

> How do you distinguish between pointer and reference? Suppose a
> func is called, once with a pointer and once with a reference.

By use.

In principle a function is always called with a pointer, and it is
the callers responsibility to ensure this pointer is valid. If the
function needs to store a reference to this pointer for a longer time
period then it must create a cbdata reference, and later use
cbdataReferenceValid (or DoneValid) to check the validity of the
reference.

> How can a func determine whether this is a reference, and if so
> if it is valid?

A function should never need to determine if a cbdata type argument
is invalid. If the argument isn't valid then the function should not
be called. Period.

> By your definition it must know that beforehand, or you must not
> call the same func with different pointers.
> Then perhaps we should enforce that to avoid abuse of cbdata?

Perhaps. I once thought about making cbdataReference() return a void
* pointer to better illustrate that it is a reference not really
meant to be used for any other purpose than as a argument to a
callback, but the drawback of doing so is that you loose the type
checking of the compiler.

> peerDigestFetchedEnough seems to need know whether fetch struct is
> valid or not. Removing this validation from there would need quite
> alot of rewrite of that func that I wouldn't like to do because of
> lack of understanding.

It should not require that much of a rewrite.. If it does then there
is more bugs that needs to be addressed.

As said earlier, no other module will invoke a callback into
peer_digest.c if fetch is invalid, and any sections inside
peer_digest.c that may cause fetch to be become invalid must be
protected by cbdataInternalLock() segments, so it is basically just
to change the cbdataInternalLock() segments to make sure they abort
proper if fetch becomes invalid.

> For now I'd only try to make sure fetch
> validity checks is done under locks as it was before.
>
> diff -u -r1.13 peer_digest.c
> --- peer_digest.c 10 May 2002 20:58:28 -0000 1.13
> +++ peer_digest.c 17 May 2002 07:03:22 -0000
> @@ -356,16 +356,16 @@
>
> assert(fetch->bufofs <= SM_PAGE_SIZE);
>
> + /* Lock our data to protect us from ourselves */
> + cbdataInternalLock(fetch);
> +
> /* If we've fetched enough, return */
> if (peerDigestFetchedEnough(fetch, fetch->buf, fetch->bufofs,
> "peerDigestHandleReply")) - return;
> + goto finish;
>
> /* Call the right function based on the state */
> /* (Those functions will update the state if needed) */
>
> - /* Lock our data to protect us from ourselves */
> - cbdataInternalLock(fetch);
> -
> /* Repeat this loop until we're out of data OR the state
> changes */ /* (So keep going if the state has changed and we still
> have data */ do {
> ----------------
> finish:
> /* Unlock our data - we've finished with it for now */
> cbdataInternalUnlock(fetch);
> }

This is fine as a patch to the existing code who is already abusing
cbdata, but not the best way.

> Ok, but why is just innocent "is this ptr valid?" question
> causing "wrong question man, boom, we shot you"? conspiracy? ;)

Because it is truing to enforce a reasonable use.

You cannot check validity on a pointer where there is no references,
as if there is no references then you won't know if the memory has
been freed and then reused.

> Besides, this could be a reference, just at this point in time
> released reference. cbdataReferenceDone() should have NULL'fied
> the pointer? btw, why would cbdataReferenceValid() return valid
> on NULL pointer?

traditionally cbdataValid() returned true on NULL pointers, to allow
one to pass NULL arguments to callbacks, but this is a fact I
disagree with, as it makes it close to impossible to determine if the
pointer is NULL because we never got any, or because whe have already
used it and got rid of the reference..

Once the code has been audited and any uses of NULL callback
arguments have been removed, then NULL cbdata pointers will be
considered invalid.

> > a) You called cbdataReferenceDone() where actually cbdataFree()
> > was intended.
>
> cbdataReferenceDone() should assert locks>0

It does. See cbdataInternalUnlock().

However, there may be active cbdata references delaying where this
is detected. This kind of error will then be detected when the last
reference is destroyed, which might be another module than where the
error occurred.

> > b) incorrect storage of cbdataReference() references, causing
> > the same cbdataReferenceDone() to be called more than once for
> > the same reference.
>
> cbdataReferenceDone() should assert locks>0

What I meant here is code that copies a cbdataReference() pointer
manually in an incorrect manner.

And yes, cbdataReferenceDone() will trigger on the error..

> > c) you managed to screw up the link count by incorrect use of
> > cbdataInternalLock/Unlock.
>
> cbdataInternalUnLock() should assert locks>0

It does.

> I feel there is a lack of instrumentation with cbdata.

Not sure about this.

> cbdataAlloc'ed pointer can be freed under a lock, and there is
> no way to check for that otherwise than with ReferenceValid.

Correct.

But keep in mind that you are only allowed to call cbdataFree() on
the "main" pointer, not on any cbdataReference() pointer.

The scope of cbdata is to allow for safe memory management in
precense of callbacks where events may occur asynchronously, not as a
general tool to address reuse of freed pointers in traditional code.

To address the traditional problem of reuse of pointers after free in
normal sequential code, normal methods should be used.

> Locked cbdata in indistinguishable from its Reference.

Yes, and if you need to lock your cbdata main pointer then you are
pretty much on your own when it comes to verifying that the use is
correct, as you are tinkering with internal aspects of cbdata.

If you look into the cbdata programmers manual, you will find that
the cbdataInternalLock/Unlock functions is undocumented, and this is
for a reason. It is also intentional that they are named
cbdataInternal...

> We need then cbdataValid() call for that or something.
> As I understand you insist that the two should be kept separate
> strictly: cbdata itself and ref to it. Then we need something
> more than just lock count for that, imho.
> Its pretty confusing now.

peer_digest.c is confusing because it tries to use cbdata for stuff
cbdata was not and is not intended for.

cbdata was and is designed to allow for callbacks in a safe manner,
cancelling the callback if the argument to the callback has been
destroyed without the callback being unregistered.

peer_digest.c uses it as a method to abort the local execution when
locally calling cbdataFree().

You should not really be looking at peer_digest.c if you want to make
sense of cbdata.

> Intuitively, if you "ask" if cbdataReferenceValid you expect to
> get yes or no, instead you can get a bummer.
> One could always assert(cbdataReferenceValid) to get the bummer.

You get a bummer in case there is an internal inconsistency telling
that your memory usage has been corrupted.

> Also, are there any tricks with pointer types to let compiler
> catch misuses of references inplace of cbdata?

There is no such thing as references in C...

I have considered using "void *" or a generic "cbdata" type for
callback references to signify the difference between a
cbdataReference and the main pointer. In the end we might end up
there. Note: Today almost all cbdata references are "void *", which
is the natural type for most uses in the intended scope where a there
is a need to store a callback argument, to later be passed to the
indicated callback function.

Regards
Henrik
Received on Sun May 19 2002 - 10:22:00 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:15:28 MST