In boost.thread's start function, the source code is something like that:
bool thread::start_thread_noexcept()
{
uintptr_t const new_thread = _beginthreadex(
0,
0,
&thread_start_function,
thread_info.get(),
CREATE_SUSPENDED,
&thread_info->id);
if (!new_thread)
{
return false;
}
// why call this line?
intrusive_ptr_add_ref(thread_info.get());
thread_info->thread_handle = (detail::win32::handle)(new_thread);
ResumeThread(thread_info->thread_handle);
return true;
}
thread_info is a intrusive smart pointer which points to the thread information data, before calling the intrusive_ptr_add_ref, the count is already 1, I don't know why call the intrusive_ptr_add_ref mannually here. I think Intrusive smart pointer's job should be calling the intrusive_ptr_add_ref and intrusive_ptr_release automatically.
I've tried to step through the source code but didn't find any clue.
Can anyone tell me
1. why call intrusive_ptr_add_ref manually here?
2. In what condition when using the intrusive_ptr, I should call intrusive_ptr_add_ref manually?
Thanks, Sincerely.
why call intrusive_ptr_add_ref manually here?
To represent the sharing of ownership of the pointer.
_beginthreadex was passed thread_info.get() as a parameter. This parameter will be passed to thread_start_function when the thread starts. And this function expects the pointer to remain valid until that happens.
Now, _beginthreadex is a simple function. It's not a variadic template that can take arbitrary parameters or anything. It takes exactly and only a naked pointer, and passes exactly that to the start function.
It is very possible for the person creating the boost::thread to call thread::detach before thread_start_function ever gets called. And if that happened, then the thread_info intrusive pointer would be destroyed, thus causing the destruction of its contained object.
And that leaves _beginthreadex with a destroyed pointer. That's bad.
What _beginthreadex needs to do is claim ownership of the intrusvie pointer. But since the API doesn't take a boost::intrusive_ptr, how do you do that?
By bumping the reference count. The reference count increase is how _beginthreadex claims ownership of the object.
Related
I am writing code that utilizes COM interfaces. I am basing my code on examples that I have found online. I do not want to utilize smart pointers in this case because I want to understand the basics of COM and not just have a smart pointer class do all of the work for me.
In order to frame my questions, let's assume I have a class similar to the following:
public class TestClass
{
private:
IUnknown *m_pUnknown;
public:
TestClass();
void AssignValue();
}
TestClass::TestClass()
{
m_pUnknown = NULL;
}
void TestClass::AssignValue()
{
IUnknown *pUnknown = NULL;
//Assign value to pUnknown here - not relevant to my questions
m_pUnknown = pUnknown;
pUnknown->Release();
}
Now on to my specific questions.
1) The examples I've seen to not use AddRef() when initializing a value, such as in the class constructor. Does the AddRef() happen "automatically" behind the scenes when a COM pointer is first assigned a value?
2) Although my code example does not show it, it is my understanding that in the AssignValue() method, when you assign a second value to overwrite the value of pUnknown (originally set in the class constructor), Release() is automatically called. After assigning the new value to pUnknown its reference count stands at zero. I need to call pUnknown->AddRef() immediately after the reassignment. Is my understanding correct?
Notes: I assume we are ignoring exceptions for simplicity here. If this was for real, you would want to use smart pointers to help keep things straight in the presence of exceptions. Similarly, I am not worrying about proper copying or destruction of instances of your example class or multi-threading. (Your raw pointers cannot be used from different threads as simply as you might assume.)
First, You need to make any necessary calls to COM. The only way anything might happen "automatically" behind the scenes would be if you were using smart pointers to do them.
1) The examples you refer to have to be getting their COM interface pointers from somewhere. This would be by making COM calls, e.g., CoCreateInstance() and QueryInterface(). These calls are passed the address of your raw pointer and set that raw pointer to the appropriate value. If they weren't also implicitly AddRef'ed, the reference count might be 0 and COM could delete the associated COM object before your program could do anything about it. So such COM calls must include an implicit AddRef() on your behalf. You are responsible for a Release() to match this implicit AddRef() that you instigated with one of these other calls.
2a) Raw pointers are raw pointers. Their value is garbage until you arrange for them to be set to something valid. In particular, assigning a value to one will NOT auto-magically call a function. Assigning to a raw pointer to an interface does not call Release() - you need to do that at the appropriate time. In your post, it appears that you are "overwriting" a raw pointer that had previously been set to NULL, hence there was no existing COM interface instance in the picture. There could not have been an AddRef() on something that doesn't exist, and must not be a Release() on something that isn't there.
2b)
Some of the code you indicated by a comment in your example is very relevant, but can easily be inferred. You have a local raw pointer variable, pUnknown. In the absent code, you presumably use a COM call that obtains an interface pointer, implicitly AddRefs it, and fills in your raw pointer with the proper value to use it. This gives you the responsibility for one corresponding Release() when you are done with it.
Next, you set a member raw pointer variable (m_pUnknown) with this same value. Depending on the previous use of this member variable, you might have needed to call Release() with its former value before doing this.
You now have 2 raw pointers set to the value to work with this COM interface instance and responsibility for one Release() due to 1 implicit AddRef() call. There are two ways to deal with this, but neither is quite what you have in your sample.
The first, most straightforward, and proper approach (which others have correctly pointed out & I skipped passed in the first version of this answer) is one AddRef() and one Release() per pointer. Your code is missing this for m_pUnknown. This requires adding m_pUnknown->AddRef() immediately after the assignment to m_pUnknown and 1 corresponding call to Release() "someplace else" when you are done using the current interface pointer from m_pUnknown. One usual candidate for this "someplace else" in your code is in the class destructor.
The second approach is more efficient, but less obvious. Even if you decide not to use it, you may see it, so should at least be aware of it. Following the first approach you would have the code sequence:
m_pUnknown = pUnknown;
m_pUnknown->AddRef();
pUnknown->Release();
Since pUnknown and m_pUnknown are set the same here, the Release() is immediately undoing the AddRef(). In this circumstance, eliding this AddRef/Release pair is reference count neutral and saves 2 round trips into COM. My mental model for this is a transfer of the interface and reference count from one pointer to the other. (With smart pointers it would look like newPtr.Attach( oldPtr.Detach() ); ) This approach leaves you with the original/not shown implicit AddRef() and needing to add the same m_pUnknown->Release() "someplace else" as in the first alternative.
In either approach, you exactly match AddRefs (implicit or explicit) with Releases for each interface and never go to a 0 reference count until you are done with the interface. Once you do hit 0, you do not attempt to use the value in the pointer.
Avi Berger already posted a great answer, but here is the same thing stated another way in case it helps with understanding.
In COM, reference counting is done within the COM object. The COM runtime will destruct and free an object whose reference count reaches 0. (This might be delayed by some time from the point of the count hitting 0).
Everything else is a convention. The usual convention amongst C++ COM programmers is that raw interface pointers should be treated as owning pointers. This concept means that any time a pointer points to a COM object, the pointer owns that object.
Using this terminology, the object may have multiple owners at any one time, and the object will be destroyed when nobody owns it.
However, raw pointers in C++ don't have ownership semantics built in. So you have to implement it yourself by making function calls:
Call AddRef on an interface pointer when that pointer takes ownership of an object. (You'll need to be aware of which Windows API functions or other library functions already do this, to avoid you doing it twice)
Call Release on an interface pointer when that pointer is about to stop owning an object.
The benefit of smart pointers is that they make it impossible for you to forget to call Release when an interface pointer stops owning an object. This includes the following cases:
Pointer goes out of scope.
Pointer is made to stop pointing to the object, by using assignment operator.
So, looking at your sample code. You have the pointer m_pUnknown. You want this pointer to take ownership of the object, so the code should be:
m_pUnknown = pUnknown;
m_pUnknown->AddRef();
You will also need to add code to your class destructor and your class assignment operator to call m_pUnknown->Release(). I would very strongly recommend wrapping these calls in the smallest class possible (that is, write your own smart pointer and make TestClass have that smart pointer as a member variable). Assuming of course you don't want to use an existing COM smart pointer class for pedagogical reasons.
The call pUnknown->Release(); is correct because pUnknown currently owns the object, and the pointer is about to stop owning the object due to the fact that it will be destroyed when the function block ends.
You may observe that it would be possible to remove both of the lines m_pUnknown->AddRef() and pUnknown->Release(). The code will behave exactly the same. However , it is better to follow the convention outlined above. Sticking to a convention helps yourself to avoid errors and it also helps other coders to understand your code.
To put it another way, the usual convention is to think of the pointer as having a reference count of either 0 or 1, even though the reference counting is not actually implemented that way.
First, my apologies. My attempt to simplify my code for the sake of clarity turned out to be misguided. However, I believe my questions were answered. If I may, I will summarize.
1) Any COM object that is assigned a value other than NULL needs to be immediately followed by AddRef() unless the AddRef() was implicitly handled (as is the case with some Windows API calls).
2) Any reassignment of value to a COM pointer, assuming that the "before" value is not NULL must be immediately proceeded by Release(). AddRef() would then by needed as mentioned in #1.
3) Any COM variable whose value needs to be preserved beyond its current scope requires that it have a reference count of at least 1 upon exiting its said scope. This may mean that an AddRef() is required.
Would this be a fair summary? Did I miss anything?
Im trying to start a new thread using CreateThread and pass the pointee of a shared_pointer as an argument. If i was to use good old ATL, it would call Ccomptr's operator () and increase the ref count and eventually retrieve the pointer.
I'm trying to do the same using STL's shared_ptr, but I'm getting an error where conversion between std::shared_ptr to LPVOID doesn't exist.
std::shared_ptr<X> m_pSettings = std::make_shared(new ....);
if (nullptr == (m_hBackgroundThread = ::CreateThread(
nullptr, // default security attributes
0, // use default stack size
ThreadProc, // thread function name
m_pSettings, // argument to thread function
0, // use default creation flags
0)))
{
LOG_ERROR(L"Failed to create thread");
return false;
}
The correct answer, as you have already been told, is to use std::thread:
// Hey, look: No magic nullptrs and zeros nobody ever changes to non-defaults anyway!
std::thread t(ThreadProc, m_pSettings);
// Don't forget to
t.detach();
// if you let the std::thread object destruct without first joining it and making sure
// the thread finishes executing
Not only std::thread solves your problem, it has a multitude of other benefits such as allowing you to pass multiple parameters without fuss, it copies/moves/does-the-right-thing with the arguments you pass, etc. and it doesn't require you to resort to any supposedly clever but actually not hacks. If you use std::async instead of std::thread you can even get in the caller thread the exception that caused the thread function to unwind if there was one.
The alternative of passing CreateThread the address of your shared_ptr and then dereferencing it inside the ThreadProc might work, if you're extremely careful - at the very least you have to make sure the caller's shared_ptr doesn't go away until the ThreadProc is done with it - but that's just asking for bad things to happen.
Sure, if you had no alternatives that's what you do, but given std::thread and friends doing anything else is wrong and a waste of time.
Also, your comparison to ATL seems to be founded on a misunderstanding. The difference is not between ATL's smart pointers that are extra-clever and and are able in some magical way to keep the reference count even when giving you the raw pointer (though I am unaware of an CComPtr::operator()) while C++ standard library smart pointers are lame and don't do that. The difference is between special object who manage keep their own reference count and manage their own lifetimes given that you declare when you use them (i.e. call AddRef) and arbitrary resources.
A smart pointer for COM (be it CComPtr or _com_ptr_t or any half-baked lame class you fine online) is basically a wrapper around the COM object's innate capability to manage its own lifetime.
You could have just as much added AddRed, Release, etc. (with the associated semantics, obviously) to your class X and wrapped it in a COM smart pointer...
You could pass shared_ptr object by pointer like this: &m_pSettings
Then dereference it in your ThreadProc :
std::shared_ptr<X> pSettings = *static_cast<std::shared_ptr<X>*>(lpParam);
Probably you need to copy your shared_ptr first before passing pointer to it into CreateTrhead so it'll be in scope while ThreadProc running.
I routinely use following primitive elements in some internal tables.
X const* find(Key const& key);
If found return pointer to found element if not found return null.
I would like to do something similar with shared_ptr instead of naked pointer.
No problem, it works the same way more or less. shared_ptr has a default constructor which makes a "null" pointer, and it also has an operator which lets you evaluate the shared_ptr in a boolean context like an if conndition. So when you have nothing to return, just say:
return shared_ptr<X>();
And to test it:
if (shared_ptr<X> ptr = myFunc()) {
// do something with *ptr
}
I don't know why you're insisting on returning shared_ptr? shared_ptr are tools for managing memory. You can use these inside your function. However it won't make any difference to caller of your function whether you are returning a shared_ptr, reference/raw-pointer.( In an asynchronous context, there are many pitfalls ).
Also, shared_ptr's are based on reference counting mechanism i.e they are deleted only when it is no longer referenced by anyone. So, if you are returning it you have to make sure you are not storing it permanently which would never enable it's reference count to reach 0.
I was wondering if it is standard practice in COM libraries to call Addref on an COM interface, that is returned from a function. For instance:
IXMLDOMElement* domElement = NULL;
document_->get_documentElement(&domElement); // does get_documentElement() call Addref on domElement?
// ...
// do something with domElement
// ..
domElement.Release(); // correct?
// (btw. member variable document_ is of type CComPtr<IXMLDOMDocument2>
or with a smart pointer:
CComPtr<IXMLDOMElement> domElement;
document_->get_documentElement(&domElement);
Btw. I found that in the docs of MSXML for "Windows media 9 series" it says that Addref is called: http://msdn.microsoft.com/en-us/library/ms751196(v=vs.85).aspx
But in the official documentation nothing is mentioned about it:
http://msdn.microsoft.com/en-us/library/ms759095(v=vs.85).aspx
The function that returns an interface pointer must call AddRef() on it before exiting, not the function that is receiving the object. The function that receives the interface pointer must use it as-is and then call Release() on it. Which means that get_documentElement() will call AddRef(), so do not call it yourself.
The rules for who - the caller or the callee - is responsible for doing what in regards to reference counting and memory management in COM are clearly defined in COM's documentation on MSDN:
The Rules of the Component Object Model
Reference Counting Rules
Yes you are supposed to addref before returning a COM object, as the caller is going to have an new interface pointer referencing the object, so the reference count needs to be increased by one. This is the rule, not the exception.
Documenting the internal addref is the exception, however, as reference counting is one of the fundamentals of COM. Probably the documentation was written when a lot of callers of this method don't know the rule and caused too many memory leaks.
When you, as a caller, no longer need the received object, you need to call Release directly or indirectly (e.g. through a class destructor), and stop using the reference pointer (many people set the pointer to null to prevent dangling pointers).
Consider the following example code which I have recently seen in our code base:
void ClassA::ExportAnimation(auto_ptr<CAnimation> animation)
{
... does something
}
// calling method:
void classB::someMethod()
{
auto_ptr<CAnimation> animation (new CAnimation(1,2));
ClassA classAInstance;
classAInstance.ExportAnimation(animation)
... do some more stuff
}
I don't like this - and would rather write it so:
void ClassA::ExportAnimation(CAnimation* animation)
{
... does something
}
// calling method:
void classB::someMethod()
{
auto_ptr<CAnimation> animation (new CAnimation(1,2));
ClassA classAInstance;
classAInstance.ExportAnimation(animation.get())
... do some more stuff
}
but it is really a problem?
It all depends on what ExportAnimation is and how it is implemented.
Does it only use the object for the duration of the call and then leaves it?
Then convert to a reference and pass a real reference. There is no need to pass membership and the argument is not optional, so void ExportAnimation( CAnimation const & ) suffices. The advantage is that it is clear from the interface that there is no memory management issues with the method, it will just use the passed object and leave it as such. In this case, passing a raw pointer (as in your proposed code) is much worse than passing a reference in that it is not clear whether ExportAnimation is or not responsible for deletion of the passed in object.
Does it keep the object for later use?
This could be the case if the function starts a thread to export the animation in the background. In this case, it has to be clear that the lifetime of the argument must extend beyond the duration of the call. This can be solved by using shared_ptr --both in the function and outside of it-- as they convey the object is shared and will be kept alive as much as required meaning. Or else you can actually transfer ownership.
In the later case, if transfer of ownership is performed, then the initial code is fine --the signature is explicit in the ownership transfer. Else you can opt to document the behavior, change to a raw pointer and make the transfer explicit by calling ExportAnimation( myAnimation.release() ).
You have added some concerns as a comment to another answer:
can I really see that object no longer exists after the method call?
The caller auto_ptr is reset to 0 in the call, so any dereference will kill be an error and will be flagged in the first test you try.
I would need to look at the header file to see that the parameter type is an auto_ptr and not a normal pointer.
You do not need to look at the header... just try passing a raw pointer and the compiler will tell you that it requires an auto_ptr<> --There is no implicit conversion from raw pointer to auto_ptr.
I would expect the object to exist until the auto_ptr goes out of scope.
The standard auto_ptr, unlike boost::scope_ptr, do not have that semantics. The ownership of the object can be released or passed to other auto_ptr, so the assumption that an object held in an auto_ptr lives for the whole scope of the auto_ptr is bad in itself.
The auto_ptr unambiguously declares that the ownership of the pointer is passed on. The plain pointer isn't self-documenting.
What is the point of an auto-ptr if you only use its internals as a storage location?
Yes, pass it to the function. Or do away with it entirely, if you really don't want it. Presumably the function needs it to pass along ownership to something else.
It sounds like maybe the alternative you're looking for is much simpler:
void ClassA::ExportAnimation(CAnimation &animation) // no pointer
// calling method:
void classB::someMethod()
{
CAnimation animation(1,2); // no pointer
ClassA classAInstance;
classAInstance.ExportAnimation(animation) // no ownership tranfer
... do some more stuff
// object dies here, no earlier, no later
}
Passing the smart pointer to ExportAnimation clearly documents, and enforces, that ownership has been passed to the function, and there is no need for the caller to delete the animation. The function will also not need to explicitly delete the object, just let the pointer go out of scope.
Your suggestion leaves that ambigious; should ExportAnimation delete the object you've passed via raw pointer? You'd need to check the function's documentation to know what the caller should do, and also check the implementation to make sure it's actually implemented as documented.
I would always recommend using smart pointers (and other RAII idioms) to make object lifetime explicit and automatic.