The question is strictly about std::function and not boost::function. See the Update section at the bottom of this question for more details, especially the part about it not being possible to compare non-empty std::function objects per the C++11 standard.
The C++11 std::function class template is great for maintaining a collection of callbacks. One can store them in a vector, for example and invoke them when need be. However, maintaining these objects and allowing for unregistration seems to be impossible.
Let me be specific, imagine this class:
class Invoker
{
public:
void Register(std::function<void()> f);
void Unregister(std::function<void()> f);
void InvokeAll();
private:
// Some container that holds the function objects passed to Register()
};
Sample usage scenario:
void foo()
{
}
int main()
{
std::function<void()> f1{foo};
std::function<void()> f2{[] {std::cout << "Hello\n";} };
Invoker inv;
// The easy part
// Register callbacks
inv.Register(f1);
inv.Register(f2);
// Invoke them
inv.InvokeAll();
// The seemingly impossible part. How can Unregister() be implemented to actually
// locate the correct object to unregister (i.e., remove from its container)?
inv.Unregister(f2);
inv.Unregister(f1);
}
It is fairly clear how the Register() function can be implemented. However, how would one go about implementing Unregister(). Let's say that the container that holds the function objects is vector<std::function<void()>> . How would you find a particular function object that is passed to the Unregister() call? std::function does supply an overloaded operator==, but that only tests for an empty function object (i.e., it cannot be used to compare two non-empty function objects to see if they both refer to the same actual invocation).
I would appreciate any ideas.
Update:
Ideas so far mainly consist of the addition of a cookie to be associated with each std::function object that can be used to unregister it. I was hoping for something that is not exogenous to the std::function object itself. Also, there seems to be much confusion between std::function and boost::function. The question is strictly about std::function objects, and not boost::function objects.
Also, you cannot compare two non-empty std::function objects for equality. They will always compare non-equal per the standard. So, links in the comments to solutions that do just that (and use boost::function objects to boot) are patently wrong in the context of this question.
Since you can't test for element identity in the container, it's probably best to use a container (such as std::list) whose iterators do not invalidate when the container is modified, and return iterators back to registering callers that can be used to unregister.
If you really want to use vector (or deque), you could return the integral index into the vector/deque when the callback is added. This strategy would naturally require you to make sure indexes are usable in this fashion to identify the function's position in the sequence. If callbacks and/or unregistration is rare, this could simply mean not reusing spots. Or, you could implement a free list to reuse empty slots. Or, only reclaim empty slots from the ends of the sequence and maintain a base index offset that is increased when slots are reclaimed off the beginning.
If your callback access pattern doesn't require random access traversal, storing the callbacks in a std::list and using raw iterators to unregister seems simplest to me.
I have an idea for this.
Store the callbacks as std::weak_ptr<std::function<void(argtype1, argtype1)>>. Then the caller is responsible for keeping the corresponding std::shared_ptr alive, and all the caller has to do to unregister the callback is destroy all active std::shared_ptrs to the callback function.
When invoking callbacks, the code has to be careful to check for lock failures on the std::weak_ptr<>s it is using. When it runs across these it can remove them from its container of registered callbacks.
Note that this does not give complete thread safety, as the callback invoker can lock the std::weak_ptr and make a temporarily newly active std::shared_ptr of the callback function that can stay alive after the caller's std::shared_ptr goes out of scope.
Related
I'm attempting to write a an observer pattern in c++
so I have a map that contains eventname -> vector of callback functions
The callback functions are stored in a vector as
std::function<void(void *)>
so the map looks like
std::unordered_map<std::string, std::vector<std::function<void(void *)>>>
I can add listeners to the vector and receive and respond to event notifications. My problem is with implementing detach.
So std::function's can't be compared, so erase/remove is out, so I wanted to search the vector and compare manually.
I found that this question had success using std::function::target with getting access to underlying pointers, but I can't use this, since I'm using std::bind to initialize the callback:
std::function<void(void *)> fnCallback = std::bind(&wdog::onBark, this, std::placeholders::_1)
I just want to compare the underlying member function ptrs, or even comparison with the underlying object ptr to which the member function is associated. Is there any way?
I'd like to avoid wrapping the member fn ptr in an object that contains a hash, although it looks like I might have to go that way...
When I do this, I return a token to the listening code.
My usual pattern is to have a weak_ptr<function<sig>> in the broadcaster, and the token is a shared_ptr<void> (to the stored weak ptr).
When broadcasting, I filter out dead targets, then broadcast. Targets deregister by simply clearing, destroying or otherwise discarding their token. A vector of tokens in their instance is a reasonable way if they want to never deregister.
If you never broadcast this can lead to old dead resources hanging around needlessly, so in a public framework I might want something with bettter guarantees. But it is lightweight easy and fast otherwise.
I am currently experimenting with writing an event queue in C++11. I am using std::bind to obtain std::function objects which are called when certain events happen. The code for this roughly looks like this:
class A
{
public:
void handle();
};
class B { ... };
// Later on, somewhere else...
std::vector< std::function< void() > functions;
A a;
B b;
functions.push_back( std::bind( &A::handle, &a ) );
functions.push_back( std::bind( &B::handle, &b ) );
// Even later:
for( auto&& f : functions )
f(); // <--- How do I know whether f is still "valid"?
Is there any way to guarantee the validity of the function object so that I can avoid stumbling over undefined behaviour here?
I have already taken a look at this question here, std::function to member function of object and lifetime of object, but it only discussed whether deleting a pointer to a bound object raises undefined behaviour. I am more interested in how to handle the destruction of such an object. Is there any way to detect this?
EDIT: To clarify, I know that I cannot guarantee a lifetime for non-static, non-global objects. It would be sufficient to be notified about their destruction so that the invalid function objects can be removed.
As #Joachim has stated, no lifetime is associated to the member function (it's a code section, not data). So you're asking if there is a way to know if the object still exists prior to execute the callback call.
You've to make a sort of framework, where the object dctor notify the container when it is destroyed, so the container could delete it from its "observers", the vector containing all the objects. To do that, the object must memorize in its instance the ptr to the container.
UPDATE
#Jason talks about the use of shared_ptr. It's okay to use them, but in this case, is not addressing the case of HOW to destroy the object linked in other object-notification list. Shared_ptr postponed the destruction of an instance until all "managed" references to it are deleted. But if you need to destroy object A, AND delete all reference to it because that object MUST be deleted, you've to look into all containers that store a shared_ptr and remove it. A very painful activity. The simplest solution (using raw ptr or shared_ptr, if you can use them, is irrelevant) is a two-link connection between the observer and the observed, in such way each one can notify its destruction to the other. How to store this information? many ways to accomplish it: hash tables, slots in observer, etc
One hack/workaround that achieves the desired result would be to use a parameter of type std::shared_ptr. When the bind is destructed, so is the shared pointer - which will do the right thing when it is the last reference. However this involves changes to the signature used. To make it slightly less awkward, you can use static methods that take in a std::shared_ptr this - sort of like the self parameter concept in python, if you are familiar.
Or if you are fine with C++11, you can just use a lambda capture of the shared pointer.
You'd need to dynamically allocate the instances to use this method.
With respect to smart pointers and new C++11/14 features, I am wondering what the best-practice return values and function parameter types would be for classes that have these facilities:
A factory function (outside of the class) that creates objects and returns them to users of the class. (For example opening a document and returning an object that can be used to access the content.)
Utility functions that accept objects from the factory functions, use them, but do not take ownership. (For example a function that counts the number of words in the document.)
Functions that keep a reference to the object after they return (like a UI component that takes a copy of the object so it can draw the content on the screen as needed.)
What would the best return type be for the factory function?
If it's a raw pointer the user will have to delete it correctly which is problematic.
If it returns a unique_ptr<> then the user can't share it if they want to.
If it's a shared_ptr<> then will I have to pass around shared_ptr<> types everywhere? This is what I'm doing now and it's causing problems as I'm getting cyclic references, preventing objects from being destroyed automatically.
What is the best parameter type for the utility function?
I imagine passing by reference will avoid incrementing a smart pointer reference count unnecessarily, but are there any drawbacks of this? The main one that comes to mind is that it prevents me from passing derived classes to functions taking parameters of the base-class type.
Is there some way that I can make it clear to the caller that it will NOT copy the object? (Ideally so that the code will not compile if the function body does try to copy the object.)
Is there a way to make it independent of the type of smart pointer in use? (Maybe taking a raw pointer?)
Is it possible to have a const parameter to make it clear the function will not modify the object, without breaking smart pointer compatibility?
What is the best parameter type for the function that keeps a reference to the object?
I'm guessing shared_ptr<> is the only option here, which probably means the factory class must return a shared_ptr<> also, right?
Here is some code that compiles and hopefully illustrates the main points.
#include <iostream>
#include <memory>
struct Document {
std::string content;
};
struct UI {
std::shared_ptr<Document> doc;
// This function is not copying the object, but holding a
// reference to it to make sure it doesn't get destroyed.
void setDocument(std::shared_ptr<Document> newDoc) {
this->doc = newDoc;
}
void redraw() {
// do something with this->doc
}
};
// This function does not need to take a copy of the Document, so it
// should access it as efficiently as possible. At the moment it
// creates a whole new shared_ptr object which I feel is inefficient,
// but passing by reference does not work.
// It should also take a const parameter as it isn't modifying the
// object.
int charCount(std::shared_ptr<Document> doc)
{
// I realise this should be a member function inside Document, but
// this is for illustrative purposes.
return doc->content.length();
}
// This function is the same as charCount() but it does modify the
// object.
void appendText(std::shared_ptr<Document> doc)
{
doc->content.append("hello");
return;
}
// Create a derived type that the code above does not know about.
struct TextDocument: public Document {};
std::shared_ptr<TextDocument> createTextDocument()
{
return std::shared_ptr<TextDocument>(new TextDocument());
}
int main(void)
{
UI display;
// Use the factory function to create an instance. As a user of
// this class I don't want to have to worry about deleting the
// instance, but I don't really care what type it is, as long as
// it doesn't stop me from using it the way I need to.
auto doc = createTextDocument();
// Share the instance with the UI, which takes a copy of it for
// later use.
display.setDocument(doc);
// Use a free function which modifies the object.
appendText(doc);
// Use a free function which doesn't modify the object.
std::cout << "Your document has " << charCount(doc)
<< " characters.\n";
return 0;
}
I'll answer in reverse order so to begin with the simple cases.
Utility functions that accept objects from the factory functions, use them, but do not take ownership. (For example a function that counts the number of words in the document.)
If you are calling a factory function, you are always taking ownership of the created object by the very definition of a factory function. I think what you mean is that some other client first obtains an object from the factory and then wishes to pass it to the utility function that does not take ownership itself.
In this case, the utility function should not care at all how ownership of the object it operates on is managed. It should simply accept a (probably const) reference or – if “no object” is a valid condition – a non-owning raw pointer. This will minimize the coupling between your interfaces and make the utility function most flexible.
Functions that keep a reference to the object after they return (like a UI component that takes a copy of the object so it can draw the content on the screen as needed.)
These should take a std::shared_ptr by value. This makes it clear from the function's signature that they take shared ownership of the argument.
Sometimes, it can also be meaningful to have a function that takes unique ownership of its argument (constructors come to mind). Those should take a std::unique_ptr by value (or by rvalue reference) which will also make the semantics clear from the signature.
A factory function (outside of the class) that creates objects and returns them to users of the class. (For example opening a document and returning an object that can be used to access the content.)
This is the difficult one as there are good arguments for both, std::unique_ptr and std::shared_ptr. The only thing clear is that returning an owning raw pointer is no good.
Returning a std::unique_ptr is lightweight (no overhead compared to returning a raw pointer) and conveys the correct semantics of a factory function. Whoever called the function obtains exclusive ownership over the fabricated object. If needed, the client can construct a std::shared_ptr out of a std::unique_ptr at the cost of a dynamic memory allocation.
On the other hand, if the client is going to need a std::shared_ptr anyway, it would be more efficient to have the factory use std::make_shared to avoid the additional dynamic memory allocation. Also, there are situations where you simply must use a std::shared_ptr for example, if the destructor of the managed object is non-virtual and the smart pointer is to be converted to a smart pointer to a base class. But a std::shared_ptr has more overhead than a std::unique_ptr so if the latter is sufficient, we would rather avoid that if possible.
So in conclusion, I'd come up with the following guideline:
If you need a custom deleter, return a std::shared_ptr.
Else, if you think that most of your clients are going to need a std::shared_ptr anyway, utilize the optimization potential of std::make_shared.
Else, return a std::unique_ptr.
Of course, you could avoid the problem by providing two factory functions, one that returns a std::unique_ptr and one that returns a std::shared_ptr so each client can use what best fits its needs. If you need this frequently, I guess you can abstract most of the redundancy away with some clever template meta-programming.
What would the best return type be for the factory function?
unique_ptr would be best. It prevents accidental leaks, and the user can release ownership from the pointer, or transfer ownership to a shared_ptr (which has a constructor for that very purpose), if they want to use a different ownership scheme.
What is the best parameter type for the utility function?
A reference, unless the program flow is so convoluted that the object might be destroyed during the function call, in which case shared_ptr or weak_ptr. (In either case, it can refer to a base class, and add const qualifiers, if you want that.)
What is the best parameter type for the function that keeps a reference to the object?
shared_ptr or unique_ptr, if you want it to take responsibility for the object's lifetime and not otherwise worry about it. A raw pointer or reference, if you can (simply and reliably) arrange for the object to outlive everything that uses it.
Most of the other answers cover this, but #T.C. linked to a few really good guidelines which I'd like to summarise here:
Factory function
A factory that produces a reference type should return a unique_ptr by default, or a shared_ptr if ownership is to be shared with the factory.
-- GotW #90
As others have pointed out, you as the recipient of the unique_ptr can convert it to a shared_ptr if you wish.
Function parameters
Don’t pass a smart pointer as a function parameter unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership.
Prefer passing objects by value, *, or &, not by smart pointer.
-- GotW #91
This is because when you pass by smart pointer, you increment the reference counter at the start of the function, and decrement it at the end. These are atomic operations, which require synchronisation across multiple threads/processors, so in heavily multithreaded code the speed penalty can be quite high.
When you're in the function the object is not going to disappear because the caller still holds a reference to it (and can't do anything with the object until your function returns) so incrementing the reference count is pointless if you're not going to keep a copy of the object after the function returns.
For functions that don't take ownership of the object:
Use a * if you need to express null (no object), otherwise prefer to use a &; and if the object is input-only, write const widget* or const widget&.
-- GotW #91
This doesn't force your caller to use a particular smart pointer type - any smart pointer can be converted into a normal pointer or a reference. So if your function doesn't need to keep a copy of the object or take ownership of it, use a raw pointer. As above, the object won't disappear in the middle of your function because the caller is still holding on to it (except in special circumstances, which you would already be aware of if this is an issue for you.)
For functions that do take ownership of the object:
Express a “sink” function using a by-value unique_ptr parameter.
void f( unique_ptr<widget> );
-- GotW #91
This makes it clear the function takes ownership of the object, and it's possible to pass raw pointers to it that you might have from legacy code.
For functions that take shared ownership of the object:
Express that a function will store and share ownership of a heap object using a by-value shared_ptr parameter.
-- GotW #91
I think these guidelines are very useful. Read the pages the quotes came from for more background and in-depth explanation, it's worth it.
I would return a unique_ptr by value in most situations. Most resources shouldn't be shared, since that makes it hard to reason about their lifetimes. You can usually write your code in such a way to avoid shared ownership. In any case, you can make a shared_ptr from the unique_ptr, so it's not like you're limiting your options.
I have my own multi-threaded service which handles some commands. The service consists of command parser, worker threads with queues and some caches. I don't want to keep an eye on each object's life-cycle, so I use shared_ptr's very extensive. Every component uses shared_ptr's in its own way:
command parser creates shared_ptr's and stores them in cache;
worker binds shared_ptr's to functors and puts them to queue.
cache temporary or permanently holds some shared_ptr's.
the data that is referenced by shared_ptr can also hold some other shared_ptr's.
And there is another underlying service (for example, command receiver and sender) that has the same structure, but uses his own cache, workers and shared_ptr's. It's independent from my service and is maintained by another developer.
It's a complete nightmare, when I try to track all shared_ptr dependencies to prevent cross-references.
Is there a way to specify some shared_ptr "interface" or "policy", so I will know which shared_ptr's I can pass safely to the underlying service without inspecting the code or interacting with the developer? Policy should involve the shared_ptr owning-cycle, for example, the worker holds the functor with binded shared_ptr since the dispatch() function call and only til some other function call, while the cache holds the shared_ptr since the cache's constructor call and til the cache's destructor call.
Especially, I'm curious about shutdown situation, when the application may freeze while waiting the threads to join.
There is no silver bullet... and shared_ptr certainly is not one.
My first question would be: do you need all those shared pointers ?
The best way to avoid cyclic references is to define the lifetime policy of each object and make sure they are compatible. This can be easily documented:
you pass me a reference, I expect the object to live throughout the function call, but no more
you pass me a unique_ptr, I am now responsible for the object
you pass me a shared_ptr, I expect to be able to keep a handle to the object myself without adversely affecting you
Now, there are rare situations where the use of shared_ptr is indeed necessary. The indication of caches lead me to think that it might be your case, at least for some uses.
In this case, you can (at least informally) enforce a layering approach.
Define a number of layers, from 0 (the base) to infinite
Each type of object is ascribed to a layer, several types may share the same layer
An object of type A might only hold a shared_ptr to an object of type B if, and only if, Layer(A) > Layer(B)
Note that we expressly forbid sibling relationships. With this scheme, no circle of references can ever be formed. Indeed, we obtain a DAG (Directed Acyclic Graph).
Now, when a type is created, it must be ascribed a layer number, and this must be documented (preferably in the code).
An object may change of layer, however:
if its layer number decreases, then you must reexamine the references it holds (easy)
if its layer number increases, then you must reexamine all the references to it (hard)
Note: by convention, types of objects which cannot hold any reference are usually in the layer 0.
Note 2: I first stumble upon this convention in an article by Herb Sutter, where he applied it to Mutexes and tried to prevent deadlock. This is an adaptation to the current issue.
This can be enforced a bit more automatically (by the compiler) as long as you are ready to work your existing code base.
We create a new SharedPtr class aware of our layering scheme:
template <typename T>
constexpr unsigned getLayer(T const&) { return T::Layer; }
template <typename T, unsigned L>
class SharedPtrImpl {
public:
explicit SharedPtrImpl(T* t): _p(t)
{
static_assert(L > getLayer(std::declval<T>()), "Layering Violation");
}
T* get() const { return _p.get(); }
T& operator*() const { return *this->get(); }
T* operator->() const { return this->get(); }
private:
std::shared_ptr<T> _p;
};
Each type that may be held in such a SharedPtr is given its layer statically, and we use a base class to help us out:
template <unsigned L>
struct LayerMember {
static unsigned const Layer = L;
template <typename T>
using SharedPtr<T> = SharedPtrImpl<T, L>;
};
And now, we can easily use it:
class Foo: public LayerMember<3> {
public:
private:
SharedPtr<Bar> _bar; // statically checked!
};
However this coding approach is a little more involved, I think that convention may well be sufficient ;)
You should look at weak_ptr. It complements shared_ptr but does not keep objects alive, so is very useful when you might have circular references.
I often come accross the problem that I have a class that has a pair of Register/Unregister-kind-of-methods. e.g.:
class Log {
public:
void AddSink( ostream & Sink );
void RemoveSink( ostream & Sink );
};
This applies to several different cases, like the Observer pattern or related stuff. My concern is, how safe is that? From a previous question I know, that I cannot safely derive object identity from that reference. This approach returns an iterator to the caller, that they have to pass to the unregister method, but this exposes implementation details (the iterator type), so I don't like it. I could return an integer handle, but that would require a lot of extra internal managment (what is the smallest free handle?). How do you go about this?
You are safe unless the client object has two derivations of ostream without using virtual inheritance.
In short, that is the fault of the user -- they should not be multiply inheriting an interface class twice in two different ways.
Use the address and be done with it. In these cases, I take a pointer argument rather than a reference to make it explicit that I will store the address. It also prevents implicit conversions that might kick in if you decided to take a const reference.
class Log {
public:
void AddSink( ostream* Sink );
void RemoveSink( ostream* Sink );
};
You can create an RAII object that calls AddSink in the constructor, and RemoveSink in the destructor to make this pattern exception-safe.
You could manage your objects using smart pointers and compare the pointers for equality inside your register / deregister functions.
If you only have stack allocated objects that are never copied between an register and deregister call you could also pass a pointer instead of the reference.
You could also do:
typedef iterator handle_t;
and hide the fact that your giving out internal iterators if exposing internal data structures worries you.
In your previous question, Konrad Rudolph posted an answer (that you did not accept but has the highest score), saying that everything should be fine if you use base class pointers, which you appear to do.