I am writing a program, the bulk of which is platform independent code compiled as a static library. I am then writing a program using any specific platform apis which is linked to the static library. Calling functions in the library from the actual program is obviously easy, however I am unsure of the best way (most efficient / cleanest implementation) of communicating with the program from the library.
I have looked into a signals / slots method however I am worried about the performance (ideally some draw calls would go through this). Otherwise the only other method I can think of is some form of callbacks using functors implementation, which should be quicker but is this the best design?
EDIT: My main aims / requirements is performance and easily implementable. The library is written in C++ and already uses boost so boost signals are a possibility, but is their performance acceptable?
Here are your options from (roughly) most flexible to least:
Signals & slots
Several signals and slots implementations are listed here (notably Boost.Signal). These are useful to implement the Observer design pattern where more than one object is interested in receiving notifications.
Boost.Function
You can register a boost::function callback. boost::function is wrapper around any callable entity: free function, static function, member function, or function object. To wrap a member function, you use boost::bind as shown in this example. Usage example:
#include <iostream>
#include <boost/function.hpp>
#include <boost/bind.hpp>
typedef boost::function<void (void)> MouseCallback;
class Mouse
{
public:
void registerCallback(MouseCallback callback) {callback_ = callback;}
void notifyClicked() {if (callback_) callback_();}
private:
MouseCallback callback_;
};
class Foo
{
public:
void mouseClicked() {std::cout << "Mouse clicked!";}
};
int main()
{
Mouse mouse;
Foo foo;
mouse.registerCallback(boost::bind(&Foo::mouseClicked, &foo));
mouse.notifyClicked();
}
Fast Delegate
There is a delegate implementation, called FastDelegate that is faster than boost::function. It uses an "ugly hack" that is not supported by the C++ standard, but is supported by practically all compilers.
There is also The Impossibly Fast C++ Delegates that is supported by the standard, but not by all compilers.
"Listener" interfaces (abstract classes)
As c-smile suggested, you can register a pointer to an object derived from a callback interface (abstract class). This is the traditional Java way of doing callbacks. Example:
class MouseInputListener
{
public:
virtual void mouseClicked() = 0;
virtual void mouseReleased() = 0;
};
class Mouse
{
public:
Mouse() : listener_(0) {}
void registerListener(MouseInputListener* listener) {listener_ = listener;}
void notifyClicked() {if (listener_) listener_->mouseClicked();}
void notifyReleased() {if (listener_) listener_->mouseReleased();}
private:
MouseInputListener* listener_;
};
class Foo : public MouseInputListener
{
public:
virtual void mouseClicked() {cout << "Mouse clicked!";}
virtual void mouseReleased() {cout << "Mouse released!";}
};
C-style callbacks
You register a pointer to a callback free function, plus an additional "context" void pointer. In the callback function, you cast the void* to the object type that will handle the event, and invoke the proper method. For example:
typedef void (*MouseCallback)(void* context); // Callback function pointer type
class Mouse
{
public:
Mouse() : callback_(0), context_(0) {}
void registerCallback(MouseCallback callback, void* context = 0)
{callback_ = callback; context_ = context;}
void notifyClicked() {if (callback_) callback_(context_);}
private:
MouseCallback callback_;
void* context_;
};
class Foo
{
public:
void mouseClicked() {cout << "Mouse clicked!";}
static void callback(void* context)
{static_cast<Foo*>(context)->mouseClicked();}
};
int main()
{
Mouse mouse;
Foo foo;
mouse.registerCallback(&Foo::callback, &foo);
mouse.notifyClicked();
}
Benchmarks
I have found some performance benchmarks:
http://www.ce.unipr.it/~medici/boostbenchmark.html
http://www.kbasm.com/cpp-callback-benchmark.html
They should give you an idea of what callback mechanism is appropriate.
As you can see by the numbers, you have to be invoking Boost signals 10,000 to 100,000 of times per second before performance even becomes an issue.
My Recommendation
If callbacks will not be invoked at a blazingly high rate (10-100 thousand times per second), then use boost::signal for maximum flexibility and automatic connection lifetime management.
If callbacks will be invoked at an extremely high rate, then start with boost::function for maximum flexibility and portability. It that's still too slow, then go with FastDelegate, or C-style callbacks.
Interfaces (abstract classes). Library expose interfaces. And accept callback interfaces to be invoked from library code. Time proven classic. Why to invent anything else?
Related
Is there a way to use boost or std bind() so I could use a result as a callback in C API?
Here's sample code I use:
#include <boost/function.hpp>
#include <boost/bind/bind.hpp>
typedef void (*CallbackType)();
void CStyleFunction(CallbackType functionPointer)
{
functionPointer();
}
class Class_w_callback
{
public:
Class_w_callback()
{
//This would not work
CStyleFunction(boost::bind(&Class_w_callback::Callback, this));
}
void Callback(){std::cout<<"I got here!\n";};
};
Thanks!
No, there is no way to do that. The problem is that a C function pointer is fundamentally nothing more than an instruction address: "go to this address, and execute the instructions you find". Any state you want to bring into the function has to either be global, or passed as parameters.
That is why most C callback APIs have a "context" parameter, typically a void pointer, that you can pass in, and just serves to allow you to pass in the data you need.
You cannot do this in portable C++. However, there are libraries out there that enable creation of C functions that resemble closures. These libraries include assembly code in their implementation and require manual porting to new platforms, but if they support architectures you care about, they work fine.
For example, using the trampoline library by Bruno Haible, you would write the code like this:
extern "C" {
#include <trampoline.h>
}
#include <iostream>
typedef int (*callback_type)();
class CallbackDemo {
static CallbackDemo* saved_this;
public:
callback_type make_callback() {
return reinterpret_cast<callback_type>(
alloc_trampoline(invoke, &saved_this, this));
}
void free_callback(callback_type cb) {
free_trampoline(reinterpret_cast<int (*)(...)>(cb));
}
void target(){
std::cout << "I got here, " << this << '\n';
};
static int invoke(...) {
CallbackDemo& me = *saved_this;
me.target();
return 0;
}
};
CallbackDemo *CallbackDemo::saved_this;
int main() {
CallbackDemo x1, x2;
callback_type cb1 = x1.make_callback();
callback_type cb2 = x2.make_callback();
cb1();
cb2();
}
Note that, despite the use of a static member, the trampolines created by alloc_trampoline are reentrant: when the returned callback is invoked, it first copies the pointer to the designated address, and then invokes the original function with original arguments. If the code must also be thread-safe, saved_this should be made thread-local.
This won't work.
The problem is that bind returns a functor, that is a C++ class with an operator() member function. This will not bind to a C function pointer. What you need is a static or non-member function that stores the this pointer in a global or static variable. Granted, finding the right this pointer for the current callback might be a non-trivial task.
Globals
As mentioned by the others, you need a global (a static member is a global hidden as a variable member) and of course if you need multiple objects to make use of different parameters in said callback, it won't work.
Context Parameters in Callback
A C library may offer a void * or some similar context. In that case use that feature.
For example, the ffmpeg library supports a callback to read data which is defined like so:
int(*read_packet)(void *opaque, uint8_t *buf, int buf_size);
The opaque parameter can be set to this. Within your callback, just cast it back to your type (name of your class).
Library Context Parameter in Calback
A C library may call your callback with its object (struct pointer). Say you have a library named example which offers a type named example_t and defines callbacks like this:
callback(example_t *e, int param);
Then you may be able to place your context (a.k.a. this pointer) in that example_t structure and retrieve it back out in your callback.
Serial Calls
Assuming you have only one thread using that specific C library and that the callback can only be triggered when you call a function in the library (i.e. you do not get events triggered at some random point in time,) you could still use a global variable. What you have to do is save your current object in the global before each call. Something like this:
object_i_am_working_with = this;
make_a_call_to_that_library();
This way, inside the callback you can always access the object_i_am_working_with pointer. This does not work in a multithreaded application or when the library automatically generates events in the background (i.e. a key press, a packet from the network, a timer, etc.)
One Thread Per Object (since C++11)
This is an interesting solution in a multi-threaded environment. When none of the previous solutions are available to you, you may be able to resolve the problem using threads.
In C++11, there is a new special specifier named thread_local. In the old days, you had to handle that by hand which would be specific to each thread implementation... now you can just do this:
thread_local Class_w_callback * callback_context = nullptr;
Then when in your callback you can use the callback_context as the pointer back to your Class_w_callback class.
This, of course, means you need to create one thread per object you create. This may not be feasible in your environment. In my case, I have components which are all running their own loop and thus each have their own thread_local environment.
Note that if the library automatically generates events you probably can't do that either.
Old Way with Threads (And C solution)
As I mentioned above, in the old days you would have to manage the local thread environment yourself. With pthread (Linux based), you have the thread specific data accessed through pthread_getspecific():
void *pthread_getspecific(pthread_key_t key);
int pthread_setspecific(pthread_key_t key, const void *value);
This makes use of dynamically allocated memory. This is probably how the thread_local is implemented in g++ under Linux.
Under MS-Windows, you probably would use the TlsAlloc function.
Learning c++ and trying to get familiar with some patterns. The signals2 doc clearly has a vast array of things I can do with slots and signals. What I don't understand is what types of applications (use cases) I should use it for.
I'm thinking along the lines of a state machine dispatching change events. Coming from a dynamically typed background (C#,Java etc) you'd use an event dispatcher or a static ref or a callback.
Are there difficulties in c++ with using cross-class callbacks? Is that essentially why signals2 exists?
One to the example cases is a document/view. How is this pattern better suited than say, using a vector of functions and calling each one in a loop, or say a lambda that calls state changes in registered listening class instances?
class Document
{
public:
typedef boost::signals2::signal<void ()> signal_t;
public:
Document()
{}
/* Connect a slot to the signal which will be emitted whenever
text is appended to the document. */
boost::signals2::connection connect(const signal_t::slot_type &subscriber)
{
return m_sig.connect(subscriber);
}
void append(const char* s)
{
m_text += s;
m_sig();
}
const std::string& getText() const
{
return m_text;
}
private:
signal_t m_sig;
std::string m_text;
};
and
class TextView
{
public:
TextView(Document& doc): m_document(doc)
{
m_connection = m_document.connect(boost::bind(&TextView::refresh, this));
}
~TextView()
{
m_connection.disconnect();
}
void refresh() const
{
std::cout << "TextView: " << m_document.getText() << std::endl;
}
private:
Document& m_document;
boost::signals2::connection m_connection;
};
Boost.Signals2 is not just "an array of callbacks", it has a lot of added value. IMO, the most important points are:
Thread-safety: several threads may connect/disconnect/invoke the same signal concurrently, without introducing race conditions. This is especially useful when communicating with an asynchronous subsystem, like an Active Object running in its own thread.
connection and scoped_connection handles that allow disconnection without having direct access to the signal. Note that this is the only way to disconnect incomparable slots, like boost::function (or std::function).
Temporary slot blocking. Provides a clean way to temporarily disable a listening module (eg. when a user requests to pause receiving messages in a view).
Automatic slot lifespan tracking: a signal disconnects automatically from "expired" slots. Consider the situation when a slot is a binder referencing a non-copyable object managed by shared_ptrs:
shared_ptr<listener> l = listener::create();
auto slot = bind(&listener::listen, l.get()); // we don't want aSignal_ to affect `listener` lifespan
aSignal_.connect(your_signal_type::slot_type(slot).track(l)); // but do want to disconnect automatically when it gets destroyed
Certainly, one can re-implement all the above functionality on his own "using a vector of functions and calling each one in a loop" etc, but the question is how it would be better than Boost.Signals2. Re-inventing the wheel is rarely a good idea.
I’ve got an OOP / design problem that I have run into and am desperately hoping that someone can steer me in a direction that doesn’t require a complete re-write.
The system is essential a Windows Service that have ~9 secondary threads that are responsible for specific tasks. All the threads share some common functionality (for example, the ability to send and receive messages internally etc). Because of this, I defined an abstract base class from which all the threads inherit from.
However, four of the threads also make use of an Inter-Process communication system based on a 3rd-party IPC system (madshi’s CreateIpcQueue ). To save replicating all the same code in these four threads, I defined an additional class to support this:
TThread <-TBaseThread<-TIPCBaseThread<- Four IPC threads
^- All other threads.
The mechanics of the IPC system is that you define a Callback function and then call the CreateIpcQueue passing it this Callback. In my TIPCBaseThread I loosely did something like this:
// TIPCBaseThread.h
class TIPCBaseThread : public TBaseThread
{
private:
static TIPCBaseThrd *pThis;
// defines the callback to use with the IPC queue
static void CALLBACK IPCQueue(char *cName, void *pMsgBuf, unsigned int iMsgLen,
void *pRtnBuf, unsigned int iRtnLen);
protected:
// virtual method, to be defined in derived classes, to handle IPC message
virtual void ProcessIPCMsg(char *cName, void *pMsgBuf, unsigned int iMsgLen, void *pRtnBuf,
unsigned int iRtnLen) = 0;
public:
CRITICAL_SECTION csIPCCritSect;
…
// TIPCBaseThread.cpp
TIPCBaseThrd* TIPCBaseThrd::pThis = 0;
__fastcall TIPCBaseThread::TIPCBaseThread(…) : TBaseThread(…)
{
pThis = this;
InitializeCriticalSectionAndSpinCount(&csIPCCritSect, 1000);
CreateIpcQueueEx(“SomeQueueName”, IPCQueue, 1, 0x1000);
//^Callback Queue
…
}
void CALLBACK TIPCBaseThread::IPCQueue(char *cName, void *pMsgBuf, unsigned int iMsgLen,
void *pRtnBuf, unsigned int iRtnLen)
{
EnterCriticalSection(&pThis->csIPCCritSect);
pThis->ProcessIPCMsg(cName, pMsgBuf, iMsgLen, pRtnBuf, iRtnLen);
LeaveCriticalSection(&pThis->csIPCCritSect);
}
My general thinking was that the TIPCBaseThread would effectively take care of creating and managing the IPC channel and then call the ProcessIPCMsg() in the various derived classes.
Now, when I test the system and send a message to any of the IPC channels, the message is received in the TIPCBaseThread callback but is passed up to the last derived class (to be created), not the class that should receive it. I’m assuming it is something to do with the
[static TIPCBaseThrd *pThis]
property being overwritten when each derived class is instantiated (but I confess I’m not 100% sure)?
Could anyone steer me in the right direction here? Obviously I would like to know exactly what is causing the problem but ideally I would like to know if there is a work around that would avoid having to completely re-write the whole object inheritance – there is obviously a bit more going on under the hood than I have shown and I’m going to have serious problems if I have to abandon this design completely.
Many thanks in advance,
Mike Collins
I think you should change the callback to take the instance as an argument
static void CALLBACK IPCQueue(TIPCBaseThread *instance,
char *cName, void *pMsgBuf, unsigned int iMsgLen,
void *pRtnBuf, unsigned int iRtnLen);
...
void CALLBACK TIPCBaseThread::IPCQueue(char *cName, void *pMsgBuf, unsigned int iMsgLen,
void *pRtnBuf, unsigned int iRtnLen)
{
...
instance->ProcessIPCMsg(cName, pMsgBuf, iMsgLen, pRtnBuf, iRtnLen);
...
}
There is one very strange thing: pThis = this; with static TIPCBaseThrd *pThis;
It means that at any point in time only the latest instance of TIPCBaseThrd is accessible via pThis (all previous instances having been overwritten); and of course there is the issue that this global value is not protected by any kind of synchronisation (mutex, atomics, ...)
It's unfortunate, but this static TIPCBaseThrd *pThis; is just an horrible idea that cannot possibly work.
I found a threadpool which doesn't seem to be in boost yet, but I may be able to use it for now (unless there is a better solution).
I have several million small tasks that I want to execute concurrently and I wanted to use a threadpool to schedule the execution of the tasks. The documentation of the threadpool provides (roughly) this example:
#include "threadpool.hpp"
using namespace boost::threadpool;
// A short task
void task()
{
// do some work
}
void execute_with_threadpool(int poolSize, int numTasks)
{
// Create a thread pool.
pool tp(poolSize);
for(int i = 0; i++; i < numTasks)
{
// Add some tasks to the pool.
tp.schedule(&task);
}
// Leave this function and wait until all tasks are finished.
}
However, the example only allows me to schedule non-member functions (or tasks). Is there a way that I can schedule a member function for execution?
Update:
OK, supposedly the library allows you to schedule a Runnable for execution, but I can't figure out where is the Runnable class that I'm supposed to inherit from.
template<typename Pool, typename Runnable>
bool schedule(Pool& pool, shared_ptr<Runnable> const & obj);
Update2:
I think I found out what I need to do: I have to make a runnable which will take any parameters that would be necessary (including a reference to the object that has a function which will be called), then I use the static schedule function to schedule the runnable on the given threadpool:
class Runnable
{
private:
MyClass* _target;
Data* _data;
public:
Runnable(MyClass* target, Data* data)
{
_target = target;
_data = data;
}
~Runnable(){}
void run()
{
_target->doWork(_data);
}
};
Here is how I schedule it within MyClass:
void MyClass::doWork(Data* data)
{
// do the work
}
void MyClass::produce()
{
boost::threadpool::schedule(myThreadPool, boost::shared_ptr<Runnable>(new Runnable(myTarget, new Data())));
}
However, the adaptor from the library has a bug in it:
template<typename Pool, typename Runnable>
bool schedule(Pool& pool, shared_ptr<Runnable> const & obj)
{
return pool->schedule(bind(&Runnable::run, obj));
}
Note that it takes a reference to a Pool but it tries to call it as if it was a pointer to a Pool, so I had to fix that too (just changing the -> to a .).
To schedule any function or member function - use Boost.Bind or Boost.Lambda (in this order). Also you can consider special libraries for your situation. I can recommend Inter Threading Building Blocks or, in case you use VC2010 - Microsoft Parallel Patterns Library.
EDIT:
I've never used this library or heard anything bad about it, but it's old enough and still is not included into Boost. I would check why.
EDIT 2:
Another option - Boost.Asio. It's primarily a networking library, but it has a scheduler that you can use. I would use this multithreading approach. Just instead of using asynchronous network operations schedule your tasks by boost::asio::io_service::post().
I think I found out what I need to do: I have to make a runnable which will take any parameters that would be necessary (including a reference to the object that has a function which will be called), then I use the static schedule function to schedule the runnable on the given threadpool:
class Runnable
{
private:
MyClass* _target;
Data* _data;
public:
Runnable(MyClass* target, Data* data)
{
_target = target;
_data = data;
}
~Runnable(){}
void run()
{
_target->doWork(_data);
}
};
Here is how I schedule it within MyClass:
void MyClass::doWork(Data* data)
{
// do the work
}
void MyClass::produce()
{
boost::threadpool::schedule(myThreadPool, boost::shared_ptr<Runnable>(new Runnable(myTarget, new Data())));
}
However, the adaptor from the library has a bug in it:
template<typename Pool, typename Runnable>
bool schedule(Pool& pool, shared_ptr<Runnable> const & obj)
{
return pool->schedule(bind(&Runnable::run, obj));
}
Note that it takes a reference to a Pool but it tries to call it as if it was a pointer to a Pool, so I had to fix that too (just changing the -> to a .).
However, as it turns out, I can't use that boost thread pool because I am mixing native C++ (dll), C++/CLI (dll) and .NET code: I have a C++/CLI library that wraps a native C++ library which in tern uses boost::thread. Unfortunately, that results in a BadImageFormatException at runtime (which has previously been discussed by other people):
The problem is that the static boost thread library tries to hook the
native win32 PE TLS callbacks in order to ensure that the thread-local
data used by boost thread is cleaned up correctly. This is not
compatible with a C++/CLI executable.
This solution is what I was able to implement using the information: http://think-async.com/Asio/Recipes. I tried implementing this recipe and found that the code worked in Windows but not in Linux. I was unable to figure out the problem but searching the internet found the key which was make the work object an auto pointer within the code block. I've include the void task() that the user wanted for my example I was able to create a convenience function and pass pointers into my function does the work. For my case, I create a thread pool that uses the function : boost::thread::hardware_concurrency() to get the possible number of threads. I've used the recipe below with as many as 80 tasks with 15 threads.
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <boost/thread.hpp>
#include <boost/scoped_ptr.hpp>
// A short task
void task()
{
// do some work
}
void execute_with_threadpool( int numTasks,
int poolSize = boost::thread::hardware_concurrency() )
{
boost::asio::io_service io_service;
boost::thread_group threads;
{
boost::scoped_ptr< boost::asio::io_service::work > work( new boost::asio::io_service::work(io_service) );
for(int t = 0; t < poolSize; t++)
{
threads.create_thread(boost::bind(&boost::asio::io_service::run, &io_service));
}
for( size_t t = 0; t < numTasks; t++ )
{
++_number_of_jobs;
io_service.post(boost::bind(task) );
}
}
threads.join_all();
}
Figured it out, you must have run() method defined, this is the easiest way:
class Command
{
public:
Command() {}
~Command() {}
void run() {}
};
In main(), tp is your threadpool:
shared_ptr<Command> pc(new Command());
tp.schedule(bind(&Command::run, pc));
Done.
I'm designing classes for my application (network tool). I this base class:
class Descriptor
{
// ...
public:
virtual void data_read (void);
virtual void data_write (void);
virtual void data_error (void);
protected:
int socket_descriptor;
// ...
}
class TcpClient :
public Descriptor
{
// ...
}
Many classes are based on Descriptor. I monitor sockets' events using epoll. When I want to look for events on TcpClient object I add the object's socket and pointer to this object to epoll, code:
epoll_event descriptor_events;
descriptor_events.events = EPOLLIN;
descriptor_events.data.fd = this->socket_descriptor;
descriptor_events.data.ptr = this;
epoll_ctl (epoll_descriptor, EPOLL_CTL_ADD, this->socket_descriptor, &descriptor_events);
I dispatch epoll events in separate thread in this way:
Descriptor *descriptor (NULL);
// ...
return_value = epoll_wait (epoll_descriptor, events, 64, -1);
while (i < return_value)
{
descriptor = (Descriptor *) (events [i]).data.ptr;
if ((events [i]).events & EPOLLOUT)
descriptor->data_write ();
if ((events [i]).events & EPOLLIN)
descriptor->data_read ();
if ((events [i]).events & EPOLLERR)
descriptor->data_error ();
i++;
}
Program is going to handle a lot of data in the epoll thread, so it mean, that virtual functions will be called many times there. I'm wondering about runtime cost of this solution.
I'm also thinking about two other implementations (however I'm not sure if the're much faster):
typedef void (*function) (Descriptor *) EventFunction;
class Descriptor
{
// ...
public:
EventFunction data_read;
EventFunction data_write;
EventFunction data_error;
protected:
Descriptor (EventFunction data_read,
EventFunction data_write,
EventFunction data_error);
int socket_descriptor;
// ...
}
or use CRTP.
Maybe you have other idea of implementing this?
Unless proven otherwise, your original design looks fine to me.
The first rule of optimization is to measure first, then fix only hotspots that really exist. You'll be surprised where your code spends its time. Dwelling on the distinction between virtual functions and function pointers is almost certainly premature optimization. In both cases the compiler will generate code to jump to a function pointer, though with virtual functions the compiler will have to look up the vtable first. Write idiomatic C++ code to do what you want, then profile it if you have performance problems.
(I do have one comment about your class Descriptor: Unless you're planning on having generic data_read(), data_write(), and data_error() methods I'd recommend making them pure virtual methods.)
Honestly, your best bet for optimizing this code is probably to completely replace it with Boost ASIO. As described, you're essentially re-implementing the heavily scrutinized and well tested ASIO library. Unless you're absolutely certain you must roll your own I/O library, you'll probably save yourself a tremendous amount of development & optimization time by just using an existing solution.
Under the don't reinvent the wheel umbrella, I'd suggest looking at Boost.Asio since it offers most of the functionality you have described in your sample code.