Porting an existing class structure to smart pointers - c++

I know this question is rather long, but I was not sure how to explain my problem in a shorter way. The question itself is about class hierarchy design and, especially, how to port an existing hierarchy based on pointers to one using smart pointers. If anyone can come up with some way to simplify my explanation and, thus, make this question more generic, please let me know. In that way, it might be useful for more SO readers.
I am designing a C++ application for handling a system that allows me to read some sensors. The system is composed of remotes machines from where I collect the measurements. This application must actually work with two different subsystems:
Aggregated system: this type of system contains several components from where I collect measurements. All the communication goes through the aggregated system which will redirect the data to the specific component if needed (global commands sent to the aggregated system itself do not need to be transferred to individual components).
Standalone system: in this case there is just a single system and all the communication (including global commands) is sent to that system.
Next you can see the class diagram I came up with:
The standalone system inherits both from ConnMgr and MeasurementDevice. On the other hand, an aggregated system splits its functionality between AggrSystem and Component.
Basically, as a user what I want to have is a MeasurementDevice object and transparently send data to corresponding endpoint, be it an aggregated system or a standalone one.
CURRENT IMPLEMENTATION
This is my current implementation. First, the two base abstract classes:
class MeasurementDevice {
public:
virtual ~MeasurementDevice() {}
virtual void send_data(const std::vector<char>& data) = 0;
};
class ConnMgr {
public:
ConnMgr(const std::string& addr) : addr_(addr) {}
virtual ~ConnMgr() {}
virtual void connect() = 0;
virtual void disconnect() = 0;
protected:
std::string addr_;
};
These are the classes for an aggregated system:
class Component : public MeasurementDevice {
public:
Component(AggrSystem& as, int slot) : aggr_sys_(as), slot_(slot) {}
void send_data(const std::vector<char>& data) {
aggr_sys_.send_data(slot_, data);
}
private:
AggrSystem& aggr_sys_;
int slot_;
};
class AggrSystem : public ConnMgr {
public:
AggrSystem(const std::string& addr) : ConnMgr(addr) {}
~AggrSystem() { for (auto& entry : components_) delete entry.second; }
// overridden virtual functions omitted (not using smart pointers)
MeasurementDevice* get_measurement_device(int slot) {
if (!is_slot_used(slot)) throw std::runtime_error("Empty slot");
return components_.find(slot)->second;
}
private:
std::map<int, Component*> components_;
bool is_slot_used(int slot) const {
return components_.find(slot) != components_.end();
}
void add_component(int slot) {
if (is_slot_used(slot)) throw std::runtime_error("Slot already used");
components_.insert(std::make_pair(slot, new Component(*this, slot)));
}
};
This is the code for a standalone system:
class StandAloneSystem : public ConnMgr, public MeasurementDevice {
public:
StandAloneSystem(const std::string& addr) : ConnMgr(addr) {}
// overridden virtual functions omitted (not using smart pointers)
MeasurementDevice* get_measurement_device() {
return this;
}
};
These are factory-like functions responsible for creating ConnMgr and MeasurementDevice objects:
typedef std::map<std::string, boost::any> Config;
ConnMgr* create_conn_mgr(const Config& cfg) {
const std::string& type =
boost::any_cast<std::string>(cfg.find("type")->second);
const std::string& addr =
boost::any_cast<std::string>(cfg.find("addr")->second);
ConnMgr* ep;
if (type == "aggregated") ep = new AggrSystem(addr);
else if (type == "standalone") ep = new StandAloneSystem(addr);
else throw std::runtime_error("Unknown type");
return ep;
}
MeasurementDevice* get_measurement_device(ConnMgr* ep, const Config& cfg) {
const std::string& type =
boost::any_cast<std::string>(cfg.find("type")->second);
if (type == "aggregated") {
int slot = boost::any_cast<int>(cfg.find("slot")->second);
AggrSystem* aggr_sys = dynamic_cast<AggrSystem*>(ep);
return aggr_sys->get_measurement_device(slot);
}
else if (type == "standalone") return dynamic_cast<StandAloneSystem*>(ep);
else throw std::runtime_error("Unknown type");
}
And finally here it is main(), showing a very simple usage case:
#define USE_AGGR
int main() {
Config config = {
{ "addr", boost::any(std::string("192.168.1.10")) },
#ifdef USE_AGGR
{ "type", boost::any(std::string("aggregated")) },
{ "slot", boost::any(1) },
#else
{ "type", boost::any(std::string("standalone")) },
#endif
};
ConnMgr* ep = create_conn_mgr(config);
ep->connect();
MeasurementDevice* dev = get_measurement_device(ep, config);
std::vector<char> data; // in real life data should contain something
dev->send_data(data);
ep->disconnect();
delete ep;
return 0;
}
PROPOSED CHANGES
First of all, I wonder whether there is a way to avoid the dynamic_cast in get_measurement_device. Since AggrSystem::get_measurement_device(int slot) and StandAloneSystem::get_measurement_device() have different signatures, it is not possible to create a common virtual method in the base class. I was thinking to add a common method accepting a map containing the options (e.g., the slot). In that case, I would not need to do the dynamic casting. Is this second approach preferable in terms of a cleaner design?
In order to port the class hierarchy to smart pointers I used unique_ptr. First I changed the map of components in AggrSystem to:
std::map<int, std::unique_ptr<Component> > components_;
The addition of a new Component now looks like:
void AggrSystem::add_component(int slot) {
if (is_slot_used(slot)) throw std::runtime_error("Slot already used");
components_.insert(std::make_pair(slot,
std::unique_ptr<Component>(new Component(*this, slot))));
}
For returning a Component I decided to return a raw pointer since the lifetime of a Component object is defined by the lifetime of an AggrSystem object:
MeasurementDevice* AggrSystem::get_measurement_device(int slot) {
if (!is_slot_used(slot)) throw std::runtime_error("Empty slot");
return components_.find(slot)->second.get();
}
Is returning a raw pointer a correct decision? If I use a shared_ptr, however, then I run into problems with the implementation for the standalone system:
MeasurementDevice* StandAloneSystem::get_measurement_device() {
return this;
}
In this case I cannot return a shared_ptr using this. I guess I could create one extra level of indirection and have something like StandAloneConnMgr and StandAloneMeasurementDevice, where the first class would hold a shared_ptr to an instance of the second.
So, overall, I wanted to ask whether this a good approach when using smart pointers. Would it be preferable to use a map of shared_ptr and return a shared_ptr too, or is it better the current approach based on using unique_ptr for ownership and raw pointer for accessing?
P.S: create_conn_mgr and main are changed as well so that instead of using a raw pointer (ConnMgr*) now I use unique_ptr<ConnMgr>. I did not add the code since the question was already long enough.

First of all, I wonder whether there is a way to avoid the
dynamic_cast in get_measurement_device.
I would attempt to unify the get_measurement_device signatures so that you can make this a virtual function in the base class.
So, overall, I wanted to ask whether this a good approach when using
smart pointers.
I think you've done a good job. You've basically converted your "single ownership" news and deletes to unique_ptr in a fairly mechanical fashion. This is exactly the right first (and perhaps last) step.
I also think you made the right decision in returning raw pointers from get_measurement_device because in your original code the clients of this function did not take ownership of this pointer. Dealing with raw pointers when you do not intend to share or transfer ownership is a good pattern that most programmers will recognize.
In summary, you've correctly translated your existing design to use smart pointers without changing the semantics of your design.
From here if you want to study the possibility of changing your design to one involving shared ownership, that is a perfectly valid next step. My own preference is to prefer unique ownership designs until a use case or circumstance demands shared ownership.
Unique ownership is not only more efficient, it is also easier to reason about. That ease in reasoning typically leads to fewer accidental cyclic memory ownership patters (cyclic memory ownership == leaked memory). Coders who just slap down shared_ptr every time they see a pointer are far more likely to end up with memory ownership cycles.
That being said, cyclic memory ownership is also possible using only unique_ptr. And if it happens, you need weak_ptr to break the cycle, and weak_ptr only works with shared_ptr. So the introduction of an ownership cycle is another good reason to migrate to shared_ptr.

Related

Library managing lifetime of objects, use smart pointers or raw?

Rolling my own GUI library for a side-project. Refactoring to use smart pointers; however, I ran into an issue.
I'm aware that you do not want to to use smart pointers across DLL boundaries for obvious reasons. But I feel dirty using 'new' in application code. See below:
// MYFINANCEAPP.H
class MyFinanceApp : public Application
{
MyFinanceApp() : mMainWindow(make_unique<Window>())
{
mMainWindow->AddControl(*(new Button("testButton")));
}
private:
std::unique_ptr<Window> mMainWindow;
};
// WINDOW.H
class Window
{
public:
void AddControl(Control& control) //QUESTION: HOW DO I GET SMART POINTER HERE???
{
mControls.emplace_back(&control)
}
private:
std::vector<std::unique_ptr<Control>> mControls; //Want to use smart pointers so I am not responsible for managing...
};
Am I better to use C++98 style and semantics and handle it myself. Obviously I do not want to pass smart pointers across the interface boundary (i.e. AddControl), but I don't want to be responsible for handling the lifetime of the controls.
Additionally, I feel really dirty using new Button("testButton").
A reminder that ABI issues can be circumvented if you simply make a guarantee that you won't release a DLL compiled on a different compiler/version/platform than the primary executable.
At any rate, regarding your interface design:
void AddControl(Control& control) {
mControls.emplace_back(&control)
}
You've got a problem here, because
Control is Polymorphic (or at least seems to be based on what code you provided) which means you have to pass by reference or by pointer to get a "complete" object, but
You don't want to expose an interface where the user has to maintain raw pointers, even briefly, before passing them along to your application.
This is how I would design around this problem:
class Window {
public:
void AddControl(std::unique_ptr<Control> control) {//Note we're passing by value!
mControls.emplace_back(std::move(control));
}
private:
std::vector<std::unique_ptr<Control>> mControls;
};
Then, in your application:
class MyFinanceApp : public Application {
public:
MyFinanceApp() : mMainWindow(make_unique<Window>()) {
mMainWindow->AddControl(std::make_unique<Button>("testButton"));
}
private:
std::unique_ptr<Window> mMainWindow;
};
Note that this doesn't necessarily stop your users from doing something stupid, like
std::unique_ptr<Window> window_ptr = std::make_Unique<Window>();
Button * button = new Button("This won't be properly deleted!");
window_ptr->AddControl(std::unique_ptr<Button>{button});
delete button; //Whoops!
... But there wasn't necessarily anything stopping them from doing that in the first place.
An alternative is to have a "Factory" associated with the Controls. We'd first need to make a modification to AddControl:
Control & AddControl(std::unique_ptr<Control> control) {
mControls.emplace_back(std::move(control));
return *mControls.back();
}
struct ControlFactory {
static Button & create_button(Window & window, std::string button_text) {
std::unique_ptr<Button> button_ptr = std::make_unique<Button>(button_text);
Button & ref = *button_ptr;
window.AddControl(std::move(button_ptr));
//ref will NOT be invalidated, because the object will still exist in memory,
//in the same location in memory as before. Only the unique_ptr will have changed.
return ref;
}
};
And then you'd simply need to change the access modifiers on all your Control subclasses to not allow direct access to their constructors by end-user programmers. Something like this would probably suffice:
class Button : public Control {
/*...*/
protected:
Button() {/*...*/}
Button(std::string text) {/*...*/}
friend class ControlFactory; //Allows ControlFactory to access, even though access is protected.
};
Then your users would be storing references on their end, which is safer than pointers, though it does mean you need to guarantee that those references never outlive the application itself.

Practical use of dynamic_cast?

I have a pretty simple question about the dynamic_cast operator. I know this is used for run time type identification, i.e., to know about the object type at run time. But from your programming experience, can you please give a real scenario where you had to use this operator? What were the difficulties without using it?
Toy example
Noah's ark shall function as a container for different types of animals. As the ark itself is not concerned about the difference between monkeys, penguins, and mosquitoes, you define a class Animal, derive the classes Monkey, Penguin, and Mosquito from it, and store each of them as an Animal in the ark.
Once the flood is over, Noah wants to distribute animals across earth to the places where they belong and hence needs additional knowledge about the generic animals stored in his ark. As one example, he can now try to dynamic_cast<> each animal to a Penguin in order to figure out which of the animals are penguins to be released in the Antarctic and which are not.
Real life example
We implemented an event monitoring framework, where an application would store runtime-generated events in a list. Event monitors would go through this list and examine those specific events they were interested in. Event types were OS-level things such as SYSCALL, FUNCTIONCALL, and INTERRUPT.
Here, we stored all our specific events in a generic list of Event instances. Monitors would then iterate over this list and dynamic_cast<> the events they saw to those types they were interested in. All others (those that raise an exception) are ignored.
Question: Why can't you have a separate list for each event type?
Answer: You can do this, but it makes extending the system with new events as well as new monitors (aggregating multiple event types) harder, because everyone needs to be aware of the respective lists to check for.
A typical use case is the visitor pattern:
struct Element
{
virtual ~Element() { }
void accept(Visitor & v)
{
v.visit(this);
}
};
struct Visitor
{
virtual void visit(Element * e) = 0;
virtual ~Visitor() { }
};
struct RedElement : Element { };
struct BlueElement : Element { };
struct FifthElement : Element { };
struct MyVisitor : Visitor
{
virtual void visit(Element * e)
{
if (RedElement * p = dynamic_cast<RedElement*>(e))
{
// do things specific to Red
}
else if (BlueElement * p = dynamic_cast<BlueElement*>(e))
{
// do things specific to Blue
}
else
{
// error: visitor doesn't know what to do with this element
}
}
};
Now if you have some Element & e;, you can make MyVisitor v; and say e.accept(v).
The key design feature is that if you modify your Element hierarchy, you only have to edit your visitors. The pattern is still fairly complex, and only recommended if you have a very stable class hierarchy of Elements.
Imagine this situation: You have a C++ program that reads and displays HTML. You have a base class HTMLElement which has a pure virtual method displayOnScreen. You also have a function called renderHTMLToBitmap, which draws the HTML to a bitmap. If each HTMLElement has a vector<HTMLElement*> children;, you can just pass the HTMLElement representing the element <html>. But what if a few of the subclasses need special treatment, like <link> for adding CSS. You need a way to know if an element is a LinkElement so you can give it to the CSS functions. To find that out, you'd use dynamic_cast.
The problem with dynamic_cast and polymorphism in general is that it's not terribly efficient. When you add vtables into the mix, it only get's worse.
When you add virtual functions to a base class, when they are called, you end up actually going through quite a few layers of function pointers and memory areas. That will never be more efficient than something like the ASM call instruction.
Edit: In response to Andrew's comment bellow, here's a new approach: Instead of dynamic casting to the specific element type (LinkElement), instead you have another abstract subclass of HTMLElement called ActionElement that overrides displayOnScreen with a function that displays nothing, and creates a new pure virtual function: virtual void doAction() const = 0. The dynamic_cast is changed to test for ActionElement and just calls doAction(). You'd have the same kind of subclass for GraphicalElement with a virtual method displayOnScreen().
Edit 2: Here's what a "rendering" method might look like:
void render(HTMLElement root) {
for(vector<HTLMElement*>::iterator i = root.children.begin(); i != root.children.end(); i++) {
if(dynamic_cast<ActionElement*>(*i) != NULL) //Is an ActionElement
{
ActionElement* ae = dynamic_cast<ActionElement*>(*i);
ae->doAction();
render(ae);
}
else if(dynamic_cast<GraphicalElement*>(*i) != NULL) //Is a GraphicalElement
{
GraphicalElement* ge = dynamic_cast<GraphicalElement*>(*i);
ge->displayToScreen();
render(ge);
}
else
{
//Error
}
}
}
Operator dynamic_cast solves the same problem as dynamic dispatch (virtual functions, visitor pattern, etc): it allows you to perform different actions based on the runtime type of an object.
However, you should always prefer dynamic dispatch, except perhaps when the number of dynamic_cast you'd need will never grow.
Eg. you should never do:
if (auto v = dynamic_cast<Dog*>(animal)) { ... }
else if (auto v = dynamic_cast<Cat*>(animal)) { ... }
...
for maintainability and performance reasons, but you can do eg.
for (MenuItem* item: items)
{
if (auto submenu = dynamic_cast<Submenu*>(item))
{
auto items = submenu->items();
draw(context, items, position); // Recursion
...
}
else
{
item->draw_icon();
item->setup_accelerator();
...
}
}
which I've found quite useful in this exact situation: you have one very particular subhierarchy that must be handled separately, this is where dynamic_cast shines. But real world examples are quite rare (the menu example is something I had to deal with).
dynamic_cast is not intended as an alternative to virtual functions.
dynamic_cast has a non-trivial performance overhead (or so I think) since the whole class hierarchy has to be walked through.
dynamic_cast is similar to the 'is' operator of C# and the QueryInterface of good old COM.
So far I have found one real use of dynamic_cast:
(*) You have multiple inheritance and to locate the target of the cast the compiler has to walk the class hierarchy up and down to locate the target (or down and up if you prefer). This means that the target of the cast is in a parallel branch in relation to where the source of the cast is in the hierarchy. I think there is NO other way to do such a cast.
In all other cases, you just use some base class virtual to tell you what type of object you have and ONLY THEN you dynamic_cast it to the target class so you can use some of it's non-virtual functionality. Ideally there should be no non-virtual functionality, but what the heck, we live in the real world.
Doing things like:
if (v = dynamic_cast(...)){} else if (v = dynamic_cast(...)){} else if ...
is a performance waste.
Casting should be avoided when possible, because it is basically saying to the compiler that you know better and it is usually a sign of some weaker design decission.
However, you might come in situations where the abstraction level was a bit too high for 1 or 2 sub-classes, where you have the choice to change your design or solve it by checking the subclass with dynamic_cast and handle it in a seperate branch. The trade-of is between adding extra time and risk now against extra maintenance issues later.
In most situations where you are writing code in which you know the type of the entity you're working with, you just use static_cast as it's more efficient.
Situations where you need dynamic cast typically arrive (in my experience) from lack of foresight in design - typically where the designer fails to provide an enumeration or id that allows you to determine the type later in the code.
For example, I've seen this situation in more than one project already:
You may use a factory where the internal logic decides which derived class the user wants rather than the user explicitly selecting one. That factory, in a perfect world, returns an enumeration which will help you identify the type of returned object, but if it doesn't you may need to test what type of object it gave you with a dynamic_cast.
Your follow-up question would obviously be: Why would you need to know the type of object that you're using in code using a factory?
In a perfect world, you wouldn't - the interface provided by the base class would be sufficient for managing all of the factories' returned objects to all required extents. People don't design perfectly though. For example, if your factory creates abstract connection objects, you may suddenly realize that you need to access the UseSSL flag on your socket connection object, but the factory base doesn't support that and it's not relevant to any of the other classes using the interface. So, maybe you would check to see if you're using that type of derived class in your logic, and cast/set the flag directly if you are.
It's ugly, but it's not a perfect world, and sometimes you don't have time to refactor an imperfect design fully in the real world under work pressure.
The dynamic_cast operator is very useful to me.
I especially use it with the Observer pattern for event management:
#include <vector>
#include <iostream>
using namespace std;
class Subject; class Observer; class Event;
class Event { public: virtual ~Event() {}; };
class Observer { public: virtual void onEvent(Subject& s, const Event& e) = 0; };
class Subject {
private:
vector<Observer*> m_obs;
public:
void attach(Observer& obs) { m_obs.push_back(& obs); }
public:
void notifyEvent(const Event& evt) {
for (vector<Observer*>::iterator it = m_obs.begin(); it != m_obs.end(); it++) {
if (Observer* const obs = *it) {
obs->onEvent(*this, evt);
}
}
}
};
// Define a model with events that contain data.
class MyModel : public Subject {
public:
class Evt1 : public Event { public: int a; string s; };
class Evt2 : public Event { public: float f; };
};
// Define a first service that processes both events with their data.
class MyService1 : public Observer {
public:
virtual void onEvent(Subject& s, const Event& e) {
if (const MyModel::Evt1* const e1 = dynamic_cast<const MyModel::Evt1*>(& e)) {
cout << "Service1 - event Evt1 received: a = " << e1->a << ", s = " << e1->s << endl;
}
if (const MyModel::Evt2* const e2 = dynamic_cast<const MyModel::Evt2*>(& e)) {
cout << "Service1 - event Evt2 received: f = " << e2->f << endl;
}
}
};
// Define a second service that only deals with the second event.
class MyService2 : public Observer {
public:
virtual void onEvent(Subject& s, const Event& e) {
// Nothing to do with Evt1 in Service2
if (const MyModel::Evt2* const e2 = dynamic_cast<const MyModel::Evt2*>(& e)) {
cout << "Service2 - event Evt2 received: f = " << e2->f << endl;
}
}
};
int main(void) {
MyModel m; MyService1 s1; MyService2 s2;
m.attach(s1); m.attach(s2);
MyModel::Evt1 e1; e1.a = 2; e1.s = "two"; m.notifyEvent(e1);
MyModel::Evt2 e2; e2.f = .2f; m.notifyEvent(e2);
}
Contract Programming and RTTI shows how you can use dynamic_cast to allow objects to advertise what interfaces they implement. We used it in my shop to replace a rather opaque metaobject system. Now we can clearly describe the functionality of objects, even if the objects are introduced by a new module several weeks/months after the platform was 'baked' (though of course the contracts need to have been decided on up front).

Most effective method of executing functions an in unknown order

Let's say I have a large, between 50 and 200, pool of individual functions whose job it is to operate on a single object and modify it. The pool of functions is selectively put into a single array and arranged in an arbitrary order.
The functions themselves take no arguments outside of the values present within the object it is modifying, and in this way the object's behavior is determined only by which functions are executed and in what order.
A way I have tentatively used so far is this, which might explain better what my goal is:
class Behavior{
public:
virtual void act(Object * obj) = 0;
};
class SpecificBehavior : public Behavior{
// many classes like this exist
public:
void act(Object * obj){ /* do something specific with obj*/ };
};
class Object{
public:
std::list<Behavior*> behavior;
void behave(){
std::list<Behavior*>::iterator iter = behavior.front();
while(iter != behavior.end()){
iter->act(this);
++iter;
};
};
};
My Question is, what is the most efficient way in C++ of organizing such a pool of functions, in terms of performance and maintainability. This is for some A.I research I am doing, and this methodology is what most closely matches what I am trying to achieve.
edits: The array itself can be changed at any time by any other part of the code not listed here, but it's guaranteed to never change during the call to behave(). The array it is stored in needs to be able to change and expand to any size
If the behaviour functions have no state and only take one Object argument, then I'd go with a container of function objects:
#include <functional>
#include <vector>
typedef std::function<void(Object &)> BehaveFun;
typedef std::vector<BehaveFun> BehaviourCollection;
class Object {
BehaviourCollection b;
void behave() {
for (auto it = b.cbegin(); it != b.cend(); ++it) *it(*this);
}
};
Now you just need to load all your functions into the collection.
if the main thing you will be doing with this collection is iterating over it, you'll probably want to use a vector as dereferencing and incrementing your iterators will equate to simple pointer arithmetic.
If you want to use all your cores, and your operations do not share any state, you might want to have a look at a library like Intel's TBB (see the parallel_for example)
I'd keep it exactly as you have it.
Perofmance should be OK (there may be an extra indirection due to the vtable look up but that shouldn't matter.)
My reasons for keeping it as is are:
You might be able to lift common sub-behaviour into an intermediate class between Behaviour and your implementation classes. This is not as easy using function pointers.
struct AlsoWaveArmsBase : public Behaviour
{
void act( Object * obj )
{
start_waving_arms(obj); // Concrete call
do_other_action(obj); // Abstract call
end_waving_arms(obj); // Concrete call
}
void start_waving_arms(Object*obj);
void end_waving_arms(Object*obj);
virtual void do_other_actions(Object * obj)=0;
};
struct WaveAndWalk : public AlsoWaveArmsBase
{
void do_other_actions(Object * obj) { walk(obj); }
};
struct WaveAndDance : pubic AlsoWaveArmsBase
{
void do_other_actions(Object * obj) { walk(obj); }
}
You might want to use state in your behaviour
struct Count : public Behavior
{
Behaviour() : i(0) {}
int i;
void act(Object * obj)
{
count(obj,i);
++i;
}
}
You might want to add helper functions e.g. you might want to add a can_act like this:
void Object::behave(){
std::list<Behavior*>::iterator iter = behavior.front();
while(iter != behavior.end()){
if( iter->can_act(this) ){
iter->act(this);
}
++iter;
};
};
IMO, these flexibilities outweigh the benefits of moving to a pure function approach.
For maintainability, your current approach is the best (virtual functions). You might get a tiny little gain from using free function pointers, but I doubt it's measurable, and even if so, I don't think it is worth the trouble. The current OO approach is fast enough and maintainable. The little gain I'm talking about comes from the fact that you are dereferencing a pointer to an object and then (behind the scenes) dereferencing a pointer to a function (which happening as the implementation of calling a virtual function).
I wouldn't use std::function, because it's not very performant (though that might differ between implementations). See this and this. Function pointers are as fast as it gets when you need this kind of dynamism at runtime.
If you need to improve the performance, I suggest to look into improving the algorithm, not this implementation.

Factory method anti-if implementation

I'm applying the Factory design pattern in my C++ project, and below you can see how I am doing it. I try to improve my code by following the "anti-if" campaign, thus want to remove the if statements that I am having. Any idea how can I do it?
typedef std::map<std::string, Chip*> ChipList;
Chip* ChipFactory::createChip(const std::string& type) {
MCList::iterator existing = Chips.find(type);
if (existing != Chips.end()) {
return (existing->second);
}
if (type == "R500") {
return Chips[type] = new ChipR500();
}
if (type == "PIC32F42") {
return Chips[type] = new ChipPIC32F42();
}
if (type == "34HC22") {
return Chips[type] = new Chip34HC22();
}
return 0;
}
I would imagine creating a map, with string as the key, and the constructor (or something to create the object). After that, I can just get the constructor from the map using the type (type are strings) and create my object without any if. (I know I'm being a bit paranoid, but I want to know if it can be done or not.)
You are right, you should use a map from key to creation-function.
In your case it would be
typedef Chip* tCreationFunc();
std::map<std::string, tCreationFunc*> microcontrollers;
for each new chip-drived class ChipXXX add a static function:
static Chip* CreateInstance()
{
return new ChipXXX();
}
and also register this function into the map.
Your factory function should be somethink like this:
Chip* ChipFactory::createChip(std::string& type)
{
ChipList::iterator existing = microcontrollers.find(type);
if (existing != microcontrollers.end())
return existing->second();
return NULL;
}
Note that copy constructor is not needed, as in your example.
The point of the factory is not to get rid of the ifs, but to put them in a separate place of your real business logic code and not to pollute it. It is just a separation of concerns.
If you're desperate, you could write a jump table/clone() combo that would do this job with no if statements.
class Factory {
struct ChipFunctorBase {
virtual Chip* Create();
};
template<typename T> struct CreateChipFunctor : ChipFunctorBase {
Chip* Create() { return new T; }
};
std::unordered_map<std::string, std::unique_ptr<ChipFunctorBase>> jumptable;
Factory() {
jumptable["R500"] = new CreateChipFunctor<ChipR500>();
jumptable["PIC32F42"] = new CreateChipFunctor<ChipPIC32F42>();
jumptable["34HC22"] = new CreateChipFunctor<Chip34HC22>();
}
Chip* CreateNewChip(const std::string& type) {
if(jumptable[type].get())
return jumptable[type]->Create();
else
return null;
}
};
However, this kind of approach only becomes valuable when you have large numbers of different Chip types. For just a few, it's more useful just to write a couple of ifs.
Quick note: I've used std::unordered_map and std::unique_ptr, which may not be part of your STL, depending on how new your compiler is. Replace with std::map/boost::unordered_map, and std::/boost::shared_ptr.
No you cannot get rid of the ifs. the createChip method creats a new instance depending on constant (type name )you pass as argument.
but you may optimaze yuor code a little removing those 2 line out of if statment.
microcontrollers[type] = newController;
return microcontrollers[type];
To answer your question: Yes, you should make a factory with a map to functions that construct the objects you want. The objects constructed should supply and register that function with the factory themselves.
There is some reading on the subject in several other SO questions as well, so I'll let you read that instead of explaining it all here.
Generic factory in C++
Is there a way to instantiate objects from a string holding their class name?
You can have ifs in a factory - just don't have them littered throughout your code.
struct Chip{
};
struct ChipR500 : Chip{};
struct PIC32F42 : Chip{};
struct ChipCreator{
virtual Chip *make() = 0;
};
struct ChipR500Creator : ChipCreator{
Chip *make(){return new ChipR500();}
};
struct PIC32F42Creator : ChipCreator{
Chip *make(){return new PIC32F42();}
};
int main(){
ChipR500Creator m; // client code knows only the factory method interface, not the actuall concrete products
Chip *p = m.make();
}
What you are asking for, essentially, is called Virtual Construction, ie the ability the build an object whose type is only known at runtime.
Of course C++ doesn't allow constructors to be virtual, so this requires a bit of trickery. The common OO-approach is to use the Prototype pattern:
class Chip
{
public:
virtual Chip* clone() const = 0;
};
class ChipA: public Chip
{
public:
virtual ChipA* clone() const { return new ChipA(*this); }
};
And then instantiate a map of these prototypes and use it to build your objects (std::map<std::string,Chip*>). Typically, the map is instantiated as a singleton.
The other approach, as has been illustrated so far, is similar and consists in registering directly methods rather than an object. It might or might not be your personal preference, but it's generally slightly faster (not much, you just avoid a virtual dispatch) and the memory is easier to handle (you don't have to do delete on pointers to functions).
What you should pay attention however is the memory management aspect. You don't want to go leaking so make sure to use RAII idioms.

Reconciling classes, inheritance, and C callbacks

In my C++ project, I've chosen to use a C library. In my zeal to have a well-abstracted and simple design, I've ended up doing a bit of a kludge. Part of my design requirement is that I can easily support multiple APIs and libraries for a given task (due, primarily, to my requirement for cross-platform support). So, I chose to create an abstract base class which would uniformly handle a given selection of libraries.
Consider this simplification of my design:
class BaseClass
{
public:
BaseClass() {}
~BaseClass() {}
bool init() { return doInit(); }
bool run() { return doWork(); }
void shutdown() { destroy(); }
private:
virtual bool doInit() = 0;
virtual bool doWork() = 0;
virtual void destroy() = 0;
};
And a class that inherits from it:
class LibrarySupportClass : public BaseClass
{
public:
LibrarySupportClass()
: BaseClass(), state_manager(new SomeOtherClass()) {}
int callbackA(int a, int b);
private:
virtual bool doInit();
virtual bool doWork();
virtual void destroy();
SomeOtherClass* state_manager;
};
// LSC.cpp:
bool LibrarySupportClass::doInit()
{
if (!libraryInit()) return false;
// the issue is that I can't do this:
libraryCallbackA(&LibrarySupportClass::callbackA);
return true;
}
// ... and so on
The problem I've run into is that because this is a C library, I'm required to provide a C-compatible callback of the form int (*)(int, int), but the library doesn't support an extra userdata pointer for these callbacks. I would prefer doing all of these callbacks within the class because the class carries a state object.
What I ended up doing is...
static LibrarySupportClass* _inst_ptr = NULL;
static int callbackADispatch(int a, int b)
{
_inst_ptr->callbackA(a, b);
}
bool LibrarySupportClass::doInit()
{
_inst_ptr = this;
if (!libraryInit()) return false;
// the issue is that I can't do this:
libraryCallbackA(&callbackADispatch);
return true;
}
This will clearly do Bad Things(TM) if LibrarySupportClass is instantiated more than once, so I considered using the singleton design, but for this one reason, I can't justify that choice.
Is there a better way?
You can justify that choice: your justification is that the C library only supports one callback instance.
Singletons scare me: It's not clear how to correctly destroy a singleton, and inheritance just complicates matters. I'll take another look at this approach.
Here's how I'd do it.
LibrarySupportClass.h
class LibrarySupportClass : public BaseClass
{
public:
LibrarySupportClass();
~LibrarySupportClass();
static int static_callbackA(int a, int b);
int callbackA(int a, int b);
private:
//copy and assignment are rivate and not implemented
LibrarySupportClass(const LibrarySupportClass&);
LibrarySupportClass& operator=(const LibrarySupportClass&);
private:
static LibrarySupportClass* singleton_instance;
};
LibrarySupportClass.cpp
LibrarySupportClass* LibrarySupportClass::singleton_instance = 0;
int LibrarySupportClass::static_callbackA(int a, int b)
{
if (!singleton_instance)
{
WHAT? unexpected callback while no instance exists
}
else
{
return singleton_instance->callback(a, b);
}
}
LibrarySupportClass::LibrarySupportClass()
{
if (singleton_instance)
{
WHAT? unexpected creation of a second concurrent instance
throw some kind of exception here
}
singleton_instance = this;
}
LibrarySupportClass::~LibrarySupportClass()
{
singleton_instance = 0;
}
My point is that you don't need to give it the external interface of a canonical 'singleton' (which e.g. makes it difficult to destroy).
Instead, the fact that there is only one of it can be a private implementation detail, and enforced by a private implementation detail (e.g. by the throw statement in the constructor) ... assuming that the application code is already such that it will not try to create more than one instance of this class.
Having an API like this (instead of the more canonical 'singleton' API) means that you can for example create an instance of this class on the stack if you want to (provided you don't try to create more than one of it).
The external constraint of the c library dictates that when your callback is called you don't have the identification of the "owning" instance of the callback. Therefore I think that your approach is correct.
I would suggest to declare the callbackDispatch method a static member of the class, and make the class itself a singleton (there are lots of examples of how to implement a singleton). This will let you implement similar classes for other libraries.
Dani beat me to the answer, but one other idea is that you could have a messaging system where the call back function dispatch the results to all or some of the instances of your class. If there isn't a clean way to figure out which instance is supposed to get the results, then just let the ones that don't need it ignore the results.
Of course this has the problem of performance if you have a lot of instances, and you have to iterate through the entire list.
The problem the way I see it is that because your method is not static, you can very easily end up having an internal state in a function that isn't supposed to have one, which, because there's a single instance on the top of the file, can be carried over between invocations, which is a -really- bad thing (tm). At the very least, as Dani suggested above, whatever methods you're calling from inside your C callback would have to be static so that you guarantee no residual state is left from an invocation of your callback.
The above assumes you have static LibrarySupportClass* _inst_ptr declared at the very top. As an alternative, consider having a factory function which will create working copies of your LibrarySupportClass on demand from a pool. These copies can then return to the pool after you're done with them and be recycled, so that you don't go around creating an instance every time you need that functionality.
This way you can have your objects keep state during a single callback invocation, since there's going to be a clear point where your instance is released and gets a green light to be reused. You will also be in a much better position for a multi-threaded environment, in which case each thread gets its own LibrarySupportClass instance.
The problem I've run into is that because this is a C library, I'm required to provide a C-compatible callback of the form int (*)(int, int), but the library doesn't support an extra userdata pointer for these callbacks
Can you elaborate? Is choosing a callback type based on userdata a problem?
Could your callback choose an instance based on a and/or b? If so, then register your library support classes in a global/static map and then have callbackADispatch() look up the correct instance in the map.
Serializing access to the map with a mutex would be a reasonable way to make this thread-safe, but beware: if the library holds any locks when it invokes your callback, then you may have to do something more clever to avoid deadlocks, depending on your lock hierarchy.