Is it possible to make a factory in C++ that complies with the open/closed principle? - c++

In a project I'm working on in C++, I need to create objects for messages as they come in over the wire. I'm currently using the factory method pattern to hide the creation of objects:
// very psuedo-codey
Message* MessageFactory::CreateMessage(InputStream& stream)
{
char header = stream.ReadByte();
switch (header) {
case MessageOne::Header:
return new MessageOne(stream);
case MessageTwo::Header:
return new MessageTwo(stream);
// etc.
}
}
The problem I have with this is that I'm lazy and don't like writing the names of the classes in two places!
In C# I would do this with some reflection on first use of the factory (bonus question: that's an OK use of reflection, right?) but since C++ lacks reflection, this is off the table. I thought about using a registry of some sort so that the messages would register themselves with the factory at startup, but this is hampered by the non-deterministic (or at least implementation-specific) static initialization order problem.
So the question is, is it possible to implement this type of factory in C++ while respecting the open/closed principle, and how?
EDIT: Apparently I'm overthinking this. I intended this question to be a "how would you do this in C++" since it's really easy to do with reflection in other languages.

I think that the open/closed approach and DRY are good principles. But they are not sacred. The goal should be making the code reliable and maintainable. If you have to perform unnatural acts to adhere to O/C or DRY, then you may simply be making your code needlessly more complex with no material benefit.
Here is something I wrote a few years ago on how I make these judgment calls.

You do not need to make your code follow every possible principle simultaneously. The aim should be to stick to as many of those paradigms as possible and no more. Do not over-engineer your solution -- you are likely to end up with spaghetti code otherwise.

I have answered in another SO question about C++ factories. Please see there if a flexible factory is of interest. I try to describe an old way from ET++ to use macros which has worked great for me.
The method is macro based and are easily extensible.
ET++ was a project to port old MacApp to C++ and X11. In the effort of it Eric Gamma etc started to think about Design Patterns

You could convert classes that create messages (MessageOne, MessageTwo ...) into message factories and register them with top level MessageFactory on initialization.
Message factory could hold map of MessageX::Header -> instance of MessageXFactory kind of map.
In CreateMessage you would find instance of MessageXFactory based on message header, retrieve the reference to MessageXFactory and then call it's method that would return instance of the actual MessageX.
With new messages you no longer have to modify the 'switch', you just need to add an instance of new MessageXFactory to the TopMessageFactory.
example:
#include <iostream>
#include <map>
#include <string>
using namespace std;
struct Message
{
static const int id = 99;
virtual ~Message() {}
virtual int msgId() { return id; }
};
struct NullMessage : public Message
{
static const int id = 0;
virtual int msgId() { return id; }
};
struct MessageOne : public Message
{
static const int id = 1;
virtual int msgId() { return id; }
};
struct MessageTwo : public Message
{
static const int id = 2;
virtual int msgId() { return id; }
};
struct MessageThree : public Message
{
static const int id = 3;
virtual int msgId() { return id; }
};
struct IMessageFactory
{
virtual ~IMessageFactory() {}
virtual Message * createMessage() = 0;
};
struct MessageOneFactory : public IMessageFactory
{
MessageOne * createMessage()
{
return new MessageOne();
}
};
struct MessageTwoFactory : public IMessageFactory
{
MessageTwo * createMessage()
{
return new MessageTwo();
}
};
struct TopMessageFactory
{
Message * createMessage(const string& data)
{
map<string, IMessageFactory*>::iterator it = msgFactories.find(data);
if (it == msgFactories.end()) return new NullMessage();
return (*it).second->createMessage();
}
bool registerFactory(const string& msgId, IMessageFactory * factory)
{
if (!factory) return false;
msgFactories[msgId] = factory;
return true;
}
map<string, IMessageFactory*> msgFactories;
};
int main()
{
TopMessageFactory factory;
MessageOneFactory * mof = new MessageOneFactory();
MessageTwoFactory * mtf = new MessageTwoFactory();
factory.registerFactory("one", mof);
factory.registerFactory("two", mtf);
Message * msg = factory.createMessage("two");
cout << msg->msgId() << endl;
msg = factory.createMessage("one");
cout << msg->msgId() << endl;
}

First, your system is not so open ended, since you switch on an 8-bit char, so your message type count won't exceed 256 ;-)
Just joking apart, this is a situation I'd use a little templated factory class (stateless if you put your char message type in a non-class template arg, or with just that char as state) that accepts your stream& and does the new on its T template arg, passing the stream& and returning it. You'll need a little registrar class to declare as static with global scope, and register the concrete T-instantiated factory (via an abstract base class pointer) with a manager (we have a generic one that takes a "factory domain" key). In your case I wouldn't use a map but directly an 256 "slot" array to put the factory_base* in.
One you have the factory framework in place, it's easy and reusable. --DD

Related

How to hold a List<> of opaque handles in CLI/C++?

I am writing a CLI/C++ wrapper for a C-library in order to use it in C#. It must be said, I only have access to the C header file and the .lib of the C-library, not the source code.
Some of the functions I am trying to wrap are returning opaque handles, such as:
typedef struct SanEvent_s *SanEvent;
typedef struct SanValue_s *SanValue;
Returning objects of this type on the C# end seems like trouble to me, as I don't know the implementation of the struct (I tried returning the SanEvent type in the C++ wrapper but on the C# end that type is not accessible due to "protection level" or whatever it said). My plan at the moment is therefore to write some helper functions, which instead just return an integer which represents an, for example, San Event in a list or something. The list would be kept in the managed C++ wrapper, where I can actually manage the San Event type.
My problem is, I don't really know how to do this with this type of type.
This:
using System::Collections::Generic::List;
namespace Wrapper {
public ref class Analytics
{
private:
static List<SanEvent^>^ events = gcnew List<SanEvent^>();
}
}
Gives me the errors: handle to handle, pointer, or reference is not allowed
The right hand side also complains about expected type specifier + the same error as above.
Can anyone give me some tips on how I could tackle this issue neatly and efficiently? My List implementation is not carved in stone, and I am open to better suggestions.
Let's imagine following SanEvent declaration
struct SanEvent_s
{
int test;
};
typedef SanEvent_s *SanEvent;
And following C++ API to work with such event:
SanEvent GetEvent()
{
auto e = new SanEvent_s();
e->test=42;
return e;
}
int UseEvent(SanEvent pEvent)
{
return pEvent->test;
}
All this code contained in static library project (fully native, no CLR).
Then we have C++/CLI project to wrap this static lib.
Here we have wrapper for event itself:
#include "./../CppLib/SanEvent_s.h"
public ref class SanEventWrapper: Microsoft::Win32::SafeHandles::SafeHandleZeroOrMinusOneIsInvalid
{
public:
static SanEventWrapper^ GetWrapper()
{
return gcnew SanEventWrapper(GetEvent());
}
internal:
SanEventWrapper(SanEvent event):SafeHandleZeroOrMinusOneIsInvalid(true)
{
this->e = event;
this->handle = System::IntPtr(event);
}
int UseWrapper()
{
return ::UseEvent(this->e);
}
protected:
bool ReleaseHandle() override
{
//todo: release wrapped event
return true;
}
private:
SanEvent e;
};
And another class which uses such a wrapper
public ref class SanEventConsumer
{
public:
int ConsumeEvent(SanEventWrapper^ wrapper)
{
return wrapper->UseWrapper();
}
};
And finally, how to use all this from C#:
var wrapper = SanEventWrapper.GetWrapper();
var consumer = new SanEventConsumer();
var res = consumer.ConsumeEvent(wrapper);
Console.WriteLine(res);
This should print 42;
Notes:
Notes:
this is a very simplified sample. It should be adapted ytrin accordance with semantics of 'SanEvent' struct as well as with respect of requirements of SafeHandle documentation (https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.safehandle?view=netframework-4.8 and https://learn.microsoft.com/en-us/dotnet/api/microsoft.win32.safehandles.safehandlezeroorminusoneisinvalid?view=netframework-4.8)
you should decide if your wrapper will own the SunEvent object or not and implement ReleaseHandle and Dispose accordingly to this.
you may consider to use another base class from this list https://learn.microsoft.com/en-us/dotnet/api/microsoft.win32.safehandles?view=netframework-4.8 instead of 'SafeHandleZeroOrMinusOneIsInvalid' or even make direct inhernitance from SafeHandle.
you can even think about dropping SafeHandle-related stuff at all and making the simple wrapper by your own, but it can give some surprises in connection with GC.
depending of the semantics of the SunEvent you may be also need to implement factory to guarantee that you always return to the managed code the same instance of wrapper for all equals values of raw native pointer.
Here's something similar to what #Serg has above, but explicitly goes with the idea that you have NO IDEA in the C# world what's inside the object.
So if you have a C++/CLI library made in VS, you get this in the .h file:
#pragma once
#include <cstdint>
using namespace System;
namespace CppCliLibrary {
public ref class Class1
{
public:
static IntPtr getOpaqueInstance(int32_t argument);
static void useOpaqueInstance(IntPtr obj);
static void freeOpaqueInstance(IntPtr obj);
};
}
Like above, using IntPtr to represent a pointer to "whatever". The corresponding .cpp file is this:
#include "pch.h"
#include "CppCliLibrary.h"
#include <string>
#include <iostream>
namespace CppCliLibrary
{
class OpaqueCppClass
{
public:
OpaqueCppClass(int32_t arg)
: m_int(arg) { }
int32_t m_int;
};
}
IntPtr CppCliLibrary::Class1::getOpaqueInstance(int32_t argument)
{
return IntPtr(new OpaqueCppClass(argument));
}
void CppCliLibrary::Class1::useOpaqueInstance(IntPtr obj)
{
CppCliLibrary::OpaqueCppClass* deref = reinterpret_cast<CppCliLibrary::OpaqueCppClass *>(obj.ToPointer());
std::cout << "Contents of class are: " << deref->m_int << std::endl;
}
void CppCliLibrary::Class1::freeOpaqueInstance(IntPtr obj)
{
CppCliLibrary::OpaqueCppClass* deref = reinterpret_cast<CppCliLibrary::OpaqueCppClass*>(obj.ToPointer());
std::cout << "Deleting class with contents: " << deref->m_int << std::endl;
delete deref;
}
Then in the C# file you have this:
namespace CsCoreConsole
{
class Program
{
static void Main(string[] args)
{
// Get an instance
var instance = CppCliLibrary.Class1.getOpaqueInstance(52);
// Use it
Console.WriteLine("Got an instance we're using");
CppCliLibrary.Class1.useOpaqueInstance(instance);
Console.WriteLine("Freeing it");
CppCliLibrary.Class1.freeOpaqueInstance(instance);
// Add a bunch to a list
List<IntPtr> opaqueInstances = new List<IntPtr>();
for(int i = 0; i < 5; i++)
{
opaqueInstances.Add(CppCliLibrary.Class1.getOpaqueInstance(i * 10));
}
// Use them all
foreach(var cur in opaqueInstances)
{
CppCliLibrary.Class1.useOpaqueInstance(cur);
}
// Delete them all
foreach (var cur in opaqueInstances)
{
CppCliLibrary.Class1.freeOpaqueInstance(cur);
}
}
}
}
Of course the C# project needs to reference the C++/CLI one, but you get the idea here. The C++/CLI is a factory (nothing more, nothing less) for IntPtr and it can use it as well, because to C# it's opaque. C# knows of nothing more than IntPtr.
The idea from Serg is to wrap it more, in a type-safe way. Sure, that can work, but this is the "even more raw" variant, if you want to put it "directly" into a List<>

Mechanism for Save/Load+Undo/Redo with minimum boilerplate

I want to make an app where a user can edit a diagram (for example), which would provide the standard mechanisms of: Save, Load, Undo, and Redo.
A simple way to do it is to have classes for the diagram and for the various shapes in it, which implement serialization via save and load methods, and where all methods to edit them return UndoableActions that can be added to an UndoManager which calls their perform method and adds them to an undo stack.
The problem with the simple way described above is that it requires a lot of error-prone boilerplate work.
I know that the serialization (save/load) part of the work can be solved by using something like Google's Protocol Buffers or Apache Thrift, which generates the boiler-plate serialization code for you, but it doesn't solve the undo+redo problem. I know that for Objective C and Swift, Apple provides Core Data which does solve serialization + undo, but I'm not familiar with anything similar for C++.
Is there a good way non-error-prone to solve save+load+undo+redo with little boilerplate?
The problem with the simple way described above is that it requires a lot of error-prone boilerplate work.
I am not convinced that this is the case. Your approach sounds reasonable and using Modern C++ features and abstractions you can implement a safe and elegant interface for it.
For starters, you could use std::variant as a sum type for "undoable actions" - this will give you a type-safe tagged union for every action. (Consider using boost::variant or other implementations that can be easily found on Google if you do not have access to C++17). Example:
namespace action
{
// User dragged the shape to a separate position.
struct move_shape
{
shape_id _id;
offset _offset;
};
// User changed the color of a shape.
struct change_shape_color
{
shape_id _id;
color _previous;
color _new;
};
// ...more actions...
}
using undoable_action = std::variant<
action::move_shape,
action::change_shape_color,
// ...
>;
Now that you have a sum type for all your possible "undoable actions", you can define undo behavior by using pattern matching. I wrote two articles on variant "pattern matching" by overloading lambdas that you could find interesting:
"visiting variants using lambdas - part 1"
"visiting variants using lambdas - part 2"
Here's an example of how your undo function could look like:
void undo()
{
auto action = undo_stack.pop_and_get();
match(action, [&shapes](const move_shape& y)
{
// Revert shape movement.
shapes[y._id].move(-y._offset);
},
[&shapes](const change_shape_color& y)
{
// Revert shape color change.
shapes[y._id].set_color(y._previous);
},
[](auto)
{
// Produce a compile-time error.
struct undo_not_implemented;
undo_not_implemented{};
});
}
If every branch of match gets large, it could be moved to its own function for readability. Trying to instantiate undo_not_implemented or using a dependent static_assert is also a good idea: a compile-time error will be produced if you forget to implement behavior for a specific "undoable action".
That's pretty much it! If you want to save the undo_stack so that the history of actions is preserved in saved documents, you can implement a auto serialize(const undoable_action&) that, again, uses pattern matching to serialize the various actions. You could then implement a deserialize function that repopulates the undo_stack on file load.
If you find implementing serialization/deserialization for every action too tedious, consider using BOOST_HANA_DEFINE_STRUCT or similar solutions to automatically generate serialization/deserialization code.
Since you're concerned about battery and performance, I would also like to mention that using std::variant or similar tagged union constructs is on average faster and more lightweight compared to polymorphic hierarchies, as heap allocation is not required and as there is no run-time virtual dispatch.
About redo functionality: you could have a redo_stack and implement an auto invert(const undoable_action&) function that inverts the behavior of an action. Example:
void undo()
{
auto action = undo_stack.pop_and_get();
match(action, [&](const move_shape& y)
{
// Revert shape movement.
shapes[y._id].move(-y._offset);
redo_stack.push(invert(y));
},
// ...
auto invert(const undoable_action& x)
{
return match(x, [&](move_shape y)
{
y._offset *= -1;
return y;
},
// ...
If you follow this pattern, you can implement redo in terms of undo! Simply call undo by popping from the redo_stack instead of the undo_stack: since you "inverted" the actions it will perform the desired operation.
EDIT: here's a minimal wandbox example that implements a match function that takes in a variant and returns a variant.
The example uses boost::hana::overload to generate the visitor.
The visitor is wrapped in a lambda f that unifies the return type to the type of the variant: this is necessary as std::visit requires that the visitor always returns the same type.
If returning a type which is different from the variant is desirable, std::common_type_t could be used, otherwise the user could explicitly specify it as the first template parameter of match.
Two reasonable approaches to this problem are implemented in the frameworks Flip and ODB.
Code-generation / ODB
With ODB you need to add #pragma declarations to your code, and have it's tool generate methods that you use for save/load and for editing the model, like so:
#pragma db object
class person
{
public:
void setName (string);
string getName();
...
private:
friend class odb::access;
person () {}
#pragma db id
string email_;
string name_;
};
Where the accessors declared in the class are auto-generated by ODB so that all changes to the model can get captured and undo-transactions may be made for them.
Reflection with minimal boilerplate / Flip
Unlike ODB, Flip doesn't generate C++ code for you, but rather it requires your program to call Model::declare to re-declare your structures like so:
class Song : public flip::Object
{
public:
static void declare ();
flip::Float tempo;
flip::Array <Track> tracks;
};
void Song::declare ()
{
Model::declare <Song> ()
.name ("acme.product.Song")
.member <flip::Float, &Song::tempo> ("tempo");
.member <flip::Array <Track>, &Song::tracks> ("tracks");
}
int main()
{
Song::declare();
...
}
With the structured declared like so, flip::Object's constructor can initialize all the fields so that they can point to the undo stack, and all the edits on them are recorded. It also has a list of all the members so that flip::Object can implement the serialization for you.
The problem with the simple way described above is that it requires a lot of error-prone boilerplate work.
I would say that the actual problem is that your undo/redo logic is part of a component, that should ship only a bunch of data as a position, a content and so on.
A common OOP way to decouple the undo/redo logic from the data is the command design pattern.
The basic idea is that all the user interactions are converted to commands and those commands are executed on the diagram itself. They contain all the information required to perform an operation and to rollback it, as long as you maintain a sorted list of commands and undo/redo them in order (that is usually the user expectation).
Another common OOP pattern that can help you either to design a custom serialization utility or to use the most common ones is the visitor design pattern.
Here the basic idea is that your diagram should not care about the kind of components it contains. Whenever you want to serialize it, you provide a serializer and the components promote themselves to the right type when queried (see double dispatching for further details on this technique).
That being said, a minimal example is worth more than a thousand words:
#include <memory>
#include <stack>
#include <vector>
#include <utility>
#include <iostream>
#include <algorithm>
#include <string>
struct Serializer;
struct Part {
virtual void accept(Serializer &) = 0;
virtual void draw() = 0;
};
struct Node: Part {
void accept(Serializer &serializer) override;
void draw() override;
std::string label;
unsigned int x;
unsigned int y;
};
struct Link: Part {
void accept(Serializer &serializer) override;
void draw() override;
std::weak_ptr<Node> from;
std::weak_ptr<Node> to;
};
struct Serializer {
void visit(Node &node) {
std::cout << "serializing node " << node.label << " - x: " << node.x << ", y: " << node.y << std::endl;
}
void visit(Link &link) {
auto pfrom = link.from.lock();
auto pto = link.to.lock();
std::cout << "serializing link between " << (pfrom ? pfrom->label : "-none-") << " and " << (pto ? pto->label : "-none-") << std::endl;
}
};
void Node::accept(Serializer &serializer) {
serializer.visit(*this);
}
void Node::draw() {
std::cout << "drawing node " << label << " - x: " << x << ", y: " << y << std::endl;
}
void Link::accept(Serializer &serializer) {
serializer.visit(*this);
}
void Link::draw() {
auto pfrom = from.lock();
auto pto = to.lock();
std::cout << "drawing link between " << (pfrom ? pfrom->label : "-none-") << " and " << (pto ? pto->label : "-none-") << std::endl;
}
struct TreeDiagram;
struct Command {
virtual void execute(TreeDiagram &) = 0;
virtual void undo(TreeDiagram &) = 0;
};
struct TreeDiagram {
std::vector<std::shared_ptr<Part>> parts;
std::stack<std::unique_ptr<Command>> commands;
void execute(std::unique_ptr<Command> command) {
command->execute(*this);
commands.push(std::move(command));
}
void undo() {
if(!commands.empty()) {
commands.top()->undo(*this);
commands.pop();
}
}
void draw() {
std::cout << "draw..." << std::endl;
for(auto &part: parts) {
part->draw();
}
}
void serialize(Serializer &serializer) {
std::cout << "serialize..." << std::endl;
for(auto &part: parts) {
part->accept(serializer);
}
}
};
struct AddNode: Command {
AddNode(std::string label, unsigned int x, unsigned int y):
label{label}, x{x}, y{y}, node{std::make_shared<Node>()}
{
node->label = label;
node->x = x;
node->y = y;
}
void execute(TreeDiagram &diagram) override {
diagram.parts.push_back(node);
}
void undo(TreeDiagram &diagram) override {
auto &parts = diagram.parts;
parts.erase(std::remove(parts.begin(), parts.end(), node), parts.end());
}
std::string label;
unsigned int x;
unsigned int y;
std::shared_ptr<Node> node;
};
struct AddLink: Command {
AddLink(std::shared_ptr<Node> from, std::shared_ptr<Node> to):
link{std::make_shared<Link>()}
{
link->from = from;
link->to = to;
}
void execute(TreeDiagram &diagram) override {
diagram.parts.push_back(link);
}
void undo(TreeDiagram &diagram) override {
auto &parts = diagram.parts;
parts.erase(std::remove(parts.begin(), parts.end(), link), parts.end());
}
std::shared_ptr<Link> link;
};
struct MoveNode: Command {
MoveNode(unsigned int x, unsigned int y, std::shared_ptr<Node> node):
px{node->x}, py{node->y}, x{x}, y{y}, node{node}
{}
void execute(TreeDiagram &) override {
node->x = x;
node->y = y;
}
void undo(TreeDiagram &) override {
node->x = px;
node->y = py;
}
unsigned int px;
unsigned int py;
unsigned int x;
unsigned int y;
std::shared_ptr<Node> node;
};
int main() {
TreeDiagram diagram;
Serializer serializer;
auto addNode1 = std::make_unique<AddNode>("foo", 0, 0);
auto addNode2 = std::make_unique<AddNode>("bar", 100, 50);
auto moveNode2 = std::make_unique<MoveNode>(10, 10, addNode2->node);
auto addLink = std::make_unique<AddLink>(addNode1->node, addNode2->node);
diagram.serialize(serializer);
diagram.execute(std::move(addNode1));
diagram.execute(std::move(addNode2));
diagram.execute(std::move(addLink));
diagram.serialize(serializer);
diagram.execute(std::move(moveNode2));
diagram.draw();
diagram.undo();
diagram.undo();
diagram.serialize(serializer);
}
I've not implemented the redo action and the code is far from being a production-ready piece of software, but it acts quite well as a starting point from which to create something more complex.
As you can see, the goal is to create a tree diagram that contains both nodes an links. A component contains a bunch of data and knows how to draw itself. Moreover, as anticipated, a component accepts a serializer in case you want to write it down on a file or whatever.
All the logic is contained in the so called commands. In the example there are three commands: add node, add link and move node. Neither the diagram nor the components know anything about what's going on under the hood. All what the diagram knows is that it's executing a set of commands and those commands can be executed back a step at the time.
A more complex undo/redo system can contain a circular buffer of commands and a few indexes that indicate the one to substitute with the next one, the one valid when going forth and the one valid when going back.
It's quite easy to implement indeed.
This approach will help you decoupling the logic from the data and it's quite common when dealing with user interfaces.
To be honest, it's not something that came up suddenly to my mind. I found something similar while looking at how open-source software solved the issue and I've used it a few years ago in a software of mine. The resulting code is really easy to maintain.
Another approach you might want to consider is working with inmutable data structures and objects. Then, the undo/redo stack can be implemented as a stack of versions of the scene/diagram/document. Undo() replaces the current version with an older version from the stack, and so on. Because all data is inmutable, you can keep references instead of copies, so it is fast and (relatively) cheap.
Pros:
simple undo/redo
multithread-friendly
clean separation of "structure" and transient state (e.g. current selection)
may simplify serialization
caching/memoization/precomputation-friendly (e.g. bounding-box, gpu buffers)
Cons:
consumes a bit more memory
forces separation of "structure" and transient state
probably more difficult: for example, for a typical tree-like scenegraph, to change a node you would also need to change all the nodes along the path to the root; the old and new versions can share the rest of the nodes
Assuming that you're calling save() on a temporary file for each edit of the diagram (even if user doesn't explicitly call the save action) and that you undo only the latest action, you can do as follows:
LastDiagram load(const std::string &path)
{
/* Check for valid path (e.g. boost::filesystem here) */
if(!found)
{
throw std::runtime_exception{"No diagram found"};
}
//read LastDiagram
return LastDiagram;
}
LastDiagram undoLastAction()
{
return loadLastDiagram("/tmp/tmp_diagram_file");
}
and in your main app you handle the exception if thrown. In case you want to allow more undos, then you should think to a solution like sqlite or a tmp file with more entries.
If performance in time and space are issues due large diagrams, think to implement some strategy like keeping an incremental difference for each element of a diagram in a std::vector (limit it to 3/5 if objects are big) and call the renderer with the current statuses. I'm not an OpenGL expert, but I think it's the way it's done there. Actually you could 'steal' this strategy from game development best practices, or generally graphics related ones.
One of those strategies could be something like this:
A structure for efficient update, incremental redisplay and undo in graphical editors

Practical use of dynamic_cast?

I have a pretty simple question about the dynamic_cast operator. I know this is used for run time type identification, i.e., to know about the object type at run time. But from your programming experience, can you please give a real scenario where you had to use this operator? What were the difficulties without using it?
Toy example
Noah's ark shall function as a container for different types of animals. As the ark itself is not concerned about the difference between monkeys, penguins, and mosquitoes, you define a class Animal, derive the classes Monkey, Penguin, and Mosquito from it, and store each of them as an Animal in the ark.
Once the flood is over, Noah wants to distribute animals across earth to the places where they belong and hence needs additional knowledge about the generic animals stored in his ark. As one example, he can now try to dynamic_cast<> each animal to a Penguin in order to figure out which of the animals are penguins to be released in the Antarctic and which are not.
Real life example
We implemented an event monitoring framework, where an application would store runtime-generated events in a list. Event monitors would go through this list and examine those specific events they were interested in. Event types were OS-level things such as SYSCALL, FUNCTIONCALL, and INTERRUPT.
Here, we stored all our specific events in a generic list of Event instances. Monitors would then iterate over this list and dynamic_cast<> the events they saw to those types they were interested in. All others (those that raise an exception) are ignored.
Question: Why can't you have a separate list for each event type?
Answer: You can do this, but it makes extending the system with new events as well as new monitors (aggregating multiple event types) harder, because everyone needs to be aware of the respective lists to check for.
A typical use case is the visitor pattern:
struct Element
{
virtual ~Element() { }
void accept(Visitor & v)
{
v.visit(this);
}
};
struct Visitor
{
virtual void visit(Element * e) = 0;
virtual ~Visitor() { }
};
struct RedElement : Element { };
struct BlueElement : Element { };
struct FifthElement : Element { };
struct MyVisitor : Visitor
{
virtual void visit(Element * e)
{
if (RedElement * p = dynamic_cast<RedElement*>(e))
{
// do things specific to Red
}
else if (BlueElement * p = dynamic_cast<BlueElement*>(e))
{
// do things specific to Blue
}
else
{
// error: visitor doesn't know what to do with this element
}
}
};
Now if you have some Element & e;, you can make MyVisitor v; and say e.accept(v).
The key design feature is that if you modify your Element hierarchy, you only have to edit your visitors. The pattern is still fairly complex, and only recommended if you have a very stable class hierarchy of Elements.
Imagine this situation: You have a C++ program that reads and displays HTML. You have a base class HTMLElement which has a pure virtual method displayOnScreen. You also have a function called renderHTMLToBitmap, which draws the HTML to a bitmap. If each HTMLElement has a vector<HTMLElement*> children;, you can just pass the HTMLElement representing the element <html>. But what if a few of the subclasses need special treatment, like <link> for adding CSS. You need a way to know if an element is a LinkElement so you can give it to the CSS functions. To find that out, you'd use dynamic_cast.
The problem with dynamic_cast and polymorphism in general is that it's not terribly efficient. When you add vtables into the mix, it only get's worse.
When you add virtual functions to a base class, when they are called, you end up actually going through quite a few layers of function pointers and memory areas. That will never be more efficient than something like the ASM call instruction.
Edit: In response to Andrew's comment bellow, here's a new approach: Instead of dynamic casting to the specific element type (LinkElement), instead you have another abstract subclass of HTMLElement called ActionElement that overrides displayOnScreen with a function that displays nothing, and creates a new pure virtual function: virtual void doAction() const = 0. The dynamic_cast is changed to test for ActionElement and just calls doAction(). You'd have the same kind of subclass for GraphicalElement with a virtual method displayOnScreen().
Edit 2: Here's what a "rendering" method might look like:
void render(HTMLElement root) {
for(vector<HTLMElement*>::iterator i = root.children.begin(); i != root.children.end(); i++) {
if(dynamic_cast<ActionElement*>(*i) != NULL) //Is an ActionElement
{
ActionElement* ae = dynamic_cast<ActionElement*>(*i);
ae->doAction();
render(ae);
}
else if(dynamic_cast<GraphicalElement*>(*i) != NULL) //Is a GraphicalElement
{
GraphicalElement* ge = dynamic_cast<GraphicalElement*>(*i);
ge->displayToScreen();
render(ge);
}
else
{
//Error
}
}
}
Operator dynamic_cast solves the same problem as dynamic dispatch (virtual functions, visitor pattern, etc): it allows you to perform different actions based on the runtime type of an object.
However, you should always prefer dynamic dispatch, except perhaps when the number of dynamic_cast you'd need will never grow.
Eg. you should never do:
if (auto v = dynamic_cast<Dog*>(animal)) { ... }
else if (auto v = dynamic_cast<Cat*>(animal)) { ... }
...
for maintainability and performance reasons, but you can do eg.
for (MenuItem* item: items)
{
if (auto submenu = dynamic_cast<Submenu*>(item))
{
auto items = submenu->items();
draw(context, items, position); // Recursion
...
}
else
{
item->draw_icon();
item->setup_accelerator();
...
}
}
which I've found quite useful in this exact situation: you have one very particular subhierarchy that must be handled separately, this is where dynamic_cast shines. But real world examples are quite rare (the menu example is something I had to deal with).
dynamic_cast is not intended as an alternative to virtual functions.
dynamic_cast has a non-trivial performance overhead (or so I think) since the whole class hierarchy has to be walked through.
dynamic_cast is similar to the 'is' operator of C# and the QueryInterface of good old COM.
So far I have found one real use of dynamic_cast:
(*) You have multiple inheritance and to locate the target of the cast the compiler has to walk the class hierarchy up and down to locate the target (or down and up if you prefer). This means that the target of the cast is in a parallel branch in relation to where the source of the cast is in the hierarchy. I think there is NO other way to do such a cast.
In all other cases, you just use some base class virtual to tell you what type of object you have and ONLY THEN you dynamic_cast it to the target class so you can use some of it's non-virtual functionality. Ideally there should be no non-virtual functionality, but what the heck, we live in the real world.
Doing things like:
if (v = dynamic_cast(...)){} else if (v = dynamic_cast(...)){} else if ...
is a performance waste.
Casting should be avoided when possible, because it is basically saying to the compiler that you know better and it is usually a sign of some weaker design decission.
However, you might come in situations where the abstraction level was a bit too high for 1 or 2 sub-classes, where you have the choice to change your design or solve it by checking the subclass with dynamic_cast and handle it in a seperate branch. The trade-of is between adding extra time and risk now against extra maintenance issues later.
In most situations where you are writing code in which you know the type of the entity you're working with, you just use static_cast as it's more efficient.
Situations where you need dynamic cast typically arrive (in my experience) from lack of foresight in design - typically where the designer fails to provide an enumeration or id that allows you to determine the type later in the code.
For example, I've seen this situation in more than one project already:
You may use a factory where the internal logic decides which derived class the user wants rather than the user explicitly selecting one. That factory, in a perfect world, returns an enumeration which will help you identify the type of returned object, but if it doesn't you may need to test what type of object it gave you with a dynamic_cast.
Your follow-up question would obviously be: Why would you need to know the type of object that you're using in code using a factory?
In a perfect world, you wouldn't - the interface provided by the base class would be sufficient for managing all of the factories' returned objects to all required extents. People don't design perfectly though. For example, if your factory creates abstract connection objects, you may suddenly realize that you need to access the UseSSL flag on your socket connection object, but the factory base doesn't support that and it's not relevant to any of the other classes using the interface. So, maybe you would check to see if you're using that type of derived class in your logic, and cast/set the flag directly if you are.
It's ugly, but it's not a perfect world, and sometimes you don't have time to refactor an imperfect design fully in the real world under work pressure.
The dynamic_cast operator is very useful to me.
I especially use it with the Observer pattern for event management:
#include <vector>
#include <iostream>
using namespace std;
class Subject; class Observer; class Event;
class Event { public: virtual ~Event() {}; };
class Observer { public: virtual void onEvent(Subject& s, const Event& e) = 0; };
class Subject {
private:
vector<Observer*> m_obs;
public:
void attach(Observer& obs) { m_obs.push_back(& obs); }
public:
void notifyEvent(const Event& evt) {
for (vector<Observer*>::iterator it = m_obs.begin(); it != m_obs.end(); it++) {
if (Observer* const obs = *it) {
obs->onEvent(*this, evt);
}
}
}
};
// Define a model with events that contain data.
class MyModel : public Subject {
public:
class Evt1 : public Event { public: int a; string s; };
class Evt2 : public Event { public: float f; };
};
// Define a first service that processes both events with their data.
class MyService1 : public Observer {
public:
virtual void onEvent(Subject& s, const Event& e) {
if (const MyModel::Evt1* const e1 = dynamic_cast<const MyModel::Evt1*>(& e)) {
cout << "Service1 - event Evt1 received: a = " << e1->a << ", s = " << e1->s << endl;
}
if (const MyModel::Evt2* const e2 = dynamic_cast<const MyModel::Evt2*>(& e)) {
cout << "Service1 - event Evt2 received: f = " << e2->f << endl;
}
}
};
// Define a second service that only deals with the second event.
class MyService2 : public Observer {
public:
virtual void onEvent(Subject& s, const Event& e) {
// Nothing to do with Evt1 in Service2
if (const MyModel::Evt2* const e2 = dynamic_cast<const MyModel::Evt2*>(& e)) {
cout << "Service2 - event Evt2 received: f = " << e2->f << endl;
}
}
};
int main(void) {
MyModel m; MyService1 s1; MyService2 s2;
m.attach(s1); m.attach(s2);
MyModel::Evt1 e1; e1.a = 2; e1.s = "two"; m.notifyEvent(e1);
MyModel::Evt2 e2; e2.f = .2f; m.notifyEvent(e2);
}
Contract Programming and RTTI shows how you can use dynamic_cast to allow objects to advertise what interfaces they implement. We used it in my shop to replace a rather opaque metaobject system. Now we can clearly describe the functionality of objects, even if the objects are introduced by a new module several weeks/months after the platform was 'baked' (though of course the contracts need to have been decided on up front).

Factory method anti-if implementation

I'm applying the Factory design pattern in my C++ project, and below you can see how I am doing it. I try to improve my code by following the "anti-if" campaign, thus want to remove the if statements that I am having. Any idea how can I do it?
typedef std::map<std::string, Chip*> ChipList;
Chip* ChipFactory::createChip(const std::string& type) {
MCList::iterator existing = Chips.find(type);
if (existing != Chips.end()) {
return (existing->second);
}
if (type == "R500") {
return Chips[type] = new ChipR500();
}
if (type == "PIC32F42") {
return Chips[type] = new ChipPIC32F42();
}
if (type == "34HC22") {
return Chips[type] = new Chip34HC22();
}
return 0;
}
I would imagine creating a map, with string as the key, and the constructor (or something to create the object). After that, I can just get the constructor from the map using the type (type are strings) and create my object without any if. (I know I'm being a bit paranoid, but I want to know if it can be done or not.)
You are right, you should use a map from key to creation-function.
In your case it would be
typedef Chip* tCreationFunc();
std::map<std::string, tCreationFunc*> microcontrollers;
for each new chip-drived class ChipXXX add a static function:
static Chip* CreateInstance()
{
return new ChipXXX();
}
and also register this function into the map.
Your factory function should be somethink like this:
Chip* ChipFactory::createChip(std::string& type)
{
ChipList::iterator existing = microcontrollers.find(type);
if (existing != microcontrollers.end())
return existing->second();
return NULL;
}
Note that copy constructor is not needed, as in your example.
The point of the factory is not to get rid of the ifs, but to put them in a separate place of your real business logic code and not to pollute it. It is just a separation of concerns.
If you're desperate, you could write a jump table/clone() combo that would do this job with no if statements.
class Factory {
struct ChipFunctorBase {
virtual Chip* Create();
};
template<typename T> struct CreateChipFunctor : ChipFunctorBase {
Chip* Create() { return new T; }
};
std::unordered_map<std::string, std::unique_ptr<ChipFunctorBase>> jumptable;
Factory() {
jumptable["R500"] = new CreateChipFunctor<ChipR500>();
jumptable["PIC32F42"] = new CreateChipFunctor<ChipPIC32F42>();
jumptable["34HC22"] = new CreateChipFunctor<Chip34HC22>();
}
Chip* CreateNewChip(const std::string& type) {
if(jumptable[type].get())
return jumptable[type]->Create();
else
return null;
}
};
However, this kind of approach only becomes valuable when you have large numbers of different Chip types. For just a few, it's more useful just to write a couple of ifs.
Quick note: I've used std::unordered_map and std::unique_ptr, which may not be part of your STL, depending on how new your compiler is. Replace with std::map/boost::unordered_map, and std::/boost::shared_ptr.
No you cannot get rid of the ifs. the createChip method creats a new instance depending on constant (type name )you pass as argument.
but you may optimaze yuor code a little removing those 2 line out of if statment.
microcontrollers[type] = newController;
return microcontrollers[type];
To answer your question: Yes, you should make a factory with a map to functions that construct the objects you want. The objects constructed should supply and register that function with the factory themselves.
There is some reading on the subject in several other SO questions as well, so I'll let you read that instead of explaining it all here.
Generic factory in C++
Is there a way to instantiate objects from a string holding their class name?
You can have ifs in a factory - just don't have them littered throughout your code.
struct Chip{
};
struct ChipR500 : Chip{};
struct PIC32F42 : Chip{};
struct ChipCreator{
virtual Chip *make() = 0;
};
struct ChipR500Creator : ChipCreator{
Chip *make(){return new ChipR500();}
};
struct PIC32F42Creator : ChipCreator{
Chip *make(){return new PIC32F42();}
};
int main(){
ChipR500Creator m; // client code knows only the factory method interface, not the actuall concrete products
Chip *p = m.make();
}
What you are asking for, essentially, is called Virtual Construction, ie the ability the build an object whose type is only known at runtime.
Of course C++ doesn't allow constructors to be virtual, so this requires a bit of trickery. The common OO-approach is to use the Prototype pattern:
class Chip
{
public:
virtual Chip* clone() const = 0;
};
class ChipA: public Chip
{
public:
virtual ChipA* clone() const { return new ChipA(*this); }
};
And then instantiate a map of these prototypes and use it to build your objects (std::map<std::string,Chip*>). Typically, the map is instantiated as a singleton.
The other approach, as has been illustrated so far, is similar and consists in registering directly methods rather than an object. It might or might not be your personal preference, but it's generally slightly faster (not much, you just avoid a virtual dispatch) and the memory is easier to handle (you don't have to do delete on pointers to functions).
What you should pay attention however is the memory management aspect. You don't want to go leaking so make sure to use RAII idioms.

How can I keep track of (enumerate) all classes that implement an interface

I have a situation where I have an interface that defines how a certain class behaves in order to fill a certain role in my program, but at this point in time I'm not 100% sure how many classes I will write to fill that role. However, at the same time, I know that I want the user to be able to select, from a GUI combo/list box, which concrete class implementing the interface that they want to use to fill a certain role. I want the GUI to be able to enumerate all available classes, but I would prefer not to have to go back and change old code whenever I decide to implement a new class to fill that role (which may be months from now)
Some things I've considered:
using an enumeration
Pros:
I know how to do it
Cons
I will have to update update the enumeration when I add a new class
ugly to iterate through
using some kind of static list object in the interface, and adding a new element from within the definition file of the implementing class
Pros:
Wont have to change old code
Cons:
Not even sure if this is possible
Not sure what kind of information to store so that a factory method can choose the proper constructor ( maybe a map between a string and a function pointer that returns a pointer to an object of the interface )
I'm guessing this is a problem (or similar to a problem) that more experienced programmers have probably come across before (and often), and there is probably a common solution to this kind of problem, which is almost certainly better than anything I'm capable of coming up with. So, how do I do it?
(P.S. I searched, but all I found was this, and it's not the same: How do I enumerate all items that implement a generic interface?. It appears he already knows how to solve the problem I'm trying to figure out.)
Edit: I renamed the title to "How can I keep track of... " rather than just "How can I enumerate..." because the original question sounded like I was more interested in examining the runtime environment, where as what I'm really interested in is compile-time book-keeping.
Create a singleton where you can register your classes with a pointer to a creator function.
In the cpp files of the concrete classes you register each class.
Something like this:
class Interface;
typedef boost::function<Interface* ()> Creator;
class InterfaceRegistration
{
typedef map<string, Creator> CreatorMap;
public:
InterfaceRegistration& instance() {
static InterfaceRegistration interfaceRegistration;
return interfaceRegistration;
}
bool registerInterface( const string& name, Creator creator )
{
return (m_interfaces[name] = creator);
}
list<string> names() const
{
list<string> nameList;
transform(
m_interfaces.begin(), m_interfaces.end(),
back_inserter(nameList)
select1st<CreatorMap>::value_type>() );
}
Interface* create(cosnt string& name ) const
{
const CreatorMap::const_iterator it
= m_interfaces.find(name);
if( it!=m_interfaces.end() && (*it) )
{
return (*it)();
}
// throw exception ...
return 0;
}
private:
CreatorMap m_interfaces;
};
// in your concrete classes cpp files
namespace {
bool registerClassX = InterfaceRegistration::instance("ClassX", boost::lambda::new_ptr<ClassX>() );
}
ClassX::ClassX() : Interface()
{
//....
}
// in your concrete class Y cpp files
namespace {
bool registerClassY = InterfaceRegistration::instance("ClassY", boost::lambda::new_ptr<ClassY>() );
}
ClassY::ClassY() : Interface()
{
//....
}
I vaguely remember doing something similar to this many years ago. Your option (2) is pretty much what I did. In that case it was a std::map of std::string to std::typeinfo. In each, .cpp file I registered the class like this:
static dummy = registerClass (typeid (MyNewClass));
registerClass takes a type_info object and simply returns true. You have to initialize a variable to ensure that registerClass is called during startup time. Simply calling registerClass in the global namespace is an error. And making dummy static allow you to reuse the name across compilation units without a name collision.
I referred to this article to implement a self-registering class factory similar to the one described in TimW's answer, but it has the nice trick of using a templated factory proxy class to handle the object registration. Well worth a look :)
Self-Registering Objects in C++ -> http://www.ddj.com/184410633
Edit
Here's the test app I did (tidied up a little ;):
object_factory.h
#include <string>
#include <vector>
// Forward declare the base object class
class Object;
// Interface that the factory uses to communicate with the object proxies
class IObjectProxy {
public:
virtual Object* CreateObject() = 0;
virtual std::string GetObjectInfo() = 0;
};
// Object factory, retrieves object info from the global proxy objects
class ObjectFactory {
public:
static ObjectFactory& Instance() {
static ObjectFactory instance;
return instance;
}
// proxies add themselves to the factory here
void AddObject(IObjectProxy* object) {
objects_.push_back(object);
}
size_t NumberOfObjects() {
return objects_.size();
}
Object* CreateObject(size_t index) {
return objects_[index]->CreateObject();
}
std::string GetObjectInfo(size_t index) {
return objects_[index]->GetObjectInfo();
}
private:
std::vector<IObjectProxy*> objects_;
};
// This is the factory proxy template class
template<typename T>
class ObjectProxy : public IObjectProxy {
public:
ObjectProxy() {
ObjectFactory::Instance().AddObject(this);
}
Object* CreateObject() {
return new T;
}
virtual std::string GetObjectInfo() {
return T::TalkToMe();
};
};
objects.h
#include <iostream>
#include "object_factory.h"
// Base object class
class Object {
public:
virtual ~Object() {}
};
class ClassA : public Object {
public:
ClassA() { std::cout << "ClassA Constructor" << std::endl; }
~ClassA() { std::cout << "ClassA Destructor" << std::endl; }
static std::string TalkToMe() { return "This is ClassA"; }
};
class ClassB : public Object {
public:
ClassB() { std::cout << "ClassB Constructor" << std::endl; }
~ClassB() { std::cout << "ClassB Destructor" << std::endl; }
static std::string TalkToMe() { return "This is ClassB"; }
};
objects.cpp
#include "objects.h"
// Objects get registered here
ObjectProxy<ClassA> gClassAProxy;
ObjectProxy<ClassB> gClassBProxy;
main.cpp
#include "objects.h"
int main (int argc, char * const argv[]) {
ObjectFactory& factory = ObjectFactory::Instance();
for (int i = 0; i < factory.NumberOfObjects(); ++i) {
std::cout << factory.GetObjectInfo(i) << std::endl;
Object* object = factory.CreateObject(i);
delete object;
}
return 0;
}
output:
This is ClassA
ClassA Constructor
ClassA Destructor
This is ClassB
ClassB Constructor
ClassB Destructor
If you're on Windows, and using C++/CLI, this becomes fairly easy. The .NET framework provides this capability via reflection, and it works very cleanly in managed code.
In native C++, this gets a little bit trickier, as there's no simple way to query the library or application for runtime information. There are many frameworks that provide this (just look for IoC, DI, or plugin frameworks), but the simplest means of doing it yourself is to have some form of configuration which a factory method can use to register themselves, and return an implementation of your specific base class. You'd just need to implement loading a DLL, and registering the factory method - once you have that, it's fairly easy.
Something you can consider is an object counter. This way you don't need to change every place you allocate but just implementation definition. It's an alternative to the factory solution. Consider pros/cons.
An elegant way to do that is to use the CRTP : Curiously recurring template pattern.
The main example is such a counter :)
This way you just have to add in your concrete class implementation :
class X; // your interface
class MyConcreteX : public counter<X>
{
// whatever
};
Of course, it is not applicable if you use external implementations you do not master.
EDIT:
To handle the exact problem you need to have a counter that count only the first instance.
my 2 cents
There is no way to query the subclasses of a class in (native) C++.
How do you create the instances? Consider using a Factory Method allowing you to iterate over all subclasses you are working with. When you create an instance like this, it won't be possible to forget adding a new subclass later.