C++ way of dependency injection - Templates or virtual methods? - c++

I wonder what is the C++ way of using dependency injection? Is that using templates or polymorphic classes? Consider the following code,
class AbstractReader
{
public:
virtual void Read() = 0;
};
class XMLReader : public AbstractReader
{
public:
void Read() { std::cout << "Reading with a XML reader" << std::endl; }
};
class TextFileReader : public AbstractReader
{
public:
void Read() { std::cout << "Reading with a Text file reader" << std::endl; }
};
class Parser
{
public:
Parser(AbstractReader* p_reader) : reader(p_reader) { }
void StartParsing() { reader->Read();
// other parsing logic
}
private:
AbstractReader* reader;
};
template<class T>
class GenericParser
{
public:
GenericParser(T* p_reader) : reader(p_reader) { }
void StartParsing()
{
reader->Read();
}
private:
T* reader;
};
1 - Which is the best method? GenericParser or Parser? I know if it is GenericParser, inheritance can be removed.
2 - If templates is the way to go, is it OK to write all the code in header files? I have seen many classes using templates writes all the code in header files rather than .h/.cpp combination. Is there any problems in doing so, something like inlining etc?
Any thoughts?

You don't have a free choice in this, based on how you want to structure your code or header files. The answer is dictated to you by the requirements of your application.
It depends on whether the coupling can be be decided at compile time or must be delayed until runtime.
If the coupling between a component and its dependencies is decided permanently at compile time, you can use templates. The compiler will then be able to perform inlining.
If however the coupling needs to be decided at runtime (e.g. the user chooses which other component will supply the dependency, perhaps through a configuration file) then you can't use templates for that, and you must use a runtime polymorphic mechanism. If so, your choices include virtual functions, function pointers or std::function.

I personally prefer to use the template solution if I know the type of the reader at the compile time itself as I feel there is no run time decision to be taken here hence the polymorphism will be of no use. As far as writing the templates in header files is concerned, you have to do that to avoid getting the linker error. This is because if you write the template method in a cpp, the compiler will not be able to instantiate the template class and hence the linker will give error. Although, a couple of workarounds exist, most of the template code is written in header files.

As to 1. "Best" is relative. Both methods have their pluses and minuses. Templates offer raw speed, but more code is inevitably inlined (yielding more coupling), and the error messages are hard to read. Inheritance is slower and makes objects larger, but it doesn't require inlining (less coupled). It also has relatively better error messages.
For a small library, coupling matters less, and templates can be a good choice. However, as complexity of your library increases, you need to move towards a less coupled approach. If you are unsure of how large your library will grow, or don't need the speed templating would provide (or don't want to deal with the error messages), go with inheritance.
My answer to 2 follows up on 1. Inlining is needed for some consumer templates, therefor requiring code placed in the header. It's a question of coupling. Inlining increases coupling between components and can drastically increase compile times; avoid it unless you want the speed and are sure your library will remain small.

GenericParser or Parser?
Depends on the rest of the code, the problem with the generic parser is that class you are going to inject also has to be a template.
But there is a 3rd more generic way ... boost::function and boost::lambda. All you have to ask for is a function with the correct (from the view of the user of the class) return type and parameters. boost::function< void ()> reader = bind( &TextFile::read, reader );
Now the user class is independent of the reader class and doesn't have to be a template.
class User
{
const boost::function< void ()>& reader;
public:
void setReader( const boost::function< void ()>& reader )
: reader(reader) {
}
};
Writes all the code in header files rather than .h/.cpp combination.
That is called the seperation model is there is only one compiler that supports it (Comeau compiler). Start reading "Export" Restriction part 1 and "Export" Restrictions part 2
#CiscoIPPhone Comment on: the problem with the generic parser is that class you are going to inject also has to be a template.
template<class T>
class GenericParser
{
public:
GenericParser(T* p_reader) : reader(p_reader) { }
void StartParsing()
{
reader->Read();
}
private:
T* reader;
};
// Now you have a GeniricParser Interface but your Parser is only usable for
// TextFileReader
class Parser
{
public:
Parser( GenericParser<TextFileReader> p_reader) : reader(p_reader) { }
void StartParsing() {
reader->Read();
}
private:
GenericParser<RealParser> reader;
};
//Solution is to make Parser also a template class
template<class T>
class Parser
{
public:
Parser( GenericParser<T> p_reader) : reader(p_reader) { }
void StartParsing() {
reader->Read();
}
private:
GenericParser<T> reader;
};

Related

How can I connect two classes (which don't know eachother) through public interface (C++)

I'm currently working on a project where everything is horribly mixed with everything. Every file include some others etc..
I want to focus a separating part of this spaghetti code into a library which has to be completely independent from the rest of the code.
The current problem is that some functions FunctionInternal of my library use some functions FunctionExternal declared somewhere else, hence my library is including some other files contained in the project, which is not conform with the requirement "independent from the rest of the code".
It goes without saying that I can't move FunctionExternal in my library.
My first idea to tackle this problem was to implement a public interface such as described bellow :
But I can't get it to work. Is my global pattern a way I could implement it or is there another way, if possible, to interface two functions without including one file in another causing an unwanted dependency.
How could I abstract my ExternalClass so my library would still be independent of the rest of my code ?
Edit 1:
External.h
#include "lib/InterfaceInternal.h"
class External : public InterfaceInternal {
private:
void ExternalFunction() {};
public:
virtual void InterfaceInternal_foo() override {
ExternalFunction();
};
};
Internal.h
#pragma once
#include "InterfaceInternal.h"
class Internal {
// how can i received there the InterfaceInternal_foo overrided in External.h ?
};
InterfaceInternal.h
#pragma once
class InterfaceInternal {
public:
virtual void InterfaceInternal_foo() = 0;
};
You can do like you suggested, override the internal interface in your external code. Then
// how can i received there the InterfaceInternal_foo overrided in External.h ?
just pass a pointer/reference to your class External that extends class InterfaceInternal. Of course your class Internal needs to have methods that accept InterfaceInternal*.
Or you can just pass the function to your internal interface as an argument. Something around:
class InterfaceInternal {
public:
void InterfaceInternal_foo(std::function<void()> f);
};
or more generic:
class InterfaceInternal {
public:
template <typename F> // + maybe some SFINAE magic, or C++20 concept to make sure it's actually callable
void InterfaceInternal_foo(F f);
};

Generic Messaging when concrete Messages are autogenerated C++ classes from XML

Background
I have auto generated concrete message types from a XML -> C++ generator.
GenMsg1, GenMsg2, ... , GenMsgN
All of these generated classes are from an XML schema. Technically I can edit their cpp and hpp files but I would prefer to not touch these as much as possible. They all have guaranteed functions that I would like to be able to call generically.
NOTE: I cannot get away from the above situation as this is a design limitation from another project. Also, I just used raw pointers in this simple example. I understand this is not best practice, its just for showing a general idea.
Goal
I am looking to process the above generated messages generically on my side.
Idea 1 and 2
My first idea was to just create and general "Message" class that was templated to hold one of the above types with a simple enum for identifying what type of message it is. The problem with this is I cannot just pass around a pointer to Message because it needs the template type parameter so this is obviously a no-go.
My next thought was to use the Curiously Recurring Template Pattern but that has the same issues as above.
Idea 3
After a lot of reading on messaging frameworks my next thought was that std::variant might be an option.
I have the following example which works but it uses double pointers and templated functions to access. If the wrong datatype is used this will throw an exception at runtime (which makes it quite clear this is the issue) but I could see this being annoying down the line as far as tracking the source of the throw.
I keep trying to read up on the std::visit but it does not make a whole lot of sense to me. I do not really want to implement a separate visitor class with a bunch of functions by hand when all of the functions in the generated classes are autogenerated already(like foo in the example below) and are ready to be called when the type is known. Additionally, they are guaranteed to exist. So it would be kind of nice to be able to call a foo() in Message and have it dive into the internal Representation and call its foo.
I have a MsgType enum in there that I could use as well. When the internal representation is set, I could set that and use it for deducing type... But this seems like its just duplicating effort already done by the std::variant so I scrapped its use but kept it in the code blow in case someone here had a new idea where something like that could be useful.
Any ideas on design moving forward? This seems like the most promising route, but I am open to ideas. Also, with my reality of having to conform to other peoples design decisions I realize that this code will "smell" a bit no matter what. I am just trying to make it as clean as possible on my end.
Idea 3 Code
#include <iostream>
#include <variant>
enum class MsgType { NOTYPE = 0, GenMessage1 = 1, GenMessage2 = 2, GenMessage3 = 3 };
class GenMessage1
{
public:
void foo() {std::cout << "Msg 1" << std::endl;}
};
class GenMessage2
{
public:
void foo() { std::cout << "Msg 2" << std::endl; }
};
class GenMessage3
{
public:
void foo() { std::cout << "Msg 3" << std::endl; }
};
class Message
{
private:
MsgType msgType;
std::string xmlStrRep;
std::variant<GenMessage1*, GenMessage2*, GenMessage3*> internalRep;
public:
Message()
{
this->msgType = MsgType::NOTYPE;
this->xmlStrRep = "";
}
template <typename T>
void setInternalRep(T* internalRep)
{
this->internalRep = internalRep;
}
template <typename T>
void getInternalRep(T retrieved)
{
*retrieved = getInternalRepHelper(*retrieved);
}
template <typename T>
T getInternalRepHelper(T retrieved)
{
return std::get<T>(this->internalRep);
}
void foo()
{
//call into interal representation and call its foo
}
};
int main()
{
Message* msg = new Message();
GenMessage3* incomingMsg = new GenMessage3();
GenMessage3* retrievedMsg;
msg->setInternalRep(incomingMsg);
msg->getInternalRep(&retrievedMsg);
retrievedMsg->foo();
return 0;
}
Outputs:
Msg 3
I think std::visit is, as you suspected, what you need. You can implement your foo() function like this:
void foo()
{
std::visit([](auto* message) {message->foo();}, this->internalRep);
}
Using a generic lambda (taking auto), it can be thought of as a template function, where the lambda's argument message is the actual type of the message in the variant, and you can use it directly. Provided all the messages have the same interface that you want to use, then you can do this with all the interface functions.

Efficient configuration of class hierarchy at compile-time

This question is specifically about C++ architecture on embedded, hard real-time systems. This implies that large parts of the data-structures as well as the exact program-flow are given at compile-time, performance is important and a lot of code can be inlined. Solutions preferably use C++03 only, but C++11 inputs are also welcome.
I am looking for established design-patterns and solutions to the architectural problem where the same code-base should be re-used for several, closely related products, while some parts (e.g. the hardware-abstraction) will necessarily be different.
I will likely end up with a hierarchical structure of modules encapsulated in classes that might then look somehow like this, assuming 4 layers:
Product A Product B
Toplevel_A Toplevel_B (different for A and B, but with common parts)
Middle_generic Middle_generic (same for A and B)
Sub_generic Sub_generic (same for A and B)
Hardware_A Hardware_B (different for A and B)
Here, some classes inherit from a common base class (e.g. Toplevel_A from Toplevel_base) while others do not need to be specialized at all (e.g. Middle_generic).
Currently I can think of the following approaches:
(A): If this was a regular desktop-application, I would use virtual inheritance and create the instances at run-time, using e.g. an Abstract Factory.
Drawback: However the *_B classes will never be used in product A and hence the dereferencing of all the virtual function calls and members not linked to an address at run-time will lead to quite some overhead.
(B) Using template specialization as inheritance mechanism (e.g. CRTP)
template<class Derived>
class Toplevel { /* generic stuff ... */ };
class Toplevel_A : public Toplevel<Toplevel_A> { /* specific stuff ... */ };
Drawback: Hard to understand.
(C): Use different sets of matching files and let the build-scripts include the right one
// common/toplevel_base.h
class Toplevel_base { /* ... */ };
// product_A/toplevel.h
class Toplevel : Toplevel_base { /* ... */ };
// product_B/toplevel.h
class Toplevel : Toplevel_base { /* ... */ };
// build_script.A
compiler -Icommon -Iproduct_A
Drawback: Confusing, tricky to maintain and test.
(D): One big typedef (or #define) file
//typedef_A.h
typedef Toplevel_A Toplevel_to_be_used;
typedef Hardware_A Hardware_to_be_used;
// etc.
// sub_generic.h
class sub_generic {
Hardware_to_be_used the_hardware;
// etc.
};
Drawback: One file to be included everywhere and still the need of another mechnism to actually switch between different configurations.
(E): A similar, "Policy based" configuration, e.g.
template <class Policy>
class Toplevel {
Middle_generic<Policy> the_middle;
// ...
};
// ...
template <class Policy>
class Sub_generic {
class Policy::Hardware_to_be_used the_hardware;
// ...
};
// used as
class Policy_A {
typedef Hardware_A Hardware_to_be_used;
};
Toplevel<Policy_A> the_toplevel;
Drawback: Everything is a template now; a lot of code needs to be re-compiled every time.
(F): Compiler switch and preprocessor
// sub_generic.h
class Sub_generic {
#if PRODUCT_IS_A
Hardware_A _hardware;
#endif
#if PRODUCT_IS_B
Hardware_B _hardware;
#endif
};
Drawback: Brrr..., only if all else fails.
Is there any (other) established design-pattern or a better solution to this problem, such that the compiler can statically allocate as many objects as possible and inline large parts of the code, knowing which product is being built and which classes are going to be used?
I'd go for A. Until it's PROVEN that this is not good enough, go for the same decisions as for desktop (well, of course, allocating several kilobytes on the stack, or using global variables that are many megabytes large may be "obvious" that it's not going to work). Yes, there is SOME overhead in calling virtual functions, but I would go for the most obvious and natural C++ solution FIRST, then redesign if it's not "good enough" (obviously, try to determine performance and such early on, and use tools like a sampling profiler to determine where you are spending time, rather than "guessing" - humans are proven pretty poor guessers).
I'd then move to option B if A is proven to not work. This is indeed not entirely obvious, but it is, roughly, how LLVM/Clang solves this problem for combinations of hardware and OS, see:
https://github.com/llvm-mirror/clang/blob/master/lib/Basic/Targets.cpp
First I would like to point out that you basically answered your own question in the question :-)
Next I would like to point out that in C++
the exact program-flow are given at compile-time, performance is
important and a lot of code can be inlined
is called templates. The other approaches that leverage language features as opposed to build system features will serve only as a logical way of structuring the code in your project to the benefit of developers.
Further, as noted in other answers C is more common for hard real-time systems than are C++, and in C it is customary to rely on MACROS to make this kind of optimization at compile time.
Finally, you have noted under your B solution above that template specialization is hard to understand. I would argue that this depends on how you do it and also on how much experience your team has on C++/templates. I find many "template ridden" projects to be extremely hard to read and the error messages they produce to be unholy at best, but I still manage to make effective use of templates in my own projects because I respect the KISS principle while doing it.
So my answer to you is, go with B or ditch C++ for C
I understand that you have two important requirements :
Data types are known at compile time
Program-flow is known at compile time
The CRTP wouldn't really address the problem you are trying to solve as it would allow the HardwareLayer to call methods on the Sub_generic, Middle_generic or TopLevel and I don't believe it is what you are looking for.
Both of your requirements can be met using the Trait pattern (another reference). Here is an example proving both requirements are met. First, we define empty shells representing two Hardwares you might want to support.
class Hardware_A {};
class Hardware_B {};
Then let's consider a class that describes a general case which corresponds to Hardware_A.
template <typename Hardware>
class HardwareLayer
{
public:
typedef long int64_t;
static int64_t getCPUSerialNumber() {return 0;}
};
Now let's see a specialization for Hardware_B :
template <>
class HardwareLayer<Hardware_B>
{
public:
typedef int int64_t;
static int64_t getCPUSerialNumber() {return 1;}
};
Now, here is a usage example within the Sub_generic layer :
template <typename Hardware>
class Sub_generic
{
public:
typedef HardwareLayer<Hardware> HwLayer;
typedef typename HwLayer::int64_t int64_t;
int64_t doSomething() {return HwLayer::getCPUSerialNumber();}
};
And finally, a short main that executes both code paths and use both data types :
int main(int argc, const char * argv[]) {
std::cout << "Hardware_A : " << Sub_generic<Hardware_A>().doSomething() << std::endl;
std::cout << "Hardware_B : " << Sub_generic<Hardware_B>().doSomething() << std::endl;
}
Now if your HardwareLayer needs to maintain state, here is another way to implement the HardLayer and Sub_generic layer classes.
template <typename Hardware>
class HardwareLayer
{
public:
typedef long hwint64_t;
hwint64_t getCPUSerialNumber() {return mySerial;}
private:
hwint64_t mySerial = 0;
};
template <>
class HardwareLayer<Hardware_B>
{
public:
typedef int hwint64_t;
hwint64_t getCPUSerialNumber() {return mySerial;}
private:
hwint64_t mySerial = 1;
};
template <typename Hardware>
class Sub_generic : public HardwareLayer<Hardware>
{
public:
typedef HardwareLayer<Hardware> HwLayer;
typedef typename HwLayer::hwint64_t hwint64_t;
hwint64_t doSomething() {return HwLayer::getCPUSerialNumber();}
};
And here is a last variant where only the Sub_generic implementation changes :
template <typename Hardware>
class Sub_generic
{
public:
typedef HardwareLayer<Hardware> HwLayer;
typedef typename HwLayer::hwint64_t hwint64_t;
hwint64_t doSomething() {return hw.getCPUSerialNumber();}
private:
HwLayer hw;
};
On a similar train of thought to F, you could just have a directory layout like this:
Hardware/
common/inc/hardware.h
hardware1/src/hardware.cpp
hardware2/src/hardware.cpp
Simplify the interface to only assume a single hardware exists:
// sub_generic.h
class Sub_generic {
Hardware _hardware;
};
And then only compile the folder that contains the .cpp files for the hardware for that platform.
The benefits to this approach are:
It's simple to understand whats happening and to add a hardware3
hardware.h still serves as your API
It takes away the abstraction from the compiler (for your speed concerns)
Compiler 1 doesn't need to compile hardware2.cpp or hardware3.cpp which may contain things Compiler 1 can't do (like inline assembly, or some other specific Compiler 2 thing)
hardware3 might be much more complicated for some reason you haven't considered yet.. so giving it a whole directory structure encapsulates it.
Since this is for a hard real time embedded system, usually you would go for a C type of solution not c++.
With modern compilers I'd say that the overhead of c++ is not that great, so it's not entirely a matter of performance, but embedded systems tend to prefer c instead of c++.
What you are trying to build would resemble a classic device drivers library (like the one for ftdi chips).
The approach there would be (since it's written in C) something similar to your F, but with no compile time options - you would specialize the code, at runtime, based on somethig like PID, VID, SN, etc...
Now if you what to use c++ for this, templates should probably be your last option (code readability usually ranks higher than any advantage templates bring to the table). So you would probably go for something similar to A: a basic class inheritance scheme, but no particularly fancy design pattern is required.
Hope this helps...
I am going to assume that these classes only need to be created a single time, and that their instances persist throughout the entire program run time.
In this case I would recommend using the Object Factory pattern since the factory will only get run one time to create the class. From that point on the specialized classes are all a known type.

C++ handling specific impl - #ifdef vs private inheritance vs tag dispatch

I have some classes implementing some computations which I have
to optimize for different SIMD implementations e.g. Altivec and
SSE. I don't want to polute the code with #ifdef ... #endif blocks
for each method I have to optimize so I tried a couple of other
approaches, but unfotunately I'm not very satisfied of how it turned
out for reasons I'll try to clarify. So I'm looking for some advice
on how I could improve what I have already done.
1.Different implementation files with crude includes
I have the same header file describing the class interface with different
"pseudo" implementation files for plain C++, Altivec and SSE only for the
relevant methods:
// Algo.h
#ifndef ALGO_H_INCLUDED_
#define ALGO_H_INCLUDED_
class Algo
{
public:
Algo();
~Algo();
void process();
protected:
void computeSome();
void computeMore();
};
#endif
// Algo.cpp
#include "Algo.h"
Algo::Algo() { }
Algo::~Algo() { }
void Algo::process()
{
computeSome();
computeMore();
}
#if defined(ALTIVEC)
#include "Algo_Altivec.cpp"
#elif defined(SSE)
#include "Algo_SSE.cpp"
#else
#include "Algo_Scalar.cpp"
#endif
// Algo_Altivec.cpp
void Algo::computeSome()
{
}
void Algo::computeMore()
{
}
... same for the other implementation files
Pros:
the split is quite straightforward and easy to do
there is no "overhead"(don't know how to say it better) to objects of my class
by which I mean no extra inheritance, no addition of member variables etc.
much cleaner than #ifdef-ing all over the place
Cons:
I have three additional files for maintenance; I could put the Scalar
implementation in the Algo.cpp file though and end up with just two but the
inclusion part will look and fell a bit dirtier
they are not compilable units per-se and have to be excluded from the
project structure
if I do not have the specific optimized implementation yet for let's say
SSE I would have to duplicate some code from the plain(Scalar) C++ implementation file
I cannot fallback to the plain C++ implementation if nedded; ? is it even possible
to do that in the described scenario ?
I do not feel any structural cohesion in the approach
2.Different implementation files with private inheritance
// Algo.h
class Algo : private AlgoImpl
{
... as before
}
// AlgoImpl.h
#ifndef ALGOIMPL_H_INCLUDED_
#define ALGOIMPL_H_INCLUDED_
class AlgoImpl
{
protected:
AlgoImpl();
~AlgoImpl();
void computeSomeImpl();
void computeMoreImpl();
};
#endif
// Algo.cpp
...
void Algo::computeSome()
{
computeSomeImpl();
}
void Algo::computeMore()
{
computeMoreImpl();
}
// Algo_SSE.cpp
AlgoImpl::AlgoImpl()
{
}
AlgoImpl::~AlgoImpl()
{
}
void AlgoImpl::computeSomeImpl()
{
}
void AlgoImpl::computeMoreImpl()
{
}
Pros:
the split is quite straightforward and easy to do
much cleaner than #ifdef-ing all over the place
still there is no "overhead" to my class - EBCO should kick in
the semantic of the class is much more cleaner at least comparing to the above
that is private inheritance == is implemented in terms of
the different files are compilable, can be included in the project
and selected via the build system
Cons:
I have three additional files for maintenance
if I do not have the specific optimized implementation yet for let's say
SSE I would have to duplicate some code from the plain(Scalar) C++ implementation file
I cannot fallback to the plain C++ implementation if nedded
3.Is basically method 2 but with virtual functions in the AlgoImpl class. That
would allow me to overcome the duplicate implementation of plain C++ code if needed
by providing an empty implementation in the base class and override in the derived
although I will have to disable that behavior when I actually implement the optimized
version. Also the virtual functions will bring some "overhead" to objects of my class.
4.A form of tag dispatching via enable_if<>
Pros:
the split is quite straightforward and easy to do
much cleaner than #ifdef ing all over the place
still there is no "overhead" to my class
will eliminate the need for different files for different implementations
Cons:
templates will be a bit more "cryptic" and seem to bring an unnecessary
overhead(at least for some people in some contexts)
if I do not have the specific optimized implementation yet for let's say
SSE I would have to duplicate some code from the plain(Scalar) C++ implementation
I cannot fallback to the plain C++ implementation if needed
What I couldn't figure out yet for any of the variants is how to properly and
cleanly fallback to the plain C++ implementation.
Also I don't want to over-engineer things and in that respect the first variant
seems the most "KISS" like even considering the disadvantages.
You could use a policy based approach with templates kind of like the way the standard library does for allocators, comparators and the like. Each implementation has a policy class which defines computeSome() and computeMore(). Your Algo class takes a policy as a parameter and defers to its implementation.
template <class policy_t>
class algo_with_policy_t {
policy_t policy_;
public:
algo_with_policy_t() { }
~algo_with_policy_t() { }
void process()
{
policy_.computeSome();
policy_.computeMore();
}
};
struct altivec_policy_t {
void computeSome();
void computeMore();
};
struct sse_policy_t {
void computeSome();
void computeMore();
};
struct scalar_policy_t {
void computeSome();
void computeMore();
};
// let user select exact implementation
typedef algo_with_policy_t<altivec_policy_t> algo_altivec_t;
typedef algo_with_policy_t<sse_policy_t> algo_sse_t;
typedef algo_with_policy_t<scalar_policy_t> algo_scalar_t;
// let user have default implementation
typedef
#if defined(ALTIVEC)
algo_altivec_t
#elif defined(SSE)
algo_sse_t
#else
algo_scalar_t
#endif
algo_default_t;
This lets you have all the different implementations defined within the same file (like solution 1) and compiled into the same program (unlike solution 1). It has no performance overheads (unlike virtual functions). You can either select the implementation at run time or get a default implementation chosen by the compile time configuration.
template <class algo_t>
void use_algo(algo_t algo)
{
algo.process();
}
void select_algo(bool use_scalar)
{
if (!use_scalar) {
use_algo(algo_default_t());
} else {
use_algo(algo_scalar_t());
}
}
As requested in the comments, here's a summary of what I did:
Set up policy_list helper template utility
This maintains a list of policies, and gives them a "runtime check" call before calling the first suitable implementaiton
#include <cassert>
template <typename P, typename N=void>
struct policy_list {
static void apply() {
if (P::runtime_check()) {
P::impl();
}
else {
N::apply();
}
}
};
template <typename P>
struct policy_list<P,void> {
static void apply() {
assert(P::runtime_check());
P::impl();
}
};
Set up specific policies
These policies implement a both a runtime test and an actual implementation of the algorithm in question. For my actual problem impl took another template parameter that specified what exactly it was they were implementing, here though the example assumes there is only one thing to be implemented. The runtime tests are cached in a static bool for some (e.g. the Altivec one I used) the test was really slow. For others (e.g. the OpenCL one) the test is actually "is this function pointer NULL?" after one attempt at setting it with dlsym().
#include <iostream>
// runtime SSE detection (That's another question!)
extern bool have_sse();
struct sse_policy {
static void impl() {
std::cout << "SSE" << std::endl;
}
static bool runtime_check() {
static bool result = have_sse();
// have_sse lives in another TU and does some cpuid asm stuff
return result;
}
};
// Runtime OpenCL detection
extern bool have_opencl();
struct opencl_policy {
static void impl() {
std::cout << "OpenCL" << std::endl;
}
static bool runtime_check() {
static bool result = have_opencl();
// have_opencl lives in another TU and does some LoadLibrary or dlopen()
return result;
}
};
struct basic_policy {
static void impl() {
std::cout << "Standard C++ policy" << std::endl;
}
static bool runtime_check() { return true; } // All implementations do this
};
Set per architecture policy_list
Trivial example sets one of two possible lists based on ARCH_HAS_SSE preprocessor macro. You might generate this from your build script, or use a series of typedefs, or hack support for "holes" in the policy_list that might be void on some architectures skipping straight to the next one, without trying to check for support. GCC sets some preprocessor macors for you that might help, e.g. __SSE2__.
#ifdef ARCH_HAS_SSE
typedef policy_list<opencl_policy,
policy_list<sse_policy,
policy_list<basic_policy
> > > active_policy;
#else
typedef policy_list<opencl_policy,
policy_list<basic_policy
> > active_policy;
#endif
You can use this to compile multiple variants on the same platform too, e.g. and SSE and no-SSE binary on x86.
Use the policy list
Fairly straightforward, call the apply() static method on the policy_list. Trust that it will call the impl() method on the first policy that passes the runtime test.
int main() {
active_policy::apply();
}
If you take the "per operation template" approach I mentioned earlier it might be something more like:
int main() {
Matrix m1, m2;
Vector v1;
active_policy::apply<matrix_mult_t>(m1, m2);
active_policy::apply<vector_mult_t>(m1, v1);
}
In that case you end up making your Matrix and Vector types aware of the policy_list in order that they can decide how/where to store the data. You can also use heuristics for this too, e.g. "small vector/matrix lives in main memory no matter what" and make the runtime_check() or another function test the appropriateness of a particular approach to a given implementation for a specific instance.
I also had a custom allocator for containers, which produced suitably aligned memory always on any SSE/Altivec enabled build, regardless of if the specific machine had support for Altivec. It was just easier that way, although it could be a typedef in a given policy and you always assume that the highest priority policy has the strictest allocator needs.
Example have_altivec():
I've included a sample have_altivec() implementation for completeness, simply because it's the shortest and therefore most appropriate for posting here. The x86/x86_64 CPUID one is messy because you have to support the compiler specific ways of writing inline ASM. The OpenCL one is messy because we check some of the implementation limits and extensions too.
#if HAVE_SETJMP && !(defined(__APPLE__) && defined(__MACH__))
jmp_buf jmpbuf;
void illegal_instruction(int sig) {
// Bad in general - https://www.securecoding.cert.org/confluence/display/seccode/SIG32-C.+Do+not+call+longjmp%28%29+from+inside+a+signal+handler
// But actually Ok on this platform in this scenario
longjmp(jmpbuf, 1);
}
#endif
bool have_altivec()
{
volatile sig_atomic_t altivec = 0;
#ifdef __APPLE__
int selectors[2] = { CTL_HW, HW_VECTORUNIT };
int hasVectorUnit = 0;
size_t length = sizeof(hasVectorUnit);
int error = sysctl(selectors, 2, &hasVectorUnit, &length, NULL, 0);
if (0 == error)
altivec = (hasVectorUnit != 0);
#elif HAVE_SETJMP_H
void (*handler) (int sig);
handler = signal(SIGILL, illegal_instruction);
if (setjmp(jmpbuf) == 0) {
asm volatile ("mtspr 256, %0\n\t" "vand %%v0, %%v0, %%v0"::"r" (-1));
altivec = 1;
}
signal(SIGILL, handler);
#endif
return altivec;
}
Conclusion
Basically you pay no penalty for platforms that can never support an implementation (the compiler generates no code for them) and only a small penalty (potentially just a very predictable by the CPU test/jmp pair if your compiler is half-decent at optimising) for platforms that could support something but don't. You pay no extra cost for platforms that the first choice implementation runs on. The details of the runtime tests vary between the technology in question.
If the virtual function overhead is acceptable, option 3 plus a few ifdefs seems a good compromise IMO. There are two variations that you could consider: one with abstract base class, and the other with the plain C implementation as the base class.
Having the C implementation as the base class lets you gradually add the vector optimized versions, falling back on the non-vectorized versions as you please, using an abstract interface would be a little cleaner to read.
Also, having separate C++ and vectorized versions of your class let you easily write unit tests that
Ensure that the vectorized code is giving the right result (easy to mess this up, and vector floating registers can have different precision than FPU, causing different results)
Compare the performance of the C++ vs the vectorized. It's often good to make sure the vectorized code is actually doing you any good. Compilers can generate very tight C++ code that sometimes does as well or better than vectorized code.
Here's one with the plain-c++ implementations as the base class. Adding an abstract interface would just add a common base class to all three of these:
// Algo.h:
class Algo_Impl // Default Plain C++ implementation
{
public:
virtual ComputeSome();
virtual ComputeSomeMore();
...
};
// Algo_SSE.h:
class Algo_Impl_SSE : public Algo_Impl // SSE
{
public:
virtual ComputeSome();
virtual ComputeSomeMore();
...
};
// Algo_Altivec.h:
class Algo_Impl_Altivec : public Algo_Impl // Altivec implementation
{
public:
virtual ComputeSome();
virtual ComputeSomeMore();
...
};
// Client.cpp:
Algo_Impl *myAlgo = 0;
#ifdef SSE
myAlgo = new Algo_Impl_SSE;
#elseif defined(ALTIVEC)
myAlgo = new Algo_Impl_Altivec;
#else
myAlgo = new Algo_Impl_Default;
#endif
...
You may consider to employ adapter patterns. There are a few types of adapters and it's quite an extensible concept. Here is an interesting article Structural Patterns: Adapter and Façade
that discusses very similar matter to the one in your question - the Accelerate framework as an example of the Adapter patter.
I think it is a good idea to discuss a solution on the level of design patterns without focusing on implementation detail like C++ language. Once you decide that the adapter states the right solutiojn for you, you can look for variants specific to your implemementation. For example, in C++ world there is known adapter variant called generic adapter pattern.
This isn't really a whole answer: just a variant on one of your existing options. In option 1 you've assumed that you include algo_altivec.cpp &c. into algo.cpp, but you don't have to do this. You could omit algo.cpp entirely, and have your build system decide which of algo_altivec.cpp, algo_sse.cpp, &c. to build. You'd have to do something like this anyway whichever option you use, since each platform can't compile every implementation; my suggestion is only that whichever option you choose, instead of having #if ALTIVEC_ENABLED everywhere in the source, where ALTIVEC_ENABLED is set from the build system, you just have the build system decide directly whether to compile algo_altivec.cpp .
This is a bit trickier to achieve in MSVC than make, scons, &c., but still possible. It's commonplace to switch in a whole directory rather than individual source files; that is, instead of algo_altivec.cpp and friends, you'd have platform/altivec/algo.cpp, platform/sse/algo.cpp, and so one. This way, when you have a second algorithm you need platform-specific implementations for, you can just add the extra source file to each directory.
Although my suggestion's mainly intended to be a variant of option 1, you can combine this with any of your options, to let you decide in the build system and at runtime which options to offer. In that case, though, you'll probably need implementation-specific header files too.
In order to hide the implementation details you may just use an abstract interface with static creator and provide three 3 implementation classes:
// --------------------- Algo.h ---------------------
#pragma once
typedef boost::shared_ptr<class Algo> AlgoPtr;
class Algo
{
public:
static AlgoPtr Create(std::string type);
~Algo();
void process();
protected:
virtual void computeSome() = 0;
virtual void computeMore() = 0;
};
// --------------------- Algo.cpp ---------------------
class PlainAlgo: public Algo { ... };
class AltivecAlgo: public Algo { ... };
class SSEAlgo: public Algo { ... };
static AlgoPtr Algo::Create(std::string type) { /* Factory implementation */ }
Please note, that since PlainAlgo, AlivecAlgo and SSEAlgo classes are defined in Algo.cpp, they are only seen from this compilation unit and therefore the implementation details hidden from the outside world.
Here is how one can use your class then:
AlgoPtr algo = Algo::Create("SSE");
algo->Process();
It seems to me that your first strategy, with separate C++ files and #including the specific implementation, is the simplest and cleanest. I would only add some comments to your Algo.cpp indicating which methods are in the #included files.
e.g.
// Algo.cpp
#include "Algo.h"
Algo::Algo() { }
Algo::~Algo() { }
void Algo::process()
{
computeSome();
computeMore();
}
// The following methods are implemented in separate,
// platform-specific files.
// void Algo::computeSome()
// void Algo::computeMore()
#if defined(ALTIVEC)
#include "Algo_Altivec.cpp"
#elif defined(SSE)
#include "Algo_SSE.cpp"
#else
#include "Algo_Scalar.cpp"
#endif
Policy-like templates (mixins) are fine until the requirement to fall back to default implementation. It's runtime opeation and should be handled by runtime polymorphism. Strategy pattern can handle this fine.
There's one drawback of this approach: Strategy-like algorithm implemented cannot be inlined. Such inlining can provide reasonable performance improvement in rare cases. If this is an issue you'll need to cover higher-level logic by Strategy.

Dealing with functions in a class that should be broken down into functions for clarity?

How is this situation usually dealt with. For example, an object may need to do very specific things:
class Human
{
public:
void eat(Food food);
void drink(Liquid liquid);
String talkTo(Human human);
}
Say that this is what this class is supposed to do, but to actually do these might result in functions that are well over 10,000 lines. So you would break them down. The problem is, many of those helper functions should not be called by anything other than the function they are serving. This makes the code confusing in a way. For example, chew(Food food); would be called by eat() but should not be called by a user of the class and probably should not be called anywhere else.
How are these cases dealt with generally. I was looking at some classes from a real video game that looked like this:
class CHeli (7 variables, 19 functions)
Variables list
CatalinaHasBeenShotDown
CatalinaHeliOn
NumScriptHelis
NumRandomHelis
TestForNewRandomHelisTimer
ScriptHeliOn
pHelis
Functions list
FindPointerToCatalinasHeli (void)
GenerateHeli (b)
CatalinaTakeOff (void)
ActivateHeli (b)
MakeCatalinaHeliFlyAway (void)
HasCatalinaBeenShotDown (void)
InitHelis (void)
UpdateHelis (void)
TestRocketCollision (P7CVector)
TestBulletCollision (P7CVectorP7CVectorP7CVector)
SpecialHeliPreRender (void)
SpawnFlyingComponent (i)
StartCatalinaFlyBy (void)
RemoveCatalinaHeli (void)
Render (void)
SetModelIndex (Ui)
PreRenderAlways (void)
ProcessControl (void)
PreRender (void)
All of these look like fairly high level functions, which mean their source code must be pretty lengthy. What is good about this is that at a glance it is very clear what this class can do and the class looks easy to use. However, the code for these functions might be quite large.
What should a programmer do in these cases; what is proper practice for these types of situations.
For example, chew(Food food); would be called by eat() but should not be called by a user of the class and probably should not be called anywhere else.
Then either make chew a private or protected member function, or a freestanding function in an anonymous namespace inside the eat implementation module:
// eat.cc
// details of digestion
namespace {
void chew(Human &subject, Food &food)
{
while (!food.mushy())
subject.move_jaws();
}
}
void Human::eat(Food &food)
{
chew(*this, food);
swallow(*this, food);
}
The benefits of this approach compared to private member functions is that the implementation of eat can be changed without the header changing (requiring recompilation of client code). The drawback is that the function cannot be called by any function outside of its module, so it can't be shared by multiple member functions unless they share an implementation file, and that it can't access private parts of the class directly.
The drawback compared to protected member functions is that derived classes can't call chew directly.
The implementation of one member function is allowed to be split in whatever way you want.
A popular option is to use private member functions:
struct Human
{
void eat();
private:
void chew(...);
void eat_spinach();
...
};
or to use the Pimpl idiom:
struct Human
{
void eat();
private:
struct impl;
std::unique_ptr<impl> p_impl;
};
struct Human::impl { ... };
However, as soon as the complexity of eat goes up, you surely don't want a collection of private methods accumulating (be it inside a Pimpl class or inside a private section).
So you want to break down the behavior. You can use classes:
struct SpinachEater
{
void eat_spinach();
private:
// Helpers for eating spinach
};
...
void Human::eat(Aliment* e)
{
if (e->isSpinach()) // Use your favorite dispatch method here
// Factories, or some sort of polymorphism
// are possible ideas.
{
SpinachEater eater;
eater.eat_spinach();
}
...
}
with the basic principles:
Keep it simple
One class one responsibility
Never duplicate code
Edit: A slightly better illustration, showing a possible split into classes:
struct Aliment;
struct Human
{
void eat(Aliment* e);
private:
void process(Aliment* e);
void chew();
void swallow();
void throw_up();
};
// Everything below is in an implementation file
// As the code grows, it can of course be split into several
// implementation files.
struct AlimentProcessor
{
virtual ~AlimentProcessor() {}
virtual process() {}
};
struct VegetableProcessor : AlimentProcessor
{
private:
virtual process() { std::cout << "Eeek\n"; }
};
struct MeatProcessor
{
private:
virtual process() { std::cout << "Hmmm\n"; }
};
// Use your favorite dispatch method here.
// There are many ways to escape the use of dynamic_cast,
// especially if the number of aliments is expected to grow.
std::unique_ptr<AlimentProcessor> Factory(Aliment* e)
{
typedef std::unique_ptr<AlimentProcessor> Handle;
if (dynamic_cast<Vegetable*>(e))
return Handle(new VegetableProcessor);
else if (dynamic_cast<Meat*>(e))
return Handle(new MeatProcessor);
else
return Handle(new AlimentProcessor);
};
void Human::eat(Aliment* e)
{
this->process(e);
this->chew();
if (e->isGood()) this->swallow();
else this->throw_up();
}
void Human::process(Aliment* e)
{
Factory(e)->process();
}
One possibility is to (perhaps privately) compose the Human of smaller objects that each do a smaller part of the work. So, you might have a Stomach object. Human::eat(Food food) would delegate to this->stomach.digest(food), returning a DigestedFood object, which the Human::eat(Food food) function processed further.
Function decomposition is something that is learnt from experience, and it usually implies type decomposition at the same time. If your functions become too large there are different things that can be done, which is best for a particular case depends on the problem at hand.
separate functionality into private functions
This makes more sense when the functions have to access quite a bit of state from the object, and if they can be used as building blocks for one or more of the public functions
decompose the class into different subclasses that have different responsibilities
In some cases a part of the work falls naturally into its own little subproblem, then the higher level functions can be implemented in terms of calls to the internal subobjects (usually members of the type).
Because the domain that you are trying to model can be interpreted in quite a number of different ways I fear trying to provide a sensible breakdown, but you could imagine that you had a mouth subobject in Human that you could use to ingest food or drink. Inside the mouth subobject you could have functions open, chew, swallow...