C++ function argument safety - c++

In a function that takes several arguments of the same type, how can we guarantee that the caller doesn't mess up the ordering?
For example
void allocate_things(int num_buffers, int pages_per_buffer, int default_value ...
and later
// uhmm.. lets see which was which uhh..
allocate_things(40,22,80,...

A typical solution is to put the parameters in a structure, with named fields.
AllocateParams p;
p.num_buffers = 1;
p.pages_per_buffer = 10;
p.default_value = 93;
allocate_things(p);
You don't have to use fields, of course. You can use member functions or whatever you like.

If you have a C++11 compiler, you could use user-defined literals in combination with user-defined types. Here is a naive approach:
struct num_buffers_t {
constexpr num_buffers_t(int n) : n(n) {} // constexpr constructor requires C++14
int n;
};
struct pages_per_buffer_t {
constexpr pages_per_buffer_t(int n) : n(n) {}
int n;
};
constexpr num_buffers_t operator"" _buffers(unsigned long long int n) {
return num_buffers_t(n);
}
constexpr pages_per_buffer_t operator"" _pages_per_buffer(unsigned long long int n) {
return pages_per_buffer_t(n);
}
void allocate_things(num_buffers_t num_buffers, pages_per_buffer_t pages_per_buffer) {
// do stuff...
}
template <typename S, typename T>
void allocate_things(S, T) = delete; // forbid calling with other types, eg. integer literals
int main() {
// now we see which is which ...
allocate_things(40_buffers, 22_pages_per_buffer);
// the following does not compile (see the 'deleted' function):
// allocate_things(40, 22);
// allocate_things(40, 22_pages_per_buffer);
// allocate_things(22_pages_per_buffer, 40_buffers);
}

Two good answers so far, one more: another approach would be to try leverage the type system wherever possible, and to create strong typedefs. For instance, using boost strong typedef (http://www.boost.org/doc/libs/1_61_0/libs/serialization/doc/strong_typedef.html).
BOOST_STRONG_TYPEDEF(int , num_buffers);
BOOST_STRONG_TYPEDEF(int , num_pages);
void func(num_buffers b, num_pages p);
Calling func with arguments in the wrong order would now be a compile error.
A couple of notes on this. First, boost's strong typedef is rather dated in its approach; you can do much nicer things with variadic CRTP and avoid macros completely. Second, obviously this introduces some overhead as you often have to explicitly convert. So generally you don't want to overuse it. It's really nice for things that come up over and over again in your library. Not so good for things that come up as a one off. So for instance, if you are writing a GPS library, you should have a strong double typedef for distances in metres, a strong int64 typedef for time past epoch in nanoseconds, and so on.

(Note: post was originally tagged 'C`)
C99 onwards allows an extension to #Dietrich Epp idea: compound literal
struct things {
int num_buffers;
int pages_per_buffer;
int default_value
};
allocate_things(struct things);
// Use a compound literal
allocate_things((struct things){.default_value=80, .num_buffers=40, .pages_per_buffer=22});
Could even pass the address of the structure.
allocate_things(struct things *);
// Use a compound literal
allocate_things(&((struct things){.default_value=80,.num_buffers=40,.pages_per_buffer=22}));

You can't. That's why it is recommended to have as few function arguments as possible.
In your example you could have separate functions like set_num_buffers(int num_buffers), set_pages_per_buffer(int pages_per_buffer) etc.
You probably have noticed yourself that allocate_things is not a good name because it doesn't express what the function is actually doing. Especially I would not expect it to set a default value.

Just for completeness, you could use named arguments, when your call becomes.
void allocate_things(num_buffers=20, pages_per_buffer=40, default_value=20);
// or equivalently
void allocate_things(pages_per_buffer=40, default_value=20, num_buffers=20);
However, with the current C++ this requires quite a bit of code to be implemented (in the header file declaring allocate_things(), which must also declare appropriate external objects num_buffers etc providing operator= which return a unique suitable object).
---------- working example (for sergej)
#include <iostream>
struct a_t { int x=0; a_t(int i): x(i){} };
struct b_t { int x=0; b_t(int i): x(i){} };
struct c_t { int x=0; c_t(int i): x(i){} };
// implement using all possible permutations of the arguments.
// for many more argumentes better use a varidadic template.
void func(a_t a, b_t b, c_t c)
{ std::cout<<"a="<<a.x<<" b="<<b.x<<" c="<<c.x<<std::endl; }
inline void func(b_t b, c_t c, a_t a) { func(a,b,c); }
inline void func(c_t c, a_t a, b_t b) { func(a,b,c); }
inline void func(a_t a, c_t c, b_t b) { func(a,b,c); }
inline void func(c_t c, b_t b, a_t a) { func(a,b,c); }
inline void func(b_t b, a_t a, c_t c) { func(a,b,c); }
struct make_a { a_t operator=(int i) { return {i}; } } a;
struct make_b { b_t operator=(int i) { return {i}; } } b;
struct make_c { c_t operator=(int i) { return {i}; } } c;
int main()
{
func(b=2, c=10, a=42);
}

Are you really going to try to QA all the combinations of arbitrary integers? And throw in all the checks for negative/zero values etc?
Just create two enum types for minimum, medium and maximum number of buffers, and small medium and large buffer sizes. Then let the compiler do the work and let your QA folks take an afternoon off:
allocate_things(MINIMUM_BUFFER_CONFIGURATION, LARGE_BUFFER_SIZE, 42);
Then you only have to test a limited number of combinations and you'll have 100% coverage. The people working on your code 5 years from now will only need to know what they want to achieve and not have to guess the numbers they might need or which values have actually been tested in the field.
It does make the code slightly harder to extend, but it sounds like the parameters are for low-level performance tuning, so twiddling the values should not be perceived as cheap/trivial/not needing thorough testing. A code review of a change from
allocate_something(25, 25, 25);
...to
allocate_something(30, 80, 42);
...will likely get just a shrug/blown off, but a code review of a new enum value EXTRA_LARGE_BUFFERS will likely trigger all the right discussions about memory use, documentation, performance testing etc.

Related

C equivalent to C++ decltype

In my C project, there is a struct, created by another colleague, containing some function pointers:
struct tools {
int (*tool_a) (int, int, int);
...
};
I have no right to change this struct and relative files.
Now I'm coding with the struct.
I have to define a function, whose return type and list of arguments must be the same with the tools.tool_a.
Meaning that my function must be as below:
int my_func(int, int, int);
The problem is that the struct changes a lot, especially the return types, for example int is replaced by size_t today, so I have to change my code a lot.
I know that decltype in C++ can help me so I just want to know if C has something equivalent?
I'm thinking I may use macro but I don't know how, I even don't know if it's possible or not.
REAL CASE
I'm developing some testing tools for linux-kernel with C.
There have been many versions of custom kernels coming from other groups in my company. For historical reasons, some of them used int, others used size_t or ssize_t and so on.
Now when I code, I have to do like this:
// int my_func(int a, int b, int c)
size_t my_func(int a, int b, int c)
// ssize_t my_func(int a, int b, int c)
{}
struct tools my_tool = {
.tool_a = my_func;
}
I have to keep commenting and uncommenting...
The sane solution is to enforce a typedef. If that isn't possible, and the number of alternative types the function could have are limited, as seems to be the case, you could cook up something with C11 _Generic.
Instead of having a single function called my_func, create multiple functions with different names. Prefix their names depending on the return type. Then have a macro which in turn re-directs to the appropriate function, based on the type passed.
Example:
#include <stdio.h>
/*** the struct that cannot be changed ***/
struct tools {
int (*tool_a) (int, int, int);
};
/*** any number of functions with different types ***/
int int_my_func(int a, int b, int c)
{
puts(__func__);
}
size_t size_t_my_func(int a, int b, int c)
{
puts(__func__);
}
/*** macro to select the appropriate function based on type ***/
#define my_func_typeof(type) \
_Generic( (type), \
int(*)(int,int,int) : int_my_func, \
size_t(*)(int,int,int) : size_t_my_func)
/*** caller code ***/
int main (void)
{
struct tools my_tool = {
.tool_a = my_func_typeof( (struct tools){0}.tool_a )
};
my_tool.tool_a(1,2,3);
}
Here I used a compound literal (struct tools){0}.tool_a to create a dummy object of the same type as tool_a, then passed that on to the macro which picks the appropriate function. If the type is not supported, there will be a compiler error since no matching _Generic association could be found.
Well, this isn't decltype but if you can just convince your colleague to use a type alias, you can have your static type checking.
If your colleague can be persuaded to do this:
typedef int tool_a_prototype(int, int, int);
struct tools {
tool_a_prototype *tool_a;
};
Then you can declare your functions like this:
tool_a_prototype my_tool_a;
int my_tool_a(int a, int b, int c) {
//Whatever
}
And your friendly compiler will tell you if there's a prototype mismatch.
The problem is that the struct changes a lot, especially the return
types, for example int is replaced by size_t today, so I have to
change my code a lot.
I know that decltype in C++ can help me so I just want to know if C
has something equivalent?
If you are willing to use a non standard gcc extension, you can use typeof:
struct tools {
int (*tool_a) (int, int, int);
};
typedef typeof( ((struct tools*)NULL)->tool_a ) tool_a_type;
typedef typeof( ((tool_a_type)NULL)(0,0,0) ) tool_a_return_type;
tool_a_return_type my_func(int x, int y, int z)
{
}
struct tools my_tool = {
.tool_a = my_func
};

How to call a function from an object with a std::string

Here's my issue, I would like to call the getters/setters of one of my objects, but not directly, I want to do it by using a std::string.
I found this but it won't work on my case I think it is because my function aren't defined in my main method but in my square class. Also my function are not all defined the same way there's void(std::string) std::string() void(int)...
here's an exemple of what a would like to do.
my object square
#include <map>
#include <functional>
#include <string>
class Square{
private:
std::string name;
int width;
float happinessPoint; //extremly important for your square.
public:
void setName(std::string);
void setWidth(int);
void setHappinessPoint(float);
std::string getName()
int getWidth()
float getHappinnessPoint()
}
and my main
#include "Square.h/cpp"
int main(){
Square square = Square("Roger",2,3.5);
// here in my magicalFunction I ask to the users the new values for my square (all in std::string for now)
vector <std::string> newValueForSquare = magicalFunction();
for (unsigned int i=0; i < newValueForSquare.size(), i++){
//here I have a function which tell me if my std::string
// is in fact a float or an int
// and I would like to call each of my setters one by one to
// sets my Square to some value I asked to the user before all that.
// something like that:
// someFunction("setName","Henry")
}
}
I hope i have been clear it's pretty hard to explain something you don't know how to do. If you want me to be more specific tell me and I'll do what I can.
EDIT: What I want to do is to call for example my square.setName() with a str::string without writting this square.setName in my main.
To call functions, based on a string, you have some choices. Before I list the choices, please search the internet for "C++ factory design pattern".
If-else ladder
Lookup table
Map / Associative array
Hash table
There may be other methods, but the above come to mind.
if-else ladder (a.k.a. switch)
The problem with this method is that the switch statement doesn't work with strings nor text literals. So you'll have to suffice with if statements:
if (string == "Roger")
{
Process_Roger();
}
else if (string == "Felicity")
{
Process_Felicity();
}
else
{
Display_Error_Message();
}
Anytime you need to add a new string, you will have to add another "else if" statement to the ladder. Not only do you have to change the code, but you also have to retest it.
Lookup Table
You will need to understand function pointers for this technique and the map technique. Consider this a prerequisite.
Use a structure for mapping text strings to function pointers:
struct Text_Function_Pointer
{
const char * name;
Function_Pointer p_function;
};
static const Text_Function_Pointer table[] =
{
{"Larry", Process_Larry},
{"Felicity", Process_Felicity},
};
static const unsigned int table_size =
sizeof(table) / sizeof(table[0]);
//...
for (unsigned int i = 0; i < table_size; ++i)
{
if (search_name == table[i].name)
{
// Execute the processing function.
table[i].p_function(search_name);
break;
}
}
An issue with this technique is that all the function pointers must have the same signature. This is true for the map as well.
A nice feature is that the data in the table is constant, so it can be placed in Read-Only Memory.
Also, to add more associations, add an entry to the the table. The search / processing function hasn't changed, so it doesn't need to be tested again.
Map / Associative Array
Prerequisite: Function pointers.
Declare a std::map<std::string, Function_Pointer_Type>. Add your names and functions to the map:
std::map<std::string, Function_Pointer_Type> dispatch_table;
dispatch_table["Roger"] = Process_Roger;
dispatch_table["Felicity"] = Process_Felicity;
dispatch_table["Larry"] = Process_Larry;
//...
// Execute appropriate processing function:
(dispatch_table[search_name])();
One issue with this method is that the std::map data structure needs to be initialized; it can't be directly accessed or loaded from executable code.
Again, all functions must have the same signature.
Hash Table
The idea here is to have an array of function pointers or an array of structures with text & function pointers. Create a hash function that generates a unique array index based on the name string. Use the index to get the function pointer from the array, then execute the function via the function pointer.
Several solutions are available to you. You basically want to parse user input to fill your Square class attribute.
One way is to use the std::stoi family of functions:
std::vector<string> values { "Roger", "2", "3.5" };
std::string name = values[0]; // No problem, two strings
int width = std::stoi(values[1]); // stoi = stringToInt
float happiness = std::stof(values[2]); // stof = stringToFloat
I'm not sure why you'd need the for loop, unless there is something I didn't understand in your question. I'll update my answer accordingly.
Update 1
After reading other answers, I would like to propose my solution to your problem. As stated several times in my comments, this is not an easy answer !
I needed such a class to write a generic test engine, and this is the code I used. It works really well with any type of function (except for routines with a return type of void -- a simple template specialization would solve it though)
# include <functional>
# include <tuple>
template<int ...>
struct seq
{
};
template<int N, int ...S>
struct gens : gens<N - 1, N - 1, S...>
{
};
template<int ...S>
struct gens<0, S...>
{
typedef seq<S...> type;
};
struct callable_base
{
virtual void operator()() = 0;
virtual ~callable_base()
{ }
};
class Task
{
private:
template<class RT, class Functor, class ...Args>
struct functor : public callable_base
{
functor(RT& result, Functor func, Args ...args)
: _ret(result)
{
_func = func;
_args = std::make_tuple(args...);
}
void operator()()
{
_ret = call(typename gens<sizeof...(Args)>::type());
}
template<int ...S>
RT call(seq<S...>)
{
return (_func(std::get<S>(_args)...));
}
private:
std::function<RT(Args...)> _func;
std::tuple<Args...> _args;
RT& _ret;
};
public:
Task()
{
_functor = nullptr;
}
template<class RT, class Functor, class ...Args>
Task(RT& result, Functor func, Args... args)
{
_functor = new functor<RT, Functor, Args...>(result, func, args...);
}
void operator()()
{
(*_functor)();
}
~Task()
{
delete _functor;
}
private:
callable_base *_functor;
};
The idea behind this code is to hide the function signature in the inner class Task::functor and get the return value in the first parameter passed to the Task(...) constructor. I'm giving this code first because I think it might help some people, but also because I think it is an elegant solution to your problem. Bear in mind that to understand most of the code, you need solid C++ knowledge. I'll detail the code in subsequent updates if needed.
Here's how you'd use it:
int main()
{
int retVal;
std::string newName;
std::map<std::string, Task *> tasks {
{"setName", new Task(retVal, &Square::setName, &newName)}
...
}
/* Modify the name however you want */
...
tasks["setname"]();
}
This whole class could be optimized, of course, primarily thanks to C++14 and move semantics, universal references and all, but I kept it simple ~
A major problem is that you have to use pointers if you don't know the values of the parameters at the time you fill the task map. I'm working on another version to simplify this aspect, but I wanted to show you that C++ is not designed to do what you ask simply. Maybe you come from a functional or JS world, in which this would be trivial x)
Update 2
I just wanted to point out that with C++14, you could omit the first 3 structures that are here to help me expand my tuple in an argument list using interger_sequence

How do I declare non-const argumets to an object template

I did two classes, The first is template class Bit<size> that convert decimal number to a binary. The second is LogicalExpression class.
Bit class:
template<int size>
class Bit
{
public:
Bit(int);
void ConvertToBinary(int);
bool number[size];
int bit;
};
template <int size> Bit<size>::Bit(int decimalNumber)
{
this->bit = 0;
ConvertToBinary(decimalNumber);
}
template <int size> void Bit<size>::ConvertToBinary(int decimalNumber)
{
number[size - ++this->bit] = decimalNumber % 2;
if (size != this->bit) {
ConvertToBinary(decimalNumber / 2);
}
}
LogicalExpression class:
#include "Bit.h"
class LogicalExpression
{
private:
char* expression;
char* variables;
int expLenght;
int varLenght;
public:
LogicalExpression(char*);
~LogicalExpression();
bool ExpressionToBoolean(char*, Bit<????>); //here is the problem
I want to use the LogicalExpression class as a normal non-template class, as a result I do not know how to declare const argument for Bit<???>, it should be Bit<varLenght>, but varLenght is non-const value and i do not want to do LogicalExpression<varLenght> obj .
Hope that my English not so bad, for not understanding me.
The problem here is possibly a misunderstanding of how templates work.
Templates are evaluated at compile time. Therefore the value inbetween < > can not contain a non-const. Its simply not possible because templates are not evaluated at run time. This is actually a strength, not a weakness (see TMP). For comparison they are more like pre-processor defines then say a function call but they are actually not the same thing as macros
In this case you need to rethink your design. in this part:
template<int size>
class Bit
{
public:
Bit(int);
void ConvertToBinary(int);
bool number[size];
int bit;
};
You either want "number" to be a dynamic array so that it would either become something like:
class Bit
{
public:
Bit(int length){ number = new bool[length]; } ;
~Bit(){delete number;}
void ConvertToBinary(int);
bool* number;
int bit;
};
it doesn't need to be a template and would be used like:
bool ExpressionToBoolean(char*)
{
Bit foo(varLength);
}
You could use std::vector for simplicity.
OR "LogicalExpression" should be a template class (which you have said you don't want)
template<int varLenght>
class LogicalExpression
{
private:
char* expression;
char* variables;
int expLenght;
public:
LogicalExpression(char*);
~LogicalExpression();
bool ExpressionToBoolean(char*, Bit<varLenght>); //here is the problem
But really this boils down to a question of where you want your memory allocated, do you want it on the heap or the stack?
Heap : Dynamic array (can be evaluated at run time)
stack : Templates (can not be evaluated at run time)
If you don't care, i would probably just stick with the dynamic array approach because you could easily over complicate the problem with templates...but this problem may be suited to TMP based on your requirements. If you want it on stack then you will have to use some form of
LogicalExpression< "const" > obj
"somewhere", which if its a syntactical taste you could use something like:
typedef LogicalExpresion8Bit LogicalExpression<8>
If you want dynamic evaluation then you have to either use dynamic memory or something a bit crazier like a combination of polymorphic and interfaces which will most likely lead to more memory on the stack then you actually want/need, not to mention a lot more code...(i.e. each variant stored in an array and selected via index).

C++ automatic generation of switch statement

Consider the following code
#include <iostream>
enum MyEnum{
A,
B,
END
};
template <int N>
class Trait {};
template<>
class Trait<A> {
public:
static int funct(int i) {return i*3;}
};
template<>
class Trait<B> {
public:
static int funct(int i) {return i*24;}
};
using namespace std;
int main(){
int i = 1;
switch(i){
case A: cout << Trait<A>::funct(i) << endl; break;
case B: cout << Trait<B>::funct(i) << endl; break;
}
}
Which will print 24 on the screen.
Assume now that I have many more values in the enum and that I define all the corresponding
template specializations of the class Trait.
To avoid writing all the code necessary in the switch statement I wrote a REPEAT macro which works almost like I want:
#include <iostream>
#define REPEAT(N, macro) REPEAT_##N(macro)
#define REPEAT_0(macro)
#define REPEAT_1(macro) REPEAT_0(macro) macro(0)
#define REPEAT_2(macro) REPEAT_1(macro) macro(1)
#define REPEAT_3(macro) REPEAT_2(macro) macro(2)
#define REPEAT_4(macro) REPEAT_3(macro) macro(3)
// etc...
// enum and class definitions
int main(){
#define MY_MACRO(N) case N: cout << Trait<N>::funct(i) << endl; break;
switch(i){
REPEAT(2, MY_MACRO)
}
}
The problem I have with this approach is that I cannot use
REPEAT(END, MY_MACRO)
because the preprocessor doesn't know about my enum.
Question: Is there a way to generate automatically the switch statement?
Notes:
The situation where I have to use this is much more complicated and having something automated would be really helpful.
It is important for me to use a switch statement because of the speed which can be achieved (speed is crucial for my application).
Thanks!
EDIT 1
More notes:
It is important that the generation of the switch depends on the value of END defined in the enum.
EDIT 2/3
I decided to make an addition here to explain better my application and why I prefer some solutions to others
In my real application the enum contains almost 50 different values and it will be extended in the future (hopefully by other people). The enum contains consecutive values.
the class "Trait" has more than 1 member function (currently 5). Furthermore, I need to use all this in 5 different files. If I don't use an automated way of generating what I need I end up writing many times code which is basically the same.
the member functions of Trait are used in the same way all the times.
currently, inside my switch I have a function call which looks like this (in1, in2 and out are all double passed by reference, const for the first two cases).
case A: Trait::funct(in1, in2, out); break;
Why do I like templates?
Consider the case Trait has 2 functions funct1 and funct2. I could define
template <int N>
class Trait {
public:
static int funct1(int i){static_assert(N!=N, "You forgot to define funct1");}
static int funct2(int i){static_assert(N!=N, "You forgot to define funct2");}
};
Now, if a function definition is missing, the compiler will return a meaningful error. When other people will make additions this will be helpful.
Using the method based on C++11 features suggested by Jarod42 I can avoid maintaining long arrays of function pointers which would be error prone.
Speed tests
So far I experimented with 3 solutions but with only two member functions in Trait:
the solution suggested by Jarod42
a simple array of function pointers as suggested by nndru and Ali
switch statement with the RETURN macro
The first two solutions seem to be equivalent, while the one based on the switch is 5 times faster. I used gcc version 4.6.3 with the flag -O3.
As you say, your enum is contiguous. In that case you don't need any templates or std::map or switch:
Use simply an array of function pointers and the enum as the index into the function pointer array!
#include <cassert>
#include <cstdio>
enum {
A,
B,
SIZE
};
int A_funct(int i) { return 3*i; }
int B_funct(int i) { return 24*i; }
typedef int (*enum_funct)(int );
enum_funct map[] = { A_funct, B_funct };
// In C++11 use this:
//static_assert( sizeof(map)/sizeof(map[0])==SIZE , "Some enum is missing its function!");
int main() {
assert(sizeof(map)/sizeof(map[0])==SIZE && "Some enum is missing its function!");
int i = 1;
std::printf("case A prints %d\n", map[A](i) );
std::printf("case B prints %d\n", map[B](i) );
}
UPDATE: From your comments:
My only concern about maintainability is about writing down explicitly
5 different function pointer arrays (if I don't automate this).
OK, now I understand the maintenance concern.
I believe you can only achieve this (no matter whether you use function pointer arrays or the switch approach) if you use some sort of source code generation, either with macros or write your own source code generator. You also have to work out some naming conventions so that the function pointer arrays (or the code at the case statements in the switch approach) can be automatically generated.
Since you didn't specify it, I just made up my own naming convention. If you are comfortable with macros, here is what I hacked together with the Boost Preprocessor Library by some mindless editing of the example:
#include <boost/preprocessor/repetition.hpp>
#define ENUM_SIZE 2
#define ENUM(z, n, unused) e##n,
enum {
BOOST_PP_REPEAT(ENUM_SIZE, ENUM, ~)
SIZE
};
#undef ENUM
int fA_e0(int i) { return 3*i; }
int fA_e1(int i) { return 24*i; }
int fB_e0(int i) { return 32*i; }
int fB_e1(int i) { return 8*i; }
typedef int (*enum_funct)(int );
#define MAP(z, n, case) f ## ##case ## _e##n,
enum_funct map_A[] = {
BOOST_PP_REPEAT(ENUM_SIZE, MAP, A)
};
enum_funct map_B[] = {
BOOST_PP_REPEAT(ENUM_SIZE, MAP, B)
};
#undef MAP
Here is what we get after the preprocessor resolved these macros (g++ -E myfile.cpp):
enum { e0, e1, SIZE };
[...]
typedef int (*enum_funct)(int );
enum_funct map_A[] = {
fA_e0, fA_e1,
};
enum_funct map_B[] = {
fB_e0, fB_e1,
};
So, as you can see, if you specify your own naming conventions, you can automatically generate the maps (function pointer arrays). The documentation is good.
However, if I were you, I would write my own source code generator. I would specify a simple text file format (key - value pairs on one line, separated by white space) and write my own tool to generate the desired C++ source files from this simple text file. The build system would then invoke my source code generator tool in the pre-build step. In that way, you don't have to mess with macros. (By the way, I wrote a little testing framework for myself and to circumvent the lack of reflection in C++ I use my own source code generator. Really not that difficult.)
The first two solutions seem to be equivalent, while the one based on
the switch is 5 times faster. I used gcc version 4.6.3 with the flag
-O3.
I would have to see your source code, the generated assembly and how you measured the time in order to understand how that happened.
So I also did my own speed tests. Since it would clutter this answer, the source codes are here: switch approach and the function pointer array approach.
As I expected: the switch approach is faster but only if you have a handful of branches. Andrei Alexandrescu also says the same in his talk
Writing Quick Code in C++, Quickly, at around 38 min. On my machine, the switch approach is as fast as the function pointer array approach if the enum size is 5. If the enum size is bigger than 5, the function pointer array approach is consistently faster. If the enum size is 200 and there are 10^8 function invocations, it is more than 10% faster on my machine. (The online codes have only 10^7 function invocations otherwise it times out.)
(I used link time optimization (-O3 -flto flag both to the compiler and the linker) and I can only recommend it; it gives a nice performance boost (in my codes up to 2.5x) and the only thing you need to do is to pass one extra flag. However, in your case the code was so simple that it didn't change anything. If you wish to try it: The link time optimization is either not available or only experimental in gcc 4.6.3.)
From your comments:
I made new experiments following step by step your benchmark method
but I still get better results with the switch statement (when the
enum size is 150 the switch is still almost twice as fast as than the
solution with pointers). [...]
In the test with my code the switch method performs always better. I run also some
experiments with your code and I got the same kind of results you got.
I have looked at the generated assembly codes, having at least 5 functions (5 cases). If we have at least this many functions, roughly speaking, what happens is that the compiler turns the switch approach into the function pointer approach with one significant disadvantage. Even in the best case, the switch always goes through 1 extra branch (integer comparison potentially followed by a jump) compared to the hand-coded function pointer array approach when dispatching to the function to be called. This extra branch belongs to the default: label which is generated even if you deliberately omit it in the C++ code; there is no way to stop the compiler from generating the code for this. (If you have at most 4 cases and all 4 function calls can be inlined, then it is different; however you already have 50 cases so it doesn't matter.)
Apart from that, with the switch approach, additional (redundant) instructions and paddings are generated, corresponding to the code at the case: labels. This potentially increases your cache misses. So, as I see it, the the switch is always inferior to the function pointer approach if you have more than a handful of cases (5 cases on my machine). That is what Andrei Alexandrescu says in his talk too; he gives a limit of ~7 cases.
As for the reasons why your speedtests indicate the opposite: These sort of speed testings are always unreliable because they are extremely sensitive to alignment and caching. Nevertheless, in my primitive tests, the switch approach was always slightly worse than the function pointer array, which is in agreement with my above analysis of the assembly codes.
Another advantage of the function pointer arrays is that it can be built and changed at runtime; this is something that you don't get with the switch approach.
The weird thing is that the speed I get with the function pointer
array changes depending on the enum size (I would expect it to be
roughly constant).
As the enum size grows, you have more functions and the instruction cache misses are more likely to happen. In other words, the program should run slightly slower if you have more functions. (It does on my machine.) Of course the whole thing happens at random, so there will be significant deviations, don't be surprised if it runs faster for ENUM_SIZE=42 than for 41. And as mentioned earlier, alignment adds additional noise to this.
In C++11, you may do the following:
#if 1 // Not in C++11
#include <cstdint>
template <std::size_t ...> struct index_sequence {};
template <std::size_t I, std::size_t ...Is>
struct make_index_sequence : make_index_sequence < I - 1, I - 1, Is... > {};
template <std::size_t ... Is>
struct make_index_sequence<0, Is...> : index_sequence<Is...> {};
#endif
namespace detail {
template <std::size_t ... Is>
int funct(MyEnum e, int i, index_sequence<Is...>)
{
// create an array of pointer on function and call the correct one.
return std::array<int(*)(int), sizeof...(Is)>{{&Trait<MyEnum(Is)>::funct...}}[(int)e](i);
}
} // namespace detail
int funct(MyEnum e, std::size_t i)
{
return detail::funct(e, i, make_index_sequence<std::size_t(END)>());
}
Note: enum should not have hole (so here A=0 and B=1 is ok)
Following macro may help:
#define DYN_DISPATCH(TRAIT, NAME, SIGNATURE, ENUM, ENUM_END) \
namespace detail { \
template <std::size_t ... Is> \
constexpr auto NAME(ENUM e, index_sequence<Is...>) -> SIGNATURE \
{ \
return std::array<SIGNATURE, sizeof...(Is)>{{&TRAIT<ENUM(Is)>::NAME...}}[(int)e]; \
} \
} /*namespace detail */ \
template <typename...Ts> \
auto NAME(ENUM e, Ts&&...ts) \
-> decltype(std::declval<SIGNATURE>()(std::declval<Ts>()...)) \
{ \
return detail::NAME(e, make_index_sequence<std::size_t(ENUM_END)>())(std::forward<Ts>(ts)...); \
}
And then use it as:
DYN_DISPATCH(Trait, funct, int(*)(int), MyEnum, END)
// now `int funct(MyEnum, int)` can be call.
You don't need templates at all to do this. More like good old X macros
#define MY_ENUM_LIST VAL(A) VAL(B)
// define an enum
#define VAL(x) x,
enum MyEnum { MY_ENUM_LIST END };
#undef VAL
// define a few functions doing a switch on Enum values
void do_something_with_Enum (MyEnum value, int i)
{
switch (value)
{
#define VAL(N) case N: std::cout << Trait<N>::funct(i) << std::endl; break;
MY_ENUM_LIST
#undef VAL
}
}
int do_something_else_with_Enum (MyEnum value)
{
switch (value)
{
#define VAL(x) case x: yet_another_template_mayhem(x);
MY_ENUM_LIST
#undef VAL
}
}
I've wasted enough time with this already. If you think templates are the solution, simply change your question to "templates experts only, preprocessor not good enough" or something.
You will not be the first wasting your time on useless templates. Many people make a fat living on providing bloated, useless solutions to inexisting problems.
Besides, your assumption of a switch being faster than an array of function pointers is highly debatable. It all depends on the number of values in your enum and the variability of the code inside your case statements.
Now if optimization is not such a big issue, you can simply use virtual methods to specialize the behaviour of whatever objects are selected by your enum and let the compiler handle the whole "automatic switch" stuff for you.
The only benefit of this approach is to avoid duplicating code if your objects are similar enough to make you think you will do a better job than the compiler handling them in a specialized way.
What you seem to be asking for is a generic solution for optimizing an unknown code pattern, and that is a contradiction in terms.
EDIT: thanks to Jarod42 for cleaning up the example.
It looks like you would like to associate and integer id with each function and find functions by the id.
If your id's are sequential you can have an array of function pointers indexed by that id, which would give you O(1) lookup complexity, e.g.:
typedef int Fn(int);
enum FnId {
A,
B,
FNID_COUNT
};
int fn_a(int);
int fn_b(int);
Fn* const fns[FNID_COUNT] = {
fn_a,
fn_b
};
int main() {
fns[A](1); // invoke function with id A.
}
If the id's are not sequential, you can still have a sorted array of {id, function_ptr} tuples and do binary search on it, O(lg(N)) lookup complexity.
None of these require macros or templates.
For numeric (database) type identifiers I have a template holding the identifiers. A dispatch via variadic templates calls a functor with the matching type traits:
#include <iostream>
#include <stdexcept>
// Library
// =======
class TypeIdentifier
{
public:
typedef unsigned Integer;
enum Value
{
Unknown,
Bool,
Int8,
UInt8,
Int16,
UInt16,
Int32,
UInt32,
Int64,
UInt64,
Float,
Double,
String,
LargeObject,
Date,
Time,
DateTime
};
template <Value ...Ids> struct ListType {};
typedef ListType<
Bool,
Int8,
UInt8,
Int16,
UInt16,
Int32,
UInt32,
Int64,
UInt64,
Float,
Double,
String,
LargeObject,
Date,
DateTime,
// Always the last value:
Unknown
>
List;
public:
TypeIdentifier(Integer value = Unknown)
: m_id(value)
{}
Integer id() const { return m_id; }
/// Involve a functor having a member function 'Result apply<Traits>()'.
template<typename Functor>
typename Functor::result_type dispatch(const Functor&);
private:
Integer m_id;
};
template<TypeIdentifier::Value I>
struct TypeTraits
{
static constexpr TypeIdentifier::Value Id = I;
static constexpr bool is(TypeIdentifier::Integer id) { return (Id == id); }
static bool is(TypeIdentifier type_identifier) { return (Id == type_identifier.id()); }
// And conversion functions
};
namespace TypeIdentifierDispatch {
template <typename Functor, TypeIdentifier::Value I, TypeIdentifier::Value ... Ids> struct Evaluate;
template <typename Functor>
struct Evaluate<Functor, TypeIdentifier::Unknown> {
static typename Functor::result_type
apply(TypeIdentifier::Integer id, const Functor&) {
throw std::logic_error("Unknown Type");
}
};
template <typename Functor, TypeIdentifier::Value I, TypeIdentifier::Value ... Ids>
struct Evaluate {
static typename Functor::result_type
apply(TypeIdentifier::Integer id, const Functor& functor) {
if(TypeTraits<I>::is(id))
return functor.template apply<TypeTraits<I>>();
else return Evaluate<Functor, Ids...>::apply(id, functor);
}
};
template <typename Functor, TypeIdentifier::Value ... Ids>
inline typename Functor::result_type
evaluate(TypeIdentifier::Integer id, const Functor& functor, TypeIdentifier::ListType<Ids...>)
{
return Evaluate<Functor, Ids...>::apply(id, functor);
}
} // namespace TypeIdentifierDispatch
template<typename Functor>
inline
typename Functor::result_type TypeIdentifier::dispatch(const Functor& functor) {
return TypeIdentifierDispatch::evaluate(id(), functor, TypeIdentifier::List());
}
// Usage
// =====
struct Print {
typedef void result_type;
template <typename Traits>
result_type apply() const {
std::cout << "Type Identifier: " << Traits::Id << '\n';
}
};
inline void print_identifier(unsigned value) {
TypeIdentifier(value).dispatch(Print());
}
int main ()
{
print_identifier(TypeIdentifier::String);
return 0;
}
Adding a new type to the library requires adjusting TypeIdentfier and (maybe) adding a specialized TypeTraits.
Note the enum values can be arbitrary.
Using recursive template you can automatically generate construction equivalent to
if (i = A)
Trait<A>::funct(i);
else if (i = B)
Trait<B>::funct(i);
I think it performance is similar to switch statement. Your original example can be rewritten as below.
#include <iostream>
using namespace std;
enum MyEnum {
A,
B,
END
};
template <MyEnum N>
class Trait
{ public:
static int funct(int i)
{
cout << "You forgot to define funct" << i;
return i;
}
};
template<>
class Trait<A> {
public:
static int funct(int i) { return i * 3; }
};
template<>
class Trait<B> {
public:
static int funct(int i) { return i * 24; }
};
template <MyEnum idx>
int Switch(const MyEnum p, const int n)
{
return (p == idx) ? Trait<idx>::funct(n) : Switch<(MyEnum)(idx - 1)>(p, n);
}
template <>
int Switch<(MyEnum)(0)>(const MyEnum p, const int n)
{
return Trait<(MyEnum)(0)>::funct(n);
}
int funct(MyEnum n)
{
return Switch<END>(n, n);
}
int main() {
MyEnum i = B;
cout << funct(i);
}

optimize output value using a class and public member

Suppose you have a function, and you call it a lot of times, every time the function return a big object. I've optimized the problem using a functor that return void, and store the returning value in a public member:
#include <vector>
const int N = 100;
std::vector<double> fun(const std::vector<double> & v, const int n)
{
std::vector<double> output = v;
output[n] *= output[n];
return output;
}
class F
{
public:
F() : output(N) {};
std::vector<double> output;
void operator()(const std::vector<double> & v, const int n)
{
output = v;
output[n] *= n;
}
};
int main()
{
std::vector<double> start(N,10.);
std::vector<double> end(N);
double a;
// first solution
for (unsigned long int i = 0; i != 10000000; ++i)
a = fun(start, 2)[3];
// second solution
F f;
for (unsigned long int i = 0; i != 10000000; ++i)
{
f(start, 2);
a = f.output[3];
}
}
Yes, I can use inline or optimize in an other way this problem, but here I want to stress on this problem: with the functor I declare and construct the output variable output only one time, using the function I do that every time it is called. The second solution is two time faster than the first with g++ -O1 or g++ -O2. What do you think about it, is it an ugly optimization?
Edit:
to clarify my aim. I have to evaluate the function >10M times, but I need the output only few random times. It's important that the input is not changed, in fact I declared it as a const reference. In this example the input is always the same, but in real world the input change and it is function of the previous output of the function.
More common scenario is to create object with reserved large enough size outside the function and pass large object to the function by pointer or by reference. You could reuse this object on several calls to your function. Thus you could reduce continual memory allocation.
In both cases you are allocating new vector many many times.
What you should do is to pass both input and output objects to your class/function:
void fun(const std::vector<double> & in, const int n, std::vector<double> & out)
{
out[n] *= in[n];
}
this way you separate your logic from the algorithm. You'll have to create a new std::vector once and pass it to the function as many time as you want. Notice that there's unnecessary no copy/allocation made.
p.s. it's been awhile since I did c++. It may not compile right away.
It's not an ugly optimization. It's actually a fairly decent one.
I would, however, hide output and make an operator[] member to access its members. Why? Because you just might be able to perform a lazy evaluation optimization by moving all the math to that function, thus only doing that math when the client requests that value. Until the user asks for it, why do it if you don't need to?
Edit:
Just checked the standard. Behavior of the assignment operator is based on insert(). Notes for that function state that an allocation occurs if new size exceeds current capacity. Of course this does not seem to explicitly disallow an implementation from reallocating even if otherwise...I'm pretty sure you'll find none that do and I'm sure the standard says something about it somewhere else. Thus you've improved speed by removing allocation calls.
You should still hide the internal vector. You'll have more chance to change implementation if you use encapsulation. You could also return a reference (maybe const) to the vector from the function and retain the original syntax.
I played with this a bit, and came up with the code below. I keep thinking there's a better way to do this, but it's escaping me for now.
The key differences:
I'm allergic to public member variables, so I made output private, and put getters around it.
Having the operator return void isn't necessary for the optimization, so I have it return the value as a const reference so we can preserve return value semantics.
I took a stab at generalizing the approach into a templated base class, so you can then define derived classes for a particular return type, and not re-define the plumbing. This assumes the object you want to create takes a one-arg constructor, and the function you want to call takes in one additional argument. I think you'd have to define other templates if this varies.
Enjoy...
#include <vector>
template<typename T, typename ConstructArg, typename FuncArg>
class ReturnT
{
public:
ReturnT(ConstructArg arg): output(arg){}
virtual ~ReturnT() {}
const T& operator()(const T& in, FuncArg arg)
{
output = in;
this->doOp(arg);
return this->getOutput();
}
const T& getOutput() const {return output;}
protected:
T& getOutput() {return output;}
private:
virtual void doOp(FuncArg arg) = 0;
T output;
};
class F : public ReturnT<std::vector<double>, std::size_t, const int>
{
public:
F(std::size_t size) : ReturnT<std::vector<double>, std::size_t, const int>(size) {}
private:
virtual void doOp(const int n)
{
this->getOutput()[n] *= n;
}
};
int main()
{
const int N = 100;
std::vector<double> start(N,10.);
double a;
// second solution
F f(N);
for (unsigned long int i = 0; i != 10000000; ++i)
{
a = f(start, 2)[3];
}
}
It seems quite strange(I mean the need for optimization at all) - I think that a decent compiler should perform return value optimization in such cases. Maybe all you need is to enable it.