What is the performance overhead of std::function? - c++

I heard on a forum using std::function<> causes performance drop. Is it true? If true, is it a big performance drop?

There are, indeed, performance issues with std:function that must be taken into account whenever using it. The main strength of std::function, namely, its type-erasure mechanism, does not come for free, and we might (but not necessarily must) pay a price for that.
std::function is a template class that wraps callable types. However, it is not parametrized on the callable type itself but only on its return and argument types. The callable type is known only at construction time and, therefore, std::function cannot have a pre-declared member of this type to hold a copy of the object given to its constructor.
Roughly speaking (actually, things are more complicated than that) std::function can hold only a pointer to the object passed to its constructor, and this raises a lifetime issue. If the pointer points to an object whose lifetime is smaller than that of the std::function object, then the inner pointer will become dangling. To prevent this problem std::function might make a copy of the object on the heap through a call to operator new (or a custom allocator). The dynamic memory allocation is what people refer the most as a performance penalty implied by std::function.
I have recently written an article with more details and that explains how (and where) one can avoid paying the price of a memory allocation.
Efficient Use of Lambda Expressions and std::function

You can find information from the boost's reference materials: How much overhead does a call through boost::function incur? and Performance
This doesn't determine "yes or no" to boost function. The performance drop may be well acceptable given program's requirements. More often than not, parts of a program are not performance-critical. And even then it may be acceptable. This is only something you can determine.
As to the standard library version, the standard only defines an interface. It is entirely up to individual implementations to make it work. I suppose a similar implementation to boost's function would be used.

Firstly, the overhead gets smaller with the inside of the function; the higher the workload, the smaller the overhead.
Secondly: g++ 4.5 does not show any difference compared to virtual functions:
main.cc
#include <functional>
#include <iostream>
// Interface for virtual function test.
struct Virtual {
virtual ~Virtual() {}
virtual int operator() () const = 0;
};
// Factory functions to steal g++ the insight and prevent some optimizations.
Virtual *create_virt();
std::function<int ()> create_fun();
std::function<int ()> create_fun_with_state();
// The test. Generates actual output to prevent some optimizations.
template <typename T>
int test (T const& fun) {
int ret = 0;
for (int i=0; i<1024*1024*1024; ++i) {
ret += fun();
}
return ret;
}
// Executing the tests and outputting their values to prevent some optimizations.
int main () {
{
const clock_t start = clock();
std::cout << test(*create_virt()) << '\n';
const double secs = (clock()-start) / double(CLOCKS_PER_SEC);
std::cout << "virtual: " << secs << " secs.\n";
}
{
const clock_t start = clock();
std::cout << test(create_fun()) << '\n';
const double secs = (clock()-start) / double(CLOCKS_PER_SEC);
std::cout << "std::function: " << secs << " secs.\n";
}
{
const clock_t start = clock();
std::cout << test(create_fun_with_state()) << '\n';
const double secs = (clock()-start) / double(CLOCKS_PER_SEC);
std::cout << "std::function with bindings: " << secs << " secs.\n";
}
}
impl.cc
#include <functional>
struct Virtual {
virtual ~Virtual() {}
virtual int operator() () const = 0;
};
struct Impl : Virtual {
virtual ~Impl() {}
virtual int operator() () const { return 1; }
};
Virtual *create_virt() { return new Impl; }
std::function<int ()> create_fun() {
return []() { return 1; };
}
std::function<int ()> create_fun_with_state() {
int x,y,z;
return [=]() { return 1; };
}
Output of g++ --std=c++0x -O3 impl.cc main.cc && ./a.out:
1073741824
virtual: 2.9 secs.
1073741824
std::function: 2.9 secs.
1073741824
std::function with bindings: 2.9 secs.
So, fear not. If your design/maintainability can improve from prefering std::function over virtual calls, try them. Personally, I really like the idea of not forcing interfaces and inheritance on clients of my classes.

This depends strongly if you are passing the function without binding any argument (does not allocate heap space) or not.
Also depends on other factors, but this is the main one.
It is true that you need something to compare against, you can't just simply say that it 'reduces overhead' compared to not using it at all, you need to compare it to using an alternative way to passing a function. And if you can just dispense of using it at all then it was not needed from the beginning

std::function<> / std::function<> with bind( ... ) is extremely fast. Check this:
#include <iostream>
#include <functional>
#include <chrono>
using namespace std;
using namespace chrono;
int main()
{
static size_t const ROUNDS = 1'000'000'000;
static
auto bench = []<typename Fn>( Fn const &fn ) -> double
{
auto start = high_resolution_clock::now();
fn();
return (int64_t)duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / (double)ROUNDS;
};
int i;
static
auto CLambda = []( int &i, int j )
{
i += j;
};
auto bCFn = [&]() -> double
{
void (*volatile pFnLambda)( int &i, int j ) = CLambda;
return bench( [&]()
{
for( size_t j = ROUNDS; j--; j )
pFnLambda( i, 2 );
} );
};
auto bndObj = bind( CLambda, ref( i ), 2 );
auto bBndObj = [&]() -> double
{
decltype(bndObj) *volatile pBndObj = &bndObj;
return bench( [&]()
{
for( size_t j = ROUNDS; j--; j )
(*pBndObj)();
} );
};
using fn_t = function<void()>;
auto bFnBndObj = [&]() -> double
{
fn_t fnBndObj = fn_t( bndObj );
fn_t *volatile pFnBndObj = &fnBndObj;
return bench( [&]()
{
for( size_t j = ROUNDS; j--; j )
(*pFnBndObj)();
} );
};
auto bFnBndObjCap = [&]() -> double
{
auto capLambda = [&i]( int j )
{
i += j;
};
fn_t fnBndObjCap = fn_t( bind( capLambda, 2 ) );
fn_t *volatile pFnBndObjCap = &fnBndObjCap;
return bench( [&]()
{
for( size_t j = ROUNDS; j--; j )
(*pFnBndObjCap)();
} );
};
using bench_fn = function<double()>;
static const
struct descr_bench
{
char const *descr;
bench_fn const fn;
} dbs[] =
{
{ "C-function",
bench_fn( bind( bCFn ) ) },
{ "C-function in bind( ... ) with all parameters",
bench_fn( bind( bBndObj ) ) },
{ "C-function in function<>( bind( ... ) ) with all parameters",
bench_fn( bind( bFnBndObj ) ) },
{ "lambda capturiging first parameter in function<>( bind( lambda, 2 ) )",
bench_fn( bind( bFnBndObjCap ) ) }
};
for( descr_bench const &db : dbs )
cout << db.descr << ":" << endl,
cout << db.fn() << endl;
}
All calls are below 2ns on my computer.

Related

Determining function time using a wrapper

I'm looking for a generic way of measuring a functions timing like Here, but for c++.
My main goal is to not have cluttered code like this piece everywhere:
auto t1 = std::chrono::high_resolution_clock::now();
function(arg1, arg2);
auto t2 = std::chrono::high_resolution_clock::now();
auto tDur = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1);
But rather have a nice wrapper around the function.
What I got so far is:
timing.hpp:
#pragma once
#include <chrono>
#include <functional>
template <typename Tret, typename Tin1, typename Tin2> unsigned int getDuration(std::function<Tret(Tin1, Tin2)> function, Tin1 arg1, Tin2 arg2, Tret& retValue)
{
auto t1 = std::chrono::high_resolution_clock::now();
retValue = function(arg1, arg2);
auto t2 = std::chrono::high_resolution_clock::now();
auto tDur = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1);
return tDur.count();
}
main.cpp:
#include "timing.hpp"
#include "matrix.hpp"
constexpr int G_MATRIXSIZE = 2000;
int main(int argc, char** argv)
{
CMatrix<double> myMatrix(G_MATRIXSIZE);
bool ret;
// this call is quite ugly
std::function<bool(int, std::vector<double>)> fillRow = std::bind(&CMatrix<double>::fillRow, &myMatrix, 0, fillVec);
auto duration = getDuration(fillRow, 5, fillVec, ret );
std::cout << "duration(ms): " << duration << std::endl;
}
in case sb wants to test the code, matrix.hpp:
#pragma once
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
template<typename T> class CMatrix {
public:
// ctor
CMatrix(int size) :
m_size(size)
{
m_matrixData = new std::vector<std::vector<T>>;
createUnityMatrix();
}
// dtor
~CMatrix()
{
std::cout << "Destructor of CMatrix called" << std::endl;
delete m_matrixData;
}
// print to std::out
void printMatrix()
{
std::ostringstream oss;
for (int i = 0; i < m_size; i++)
{
for (int j = 0; j < m_size; j++)
{
oss << m_matrixData->at(i).at(j) << ";";
}
oss << "\n";
}
std::cout << oss.str() << std::endl;
}
bool fillRow(int index, std::vector<T> row)
{
// checks
if (!indexValid(index))
{
return false;
}
if (row.size() != m_size)
{
return false;
}
// data replacement
for (int j = 0; j < m_size; j++)
{
m_matrixData->at(index).at(j) = row.at(j);
}
return true;
}
bool fillColumn(int index, std::vector<T> column)
{
// checks
if (!indexValid(index))
{
return false;
}
if (column.size() != m_size)
{
return false;
}
// data replacement
for (int j = 0; j < m_size; j++)
{
m_matrixData->at(index).at(j) = column.at(j);
}
return true;
}
private:
// variables
std::vector<std::vector<T>>* m_matrixData;
int m_size;
bool indexValid(int index)
{
if (index + 1 > m_size)
{
return false;
}
return true;
}
// functions
void createUnityMatrix()
{
for (int i = 0; i < m_size; i++)
{
std::vector<T> _vector;
for (int j = 0; j < m_size; j++)
{
if (i == j)
{
_vector.push_back(1);
}
else
{
_vector.push_back(0);
}
}
m_matrixData->push_back(_vector);
}
}
};
The thing is, this code is still quite ugly due to the std::function usage. Is there a better and/or simpler option ?
(+ also I'm sure I messed sth up with the std::bind, I think I need to use std::placeholders since I want to set the arguments later on.)
// edit, correct use of placeholder in main:
std::function<bool(int, std::vector<double>)> fillRow = std::bind(&CMatrix<double>::fillRow, &myMatrix, std::placeholders::_1, std::placeholders::_2);
auto duration = getDuration(fillRow, 18, fillVec, ret );
You can utilize RAII to implement a timer that records the execution time of a code block and a template function that wraps the function you would like to execute with the timer.
#include<string>
#include<chrono>
#include <unistd.h>
struct Timer
{
std::string fn, title;
std::chrono::time_point<std::chrono::steady_clock> start;
Timer(std::string fn, std::string title)
: fn(std::move(fn)), title(std::move(title)), start(std::chrono::steady_clock::now())
{
}
~Timer()
{
const auto elapsed =
std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::steady_clock::now() - start).count();
printf("%s: function=%s; elasepd=%f ms\n", title.c_str(), fn.c_str(), elapsed / 1000.0);
}
};
#ifndef ENABLE_BENCHMARK
static constexpr inline void dummy_fn() { }
#define START_BENCHMARK_TIMER(...) dummy_fn()
#else
#define START_BENCHMARK_TIMER(title) bench::Timer timer(__FUNCTION__, title)
#endif
template<typename F, typename ...Args>
auto time_fn(F&& fn, Args&&... args) {
START_BENCHMARK_TIMER("wrapped fn");
return fn(std::forward<Args>(args)...);
}
int foo(int i) {
usleep(70000);
return i;
}
int main()
{
printf("%d\n", time_fn(foo, 3));
}
stdout:
wrapped fn: function=time_fn; elasepd=71.785000 ms
3
General Idea:
time_fn is a simple template function that calls START_BENCHMARK_TIMER and calls fn with the provided arguments
START_BENCHMARK_TIMER then creates a Timer object. It will record the current time in start. Do note that __FUNCTION__ will be replaced with the function that was called.
When the
provided fn returns or throws an exception, the Timer object from (1) will be destroyed and the destructor will be called. The destructor will then calculate the time difference between the current time and the recorded start time and prints it to stdout
Note:
Even though declaring start and end in time_fn instead of the RAII timer will work, having an RAII timer will allow you to cleanly handle the situation when fn throws an exception
If you are on c++11, you will need to change time_fn declaration to typename std::result_of<F &&(Args &&...)>::type time_fn(F&& fn, Args&&... args).
Edit: Updated the response to include a wrapper function approach.

how can i find a beautiful function wrapper

typedef void (*void_proc)(void* parameter);
void* parallel_init(void* dummy, int core_number);
int parallel_addtask(void* parallel_monitor, void_proc process, void *parameter);
int parallel_waittask(void* parallel_monitor, int task_id);
int parallel_uninit(void* parallel_monitor);
struct parallel_parameter {
int end;
int begin;
};
void process(void* parameter) {
auto p = reinterpret_cast<parallel_parameter*>(parameter);
// ur_function_name(p->begin, p-end);
}
above is a parallel library(c style) which i woule like to use. every time u call it, u should define a specific struct parameter, it is so annoying that i want implement a template function to mitigate the call steps and i try some kinds of methods to achieve this but failed.
template<typename _function, typename... _parameter>
int parallel_executor(_function&& function, _parameter&&... parameter) {
auto res = 0;
parallel_parameter p[8]{0};
auto body = [](void* para) -> void {
auto p = reinterpret_cast<parallel_parameter*>(para);
function(p->begin, p->end, std::forward<_parameter>(parameter)...)
};
auto parallel_handle = parallel_init(nullptr, 8);
do {
for (int i = 0;i < 8; ++i) {
res = parallel_addtask(parallel_handle, body, static_cast<void*>(&p[i]));
if (res != 0) break;
}
for (int i = 0; i < 8; ++i) {
res = parallel_waittask(parallel_handle, i);
if (res != 0) break;
}
} while (false);
parallel_uninit(parallel_handle);
return res;
}
this call is just simple to show my dilemma, when i use the parallel_executor, it turns out sessioncannot be accessed, because i am not specific the capture style, but when i change the body into below style, the parallel_addtask will not accept body function.
auto body = [&](void* para) -> void {
auto p = reinterpret_cast<parallel_parameter*>(para);
function(p->begin, p->end, std::forward<_parameter>(parameter)...)
};
and now i am in this awkward position for a while. below is the call style which i prefered.
auto ret = parallel_executor(
[](int begin, int end, int parameter_1, int parameter_2) {
std::cout << begin << " ==> " << end << " ==> " << parameter_1 << std::endl;
},
100, // parameter_1
200 // parameter_2
);
regarding the issue, i hope I have made myself clear. any suggestion is appreciated.
Wrapper might look like:
class ParrallelWrapper
{
public:
ParrallelWrapper(int core_number) :
parallel_monitor(parallel_init(nullptr, core_number))
{}
ParrallelWrapper(const ParrallelWrapper&) = delete;
ParrallelWrapper& operator= (const ParrallelWrapper&) = delete;
~ParrallelWrapper() { parallel_uninit(parallel_monitor); }
int AddTask(std::function<void()> f) {
auto run_function = *[](void* f){
(*reinterpret_cast<std::function<void()>*>(f))();
};
functions.push_back(std::make_unique<std::function<void()>>(f));
return parallel_addtask(parallel_monitor, run_function, functions.back().get());
}
int Wait(int task_id) { return parallel_waittask(parallel_monitor, task_id); }
private:
void* parallel_monitor = nullptr;
// Ensure lifetime, and pointer consistence.
std::vector<std::unique_ptr<std::function<void()>>> functions;
};
Demo
With appropriate blanks for specifying begin, end, and the number of tasks, you can use something like
struct parallel_deleter {
void operator()(void *m) const {parallel_uninit(m);}
};
template<class F,class ...TT>
int parallel_executor(F f,TT &&...tt) {
constexpr auto p=+f; // require captureless
constexpr int n=/*...*/;
std::unique_ptr<void,parallel_deleter> m(parallel_init(nullptr,n));
struct arg {
int begin,end;
std::tuple<TT...> user;
};
std::vector<arg> v(n,{0,0,{tt...}});
for(auto &x : v) {
x.begin=/*...*/;
x.end=/*...*/;
if(const int res=parallel_addtask(m.get(),[](void *v) {
const auto &a=*static_cast<arg*>(v);
std::apply([&a](auto &...aa) {p(a.begin,a.end,aa...);},a.user);
},&x)) return res;
}
for(int i=0;i<n;++i)
if(const int res=parallel_waittask(m.get(),i)) return res;
return parallel_uninit(m.release());
}
This design relies on a captureless lambda being passed (so that p can be used inside the task lambda without capturing anything); if you need to support any callable, Jarod42's solution based on std::function is superior.

How to create my own loop version in C++?

I was wondering if it possible to create custom functions like for, for_each, while etc.
There's nothing that I want to do that the existing loops won't do it. I am just curious to learn how they work and if I ever need to create my own.
For example if one wants to create another version of the for function that would take only parameter.
In this example, I want to to create a for that only takes one parameter, an integer.
Instead of writing
for (int i = 0; i < 50; ++i)
I would create a for version like this
for_(50)
and they would act the same. How would I do something like that?
I have posted this question in another forum.
In addition to the proposals in other answers, you could create a function like the one below, but it is, at the very end, very similar to using the standard std::for_each.
#include <iostream>
#include <functional>
template<typename C, typename F>
void for_(C begin_, C end_, F&& f) { // [begin_, end_)
for (C i = begin_; i < end_; ++i) {
f(i);
}
}
template<typename C, typename F>
void for_(C count, F&& f) { // special case for [0, count)
for_(0, count, f);
}
void mul2(int x) {
std::cout << x*2 << " ";
}
int main() {
for_(10, [](int i) { std::cout << i << "\n"; });
for_(2, 10, mul2);
}
An ugly and unsafe solution is to use macro:
#define REPEAT(i,N) for(int (i) = 0; (i) < (N); ++(i))
int main()
{
REPEAT(i,10) std::cout << i << std::endl;
return 0;
}
You can't extend the C++ syntax for new loops.
You could use a macro, but this is pretty ugly, and generally best avoided. Another way to get something similar is by passing a functor as a parameter, greatly helped by the introduction of lambda expressions to C++. You can find some examples of such in the <algorithm> header.
For example:
#include <algorithm>
#include <vector>
int main()
{
std::vector<int> numbers = { 1, 4, 5, 7, 10 };
int even_count = 0;
for (auto x : numbers)
{
if (x % 2 == 0)
{
++even_count;
}
}
auto even_count2 = std::count_if(numbers.begin(), numbers.end(), [](int x) { return x % 2 == 0; });
}
You could use a lambda function and pass in a function object as a parameter to be performed for every iteration of the loop.
#include <iostream>
#include <functional>
int main()
{
auto for_ = [](int start, int size, std::function<void (int i)> fn)
{
int end = start + size;
for (int i = start; i < end; ++i)
{
fn(i);
}
};
for_(0, 10, [](int i) { std::cout << i << std::endl; });
for_(0, 10, [](int i) { std::cout << i*2 << std::endl; });
}
It seems like you are reinventing the wheel here a bit. You could just use std::for_each.
However, you could have custom lambda functions that do different things and just implement the operation within the lambda itself without taking in a function object for the operation.

Is it possible to write one function for std::string and std::wstring?

I just wrote a simple utility function for std::string. Then I noticed that the function would look exactly the same if the std::string was a std::wstring or a std::u32string. Is it possible to use a template function here? I am not very familiar with templates, and std::string and std::wstring are templates themselves, which might be an issue.
template<class StdStringClass>
inline void removeOuterWhitespace(StdStringClass & strInOut)
{
const unsigned int uiBegin = strInOut.find_first_not_of(" \t\n");
if (uiBegin == StdStringClass::npos)
{
// the whole string is whitespace
strInOut.clear();
return;
}
const unsigned int uiEnd = strInOut.find_last_not_of(" \t\n");
strInOut = strInOut.substr(uiBegin, uiEnd - uiBegin + 1);
}
Is this a proper way to do it? Are there pitfalls with this idea. I am not talking about this function but the general concept of using a templated class StdStringClass and calling the usual std::string functions like find, replace, erase, etc.
Its a good Idea, But I'd build the template on top of std::basic_string rather then general StdStringclass
template<class T>
inline void removeOuterWhitespace(std::basic_string<T>& strInOut)
{
constexpr auto delim[] = {T(' '),T('\t'),T('\n'),T(0)};
const auto uiBegin = strInOut.find_first_not_of(delim);
if (uiBegin == std::basic_string<T>::npos)
{
// the whole string is whitespace
strInOut.clear();
return;
}
const auto uiEnd = strInOut.find_last_not_of(delim);
strInOut = strInOut.substr(uiBegin, uiEnd - uiBegin + 1);
}
I would also ditch the MSDN-style "inout" notation in favro for simpler name like str. programmer will guess themselves that str is the result since it is passed as non-const reference and function returns void.
also, I changed unsigned int to auto. all the standard C++ containers/strings return size_t when returning indexes. size_t might not be unsigned int. auto matches itself to the right return value.
Assuming your template works as expected (haven't checked...sorry), another option would be to wrap the function in class, and control which types of strings classes you'd like the function to be applied to using constructors.
EDIT: added illustrative framework
EDIT2 one that compiles (at least with vs2015) :-)
class StringType1;
class StringTypeN;
class str {
//template function
template<class StdStringClass>
inline void removeOuterWhitespace(StdStringClass & strInOut)
{
//.
//.
//.
}
public:
//constructors
str(StringType1 &s1) { removeOuterWhitespace(s1); }
//.
//.
//.
str(StringTypeN &sN) { removeOuterWhitespace(sN); }
};
int main() {
return 0;
}
EDIT3 Proof of concept
#include <iostream>
class incr {
//template function
template<class incrementor>
inline void removeOuterWhitespace(incrementor & n)
{
n++;
}
public:
//constructors
incr(int &n1) { removeOuterWhitespace(n1); }
incr(double &n1) { removeOuterWhitespace(n1); }
incr(float &n1) { removeOuterWhitespace(n1); }
};
int main() {
int n1 = 1;
double n2 = 2;
float n3 = 3;
std::cout << n1 << "\t" << n2 << "\t" << n3 << std::endl;
auto test1 = incr(n1);
auto test2 = incr(n2);
auto test3 = incr(n3);
//all variables modified
std::cout << "all variables modified by constructing incr" << std::endl;
std::cout << n1 << "\t" << n2 << "\t" << n3 << std::endl;
return 0;
}

Is there a macro-based adapter to make a functor from a class?

Creating a functor requires an unnecessary boiler plate. The state has to be written 4 times!
struct f{
double s; // 1st
f(double state): s(state) {} // 2nd, 3rd and 4th
double operator() (double x) {
return x*s;
}
};
is there a library with a macro that would be just double functor(state)(x){ return x*state; } or something similar.
BOOST_FOREACH is a macro adapter that works well. I'm looking for something similar.
any suggestions on how to write one is appreciated too.
ps. using struct for functor is faster then bind Class's operator() or bind a function as a functor?
Update(1)
in regards to lambdas:
the functor has to be modular, meaning, it should be reusable in other function. lambdas have to be within a function -- lambda has to be in main to be called from main and other functions outside of main, can't call the lambda defined in main.
How about relying on aggregate initialization? Simply do not declare the constructor:
struct f {
double s;
double operator()(double x) {
return x * s;
}
};
use it like this
int main()
{
auto ff = f{42};
std::cout << ff(2);
return 0;
}
Define the functionality you want, e.g., you multiplication, as a function and then use std::bind() to create a suitable function object:
#include <functional>
double some_operation(double state, double x) {
return state * x;
}
int main() {
auto function = std::bind(&some_operation, 17, std::placeholders::_1);
return function(18);
}
Since a call through a function pointer generally can't be inlined, you might want to write your function as a function object instead:
#include <functional>
struct some_operation {
double operator()(double state, double x) const {
return state * x;
}
};
int main() {
auto function = std::bind(some_operation(), 17, std::placeholders::_1);
return function(18);
}
Below is a test program which seems to indicate that the speed of a hand-crafted function object and a bound function object are about the same, i.e., the results I get are
in 90 ms, functor as a struct; result = 1.5708e+16
in 262 ms, function pointer through bind; result = 1.5708e+16
in 261 ms, function through bind; result = 1.5708e+16
in 87 ms, function object through bind; result = 1.5708e+16
in 88 ms, non-called bind with function; result = 1.5708e+16
in 88 ms, non-called bind with function pointer; result = 1.5708e+16
using a recent version of clang (more precisely: clang version 3.4 (trunk 182411)) on a MacOS system optimizing with -O2 option. Using and gcc (more precisely: gcc version 4.9.0 20130811 (experimental) (GCC)) gives similar results.
It seems it makes a difference whether the function object is build in the local context or passed via template argument to a separate function. This difference is interesting as I would expect that most of the uses of bind() a function will result in passing off the resulting function object somewhere.
The code is based on https://stackoverflow.com/a/18175033/1120273:
#include <iostream>
#include <functional>
#include <chrono>
using namespace std;
using namespace std::placeholders;
using namespace std::chrono;
struct fs {
double s;
fs(double state) : s(state) {}
double operator()(double x) {
return x*s;
}
};
struct ff {
double operator()(double x, double state) const {
return x * state;
}
};
double fb(double x, double state) {
return x*state;
}
template <typename Function>
void measure(char const* what, Function function)
{
const auto stp1 = high_resolution_clock::now();
double sresult(0.0);
for(double x=0.0; x< 1.0e8; ++x) {
sresult += function(x);
}
const auto stp2 = high_resolution_clock::now();
const auto sd = duration_cast<milliseconds>(stp2 - stp1);
cout << "in " << sd.count() << " ms, ";
cout << what << "; result = " << sresult << endl;
}
int main() {
double state=3.1415926;
measure("functor as a struct", fs(state));
measure("function through bind", std::bind(&fb, _1, state));
measure("function object through bind", std::bind(ff(), _1, state));
{
const auto stp1 = high_resolution_clock::now();
double sresult(0.0);
auto function = std::bind(fb, _1, state);
for(double x=0.0; x< 1.0e8; ++x) {
sresult += function(x);
}
const auto stp2 = high_resolution_clock::now();
const auto sd = duration_cast<milliseconds>(stp2 - stp1);
cout << "in " << sd.count() << " ms, ";
cout << "embedded bind with function; result = " << sresult << endl;
}
{
const auto stp1 = high_resolution_clock::now();
double sresult(0.0);
auto function = std::bind(&fb, _1, state);
for(double x=0.0; x< 1.0e8; ++x) {
sresult += function(x);
}
const auto stp2 = high_resolution_clock::now();
const auto sd = duration_cast<milliseconds>(stp2 - stp1);
cout << "in " << sd.count() << " ms, ";
cout << "embedded bind with function pointer; result = " << sresult << endl;
}
return 0;
}
We've got lambdas for this:
double s = 42;
auto f = [s](double x) {
return s * x;
};
Down to single mention of state on line 2 (as you dont seem to count one in the actual expression). Whether initialization on line 1 counts as mention is debatable, your desired form does not contain any initialization, which is required, so I assume this to be acceptable.
In c++14 we'll get extension of lambda capture syntax allowing even more terse form:
auto f = [s{42}](double x) {
return s * x;
};
Have a look at BOOST_LOCAL_FUNCTION which seems to be exactly what youre looking for, as you even mention a macro :)
double s = 42;
double BOOST_LOCAL_FUNCTION(bind& s, double x) {
return x*s;
} BOOST_LOCAL_FUNCTION_NAME(f)
Personal note: If you have a modern compiler, go with C++11 lambdas.