Understanding the overhead of lambda functions in C++11 - c++

This was already touched in Why C++ lambda is slower than ordinary function when called multiple times? and C++0x Lambda overhead
But I think my example is a bit different from the discussion in the former and contradicts the result in the latter.
On the search for a bottleneck in my code I found a recusive template function that processes a variadic argument list with a given processor function, like copying the value into a buffer.
template <typename T>
void ProcessArguments(std::function<void(const T &)> process)
{}
template <typename T, typename HEAD, typename ... TAIL>
void ProcessArguments(std::function<void(const T &)> process, const HEAD &head, const TAIL &... tail)
{
process(head);
ProcessArguments(process, tail...);
}
I compared the runtime of a program that uses this code together with a lambda function as well as a global function that copies the arguments into a global buffer using a moving pointer:
int buffer[10];
int main(int argc, char **argv)
{
int *p = buffer;
for (unsigned long int i = 0; i < 10E6; ++i)
{
p = buffer;
ProcessArguments<int>([&p](const int &v) { *p++ = v; }, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
}
}
compiled with g++ 4.6 and -O3 measuring with the tool time takes more than 6 seconds on my machine while
int buffer[10];
int *p = buffer;
void CopyIntoBuffer(const int &value)
{
*p++ = value;
}
int main(int argc, char **argv)
{
int *p = buffer;
for (unsigned long int i = 0; i < 10E6; ++i)
{
p = buffer;
ProcessArguments<int>(CopyIntoBuffer, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
}
return 0;
}
takes about 1.4 seconds.
I do not get what is going on behind the scenes that explains the time overhead and am wondering if I can change something to make use of lambda functions without paying with runtime.

The problem here is your usage of std::function.
You send it by copy and therefore copying its contents (and doing that recursively as you unwind parameters).
Now, for pointer to function, contents is, well, just pointer to function.
For lambda, contents are at least pointer to function + reference that you captured. This is twice as much to copy. Plus, because of std::function's type erasure copying any data will most likely be slower (not inlined).
There are several options here, and the best would probably be passing not std::function, but template instead. The benefits are that your method call is more likely to be inlined, no type erasure happens by std::function, no copying happens, everything is so very good. Like that:
template <typename TFunc>
void ProcessArguments(const TFunc& process)
{}
template <typename TFunc, typename HEAD, typename ... TAIL>
void ProcessArguments(const TFunc& process, const HEAD &head, const TAIL &... tail)
{
process(head);
ProcessArguments(process, tail...);
}
Second option is doing the same, but sending the process by copy. Now, copying does happen, but still is neatly inlined.
What's equally important is that process' body can also be inlined, especially for lamda. Depending on complexity of copying the lambda object and its size, passing by copy may or may not be faster than passing by reference. It may be faster because compiler may have harder time reasoning about reference than the local copy.
template <typename TFunc>
void ProcessArguments(TFunc process)
{}
template <typename TFunc, typename HEAD, typename ... TAIL>
void ProcessArguments(TFunc process, const HEAD &head, const TAIL &... tail)
{
process(head);
ProcessArguments(process, tail...);
}
Third option is, well, try passing std::function<> by reference. This way you at least avoid copying, but calls will not be inlined.
Here are some perf results (using ideones' C++11 compiler).
Note that, as expected, inlined lambda body is giving you best performance:
Original function:
0.483035s
Original lambda:
1.94531s
Function via template copy:
0.094748
### Lambda via template copy:
0.0264867s
Function via template reference:
0.0892594s
### Lambda via template reference:
0.0264201s
Function via std::function reference:
0.0891776s
Lambda via std::function reference:
0.09s

Related

How to properly declare a method that only takes a lambda?

In the following example, I would like a traverse method that receives a callback. This example works perfectly as soon as I don't capture anything [] because the lambda can be reduced into a function pointer. However, in this particular case, I would like to access sum.
struct Collection {
int array[10];
void traverse(void (*cb)(int &n)) {
for(int &i : array)
cb(i);
}
int sum() {
int sum = 0;
traverse([&](int &i) {
sum += i;
});
}
}
What is the proper way (without using any templates) to solve this? A solution is to use a typename template as follows. But in this case, you lack visibility on what traverse gives in each iteration (an int):
template <typename F>
void traverse(F cb) {
for(int &i : array)
cb(i);
}
Lambda types are unspecified; there is no way to name them.
So you have two options:
Make traverse a template (or have it take auto, which is effectively the same thing)
Fortunately this is a completely normal and commonplace thing to do.
Have traverse take a std::function<void(int)>. This incurs some overhead, but does at least mean the function need not be a template.
But in this case, you lack visibility on what traverse gives in each iteration (an int)
We don't tend to consider that a problem. I do understand that giving this in the function's type is more satisfying and clear, but generally a comment is sufficient, because if the callback doesn't provide an int, you'll get a compilation error anyway.
Only captureless lambdas can be used with function pointers. As every lambda definition has its own type you have to use a template parameter in all places where you accept lambdas which captures.
But in this case, you lack visibility on what traverse gives in each iteration (an int).
This can be checked easily by using SFINAE or even simpler by using concepts in C++20. And to make it another step simpler, you even do not need to define a concept and use it later, you can directly use an ad-hoc requirement as this ( this results in the double use of the requires keyword:
struct Collection {
int array[10];
template <typename F>
// check if F is "something" which can be called with an `int&` and returns void.
requires requires ( F f, int& i) { {f(i)} -> std::same_as<void>; }
void traverse(F cb)
{
for(int &i : array)
cb(i);
}
// alternatively you can use `std::invocable` from <concepts>
// check if F is "something" which can be called with an `int&`, no return type check
template <std::invocable<int&> F>
void traverse2(F cb)
{
for(int &i : array)
cb(i);
}
int sum() {
int sum = 0;
traverse([&](int &i) {
sum += i;
});
return sum;
}
};
In your case you have several ways of declaring a callback in C++:
Function pointer
void traverse(void (*cb)(int &n)) {
for(int &i : array)
cb(i);
}
This solution only supports types that can decay into a function pointer. As you mentioned, lambdas with captures would not make it.
Typename template
template <typename F>
void traverse(F cb) {
for(int &i : array)
cb(i);
}
It does accept anything, but as you noticed. the code is hard to read.
Standard Functions (C++11)
void traverse(std::function<const void(int &num)>cb) {
for(int &i : array)
cb(i);
}
This is the most versatile solution with a slightly overhead cost.
Don't forget to include <functional>.

Avoid memory allocation with std::function and member function

This code is just for illustrating the question.
#include <functional>
struct MyCallBack {
void Fire() {
}
};
int main()
{
MyCallBack cb;
std::function<void(void)> func = std::bind(&MyCallBack::Fire, &cb);
}
Experiments with valgrind shows that the line assigning to func dynamically allocates about 24 bytes with gcc 7.1.1 on linux.
In the real code, I have a few handfuls of different structs all with a void(void) member function that gets stored in ~10 million std::function<void(void)>.
Is there any way I can avoid memory being dynamically allocated when doing std::function<void(void)> func = std::bind(&MyCallBack::Fire, &cb); ? (Or otherwise assigning these member function to a std::function)
Unfortunately, allocators for std::function has been dropped in C++17.
Now the accepted solution to avoid dynamic allocations inside std::function is to use lambdas instead of std::bind. That does work, at least in GCC - it has enough static space to store the lambda in your case, but not enough space to store the binder object.
std::function<void()> func = [&cb]{ cb.Fire(); };
// sizeof lambda is sizeof(MyCallBack*), which is small enough
As a general rule, with most implementations, and with a lambda which captures only a single pointer (or a reference), you will avoid dynamic allocations inside std::function with this technique (it is also generally better approach as other answer suggests).
Keep in mind, for that to work you need guarantee that this lambda will outlive the std::function. Obviously, it is not always possible, and sometime you have to capture state by (large) copy. If that happens, there is no way currently to eliminate dynamic allocations in functions, other than tinker with STL yourself (obviously, not recommended in general case, but could be done in some specific cases).
As an addendum to the already existent and correct answer, consider the following:
MyCallBack cb;
std::cerr << sizeof(std::bind(&MyCallBack::Fire, &cb)) << "\n";
auto a = [&] { cb.Fire(); };
std::cerr << sizeof(a);
This program prints 24 and 8 for me, with both gcc and clang. I don't exactly know what bind is doing here (my understanding is that it's a fantastically complicated beast), but as you can see, it's almost absurdly inefficient here compared to a lambda.
As it happens, std::function is guaranteed to not allocate if constructed from a function pointer, which is also one word in size. So constructing a std::function from this kind of lambda, which only needs to capture a pointer to an object and should also be one word, should in practice never allocate.
Run this little hack and it probably will print the amount of bytes you can capture without allocating memory:
#include <iostream>
#include <functional>
#include <cstring>
void h(std::function<void(void*)>&& f, void* g)
{
f(g);
}
template<size_t number_of_size_t>
void do_test()
{
size_t a[number_of_size_t];
std::memset(a, 0, sizeof(a));
a[0] = sizeof(a);
std::function<void(void*)> g = [a](void* ptr) {
if (&a != ptr)
std::cout << "malloc was called when capturing " << a[0] << " bytes." << std::endl;
else
std::cout << "No allocation took place when capturing " << a[0] << " bytes." << std::endl;
};
h(std::move(g), &g);
}
int main()
{
do_test<1>();
do_test<2>();
do_test<3>();
do_test<4>();
}
With gcc version 8.3.0 this prints
No allocation took place when capturing 8 bytes.
No allocation took place when capturing 16 bytes.
malloc was called when capturing 24 bytes.
malloc was called when capturing 32 bytes.
Many std::function implementations will avoid allocations and use space inside the function class itself rather than allocating if the callback it wraps is "small enough" and has trivial copying. However, the standard does not require this, only suggests it.
On g++, a non-trivial copy constructor on a function object, or data exceeding 16 bytes, is enough to cause it to allocate. But if your function object has no data and uses the builtin copy constructor, then std::function won't allocate.
Also, if you use a function pointer or a member function pointer, it won't allocate.
While not directly part of your question, it is part of your example.
Do not use std::bind. In virtually every case, a lambda is better: smaller, better inlining, can avoid allocations, better error messages, faster compiles, the list goes on. If you want to avoid allocations, you must also avoid bind.
I propose a custom class for your specific usage.
While it's true that you shouldn't try to re-implement existing library functionality because the library ones will be much more tested and optimized, it's also true that it applies for the general case. If you have a particular situation like in your example and the standard implementation doesn't suite your needs you can explore implementing a version tailored to your specific use case, which you can measure and tweak as necessary.
So I have created a class akin to std::function<void (void)> that works only for methods and has all the storage in place (no dynamic allocations).
I have lovingly called it Trigger (inspired by your Fire method name). Please do give it a more suited name if you want to.
// helper alias for method
// can be used in user code
template <class T>
using Trigger_method = auto (T::*)() -> void;
namespace detail
{
// Polymorphic classes needed for type erasure
struct Trigger_base
{
virtual ~Trigger_base() noexcept = default;
virtual auto placement_clone(void* buffer) const noexcept -> Trigger_base* = 0;
virtual auto call() -> void = 0;
};
template <class T>
struct Trigger_actual : Trigger_base
{
T& obj;
Trigger_method<T> method;
Trigger_actual(T& obj, Trigger_method<T> method) noexcept : obj{obj}, method{method}
{
}
auto placement_clone(void* buffer) const noexcept -> Trigger_base* override
{
return new (buffer) Trigger_actual{obj, method};
}
auto call() -> void override
{
return (obj.*method)();
}
};
// in Trigger (bellow) we need to allocate enough storage
// for any Trigger_actual template instantiation
// since all templates basically contain 2 pointers
// we assume (and test it with static_asserts)
// that all will have the same size
// we will use Trigger_actual<Trigger_test_size>
// to determine the size of all Trigger_actual templates
struct Trigger_test_size {};
}
struct Trigger
{
std::aligned_storage_t<sizeof(detail::Trigger_actual<detail::Trigger_test_size>)>
trigger_actual_storage_;
// vital. We cannot just cast `&trigger_actual_storage_` to `Trigger_base*`
// because there is no guarantee by the standard that
// the base pointer will point to the start of the derived object
// so we need to store separately the base pointer
detail::Trigger_base* base_ptr = nullptr;
template <class X>
Trigger(X& x, Trigger_method<X> method) noexcept
{
static_assert(sizeof(trigger_actual_storage_) >=
sizeof(detail::Trigger_actual<X>));
static_assert(alignof(decltype(trigger_actual_storage_)) %
alignof(detail::Trigger_actual<X>) == 0);
base_ptr = new (&trigger_actual_storage_) detail::Trigger_actual<X>{x, method};
}
Trigger(const Trigger& other) noexcept
{
if (other.base_ptr)
{
base_ptr = other.base_ptr->placement_clone(&trigger_actual_storage_);
}
}
auto operator=(const Trigger& other) noexcept -> Trigger&
{
destroy_actual();
if (other.base_ptr)
{
base_ptr = other.base_ptr->placement_clone(&trigger_actual_storage_);
}
return *this;
}
~Trigger() noexcept
{
destroy_actual();
}
auto destroy_actual() noexcept -> void
{
if (base_ptr)
{
base_ptr->~Trigger_base();
base_ptr = nullptr;
}
}
auto operator()() const
{
if (!base_ptr)
{
// deal with this situation (error or just ignore and return)
}
base_ptr->call();
}
};
Usage:
struct X
{
auto foo() -> void;
};
auto test()
{
X x;
Trigger f{x, &X::foo};
f();
}
Warning: only tested for compilation errors.
You need to thoroughly test it for correctness.
You need to profile it and see if it has a better performance than other solutions. The advantage of this is because it's in house cooked you can make tweaks to the implementation to increase performance on your specific scenarios.
As #Quuxplusone mentioned in their answer-as-a-comment, you can use inplace_function here. Include the header in your project, and then use like this:
#include "inplace_function.h"
struct big { char foo[20]; };
static stdext::inplace_function<void(), 8> inplacefunc;
static std::function<void()> stdfunc;
int main() {
static_assert(sizeof(inplacefunc) == 16);
static_assert(sizeof(stdfunc) == 32);
inplacefunc = []() {};
// fine
struct big a;
inplacefunc = [a]() {};
// test.cpp:15:24: required from here
// inplace_function.h:237:33: error: static assertion failed: inplace_function cannot be constructed from object with this (large) size
// 237 | static_assert(sizeof(C) <= Capacity,
// | ~~~~~~~~~~^~~~~~~~~~~
// inplace_function.h:237:33: note: the comparison reduces to ‘(20 <= 8)’
}

Is it safe to convert a template lambda to a `void *`?

I'm working on implementing fibers using coroutines implemented in assembler. The coroutines work by cocall to change stack.
I'd like to expose this in C++ using a higher level interface, as cocall assembly can only handle a single void* argument.
In order to handle template lambdas, I've experimented with converting them to a void* and found that while it compiles and works, I was left wondering if it was safe to do so, assuming ownership semantics of the stack (which are preserved by fibers).
template <typename FunctionT>
struct Coentry
{
static void coentry(void * arg)
{
// Is this safe?
FunctionT * function = reinterpret_cast<FunctionT *>(arg);
(*function)();
}
static void invoke(FunctionT function)
{
coentry(reinterpret_cast<void *>(&function));
}
};
template <typename FunctionT>
void coentry(FunctionT function)
{
Coentry<FunctionT>::invoke(function);
}
int main(int argc, const char * argv[]) {
auto f = [&]{
std::cerr << "Hello World!" << std::endl;
};
coentry(f);
}
Is this safe and additionally, is it efficient? By converting to a void* am I forcing the compiler to choose a less efficient representation?
Additionally, by invoking coentry(void*) on a different stack, but the original invoke(FunctionT) has returned, is there a chance that the stack might be invalid to resume? (would be similar to, say invoking within a std::thread I guess).
Everything done above is defined behaviour. The only performance hit is that inlining something aliased thro7gh a void pointer could be slightly harder.
However, the lambda is an actual value, and if stored in automatic storage only lasts as long as the stored-in stack frame does.
You can fix this a number of ways. std::function is one, another is to store the lambda in a shared_ptr<void> or unique_ptr<void, void(*)(void*)>. If you do not need type erasure, you can even store the lambda in a struct with deduced type.
The first two are easy. The third;
template <typename FunctionT>
struct Coentry {
FunctionT f;
static void coentry(void * arg)
{
auto* self = reinterpret_cast<Coentry*>(arg);
(self->f)();
}
Coentry(FunctionT fin):f(sts::move(fin)){}
};
template<class FunctionT>
Coentry<FunctionT> make_coentry( FunctionT f ){ return {std::move(f)}; }
now keep your Coentry around long enough until the task completes.
The details of how you manage lifetime depend on the structure of the rest of your problem.

variadic boost bind type resolution

I'm trying to write an async logger which accepts variadic arguments that are then strung together using a variadic stringer and then pushed onto a single producer single consumer queue.
I'm stuck in my enqueue function part of my Log struct which looks as follows:
template <typename T>
std::string Log::stringer(T const & t){
return boost::lexical_cast<std::string>(t);
}
template<typename T, typename ... Args>
std::string Log::stringer(T const & t, Args const & ... args){
return stringer(t) + stringer(args...);
}
template<typename T, typename ... Args>
void Log::enqueue(T & t, Args & ... args){
boost::function<std::string()> f
= boost::bind(&Log::stringer<T &, Args & ...>,this,
boost::ref(t),
boost::forward<Args>(args)...);
/// the above statement fails to compile though if i use 'auto f' it works ->
/// but then it is unclear to me what the signature of f really is ?
// at this point i would like to post the functor f onto my asio::io_service,
// but not able to cause it's not clear to me what the type of f is.
// I think it should be of type boost::function<std::string()>
}
Inside main(), I call
Log t_log;
t_log.enqueue("hello"," world");
My suggestion for the function you ask about:
template <typename T, typename... Args> void enqueue(T &t, Args const&... args) {
this->io_service->post([=]{
auto s = stringer(t, args...);
//std::fprintf(stderr, "%s\n", s.c_str());
});
}
This works with GCC and Clang (GCC 4.9 or later because of a known issue with captured variadic packs).
But really, I'd reconsider the design at hand, and certainly start a lot simpler until you know what areas deserve further optimization.
Questionables
There are many things I don't understand about this code:
Why are the arguments being taken by non-const reference
Why are you subsequently using std::forward<> on them (you already now the value category, and it's not going to change)
Why are you passing the stringization to an io_service?
the queue is going to introduce locking (kind of refuting the lockfree queue) and
the stringization is going to have its result ignored...
Why would you use boost::function here? This incurs a (another) dynamic allocation and an indirect dispatch... Just post f
Why are the arguments bound by reference in the first place? If you're going to process the arguments on a different thread, this leads to Undefined Behaviour. E.g. imagine the caller doing
char const msg[] = "my message"; // perhaps some sprintf output
l.enqueue(cat.c_str(), msg);
The c_str() is stale after the enqueue returned and msg goes out of scope soon, or gets overwritten with other data.
Why are you using bind approaches when you clearly have c++11 support (because you used std::forward<> and attributes)?
Why are you using a lockfree queue (do anticipate to be constantly logging at max CPU? In that case, logging is the core functionality of you application and you should probably think this through a bit (a lot) more rigorously (e.g. write into preallocated alternating buffers and decide on max backlog etc).
In all other cases, you probably want at most 1 single thread running on a lockfree queue. This would likely already be overkill (spinning a thread constantly is expensive). Instead, you could gracefully fallback to yields/synchronization if there's nothing to do n cycles.
You can just bind to a shared_ptr. This is a lot safer and more convenient than binding to .get()
In my sample below I've just removed the need for scoped_ptrs by not allocating everything from the heap (why was that?). (You can use boost::optional<work> if you needed work.)
The explicit memory-order load/stores give me bad vibes too. The way they're written would make sense only if exactly two threads are involved in the flag, but this is in no way apparent to me at the moment (threads are created all around).
On most platforms there will be no difference, and in light of the above, the presence of explicit memory ordering stands out as a clear code smell
The same thing applies to the attempts to forcibly inline certain functions. You can trust your compiler and you should probably refrain from second guessing it until you know you have a bottleneck caused by suboptimal generated code
Since you intend to give threads thread affinity, do use thread locals. Either use GCC/MSVC extensions in C++03 (__thread) or use c++11 thread_local, e.g. in pop()
thread_local std::string s;
s.reserve(1000);
s.resize(0);
This enormously reduces the number of allocations (at the cost of making pop() non-reentrant, which is not required.
I later noticed this pop() is limited to a single thread
What is the use of having that lockfree queue if all you do is ... spinlock manually around it?
void push(std::string const &s) {
while (std::atomic_flag_test_and_set_explicit(&this->lock, std::memory_order_acquire))
;
while (!this->q->push(s))
;
std::atomic_flag_clear_explicit(&this->lock, std::memory_order_release);
}
Cleanup Suggestion
Live On Coliru
#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/atomic.hpp>
#include <boost/lockfree/spsc_queue.hpp>
#include <boost/thread/thread.hpp>
/*
* safe for use from a single thread only
*/
template <unsigned line_maxchars = 1000>
class Log {
public:
Log(std::string const &logFileName, int32_t queueSize)
: fp(stderr), // std::fopen(logFileName.c_str(),"w")
_shutdown(false),
_thread(&Log::pop, this),
_queue(queueSize)
{ }
void pop() {
std::string s;
s.reserve(line_maxchars);
struct timeval ts;
while (!_shutdown) {
while (_queue.pop(s)) {
gettimeofday(&ts, NULL);
std::fprintf(fp, "%li.%06li %s\n", ts.tv_sec, ts.tv_usec, s.c_str());
}
std::fflush(fp); // RECONSIDER HERE?
}
while (_queue.pop(s)) {
gettimeofday(&ts, NULL);
std::fprintf(fp, "%li.%06li %s\n", ts.tv_sec, ts.tv_usec, s.c_str());
}
}
template <typename S, typename T> void stringer(S& stream, T const &t) {
stream << t;
}
template <typename S, typename T, typename... Args>
void stringer(S& stream, T const &t, Args const &... args) {
stringer(stream, t);
stringer(stream, args...);
}
template <typename T, typename... Args> void enqueue(T &t, Args const&... args) {
thread_local char buffer[line_maxchars] = {};
boost::iostreams::array_sink as(buffer);
boost::iostreams::stream<boost::iostreams::array_sink> stream(as);
stringer(stream, t, args...);
auto output = as.output_sequence();
push(std::string(output.first, output.second));
}
void push(std::string const &s) {
while (!_queue.push(s));
}
~Log() {
_shutdown = true;
_thread.join();
assert(_queue.empty());
std::fflush(fp);
std::fclose(fp);
fp = NULL;
}
private:
FILE *fp;
boost::atomic_bool _shutdown;
boost::thread _thread;
boost::lockfree::spsc_queue<std::string> _queue;
};
#include <chrono>
#include <iostream>
int main() {
using namespace std::chrono;
auto start = high_resolution_clock::now();
{
Log<> l("/tmp/junk.log", 1024);
for (int64_t i = 0; i < 10; ++i) {
l.enqueue("hello ", i, " world");
}
}
std::cout << duration_cast<microseconds>(high_resolution_clock::now() - start).count() << "μs\n";
}
As you can see, I've reduced the code by a third. I've documented the fact that it's only safe for use from a single thread.
Asio is gone. Lexical cast is gone. Things have meaningful names. No more memory order fiddling. No more thread affinity fiddling. No more inline envy. No more tedious string allocations.
The things that you'd likely benefit the most from is
make the array_sinks/buffers pooled and stored in the queue by reference
not flush on every log

C++ varargs - Is how I am using them okay or are they bad? Is there a good alternative?

The ultimate goal of this is to have a function which can take a variable number of arguments of a certain type (the same type, not different types), that can be declared on the function call.
As I'm using Visual Studio 2010, I CANNOT do:
MyFunction({1,2,3});
In an earlier question which was answered, I found I could use boost::assign::list_of(), however I discovered later that this seems to have a bug of some kind if you try to pass it only one parameter.
So I did some more searching and found that I could use variadic functions to achieve what I was aiming for.
void TestFunction2<int>(int count, ...)
{}
However, I wanted to restrict it by type, so eventually found I could do this with templates:
template <class T>
void TestFunction(const T& count, ...);
template <>
void TestFunction<int>(const int& count, ...);
Unfortunately, varargs things like va_list do not apparently like references. The examples I saw to restrict types like this used const references. If I remove the const reference aspect of the count parameter, it works as I want, but I don't know if this is going to lead to horrible side-effects down the road, OR if this whole varargs thing is a bad idea to begin with.
So I guess my question is, is what I'm doing in the last example above good or bad? If it's bad, what is a good alternative so I can call a function with one or more parameters in-line like, say, int parameters?
What you want is std::initializer_list<T>, unfortunately this require C++11 support.
An alternative, that is nearly as elegant and easy enough to upgrade from, is to use an array:
#include <iostream>
template <typename T, size_t N>
void func(T (&s)[N]) {
for (size_t i = 0; i != N; ++i) {
std::cout << s[i] << '\n';
}
}
int main() {
int array[] = {1, 2, 3};
func(array);
}
When you move on to a compiler that supports initializer lists, this can be changed into:
#include <iostream>
template <typename T>
void func(std::initializer_list<T> s) {
for (T const& t: s) {
std::cout << t << '\n';
}
}
int main() {
func({1, 2, 3});
}
So both the function and call sites update will be painless.
Note: the call site could be made completely similar using a macro, I advise against such approach, the purported gain is not worth the obfuscation.
EDIT:
One more solution... if your compiler's IDE partially supports C++11, you may be able to initialize a std::vector at call time, i.e.
template <typename T>
void TestFunction(std::vector<T> vect)
{
....
}
....
TestFunction(std::vector<int>{1,2,3});
Advantages to this approach are that STL automatically frees the allocated memory when the function goes out of scope.
If that doesn't work you can resort to a two liner...
template <typename T>
void TestFunction(std::vector<T> vect)
{
....
}
....
std::vector<int> tmp(1,2,3);
TestFunction(tmp);
The big downside is that here the memory sits on stack until you leave that scope (or explicitly resize the vector to zero length.
Both approaches share some advantages... the count is built in and you have access to other useful member functions or affiliate methods (like std::sort).
......................................
Why not use variable arguments?
See the answer here, for example...
Is it a good idea to use varargs in a C API to set key value pairs?
On non-C+11 compliant compilers (like your IDE), you can try...
template <typename T>
TestFunction(const unsigned int count, T * arr)
TestFunction<std::string>(10, new string[] {"One", "Two", "Three"});
(Sounds like you can't use this in your IDE, but...)
If you're confident you're only compiling on modern machines and are primarily using simple types, this is best/most standards compliant solution...
As of C++11 you can use std::initializer which is in std::vector:
#include<vector>
template <typename T>
void TestFunction(const std::initializer_list<T>& v)
{ }
int main()
{
TestFunction<double>({1.0, 2.0});
return 0;
}
..........................
...however this requires your compiler to be C+11 so it's not perfectly portable. For anything other than simple types, it also becomes harder to read.
I realize you say on the function call, but you may want to rethink that from a readability and ease of coding approach.
I agree with part of your approach -- what you want is to use a template function (this handles the variable type). Before you call you initialize your collection of same-type elements into a temporary standard C array or a std::vector/std::list (STL's array wrapper).
http://www.cplusplus.com/doc/tutorial/templates/
http://www.cplusplus.com/reference/vector/
http://www.cplusplus.com/reference/list/
It's more lines of code, but it's much more readable and standardized.
i.e.
Rather than...
MyFunction({1,2,3});
Use:
template <typename T>
void TestFunction(const int count, T * arr)
{
for (unsigned int i = 0; i < count; i++)
{
.... arr[i] ... ; //do stuff
...
}
}
int main()
{
int * myArr = {1,2,3};
TestFuntion<int>(3, myArr);
}
...or...
#include <vector>
template <typename T>
void TestFunction(std::vector<T> vect)
{
for (unsigned int i = 0; i < vect.size(); i++)
{
.... vect[i] ... ; //do stuff
...
}
}
int main()
{
std::vector<int> myVect;
myVect.push_back(1);
myVect.push_back(2);
myVect.push_back(3);
TestFuntion<int>(myVect);
}
std::list would also be a perfectly acceptable, and may perform better, depending on your use case.