Optimize C++ template executions

Optimize C++ template executions - c++

I am working on project where the performance is critical. The application is processing a huge amount of data. Code is written in C++ and I need to do some changes.
There is given following code (It is NOT my code and I simplified it to minimum):
void process<int PARAM1, int PARAM2>() {
// processing the data
}
void processTheData (int param1, int param2) { // wrapper
if (param1 == 1 && param2 == 1) { // Ugly looking block of if's
process<1, 1>();
else if(param1 == 1 && param2 == 2) {
process<1, 2>();
else if(param1 == 1 && param2 == 3) {
process<1, 3>();
else if(param1 == 1 && param2 == 4) {
process<1, 4>();
else if(param1 == 2 && param2 == 1) {
process<2, 1>();
else if(param1 == 2 && param2 == 2) {
process<2, 2>();
else if(param1 == 2 && param2 == 3) {
process<2, 3>();
else if(param1 == 2 && param2 == 4) {
process<2, 4>();
} // and so on....
}
And the main function:
int main(int argc, char *argv[]) {
factor1 = atoi(argv[1]);
factor2 = atoi(argv[2]);
// choose some optimal param1 and param2
param1 = choseTheOptimal(factor1, factor2);
param2 = choseTheOptimal(factor1, factor2);
processTheData(param1, param2); //start processing
return 0;
}
Hopefully the code looks clear.
The functions:
process is the core function that is processing the data,
processTheData is a wrapper of the process function.
There is a limited number of values that the params (param1 and param2) takes (Let's say about 10 x 10).
The values of param1 and param2 are NOT known before execution.
If I simply rewrite the process function so it uses the function parameters instead of template constants (means process(int PARAM1, int PARAM2)) then the processing is about 10 times slower.
Because of the above the PARAM1 and PARAM2 must be the constant of process function.
Is there any smart way to get rid of this ugly block of if's located in processTheData function?

Like this.
#include <array>
#include <utility>
template<int PARAM1, int PARAM2>
void process() {
// processing the data
}
// make a jump table to call process<X, Y> where X is known and Y varies
template<std::size_t P1, std::size_t...P2s>
constexpr auto make_table_over_p2(std::index_sequence<P2s...>)
{
return std::array<void (*)(), sizeof...(P2s)>
{
&process<int(P1), int(P2s)>...
};
}
// make a table of jump tables to call process<X, Y> where X and Y both vary
template<std::size_t...P1s, std::size_t...P2s>
constexpr auto make_table_over_p1_p2(std::index_sequence<P1s...>, std::index_sequence<P2s...> p2s)
{
using element_type = decltype(make_table_over_p2<0>(p2s));
return std::array<element_type, sizeof...(P1s)>
{
make_table_over_p2<P1s>(p2s)...
};
}
void processTheData (int param1, int param2) { // wrapper
// make a 10x10 jump table
static const auto table = make_table_over_p1_p2(
std::make_index_sequence<10>(),
std::make_index_sequence<10>()
) ;
// todo - put some limit checks here
// dispatch
table[param1][param2]();
}

This is what I call the matic switch. It takes a runtime value (within a specified range), and turns it into a compile time value.
namespace details
{
template<std::size_t I>
using index_t = std::integral_constant<std::size_t, I>;
template<class F>
using f_result = std::result_of_t< F&&(index_t<0>) >;
template<class F>
using f_ptr = f_result<F>(*)(F&& f);
template<class F, std::size_t I>
f_ptr<F> get_ptr() {
return [](F&& f)->f_result<F> {
return std::forward<F>(f)(index_t<I>{});
};
}
template<class F, std::size_t...Is>
auto dispatch( F&& f, std::size_t X, std::index_sequence<Is...> ) {
static const f_ptr<F> table[]={
get_ptr<F, Is>()...
};
return table[X](std::forward<F>(f));
}
}
template<std::size_t max, class F>
details::f_result<F>
dispatch( F&& f, std::size_t I ) {
return details::dispatch( std::forward<F>(f), I, std::make_index_sequence<max>{} );
}
what this does is build a jump table to convert runtime data to a compile time constant. I use a lambda, because it makes it nice and generic, and pass it an integral constant. An integral constant is a runtime stateless object whose type carries the constant with it.
An example use:
template<std::size_t a, std::size_t b>
void process() {
static_assert( sizeof(int[a+1]) + sizeof(int[b+1]) >= 0 );
}
constexpr int max_factor_1 = 10;
constexpr int max_factor_2 = 10;
int main() {
int factor1 = 1;
int factor2 = 5;
dispatch<max_factor_1>(
[factor2](auto factor1) {
dispatch<max_factor_2>(
[factor1](auto factor2) {
process< decltype(factor1)::value, decltype(factor2)::value >();
},
factor2
);
},
factor1
);
}
where max_factor_1 and max_factor_2 are constexpr values or expressions.
This uses C++14 for auto lambdas and constexpr implicit cast from integral constants.
Live example.

This is what I came up with. It uses less fancy features (only enable_if, no variadic templates or function pointers) but it is also less generic. Pasting the code into godbolt indicates that compilers are able to optimize this completely away for the example code which may have a performance advantage in the real code.
#include <type_traits>
template <int param1, int param2>
void process() {
static_assert(sizeof(int[param1 + 1]) + sizeof(int[param2 + 1]) > 0);
}
template <int limit2, int param1, int param2>
std::enable_if_t<(param2 > limit2)> pick_param2(int) {
static_assert("Invalid value for parameter 2");
}
template <int limit2, int param1, int param2>
std::enable_if_t<param2 <= limit2> pick_param2(int p) {
if (p > 0) {
pick_param2<limit2, param1, param2 + 1>(p - 1);
} else {
process<param1, param2>();
}
}
template <int limit1, int limit2, int param>
std::enable_if_t<(param > limit1)> pick_param1(int, int) {
static_assert("Invalid value for parameter 1");
}
template <int limit1, int limit2, int param>
std::enable_if_t<param <= limit1> pick_param1(int p1, int p2) {
if (p1 > 0) {
pick_param1<limit1, limit2, param + 1>(p1 - 1, p2);
} else {
pick_param2<limit2, param, 0>(p2);
}
}
template <int limit_param1, int limit_param2>
void pick_params(int param1, int param2) {
pick_param1<limit_param1, limit_param2, 0>(param1, param2);
}
int main() {
int p1 = 3;
int p2 = 5;
pick_params<10, 10>(p1, p2);
}
I'd be interested in profiling results.

Related

C++ LRUCache decorator | call a non-member function in a class scope

You may be familiar with Python decorators such as #lru_cache. It wraps any function and does MEMOIZATION of results improving the runtime:
#functools.lru_cache(maxsize=100)
def fib(n):
if n < 2:
return n
return fib(n-1) + fib(n-2)
I want to build #lru_cache decorator in C++.
My implementation have one problem - when I wrap a recursive function and call it in a wrapping class, all subsequent recursive calls have no access to the cache. And as a result, I have 0 hits. Let me illustrate (I'll skip a bit of code).
LRUCache class:
template <typename Key, typename Val>
class LRUCache
{
public:
LRUCache( int capacity = 100 ) : capacity{ capacity } {}
Val get( Key key ) {... }
void put( Key key, Val value ) {... }
...
private:
int capacity;
std::list<std::pair<Val, Key>> CACHE;
std::unordered_map<Key, typename std::list<std::pair<Val, Key>>::iterator> LOOKUP;
};
_LruCacheFunctionWrapper class:
template <typename Key, typename Val>
class _LruCacheFunctionWrapper
{
struct CacheInfo {... };
public:
_LruCacheFunctionWrapper( std::function<Val( Key )> func, int maxSize )
: _wrapped{ func }
, _cache{ maxSize }
, _hits{ 0 }
, _misses{ 0 }
, _maxsize{ maxSize }
{}
template<typename... Args>
Val operator()( Args... args )
{
auto res = _cache.get( args... );
if( res == -1 )
{
++_misses;
res = _wrapped( args... );
_cache.put( args..., res );
}
else
++_hits;
return res;
}
CacheInfo getCacheInfo() {... }
void clearCache() {... }
private:
std::function<Val( Key )> _wrapped;
LRUCache<Key, Val> _cache;
int _hits;
int _misses;
int _maxsize;
};
And lastly, the target function:
long long fib( int n )
{
if( n < 2 )
return n;
return fib( n - 1 ) + fib( n - 2 );
}
You may see that the line:
res = _wrapped( args... );
is sending me into the function scope and I have to recalculate all recursive calls. How can I solve it?
Main.cpp:
_LruCacheFunctionWrapper<int, long long> wrapper( &fib, 50 );
for( auto i = 0; i < 16; ++i )
std::cout << wrapper( i ) << " ";

Main function body doesn't detect call to overloaded variadic-templated function C++

I'm currently learning variadic template functions and parameter packing/unpacking.
This is my code,
template<typename T, typename U>
void my_insert(std::vector<int>& v, T& t) {
int i;
if (typeid(t).name() == typeid(const char*).name()) {
i = stoi(t);
}
else if (typeid(t).name() == typeid(char).name()) {
i = t - 48;
}
else if (typeid(t).name() == typeid(int).name()) {
i = t;
}
else if (typeid(t).name() == typeid(double).name()) {
i = static_cast<int>(round(t));
}
else if (typeid(t).name() == typeid(bool).name()) {
if (t) i == 1;
else i == 0;
}
else if (typeid(t).name() == typeid(std::vector<U>).name()) {
int j = 0;
while (j < t.size()) {
my_insert(v, t[j]);
++j;
}
}
else return;
v.push_back(i);
}
template<typename T, typename U, typename ...Args>
void my_insert(std::vector<int>& v, T& t, Args&... args) {
int i;
if (typeid(t).name() == typeid(const char*).name()) {
if (isdigit(t[0])) i = stoi(t);
// else do nothing
}
else if (typeid(t).name() == typeid(char).name()) {
i = t - 48;
}
else if (typeid(t).name() == typeid(int).name()) {
i = t;
}
else if (typeid(t).name() == typeid(double).name()) {
i = static_cast<int>(round(t));
}
else if (typeid(t).name() == typeid(bool).name()) {
if (t) i == 1;
else i == 0;
}
else if (typeid(t).name() == typeid(std::vector<U>).name()) {
int j = 0;
while (j < t.size()) {
my_insert(v, t[j]);
++j;
}
}
//else do nothing
v.push_back(i);
my_insert(args...);
}
int main() {
std::vector<int> v;
my_insert(v, "123", "-8", 32, 3.14159, true, true, false, '5', "12.3");
return 0;
}
ERROR : no instance of overloaded function my_insert matches the argument list
I don't understand what mistake I've made since for me the same exact implementation of the a print() function works with { cout << t << endl; print(args...); } , w/ signature <typename T, typename ...Args> void print(const T& t, const Args... args);
I know that a variadic function can be implemented with recursive calls non-variadic parameter overloaded versions of the same function. A so-called base case statement.
With all that being said, I'm unsure what it is that I'm doing incorrectly.

Well... there are some problems in your code.
The blocking error is the template parameter U
template<typename T, typename U>
void my_insert(std::vector<int>& v, T& t)
template<typename T, typename U, typename ...Args>
void my_insert(std::vector<int>& v, T& t, Args&... args)
The compiler can't deduce it and calling the function
my_insert(v, "123", "-8", 32, 3.14159, true, true, false, '5', "12.3");
the U isn't explicated
I suppose that the idea is "if T is a std::vector of some type U, add all element of the vector". If I understand correctly, I suggest to add a different overloaded version of the function.
Other problems...
1) In a couple of points you write something as
if (t) i == 1;
else i == 0;
It seems to me that your using operator == (comparison) instead of = (assignment).
General suggestion: enable the highest warning level to intercept this sort of trivial errors.
2) Your using typeid
if (typeid(t).name() == typeid(char).name())
to compare types.
Suggestion: use std::is_same instead
if ( std::is_same<T, char>::value )
3) The ground case of your recursion is a my_insert() function that is almost identical to the recursive version; the only differences are the absence of Args... argument and recursion call.
This is error prone because, if you modify one of the two version, you must remember to modify the other in the same way.
Suggestion: write a empty-and-do-nothing ground case; something as
void my_insert (std::vector<int> & v)
{ }
4) you can't compile
i = stoi(t);
when t isn't a char const *
Analogous problems with other assignments.
The problem is that when you write [pseudocode]
if ( condition )
statement_1;
else
statement_2;
the compiler must compile both statement_1 and statement_2 also when know compile-time that condition is true or false.
To avoid the compilation of the unused statement, you have to use if constexpr.
So you have to write something as
if constexpr ( std::is_same_v<T, char const *> )
i = std::stoi(t);
else if constexpr ( std::is_same_v<T, char> )
i = t - 48;
else if constexpr ( std::is_same_v<T, int> )
i = t;
else if constexpr ( std::is_same_v<T, double> )
i = static_cast<int>(std::round(t));
else if constexpr ( std::is_same_v<T, bool> )
i = t;
Unfortunately, if constexpr is available only starting from C++17.
Before C++17, you have to write different overloaded functions.
5) calling my_insert() recursively, you have to remember the v vector
my_insert(args...); // <-- wrong! no v
my_insert(v, args...); // <-- correct
6) take in count that "123" is convertible to char const * but isn't a char const * (it's a char const [4]); so, instead of
if constexpr ( std::is_same_v<T, char const *> )
i = std::stoi(t);
you can try with
if constexpr ( std::is_convertible_v<T, char const *> )
i = std::stoi(t);
The following is a possible C++17 implementation of your code
#include <cmath>
#include <string>
#include <vector>
#include <iostream>
void my_insert (std::vector<int> const &)
{ }
template <typename T, typename ... As>
void my_insert (std::vector<int> &, std::vector<T> const &, As const & ...);
template <typename T, typename ... As>
void my_insert (std::vector<int> & v, T const & t, As const & ... as)
{
int i{};
if constexpr ( std::is_convertible_v<T, char const *> )
i = std::stoi(t);
else if constexpr ( std::is_same_v<T, char> )
i = t - 48;
else if constexpr ( std::is_same_v<T, int> )
i = t;
else if constexpr ( std::is_same_v<T, double> )
i = static_cast<int>(std::round(t));
else if constexpr ( std::is_same_v<T, bool> )
i = t;
// else ???
v.push_back(i);
my_insert(v, as...);
}
template <typename T, typename ... As>
void my_insert (std::vector<int> & v, std::vector<T> const & t,
As const & ... as)
{
for ( auto const & val : t )
my_insert(v, val);
my_insert(v, as...);
}
int main ()
{
std::vector<int> v;
std::vector<char> u { '9', '8', '7' };
my_insert(v, "123", "-8", 32, 3.14159, true, u, false, '5', "12.3");
for ( auto const & val : v )
std::cout << val << ' ';
std::cout << std::endl;
}

Recursive iteration over parameter pack

I'm currently trying to implement a function, which accepts some data and a parameter-pack ...args. Inside I call another function, which recursively iterates the given arguments.
Sadly I'm having some issues to compile it. Apparently the compiler keeps trying to compile the recursive function, but not the overload to stop the recursion.
Does anyone have an idea what the issue is ?
class Sample
{
public:
template<class ...TArgs, std::size_t TotalSize = sizeof...(TArgs)>
static bool ParseCompositeFieldsXXX(const std::vector<std::string> &data, TArgs &&...args)
{
auto field = std::get<0>(std::forward_as_tuple(std::forward<TArgs>(args)...));
//bool ok = ParseField(field, 0, data);
auto x = data[0];
bool ok = true;
if (TotalSize > 1)
return ok && ParseCompositeFields<1>(data, std::forward<TArgs>(args)...);
return ok;
}
private:
template<std::size_t Index, class ...TArgs, std::size_t TotalSize = sizeof...(TArgs)>
static bool ParseCompositeFields(const std::vector<std::string> &data, TArgs &&...args)
{
auto field = std::get<Index>(std::forward_as_tuple(std::forward<TArgs>(args)...));
//bool ok = ParseField(field, Index, data);
auto x = data[Index];
bool ok = true;
if (Index < TotalSize)
return ok && ParseCompositeFields<Index + 1>(data, std::forward<TArgs>(args)...);
return ok;
}
template<std::size_t Index>
static bool ParseCompositeFields(const std::vector<std::string> &data)
{
volatile int a = 1 * 2 + 3;
}
};
int wmain(int, wchar_t**)
{
short x1 = 0;
std::string x2;
long long x3 = 0;
Sample::ParseCompositeFieldsXXX({ "1", "Sxx", "-5,32" }, x1, x2, x3);
return 0;
}
\utility(446): error C2338: tuple index out of bounds
...
\main.cpp(56): note: see reference to class template
instantiation 'std::tuple_element<3,std::tuple>' being compiled

Alternative approach
You seem to be using rather old technique here. Simple expansion is what you're searching for:
#include <cstddef>
#include <utility>
#include <tuple>
#include <vector>
#include <string>
class Sample
{
template <std::size_t index, typename T>
static bool parse_field(T&& field, const std::vector<std::string>& data)
{
return true;
}
template <typename Tuple, std::size_t ... sequence>
static bool parse_impl(Tuple&& tup, const std::vector<std::string>& data, std::index_sequence<sequence...>)
{
using expander = bool[];
expander expansion{parse_field<sequence>(std::get<sequence>(tup), data)...};
bool result = true;
for (auto iter = std::begin(expansion); iter != std::end(expansion); ++iter)
{
result = result && *iter;
}
return result;
}
public:
template<class ...TArgs, std::size_t TotalSize = sizeof...(TArgs)>
static bool ParseCompositeFieldsXXX(const std::vector<std::string> &data, TArgs &&...args)
{
return parse_impl(std::forward_as_tuple(std::forward<TArgs>(args)...),
data, std::make_index_sequence<sizeof...(TArgs)>{});
}
};
int main()
{
short x1 = 0;
std::string x2;
long long x3 = 0;
Sample::ParseCompositeFieldsXXX({ "1", "Sxx", "-5,32" }, x1, x2, x3);
return 0;
}
If you're looking at something like array, then it is array. Don't use recursion unless required, as it usually makes it complicated. Of course there are exceptions though.
Making it better
As you can see, one doesn't even need a class here. Just remove it.
Possible problems
One problem might arise if the order of invocation matters. IIRC, before C++17 this doesn't have strong evaluation order, so it might fail you sometimes.

Does anyone have an idea what the issue is ?
The crucial point are the lines:
if (Index < TotalSize)
return ok && ParseCompositeFields<Index + 1>(data, std::forward<TArgs>(args)...);
First of all, to be logically correct, the condition should read Index < TotalSize - 1., as tuple element counts are zero-based.
Furthermore, even if Index == TotalSize - 1, the compiler is still forced to instantiate ParseCompositeFields<Index + 1> (as it has to compile the if-branch), which effectively is ParseCompositeFields<TotalSize>. This however will lead to the error your got when trying to instantiate std::get<TotalSize>.
So in order to conditionally compile the if-branch only when the condition is fulfilled, you would have to use if constexpr(Index < TotalSize - 1) (see on godbolt). For C++14, you have to fall back on using template specializations and function objects:
class Sample
{
template<std::size_t Index, bool>
struct Helper {
template<class ...TArgs, std::size_t TotalSize = sizeof...(TArgs)>
static bool ParseCompositeFields(const std::vector<std::string> &data, TArgs &&...args)
{
auto field = std::get<Index>(std::forward_as_tuple(std::forward<TArgs>(args)...));
//bool ok = ParseField(field, Index, data);
auto x = data[Index];
bool ok = true;
return ok && Helper<Index + 1, (Index < TotalSize - 1)>::ParseCompositeFields(data, std::forward<TArgs>(args)...);
}
};
template<std::size_t Index>
struct Helper<Index, false> {
template<class ...TArgs, std::size_t TotalSize = sizeof...(TArgs)>
static bool ParseCompositeFields(const std::vector<std::string> &data, TArgs &&...args) {
volatile int a = 1 * 2 + 3;
return true;
}
};
public:
template<class ...TArgs, std::size_t TotalSize = sizeof...(TArgs)>
static bool ParseCompositeFieldsXXX(const std::vector<std::string> &data, TArgs &&...args)
{
auto field = std::get<0>(std::forward_as_tuple(std::forward<TArgs>(args)...));
//bool ok = ParseField(field, 0, data);
auto x = data[0];
bool ok = true;
return ok && Helper<1, (TotalSize > 1)>::ParseCompositeFields(data, std::forward<TArgs>(args)...);
}
};

Pass-by-reference hinders gcc from tail call elimination

See BlendingTable::create and BlendingTable::print. Both have the same form of tail recursion, but while create will be optimized as a loop, print will not and cause a stack overflow.
Go down to see a fix, which I got from a hint from one of the gcc devs on my bug report of this problem.
#include <cstdlib>
#include <iostream>
#include <memory>
#include <array>
#include <limits>
class System {
public:
template<typename T, typename... Ts>
static void print(const T& t, const Ts&... ts) {
std::cout << t << std::flush;
print(ts...);
}
static void print() {}
template<typename... Ts>
static void printLine(const Ts&... ts) {
print(ts..., '\n');
}
};
template<typename T, int dimension = 1>
class Array {
private:
std::unique_ptr<T[]> pointer;
std::array<int, dimension> sizes;
int realSize;
public:
Array() {}
template<typename... Ns>
Array(Ns... ns):
realSize(1) {
checkArguments(ns...);
create(1, ns...);
}
private:
template<typename... Ns>
static void checkArguments(Ns...) {
static_assert(sizeof...(Ns) == dimension, "dimension mismatch");
}
template<typename... Ns>
void create(int d, int n, Ns... ns) {
realSize *= n;
sizes[d - 1] = n;
create(d + 1, ns...);
}
void create(int) {
pointer = std::unique_ptr<T[]>(new T[realSize]);
}
int computeSubSize(int d) const {
if (d == dimension) {
return 1;
}
return sizes[d] * computeSubSize(d + 1);
}
template<typename... Ns>
int getIndex(int d, int n, Ns... ns) const {
return n * computeSubSize(d) + getIndex(d + 1, ns...);
}
int getIndex(int) const {
return 0;
}
public:
template<typename... Ns>
T& operator()(Ns... ns) const {
checkArguments(ns...);
return pointer[getIndex(1, ns...)];
}
int getSize(int d = 1) const {
return sizes[d - 1];
}
};
class BlendingTable : public Array<unsigned char, 3> {
private:
enum {
SIZE = 0x100,
FF = SIZE - 1,
};
public:
BlendingTable():
Array<unsigned char, 3>(SIZE, SIZE, SIZE) {
static_assert(std::numeric_limits<unsigned char>::max() == FF, "unsupported byte format");
create(FF, FF, FF);
}
private:
void create(int dst, int src, int a) {
(*this)(dst, src, a) = (src * a + dst * (FF - a)) / FF;
if (a > 0) {
create(dst, src, a - 1);
} else if (src > 0) {
create(dst, src - 1, FF);
} else if (dst > 0) {
create(dst - 1, FF, FF);
} else {
return;
}
}
void print(int dst, int src, int a) const {
System::print(static_cast<int>((*this)(FF - dst, FF - src, FF - a)), ' ');
if (a > 0) {
print(dst, src, a - 1);
} else if (src > 0) {
print(dst, src - 1, FF);
} else if (dst > 0) {
print(dst - 1, FF, FF);
} else {
System::printLine();
return;
}
}
public:
void print() const {
print(FF, FF, FF);
}
};
int main() {
BlendingTable().print();
return EXIT_SUCCESS;
}
Changing the class definition of System from
class System {
public:
template<typename T, typename... Ts>
static void print(const T& t, const Ts&... ts) {
std::cout << t << std::flush;
print(ts...);
}
static void print() {}
template<typename... Ts>
static void printLine(const Ts&... ts) {
print(ts..., '\n');
}
};
to
class System {
public:
template<typename T, typename... Ts>
static void print(T t, Ts... ts) {
std::cout << t << std::flush;
print(ts...);
}
static void print() {}
template<typename... Ts>
static void printLine(Ts... ts) {
print(ts..., '\n');
}
};
magically allows gcc to eliminate the tail calls.
Why does 'whether or not passing function arguments by reference' make such a big difference in gcc's behaviour? Semantically they both look the same to me in this case.

As it is noted by #jxh the cast static_cast<int>() creates a temporary whose reference is passed to the print function. Without such cast the tail recursion is optimized correctly.
The issue is very similar to the old case Why isn't g++ tail call optimizing while gcc is? and the workaround may be similar to https://stackoverflow.com/a/31793391/4023446.
It is still possible to use System with the arguments passed by reference if call to System::print will be moved to a separate private helper function SystemPrint:
class BlendingTable : public Array<unsigned char, 3> {
//...
private:
void SystemPrint(int dst, int src, int a) const
{
System::print(static_cast<int>((*this)(FF - dst, FF - src, FF - a)), ' ');
}
void print(int dst, int src, int a) const {
SystemPrint(dst, src, a);
if (a > 0) {
print(dst, src, a - 1);
} else if (src > 0) {
print(dst, src - 1, FF);
} else if (dst > 0) {
print(dst - 1, FF, FF);
} else {
System::printLine();
return;
}
}
// ...
}
Now the tail call optimization works (g++ (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2 with the optimization option -O2) and the print does not cause a stack overflow.
Update
I verified it with other compilers:
the original code without any change is perfectly optimized by clang++ Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) with -O1 optimization
g++ (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4 fails to perform TCO even without the cast or with the wrapping function SystemPrint workaround; here only the workaround with System::print arguments by values works.
So, the issue is very specific to compiler versions.

std::string is passing the std::is_fundamental check when it should not - template metaprogramming

I'm having a problem with an assignment of mine. The question for the assignment is as follows:
Write a function template named Interpolate that will make the below work. Each argument will be output when its corresponding % is encountered in the format string. All output should be ultimately done with the appropriate overloaded << operator. A \% sequence should output a percent sign.
SomeArbitraryClass obj;
int i = 1234;
double x = 3.14;
std::string str("foo");
std::cout << Interpolate(R"(i=%, x1=%, x2=%\%, str1=%, str2=%, obj=%)", i, x, 1.001, str, "hello", obj) << std::endl;
If there is a mismatch between the number of percent signs and the number of arguments to output, throw an exception of type cs540::WrongNumberOfArgs.
Now, I've started to write the code to make it work. However, I'm running into a problem using non-PODs. Here is what I have written so far:
#include <iostream>
#include <sstream>
#include <string>
#include <type_traits>
std::string Interpolate(std::string raw_string) {
std::size_t found = raw_string.find_first_of("%");
if(found != std::string::npos && raw_string[found-1] != '\\') {
std::cout << "Throw cs540::ArgsMismatchException" << std::endl;
}
return raw_string;
}
template <typename T, typename ...Args>
std::string Interpolate(std::string raw_string, T arg_head, Args... arg_tail) {
std::size_t found = raw_string.find_first_of("%");
while(found != 0 && raw_string[found-1] == '\\') {
found = raw_string.find_first_of("%", found + 1);
}
if(found == std::string::npos) {
std::cout << "Throw cs540::ArgsMismatchException." << std::endl;
}
// Checking the typeid of the arg_head, and converting it to a string, and concatenating the strings together.
else {
if(std::is_arithmetic<T>::value) {
raw_string = raw_string.substr(0, found) + std::to_string(arg_head) + raw_string.substr(found + 1, raw_string.size());
}
}
return Interpolate(raw_string, arg_tail...);
}
int main(void) {
int i = 24332;
float x = 432.321;
std::string str1("foo");
//Works
std::cout << Interpolate(R"(goo % goo % goo)", i, x) << std::endl;
// Does not work, even though I'm not actually doing anything with the string argument
std::cout << Interpolate(R"(goo %)", str1) << std::endl;
}

This is a run time check semantically. This means that the code in the {} is compiled, even if the expression is always false:
if(std::is_arithmetic<T>::value) {
raw_string = raw_string.substr(0, found) + std::to_string(arg_head) + raw_string.substr(found + 1, raw_string.size());
}
to fix this, you can do this:
template<typename T>
void do_arithmetic( std::string& raw_string, T&& t, std::true_type /* is_arthmetic */ ) {
raw_string = raw_string.substr(0, found) + std::to_string(std::forward<T>(t)) + raw_string.substr(found + 1, raw_string.size());
}
template<typename T>
void do_arithmetic( std::string& raw_string, T&& t, std::false_type /* is_arthmetic */ ) {
// do nothing
}
then put in your code:
do_arithmetic( raw_string, arg_head, std::is_arithmetic<T>() );
which does a compile-time branch. The type of std::is_arithmetic is either true_type or false_type depending on if T is arithmetic. This causes different overloads of do_arithmetic to be called.
In C++1y you can do this inline.
template<typename F, typename...Args>
void do_if(std::true_type, F&& f, Args&&... args){
std::forward<F>(f)( std::forward<Args>(args)... );
}
template<typename...Args>
void do_if(std::false_type, Args&&...){
}
template<bool b,typename...Args>
void do_if_not(std::integral_constant<bool,b>, Args&& args){
do_if( std::integral_constant<bool,!b>{}, std::forward<Args>(args)... );
}
template<typename C, typename F_true, typename F_false, typename...Args>
void branch( C c, F_true&&f1, F_false&& f0, Args&&... args ){
do_if(c, std::forward<F_true>(f1), std::forward<Args>(args)... );
do_if_not(c, std::forward<F_false>(f0), std::forward<Args>(args)... );
}
which is boilerplate. We can then do in our function:
do_if(std::is_arithmetic<T>{},
[&](auto&& arg_head){
raw_string = raw_string.substr(0, found) + std::to_string(arg_head) + raw_string.substr(found + 1, raw_string.size());
},
arg_head
);
or, if you want both branches:
branch(std::is_arithmetic<T>{},
[&](auto&& x){
raw_string = std::to_string(x); // blah blah
}, [&](auto&&) {
// else case
},
arg_head
);
and the first method only gets instantianted with x=arg_head if is_arithmetic is true.
Needs polish, but sort of neat.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Optimize C++ template executions - c++

Related

C++ LRUCache decorator | call a non-member function in a class scope

Main function body doesn't detect call to overloaded variadic-templated function C++

Recursive iteration over parameter pack

Pass-by-reference hinders gcc from tail call elimination

std::string is passing the std::is_fundamental check when it should not - template metaprogramming

Categories

Resources