I have a matrix class very tailored for the algorithm I need to implement. I know about Eigen but it doesn't fit my bill so I had to do my own. I have been working all along with Column Major ordering and now I have the strong use case to employ Row Major too and so I would like to specialize my template matrix class with an extra template parameter that defines the ordering but I don't want to break the existing code.
The concrete effect of this will be to use the template partial specialization to generate differently two or three key class methods e.g. operator(int i, int j) that define the different ordering, a similar concept can be done using pre-processor #define but this is not very elegant and only works compiling all in one mode or the other. This is a sketch of what I'm trying to accomplish:
enum tmatrix_order {
COLUMN_MAJOR, ROW_MAJOR
};
/**
* Concrete definition of matrix in major column ordering that delegates most
* operations to LAPACK and BLAS functions. This implementation provides support
* for QR decomposition and fast updates. The following sequence of QR updates
* are supported:
*
* 1) [addcol -> addcol] integrates nicely with the LAPACK compact form.
* 2) [addcol -> delcol] delcols holds additional Q, and most to date R
* 3) [delcol -> delcol] delcols holds additional Q, and most to date R
* 4) [delcol -> addcol] delcols Q must also be applied to the new column
* 5) [addcol -> addrow] addrows holds additional Q, R is updated in original QR
* 6) [delcol -> addrow] addrows holds additional Q, R is updated in original QR
*/
template<typename T, tmatrix_order O = COLUMN_MAJOR>
class tmatrix {
private:
// basic matrix structure
T* __restrict m_data;
int m_rows, m_cols;
// ...
};
template <typename T>
inline T& tmatrix<T, COLUMN_MAJOR>::operator()(int i, int j) {
return m_data[j*m_rows + i];
}
template <typename T>
inline const T& tmatrix<T, COLUMN_MAJOR>::operator()(int i, int j) const {
return m_data[j*m_rows + i];
}
template <typename T>
inline T& tmatrix<T, ROW_MAJOR>::operator()(int i, int j) {
return m_data[i*m_cols + j];
}
template <typename T>
inline const T& tmatrix<T, ROW_MAJOR>::operator()(int i, int j) const {
return m_data[i*m_cols + j];
}
but the compiler will complain of the partial specialization:
/Users/bravegag/code/fastcode_project/code/src/matrix.h:227:59: error: invalid use of incomplete type 'class tmatrix<T, (tmatrix_order)0u>'
/Users/bravegag/code/fastcode_project/code/src/matrix.h:45:7: error: declaration of 'class tmatrix<T, (tmatrix_order)0u>'
However, if I fully specialize these function like shown below it will work, but this is very inflexible:
inline double& tmatrix<double, COLUMN_MAJOR>::elem(int i, int j) {
return m_data[j*m_rows + i];
}
Is this a language partial template specialization support issue or am I using the wrong syntax?
A possible solution:
enum tmatrix_order {
COLUMN_MAJOR, ROW_MAJOR
};
template<typename T>
class tmatrix_base {
protected:
// basic matrix structure
T* __restrict m_data;
int m_rows, m_cols;
};
template<typename T, tmatrix_order O = COLUMN_MAJOR>
class tmatrix : public tmatrix_base<T>{
public:
tmatrix() {this->m_data = new T[5];}
T& operator()(int i, int j) {
return this->m_data[j*this->m_rows + i];
}
const T& operator()(int i, int j) const {
return this->m_data[j*this->m_rows + i];
}
};
template<typename T>
class tmatrix<T, ROW_MAJOR> : public tmatrix_base<T>{
public:
tmatrix() {this->m_data = new T[5];}
T& operator()(int i, int j) {
return this->m_data[i*this->m_cols + j];
}
const T& operator()(int i, int j) const {
return this->m_data[i*this->m_cols + j];
}
};
int main()
{
tmatrix<double, COLUMN_MAJOR> m1;
m1(0, 0);
tmatrix<double, ROW_MAJOR> m2;
m2(0, 0);
}
The idea is that you provide the full class definition, instead of providing only the functions in the specialized class. I think the basic problem is that the compiler does not know what that class's definition is otherwise.
Note: you need the this-> to be able to access the templated base members, otherwise lookup will fail.
Note: the constructor is just there so I could test the main function without blowing up. You will need your own (which I'm sure you already have)
I'd keep it simple, and write it like this:
template <typename T, tmatrix_order O>
inline T& tmatrix<T, O>::operator()(int i, int j) {
if (O == COLUMN_MAJOR) {
return m_data[j*m_rows + i];
} else {
return m_data[i*m_cols + j];
}
}
While not guaranteed by the language specification, I bet your compiler will optimize out that comparison with a compile-time constant.
Related
class Whatever {
public:
// doThing overloads:
template <typename T>
inline static T doThing(T t, float n) {
/* It's a SmoothStartN function in my code,
but don't worry about the specifics.
Includes a for loop up to n times
(result gets interpolated between non-integer ns). */
return whatever;
}
template <unsigned int n, typename T>
inline static T doThing(T t) {
/* Same as the other one, except now the compiler can
unroll the for loop if appropriate.
Or so I assume, anyway; I might be wrong. */
return whatever;
}
// doMoreComplexThing overloads:
template <unsigned int n, typename T>
inline static T doMoreComplexThing(T t1, T t2) {
float halfN = ((float)n) * 0.5f;
return (doThing(t1, halfN) * doThing(t2, halfN));
}
};
My problem: doMoreComplexThing() currently has to use the presumably-less-well-optimised version of doThing() in all cases. However, in half of all cases, where n is even, it can be evenly divided into integers and thus the more efficient template-uint version is viable.
How could I set this up so that, at compile time, doMoreComplexThing() detects whether n is even and uses the appropriate overload? Is such a thing possible? For that matter, is it likely any more performant to bother with this, or should I just stick with the float overload?
Answer: Thanks to Quentin's suggestion, I believe a good solution looks something like this:
template <unsigned int n, typename T>
inline static T doMoreComplexThing(T t1, T t2) {
if constexpr((n % 2u) == 0u) {
unsigned int halfN = n / 2u;
return (doThing<halfN>(t1) * doThing<halfN>(t2));
}
else {
float halfN = ((float)n) * 0.5f;
return (doThing(t1, halfN) * doThing(t2, halfN));
}
}
I have a templated class which has its own storage, but can also be used to 'view' (or even modify) a bigger array using a pointer to some position in the array. Viewing the bigger array is done using e.g. (see full example below):
Tensor<double> t;
t.view(&array[i]);
When the array is marked const the following can be used
Tensor<const double> t;
t.view(&array[i]);
My problem is now:
I want to write a function with ONE template argument for Tensor<...>, that can be used to map the const-array and to modify a copy of the map. How can I remove the const from a 'nested template'? Or, if that is not possible, how can I use the map without marking it const in the template?
Note that I have currently no conversion from Tensor<const double> to Tensor<double>.
A simple, complete, example showing this behavior is
#include <iostream>
#include <vector>
template<class T>
class Tensor
{
private:
T m_container[4];
T *m_data;
public:
// constructor
Tensor() { m_data = &m_container[0];};
// copy constructor
Tensor(const Tensor& other)
{
for ( auto i = 0 ; i < 4 ; ++i )
m_container[i] = other[i];
m_data = &m_container[0];
}
// index operator
T& operator[](size_t i) { return m_data[i]; }
const T& operator[](size_t i) const { return m_data[i]; }
// point to external object
void view(T *data) { m_data = data; }
};
template<class T>
T someOperation(std::vector<double> &input, size_t i)
{
T t;
t.view(&input[i*4]);
// ... some code that uses "t" but does not modify it
T s = t;
return s;
}
int main()
{
std::vector<double> matrix = { 1., 2., 3., 4., 11., 12., 13., 14. };
Tensor<double> tensor = someOperation<Tensor<double>>(matrix, 1);
return 0;
}
Compiled for example using clang++ -std=c++14 so.cpp.
Now I want to change the signature of the function to
template<class T>
T someOperation(const std::vector<double> &input, size_t i)
The above function can be used using
someOperation<Tensor<const double>>(...)
But obviously I cannot change s anymore. How can I solve this?
Consider std::remove_const:
template<typename T>
void f(T& t);
double const d = 1.0;
f(d);
template<typename T>
void f(T& t)
{
T ct = t;
//ct += 1.0; // refuses to compile!
typename std::remove_const<T>::type nct = t;
nct += 1.0; // fine...
}
Edit: OK, only half of the truth...
With your example provided, matter gets a little more complicated, as you need to exchange the inner template type...
This can be done with a template template function:
template<template < class > class T, typename V>
auto someOperation(std::vector<double>& input, size_t i)
{
T<V> t;
t.view(&input[i*4]);
T<typename std::remove_const<V>::type> s = t;
return s;
}
However, this imposes quite some trouble on you:
Constant members cannot be initialized in the constructor body, so you need:Tensor(Tensor const& other)
: m_container { other[0], other[1], other[2], other[3] },
m_data(m_container)
{ }
Tensor<double> and Tensor<double const> are entirely different types, so they need to be constructible one from another:Tensor(Tensor<typename std::remove_const<T>::type> const& other);
Tensor(Tensor <T const> const& other);
// both with same implementation as above
We don't need all combinations, but we get them for free... Alternatively, a template constructor:template<typename TT>
Tensor(Tensor<TT> const& other);This would even allow you to initialize e. g. a double tensor from e. g. an int tensor – if desired or not, decide you...
Suppose I have the following Matrix template class and there is a requirement to represent vector as either 1 x RowSize or ColSize x 1 matrix (so that I can reuse many matrix operators which are compatible with vectors: multiplying 2 matrices, multiplying matrix by a scalar etc):
template <class T, size_t ColumnSize, size_t RowSize>
struct Matrix {
T [ColumnSize][RowSize];
}
I have two questions:
1) If I am not mistaken I can achieve that either by partial specialization or using SFINAE on Matrix methods (for example to enable 'length' method when either ColSize or RowSize is 1). What are the pros and cons of mentioned options?
2) If I choose to go with the partial specialization, is there a way to define one specialization for both row and column vectors, instead of this:
template <class T, size_t ColumnSize>
struct Matrix<T, ColumnSize, 1> {
T length() const;
T [ColumnSize][RowSize];
}
template <class T, size_t RowSize>
struct Matrix<T, 1, RowSize> {
T length() const;
T [ColumnSize][RowSize];
}
It really depends on whether the requirement is "a general Matrix must not have a length method" (then SFINAE or inheritance should be used), or "length must not be called on a general Matrix" (then a static_assert inside of the length body is applicable). A third option is to not do anything and make length applicable on generic matrices, however there are still other operations that only work on vectors.
For "a general Matrix must not have a length method". To save space, I will use int, and shorter symbol names. Instead of int_, you should use std::integral_constant. The int_ wrapper is needed because of language restrictions that forbid specializing with more complex computations if the parameter is a non-type parameter. Therefore we ḿake the paramer a type, and wrap the value into it. The following does not use SFINAE, but inheritance. With d() of the vector mixing base class, you can access the data of the vector at any time from within the mixing class.
template<int> struct int_;
template<typename D, typename S>
struct V { };
template<typename T, int A, int B>
struct M : V<M<T, A, B>, int_<A * B>> {
T data[A][B];
};
template<typename T, int A, int B>
struct V<M<T, A, B>, int_<A + B - 1>> {
int length() const { return A * B; }
M<T, A, B> *d() { return static_cast<M<T, A, B>*>(this); }
const M<T, A, B> *d() const { return static_cast<const M<T, A, B>*>(this); }
};
This is now
int main() {
M<float, 1, 3> m1; m1.length();
M<float, 3, 1> m2; m2.length();
// M<float, 3, 2> m3; m3.length(); error
}
For "length must not be called on a general Matrix", you can use "static_assert"
template<typename T, int A, int B>
struct M {
int length() const {
static_assert(A == 1 || B == 1, "must not be called on a matrix!");
return A * B;
}
T data[A][B];
};
Choose what is most appropriate
SFINAE is only able to disable a template declaration based on its own parameters. It's a bit unnatural to disable a non-template member function such as length, using the parameters of the enclosing class. The technique looks like this:
template <class T, size_t RowSize, size_t ColumnSize>
struct Matrix {
// SFINAE turns a non-template into a template.
// Introduce a fake dependency so enable_if resolves upon function call.
template< typename size_t_ = size_t >
static constexpr
// Now write the actual condition within the return type.
std::enable_if_t< RowSize == 1 || ColumnSize == 1
, size_t_ > length() const;
{ return RowSize * ColumnSize; }
T [ColumnSize][RowSize];
}
If you can stomach this ugliness, then you get exactly what you want: a function of the desired type, which completely vanishes when the condition is not met. No other support is needed.
On the other hand, partial specialization affects the entire class definition. Since it's usually poor design to duplicate the entire class in each partial specialization, inheritance is used as Johannes describes.
Just to add one alternative to his answer, SFINAE can be used within partial specialization, to avoid the clever algebra and the int_ issue.
// Add "typename = void" for idiomatic class SFINAE.
template<size_t RowSize, size_t ColumnSize, typename = void>
struct maybe_vector_interface { }; // Trivial specialization for non-vectors
// Partial specialization for vectors:
template<size_t RowSize, size_t ColumnSize>
struct maybe_vector_interface< RowSize, ColumnSize,
std::enable_if_t< RowSize == 1 || ColumnSize == 1 > > {
static constexpr int length() const
{ return RowSize * ColumnSize; }
};
template<typename T, size_t RowSize, size_t ColumnSize>
struct Matrix
: maybe_vector_interface<RowSize, ColumnSize> {
T data[RowSize][ColumnSize];
};
I have just started to use template meta-programming in my code. I have a class which has as a member which is a vector of a multi-dimensional Cartesian points. Here is a basic setup of the class:
template<size_t N>
class TGrid{
public:
void round_points_3(){
for(std::size_t i = 0; i < Xp.size();i++){
Xp[i][0] = min[0] + (std::floor((Xp[i][0] - min[0]) * nbins[0] / (max[0] - min[0])) * bin_w[0]) + bin_w[0]/2.0;
Xp[i][1] = min[1] + (std::floor((Xp[i][1] - min[1]) * nbins[1] / (max[1] - min[1])) * bin_w[1]) + bin_w[1]/2.0;
Xp[i][2] = min[2] + (std::floor((Xp[i][2] - min[2]) * nbins[2] / (max[2] - min[2])) * bin_w[2]) + bin_w[2]/2.0;
}
}
void round_points_2(){
for(std::size_t i = 0; i < Xp.size();i++){
Xp[i][0] = min[0] + (std::floor((Xp[i][0] - min[0]) * nbins[0] / (max[0] - min[0])) * bin_w[0]) + bin_w[0]/2.0;
Xp[i][1] = min[1] + (std::floor((Xp[i][1] - min[1]) * nbins[1] / (max[1] - min[1])) * bin_w[1]) + bin_w[1]/2.0;
}
}
void round_points_1(){
for(std::size_t i = 0; i < Xp.size();i++){
Xp[i][0] = min[0] + (std::floor((Xp[i][0] - min[0]) * nbins[0] / (max[0] - min[0])) * bin_w[0]) + bin_w[0]/2.0;
}
}
public:
std::vector<std::array<double,N> > Xp;
std::vector<double> min, max, nbins, bin_w;
};
This class represented a multidimensional Grid. The dimension is specified by the template value N. I will be having many operations which can be made more efficient by having template specific member functions tailored to the specific dimensions, such as loop unrolling.
In the class TGrid, I have 3 functions specific for dimensions D=1,D=2 and D=3. This is indicated by the subscript _1,_2 and _3 of the functions.
I am looking for a template meta-programming oriented approach to write
these three functions more compactly.
I have seen examples of loop unrolling but all of these examples don't consider member functions of a template class.
Putting to one side the question of whether or not this is an appropriate optimisation, or if other optimisations should be regarded first, this is how I would do it. (But I do agree, sometimes it is demonstrably better to explicitly unroll loops — the compiler isn't always the best judge.)
One can't partially specialize a member function, and one can't specialize a nested struct without specializing the outer struct, so the only solution is to use a separate templated struct for the unrolling mechanism. Feel free to put this in some other namespace :)
The unrolling implementation:
template <int N>
struct sequence {
template <typename F,typename... Args>
static void run(F&& f,Args&&... args) {
sequence<N-1>::run(std::forward<F>(f),std::forward<Args>(args)...);
f(args...,N-1);
}
};
template <>
struct sequence<0> {
template <typename F,typename... Args>
static void run(F&& f,Args&&... args) {}
};
This takes an arbitrary functional object and a list of arguments, and then calls the object with the arguments and an additional final argument N times, where the final argument ranges from 0 to N-1. The universal references and variadic templates are not necessary; the same idea can be employed in C++98 with less generality.
round_points<K> then calls sequence::run<K> with a helper static member function:
template <size_t N>
class TGrid {
public:
template <size_t K>
void round_points(){
for (std::size_t i = 0; i < Xp.size();i++) {
sequence<K>::run(TGrid<N>::round_item,*this,i);
}
}
static void round_item(TGrid &G,int i,int j) {
G.Xp[i][j] = G.min[j] + (std::floor((G.Xp[i][j] - G.min[j]) * G.nbins[j] / (G.max[j] - G.min[j])) * G.bin_w[j]) + G.bin_w[j]/2.0;
}
// ...
};
Edit: Addendum
Doing the equivalent with a pointer-to-member function appears to be hard for compilers to inline. As an alternative, to avoid the use of a static round_item, you can use a lambda, e.g.:
template <size_t N>
class TGrid {
public:
template <size_t K>
void round_points(){
for (std::size_t i = 0; i < Xp.size();i++) {
sequence<K>::run([&](int j) {round_item(i,j);});
}
}
void round_item(int i,int j) {
Xp[i][j] = min[j] + (std::floor((Xp[i][j] - min[j]) * nbins[j] / (max[j] - min[j])) * bin_w[j]) + bin_w[j]/2.0;
}
// ...
};
Here are two template functions, that differ only in their template parameters. The rest of the parameters are exactly the same.
template<int module>
void template_const(int &a,int & b){
a = a & module;
b = b % module;
}
template<bool x>
void template_const(int &a,int & b){
int w;
if (x){
w = 123;
}
else w = 512;
a = a & w;
b = b % w;
}
When I try to call them like this
template_const<true>(a,b)
or
template_const<123>(a,b)
the compiler tells me that the call is ambiguous. How can I call these two functions?
As #jogojapan pointed out, the problem is that the compiler cannot order these two functions, i.e. there is not one that is more specialized than the other. As explained in §14.5.6.2, when a call to an overloaded function template is ambiguous, the compiler uses a partial ordering between the various overloads to select the most specialized one.
To order the overloads, the compiler transforms each one of them and performs template argument deduction to see if one is more specialized than another one (there is a short explanation at the end of this answer). In your case, the two overloads are equivalent (or not comparable): template<int> void template_const(int &,int &) is not more specialized than template<bool> void template_const(int &, int &), and vice-versa.
Therefore, the compiler cannot select one over the other, hence generating an ambiguous call error.
If you are ok with explicitely specifying the type of the parameter you want to pass, you can use partial template specialization as follow:
template<typename T, T param>
struct template_const_impl;
template <int module>
struct template_const_impl<int, module>
{
static void apply(int &a, int &b)
{
a = a & module;
b = b % module;
}
};
template<bool x>
struct template_const_impl<bool, x>
{
static void apply(int &a, int &b)
{
const int w = x ? 123 : 512;
a = a & w;
b = b % w;
}
};
template <typename T, T param>
void template_const(int &a, int &b)
{
return template_const_impl<T, param>::apply(a, b);
}
int main()
{
int i = 512, j = 256;
template_const<int, 123>(i, j);
template_const<bool, true>(i, j);
}
This is not ideal, but it don't think there is a cleaner solution unless you can use C++11 and are willing to rely on some macros, in which case you can simplify the calling code a bit (idea taken from #Nawaz in this answer):
#define TEMPLATE_CONST(x) template_const<decltype(x), x>
int main()
{
int i = 512, j = 256;
TEMPLATE_CONST(123)(i, j);
TEMPLATE_CONST(true)(i, j);
}
I do not think it is going to work like this. You have overloads with same parameter types. Probably you will have to give them different names in the end and call them as you tried.