More efficient way with multi-dimensional arrays (matrices) using C++ - c++

I coded a small simple program that compares the performances of several ways to fill a simple 8x8 matrix with different kind of containers. Here's the following code :
#define MATRIX_DIM 8
#define OCCUR_MAX 100000
static void genHeapAllocatedMatrix(void)
{
int **pPixels = new Pixel *[MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
pPixels[idy] = new Pixel[MATRIX_DIM];
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++)
pPixels[idy][idx] = 42;
}
}
static void genStackAllocatedMatrix(void)
{
std::array<std::array<int, 8>, 8> matrix;
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
matrix[idy][idx] = 42;
}
}
}
static void genStackAllocatedMatrixBasic(void)
{
int matrix[MATRIX_DIM][MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
matrix[idy][idx] = 42;
}
}
}
int main(void)
{
clock_t begin, end;
double time_spent;
begin = clock();
for (type::uint32 idx = 0; idx < OCCUR_MAX; idx++)
{
//genHeapAllocatedMatrix();
genStackAllocatedMatrix();
//genStackAllocatedMatrixBasic();
}
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
std::cout << "Elapsed time = " << time_spent << std::endl;
return (0);
}
As you can guess the more efficient way is the last one with a simple two-dimentional C array (hard-coded). Of course the worse choice is the number one using heap allocations.
My problem is I want to stock this 2-dimensional array as an attribute in a class. Here's a definition of a custom class that handle a matrix :
template <typename T>
class Matrix
{
public:
Matrix(void);
Matrix(type::uint32 column, type::uint32 row);
Matrix(Matrix const &other);
virtual ~Matrix(void);
public:
Matrix &operator=(Matrix const &other);
bool operator!=(Matrix const &other);
bool operator==(Matrix const &other);
type::uint32 rowCount(void) const;
type::uint32 columnCount(void) const;
void printData(void) const;
T **getData(void) const;
void setData(T **matrix);
private:
type::uint32 m_ColumnCount;
type::uint32 m_RowCount;
T **m_pMatrix;
};
To do the job done I tried the following thing using a cast :
Matrix<int> matrix;
int tab[MATRIX_DIM][MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
tab[idy][idx] = 42;
}
}
matrix.setData((int**)&tab[0][0]);
This code compiles correctly but if I want to print it there's a segmentation fault.
int tab[MATRIX_DIM][MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
tab[idy][idx] = 42;
}
}
int **matrix = (int**)&tab[0][0];
std::cout << matrix[0][0] << std::endl; //Segmentation fault
Is there a possible way to stock this kind of two dimentional array as an attribute without heap allocation?

That's because a two-dimensional array is not an array of pointers.
So, you should use int * for your matrix type, but then of course you will not be able to index it by two dimensions.
Another option is to store a pointer to the array:
int (*matrix)[MATRIX_DIM][MATRIX_DIM];
matrix = &tab;
std::cout << (*matrix)[0][0] << std::endl;
But that doesn't suit well an idea of incapsulating matrix in a class. F better idea would be for the class to allocate the storage itself (possibly in a single heap allocation) and to provide an access to the matrix through methods only (e.g. GetCell(row, col) etc.), without exposing raw pointers.

Measuring the speed of operations on an 8 x 8 array is largely pointless. For a data set as small as that, the cost of the operation will be close to zero and you are mostly measuring setup time, etc.
Timings become important for larger data sets, but you cannot sensibly extrapolate the small set results to the larger set. With larger data sets you will often find that the data exists on multiple memory pages. There is a danger that paging costs will dominate other costs. Very large improvements in efficiency are possible by ensuring that your algorithm processes all (or most) of the data on one page before moving to the next page, rather than constantly swapping pages.
In general, you are best to use the simplest data structures with the least liklihood of programming error and optimising processing algorithms. I say "in general" as extreme cases do exist where small differences in access time matter, but they are rare.

Use a single array to represent the matrix instead of allocating for each index.
I've written a class for this already. Feel free to use it:
#include <vector>
template<typename T, typename Allocator = std::allocator<T>>
class DimArray
{
private:
int Width, Height;
std::vector<T, Allocator> Data;
public:
DimArray(int Width, int Height);
DimArray(T* Data, int Width, int Height);
DimArray(T** Data, int Width, int Height);
virtual ~DimArray() {}
DimArray(const DimArray &da);
DimArray(DimArray &&da);
inline std::size_t size() {return Data.size();}
inline std::size_t size() const {return Data.size();}
inline int width() {return Width;}
inline int width() const {return Width;}
inline int height() {return Height;}
inline int height() const {return Height;}
inline T* operator [](const int Index) {return Data.data() + Height * Index;}
inline const T* operator [](const int Index) const {return Data.data() + Height * Index;}
inline DimArray& operator = (DimArray da);
};
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(int Width, int Height) : Width(Width), Height(Height), Data(Width * Height, 0) {}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(T* Data, int Width, int Height) : Width(Width), Height(Height), Data(Width * Height, 0) {std::copy(&Data[0], &Data[0] + Width * Height, const_cast<T*>(this->Data.data()));}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(T** Data, int Width, int Height) : Width(Width), Height(Height), Data(Width * Height, 0) {std::copy(Data[0], Data[0] + Width * Height, const_cast<T*>(this->Data.data()));}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(const DimArray &da) : Width(da.Width), Height(da.Height), Data(da.Data) {}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(DimArray &&da) : Width(std::move(da.Width)), Height(std::move(da.Height)), Data(std::move(da.Data)) {}
template<typename T, typename Allocator>
DimArray<T, Allocator>& DimArray<T, Allocator>::operator = (DimArray<T, Allocator> da)
{
this->Width = da.Width;
this->Height = da.Height;
this->Data.swap(da.Data);
return *this;
}
Usage:
int main()
{
DimArray<int> Matrix(1000, 1000); //creates a 1000 * 1000 matrix.
Matrix[0][0] = 100; //ability to index it like a multi-dimensional array.
}
More usage:
template<typename T, std::size_t size>
class uninitialised_stack_allocator : public std::allocator<T>
{
private:
alignas(16) T data[size];
public:
typedef typename std::allocator<T>::pointer pointer;
typedef typename std::allocator<T>::size_type size_type;
typedef typename std::allocator<T>::value_type value_type;
template<typename U>
struct rebind {typedef uninitialised_stack_allocator<U, size> other;};
pointer allocate(size_type n, const void* hint = 0) {return static_cast<pointer>(&data[0]);}
void deallocate(void* ptr, size_type n) {}
size_type max_size() const {return size;}
};
int main()
{
DimArray<int, uninitialised_stack_allocator<int, 1000 * 1000>> Matrix(1000, 1000);
}

Related

C++ Multidimensional array in existing memory

(This is not a duplicate of this or this that refer to fixed sizes, the issue is not to understand how pointers are stored, but if the compiler can automate the manual function).
Based on this SO question multidimensional arrays are stored sequentially.
// These arrays are the same
int array1[3][2] = {{0, 1}, {2, 3}, {4, 5}};
int array2[6] = { 0, 1, 2, 3, 4, 5 };
However I'm trying to create a 2 dimension array of floats in pre-allocated memory:
float a[5][10]
float b[50]; // should be same memory
Then I'm trying:
vector<char> x(1000);
float** a = (float**)x.data();
a[0][1] = 5;
The above code crashes, obviously because the compiler does not know the size of the array to allocate it in memory like in the compiler-level known array in the first example.
Is there a way to tell the compiler to allocate a multi dimensional array in sequential memory without manually calculating the pointers (say, by manually shifting the index and calling placement new for example)?
Currently, I'm doing it manually, for example:
template <typename T> size_t CreateBuffersInMemory(char* p,int n,int BufferSize)
{
// ib = T** to store the data
int ty = sizeof(T);
int ReqArraysBytes = n * sizeof(void*);
int ReqT = ReqArraysBytes * (ty*BufferSize);
if (!p)
return ReqT;
memset(p, 0, ReqT);
ib = (T**)p;
p += n * sizeof(void*);
for (int i = 0; i < n; i++)
{
ib[i] = (T*)p;
p += ty*BufferSize;
}
return ReqT;
}
Thanks a lot.
To allocate T[rows][cols] array as a one-dimensional array allocate T[rows * cols].
To access element [i][j] of that one-dimensional array you can do p[i * cols + j].
Example:
template<class T>
struct Array2d {
T* elements_;
unsigned columns_;
Array2d(unsigned rows, unsigned columns)
: elements_(new T[rows * columns]{}) // Allocate and value-initialize.
, columns_(columns)
{}
T* operator[](unsigned row) {
return elements_ + row * columns_;
}
// TODO: Implement the special member functions.
};
int main() {
Array2d<int> a(5, 10);
a[3][1] = 0;
}
Your code invokes undefined behavior because x.data() does not point to an array of pointers but to an array of 1000 objects of type char. You should be thankful that it crashes… ;-)
One way to access a contiguous buffer of some type as if it was a multidimensional array is to have another object that represents a multidimensional view into this buffer. This view object can then, e.g., provide member functions to access the data using a multidimensional index. To enable the a[i][j][k] kind of syntax (which you seem to be aiming for), provide an overloaded [] operator which returns a proxy object that itself offers an operator [] and so on until you get down to a single dimension.
For example, for the case that dimensions are fixed at compile time, we can define
template <int Extent, int... Extents>
struct row_major_layout;
template <int Extent>
struct row_major_layout<Extent>
{
template <typename T>
static auto view(T* data) { return data; }
};
template <int Extent, int... Extents>
struct row_major_layout
{
static constexpr int stride = (Extents * ... * 1);
template <typename T>
class span
{
T* data;
public:
span(T* data) : data(data) {}
auto operator[](std::size_t i) const
{
return row_major_layout<Extents...>::view(data + i * stride);
}
};
template <typename T>
static auto view(T* data) { return span<T>(data); }
};
and then simply create and access such a row_major_layout view
void test()
{
constexpr int M = 7, N = 2, K = 5;
std::vector<int> bla(row_major_layout<M, N, K>::size);
auto a3d = row_major_layout<M, N, K>::view(data(bla));
a3d[2][1][3] = 42;
}
live example here
Or in case the array bounds are dynamic:
template <int D>
class row_major_layout;
template <>
class row_major_layout<1>
{
public:
row_major_layout(std::size_t extent) {}
static constexpr std::size_t size(std::size_t extent)
{
return extent;
}
template <typename T>
friend auto view(T* data, const row_major_layout&)
{
return data;
}
};
template <int D>
class row_major_layout : row_major_layout<D - 1>
{
std::size_t stride;
public:
template <typename... Dim>
row_major_layout(std::size_t extent, Dim&&... extents)
: row_major_layout<D - 1>(std::forward<Dim>(extents)...), stride((extents * ... * 1))
{
}
template <typename... Dim>
static constexpr std::size_t size(std::size_t extent, Dim&&... extents)
{
return extent * row_major_layout<D - 1>::size(std::forward<Dim>(extents)...);
}
template <typename T>
class span
{
T* data;
std::size_t stride;
const row_major_layout<D - 1>& layout;
public:
span(T* data, std::size_t stride, const row_major_layout<D - 1>& layout)
: data(data), stride(stride), layout(layout)
{
}
auto operator[](std::size_t i) const
{
return view(data + i * stride, layout);
}
};
template <typename T>
friend auto view(T* data, const row_major_layout& layout)
{
return span<T>(data, layout.stride, layout);
}
};
and
void test(int M, int N, int K)
{
std::vector<int> bla(row_major_layout<3>::size(M, N, K));
auto a3d = view(data(bla), row_major_layout<3>(M, N, K));
a3d[2][1][3] = 42;
}
live example here
Based on this answer assuming you want an array of char you can do something like
std::vector<char> x(1000);
char (&ar)[200][5] = *reinterpret_cast<char (*)[200][5]>(x.data());
Then you can use ar as a normal two-dimensional array, like
char c = ar[2][3];
For anyone trying to achieve the same, I 've created a variadit template function that would create a n-dimension array in existing memory:
template <typename T = char> size_t CreateArrayAtMemory(void*, size_t bs)
{
return bs*sizeof(T);
}
template <typename T = char,typename ... Args>
size_t CreateArrayAtMemory(void* p, size_t bs, Args ... args)
{
size_t R = 0;
size_t PS = sizeof(void*);
char* P = (char*)p;
char* P0 = (char*)p;
size_t BytesForAllPointers = bs*PS;
R = BytesForAllPointers;
char* pos = P0 + BytesForAllPointers;
for (size_t i = 0; i < bs; i++)
{
char** pp = (char**)P;
if (p)
*pp = pos;
size_t RLD = CreateArrayAtMemory<T>(p ? pos : nullptr, args ...);
P += PS;
R += RLD;
pos += RLD;
}
return R;
}
Usage:
Create a 2x3x4 char array:
int j = 0;
size_t n3 = CreateArrayAtMemory<char>(nullptr,2,3,4);
std::vector<char> a3(n3);
char*** f3 = (char***)a3.data();
CreateArrayAtMemory<char>(f3,2,3,4);
for (int i1 = 0; i1 < 2; i1++)
{
for (int i2 = 0; i2 < 3; i2++)
{
for (int i3 = 0; i3 < 4; i3++)
{
f3[i1][i2][i3] = j++;
}
}
}

How to return a well defined section of memory? for instance a color value of a pixel from image data

I need to perform some manipulation on images. The images can be color/greyscale and 8-bit/16-bit. I want to do it without using any third party library (opencv, IPP etc).
I have:
The image data as void *.
Width and height of the image.
Number of channels
Bit Resolution of each color channel.
I was thinking of having following structures to represent Color and Image.
Color structure
template<typename ColorDataType, std::size_t channelCount = 3, std::size_t bitResolution = 8>
struct Color
{
using DataType = ColorDataType;
std::array<DataType, channelCount> colorData;
};
Image structure
template<typename ColorType>
class Image
{
std::size_t width;
std::size_t height;
ColorType::DataType * data; // Or a unique_ptr<DataType[]> haven't decided on the ownership yet.
public:
Image(std::size_t inWidth, std::size_t inHeight, ColorType::DataType * inData)
: width(inWidth), height(inHeight), data(inData)
{}
Color & GetColor(std::size_t row, std::size_t col)
{
// How can I return a color element here that could be manipulated from the receiving side?
// The change should be reflected in the memory addressed by data pointer.
}
};
What would be the best way to return a section of the image data so that I can manipulate it later? I would also like to have a const mechanism for the same, in case my Image is const.
Additional information:
The requirement is more about treating the given data stream as a 3-D matrix whose dimensions are:
Row of the image
Column of the image
Color of the image
The 3rd dimension, Color, can have 1 or 3 elements depending on whether the image is a gray-scale image or RGB image. I would like to have a way to access the Color element of an image based on the row and column provided and be able to edit the real data in the data stream based on the Color element received.
By doing so, I was hoping to represent Color as a separate entity, so that I can say "Color has a DataType, channelCount and bitResolution" And Image is made of a particular Color Type
I'm assuming your Color class knows everything about itself to be constructable just from a pointer to your memory. In order for it to actually manipulate your image, pass iterators into data to the constructor of Color and store these iterators. Every time you manipulate your color object, make sure (beware thread-safety!) to also manipulate the data behind the iterators.
If you don't want to force your receiver to accept references, you can also return decltype(auto) from GetColor.
I used something similar in a 2D-array class that I wanted to have something similar to a operator[][]:
#include <cassert>
#include <vector>
template <typename T> class Matrix {
std::vector<T> _matrix;
const int _rows;
const int _cols;
public:
Matrix(int rows, int cols, T def) : _matrix(rows * cols, def), _rows(rows), _cols(cols) {}
Matrix(int rows, int cols) : Matrix(rows, cols, T()) {}
class Row {
private:
Matrix<T> &_m;
const int _row;
public:
Row(Matrix<T> &m, int row) : _m(m), _row(row) {}
decltype(auto) operator[](int col) {
assert(col >= 0);
assert(col < _m.cols());
return _m._matrix[_row * _m.cols() + col];
}
};
friend class Row;
Row operator[](int row) {
assert(row >= 0);
assert(row < _rows);
return Row(*this, row);
}
int rows() const { return _rows; }
int cols() const { return _cols; }
};
EDIT: This should be more like what you are looking for:
#include <array>
#include <cassert>
#include <iostream>
#include <vector>
template <std::size_t channelCount = 3, std::size_t bitResolution = 8> struct Color {
private:
std::vector<int>::iterator begin;
public:
Color(std::vector<int>::iterator begin) : begin(begin) {}
std::string toString() const {
std::string rep = "(";
for (std::size_t i = 0; i < channelCount; ++i) {
rep += std::to_string(*(begin + i));
if (i + 1 < channelCount)
rep += ", ";
}
rep += ")";
return rep;
}
template <std::size_t channel> void setChannelColor(int color) {
static_assert(channel < channelCount);
*(begin + channel) = color;
}
static constexpr std::size_t getChannelCount() { return channelCount; }
static constexpr std::size_t getBitResolution() { return bitResolution; }
};
template <class ColorTypeT> class Image {
using ColorType = Color<ColorTypeT::getChannelCount(), ColorTypeT::getBitResolution()>;
static constexpr std::size_t channelCount = ColorTypeT::getChannelCount();
static constexpr std::size_t bitResolution = ColorTypeT::getBitResolution();
std::size_t width;
std::size_t height;
std::vector<int> data;
public:
Image(std::size_t inWidth, std::size_t inHeight, std::vector<int> inData) : width(inWidth), height(inHeight), data(std::move(inData)) {}
decltype(auto) GetColor(std::size_t row, std::size_t col) { return ColorType(data.begin() + (row * width + col) * channelCount); }
std::size_t getWidth() const { return width; }
std::size_t getHeight() const { return height; }
};
template <class Image> void printImage(Image &image) {
for (std::size_t i = 0; i < image.getHeight(); ++i) {
for (std::size_t j = 0; j < image.getWidth(); ++j) {
std::cout << image.GetColor(i, j).toString() << " ";
}
std::cout << std::endl;
}
}
int main(void) {
std::vector<int> data = {255, 0, 0, 0, 255, 255, 0, 0, 0, 128, 0, 0, 0, 0, 0, 255, 0, 0, 255, 0, 0, 255, 0, 0, 0, 0, 255};
using Color1 = Color<3, 8>;
using Color2 = Color<1, 8>;
Image<Color1> i1{3, 3, data};
Image<Color2> i2{9, 3, data};
printImage(i1);
std::cout << std::endl;
printImage(i2);
std::cout << std::endl;
i1.GetColor(0, 0).setChannelColor<1>(128);
// i2.GetColor(0, 0).setChannelColor<2>(128); // static_assert fails
i2.GetColor(0, 0).setChannelColor<0>(128);
printImage(i1);
std::cout << std::endl;
printImage(i2);
return 0;
}
It does not take the resolution into account yet, but that should not be much work.
EDIT:
The two classes now don't have a "connection", you can plug in anything that has the methods getChannelCount and getBitResolution. To enforce this, you can use a concept:
template<typename T>
concept bool ColorType = requires(T a) {
{ a.getChannelCount() } -> std::size_t { a.getBitResolution() } -> std::size_t;
};
And then change the definition of the Image class to:
template <ColorType ColorTypeT> class Image
clang does not support this yet, however, GCC 8 does.

How to implement a tensor class for Kronecker-Produkt [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Currently I came across an interesting article what's called the Kronecker-Produkt. At the same time I'm working on my neural network library.
So that my algorithm works, I need a tensor class, where I can get the product of two tensor's with an overloaded * operator.
Consider the following example/questions:
How to efficiently construct/store the nested matrices?
How to perform the product of two tensor's?
How to visualize tensor c as simply as possible?
My class 3 tensor which currently only supports 3 dimensions:
#pragma once
#include <iostream>
#include <sstream>
#include <random>
#include <cmath>
#include <iomanip>
template<typename T>
class tensor {
public:
const unsigned int x, y, z, s;
tensor(unsigned int x, unsigned int y, unsigned int z, T val) : x(x), y(y), z(z), s(x * y * z) {
p_data = new T[s];
for (unsigned int i = 0; i < s; i++) p_data[i] = val;
}
tensor(const tensor<T> & other) : x(other.x), y(other.y), z(other.z), s(other.s) {
p_data = new T[s];
memcpy(p_data, other.get_data(), s * sizeof(T));
}
~tensor() {
delete[] p_data;
p_data = nullptr;
}
T * get_data() {
return p_data;
}
static tensor<T> * random(unsigned int x, unsigned int y, unsigned int z, T val, T min, T max) {
tensor<T> * p_tensor = new tensor<T>(x, y, z, val);
std::random_device rd;
std::mt19937 mt(rd());
std::uniform_real_distribution<T> dist(min, max);
for (unsigned int i = 0; i < p_tensor->s; i++) {
T rnd = dist(mt);
while (abs(rnd) < 0.001) rnd = dist(mt);
p_tensor->get_data()[i] = rnd;
}
return p_tensor;
}
static tensor<T> * from(std::vector<T> * p_data, T val) {
tensor<T> * p_tensor = new tensor<T>(p_data->size(), 1, 1, val);
for (unsigned int i = 0; i < p_tensor->get_x(); i++) p_tensor->set_data(i + 0 * p_tensor->get_x() * + 0 * p_tensor->get_x() * p_tensor->get_y(), p_data->at(i));
return p_tensor;
}
friend std::ostream & operator <<(std::ostream & stream, tensor<T> & tensor) {
stream << "(" << tensor.x << "," << tensor.y << "," << tensor.z << ") Tensor\n";
for (unsigned int i = 0; i < tensor.x; i++) {
for (unsigned int k = 0; k < tensor.z; k++) {
stream << "[";
for (unsigned int j = 0; j < tensor.y; j++) {
stream << std::setw(5) << roundf(tensor(i, j, k) * 1000) / 1000;
if (j + 1 < tensor.y) stream << ",";
}
stream << "]";
}
stream << std::endl;
}
return stream;
}
tensor<T> & operator +(tensor<T> & other) {
tensor<T> result(*this);
return result;
}
tensor<T> & operator -(tensor<T> & other) {
tensor<T> result(*this);
return result;
}
tensor<T> & operator *(tensor<T> & other) {
tensor<T> result(*this);
return result;
}
T & operator ()(unsigned int i, unsigned int j, unsigned int k) {
return p_data[i + (j * x) + (k * x * y)];
}
T & operator ()(unsigned int i) {
return p_data[i];
}
private:
T * p_data = nullptr;
};
int main() {
tensor<double> * p_tensor_input = tensor<double>::random(6, 2, 3, 0.0, 0.0, 1.0);
tensor<double> * p_tensor_weight = tensor<double>::random(2, 6, 3, 0.0, 0.0, 1.0);
std::cout << *p_tensor_input << std::endl;
std::cout << *p_tensor_weight << std::endl;
tensor<double> p_tensor_output = *p_tensor_input + *p_tensor_weight;
return 0;
}
Your first step is #2 -- and get it correct.
After that, optimize.
Start with a container C<T>.
Define some operations on it. wrap(T) returns a C<T> containing that T. map takes a C<T> and a function on T U f(T) and returns C<U>. flatten takes a C<C<U>> and returns a C<U>.
Define scale( T, C<T> ) which takes a T and a C<T> and returns a C<T> with the elements scaled. Aka, scalar multiplication.
template<class T>
C<T> scale( T scalar, C<T> container ) {
return map( container, [&](T t){ return t*scalar; } );
}
Then we have:
template<class T>
C<T> tensor( C<T> lhs, C<T> rhs ) {
return flatten( map( lhs, [&](T t) { return scale( t, rhs ); } ) );
}
is your tensor product. And yes, that can be your actual code. I would tweak it a bit for efficiency.
(Note I used different terms, but I'm basically describing monadic operations using different words.)
After you have this, test, optimize, and iterate.
As for 3, the result of tensor products get large and complex, there is no simple visualization for a large tensor.
Oh, and keep things simple and store data in a std::vector to start.
Here are some tricks for efficient vectors i learned in class, but they should be equally good for a tensor.
Define an empty constructor and assignment operator. For example
tensor(unsigned int x, unsigned int y, unsigned int z) : x(x), y(y), z(z), s(x * y * z) {
p_data = new T[s];
}
tensor& operator=( tensor const& that ) {
for (int i=0; i<size(); ++i) {
p_data[i] = that(i) ;
}
return *this ;
}
template <typename T>
tensor& operator=( T const& that ) {
for (int i=0; i<size(); ++i) {
p_data[i] = that(i) ;
}
return *this ;
}
Now we can implement things like addition and scaling with deferred evaluation. For example:
template<typename T1, typename T2>
class tensor_sum {
//add value_type to base tensor class for this to work
typedef decltype( typename T1::value_type() + typename T2::value_type() ) value_type ;
//also add function to get size of tensor
value_type operator()( int i, int j, int k ) const {
return t1_(i,j,k) + v2_(i,j,k) ;
}
value_type operator()( int i ) const {
return t1_(i) + v2_(i) ;
}
private:
T1 const& t1_;
T2 const& t2_;
}
template <typename T1, typename T2>
tensor_sum<T1,T2> operator+(T1 const& t1, T2 const& t2 ) {
return vector_sum<T1,T2>(t1,t2) ;
}
This tensor_sum behaves exactly like any normal tensor, except that we don't have to allocate memory to store the result. So we can do something like this:
tensor<double> t0(...);
tensor<double> t1(...);
tensor<double> t2(...);
tensor<double> result(...); //define result to be empty, we will fill it later
result = t0 + t1 + 5.0*t2;
The compiler should optimize this to be just one loop, without storing intermediate results or modifying the original tensors. You can do the same thing for scaling and the kronecker product. Depending on what you want to do with the tensors, this can be a big advantage. But be careful, this isn't always the best option.
When implementing the kronecker product you should be careful of the of the ordering of your loop, try to go through the tensors in the order they are stored for cache efficiency.

How to make a "variadic" vector like class

I am trying to make class that acts as multidimensional vector. It doesn't have to do anything fancy. I basically want to have a "container" class foo where I can access elements by foo[x][y][z]. Now I would also need similar classes for foo[x][y] and foo[x]. Which lead me to ponder about the following (more general) question, is there a way to make something like this where you can just initialize as foo A(a,b,c,...) for any n number of arguments and get a n-dimensional vector with elements accessible by [][][]...? Below the class I have for (in example) the four-dimensional case.
First the header
#ifndef FCONTAINER_H
#define FCONTAINER_H
#include <iostream>
using namespace std;
class Fcontainer
{
private:
unsigned dim1, dim2, dim3, dim4 ;
double* data;
public:
Fcontainer(unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4);
~Fcontainer();
Fcontainer(const Fcontainer& m);
Fcontainer& operator= (const Fcontainer& m);
double& operator() (unsigned const dim1, unsigned const dim2, unsigned const dim3, unsigned const dim4);
double const& operator() (unsigned const dim1, unsigned const dim2, unsigned const dim3, unsigned const dim4) const;
};
#endif // FCONTAINER_H
Now the cpp:
#include "fcontainer.hpp"
Fcontainer::Fcontainer(unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4)
{
dim1 = dims1; dim2 = dims2; dim3 = dims3; dim4 = dims4;
if (dims1 == 0 || dims2 == 0 || dims3 == 0 || dims4 == 0)
throw std::invalid_argument("Container constructor has 0 size");
data = new double[dims1 * dims2 * dims3 * dims4];
}
Fcontainer::~Fcontainer()
{
delete[] data;
}
double& Fcontainer::operator() (unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4)
{
if (dims1 >= dim1 || dims2 >= dim2 || dims3 >= dim3 || dims4 >= dim4)
throw std::invalid_argument("Container subscript out of bounds");
return data[dims1*dim2*dims3*dim4 + dims2*dim3*dim4 + dim3*dim4 + dims4];
}
double const& Fcontainer::operator() (unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4) const
{
if(dims1 >= dim1 || dims2 >= dim2 || dims3 >= dim3 || dims4 >= dim4)
throw std::invalid_argument("Container subscript out of bounds");
return data[dims1*dim2*dims3*dim4 + dims2*dim3*dim4 + dim3*dim4 + dims4];
}
So I want to expand this to an arbitrary amount of dimensions. I suppose it will take something along the lines of a variadic template or an std::initializer_list but I am not clear on how to approach this( for this problem).
Messing around in Visual Studio for a little while, I came up with this nonsense:
template<typename T>
class Matrix {
std::vector<size_t> dimensions;
std::unique_ptr<T[]> _data;
template<typename ... Dimensions>
size_t apply_dimensions(size_t dim, Dimensions&& ... dims) {
dimensions.emplace_back(dim);
return dim * apply_dimensions(std::forward<Dimensions>(dims)...);
}
size_t apply_dimensions(size_t dim) {
dimensions.emplace_back(dim);
return dim;
}
public:
Matrix(std::vector<size_t> dims) : dimensions(std::move(dims)) {
size_t size = flat_size();
_data = std::make_unique<T[]>(size);
}
template<typename ... Dimensions>
Matrix(size_t dim, Dimensions&&... dims) {
size_t size = apply_dimensions(dim, std::forward<Dimensions>(dims)...);
_data = std::make_unique<T[]>(size);
}
T & operator()(std::vector<size_t> const& indexes) {
if(indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
T const& operator()(std::vector<size_t> const& indexes) const {
if (indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
template<typename ... Indexes>
T & operator()(size_t idx, Indexes&& ... indexes) {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
template<typename ... Indexes>
T const& operator()(size_t idx, Indexes&& ... indexes) const {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
T & at(size_t flat_index) {
return _data[flat_index];
}
T const& at(size_t flat_index) const {
return _data[flat_index];
}
size_t dimension_size(size_t dim) const {
return dimensions[dim];
}
size_t num_of_dimensions() const {
return dimensions.size();
}
size_t flat_size() const {
size_t size = 1;
for (size_t dim : dimensions)
size *= dim;
return size;
}
private:
size_t get_flat_index(std::vector<size_t> const& indexes) const {
size_t dim = 0;
size_t flat_index = 0;
for (size_t index : indexes) {
flat_index += get_offset(index, dim++);
}
return flat_index;
}
template<typename ... Indexes>
size_t get_flat_index(size_t dim, size_t index, Indexes&& ... indexes) const {
return get_offset(index, dim) + get_flat_index(dim + 1, std::forward<Indexes>(indexes)...);
}
size_t get_flat_index(size_t dim, size_t index) const {
return get_offset(index, dim);
}
size_t get_offset(size_t index, size_t dim) const {
if (index >= dimensions[dim])
throw std::runtime_error("Index out of Bounds");
for (size_t i = dim + 1; i < dimensions.size(); i++) {
index *= dimensions[i];
}
return index;
}
};
Let's talk about what this code accomplishes.
//private:
template<typename ... Dimensions>
size_t apply_dimensions(size_t dim, Dimensions&& ... dims) {
dimensions.emplace_back(dim);
return dim * apply_dimensions(std::forward<Dimensions>(dims)...);
}
size_t apply_dimensions(size_t dim) {
dimensions.emplace_back(dim);
return dim;
}
public:
Matrix(std::vector<size_t> dims) : dimensions(std::move(dims)) {
size_t size = flat_size();
_data = std::make_unique<T[]>(size);
}
template<typename ... Dimensions>
Matrix(size_t dim, Dimensions&&... dims) {
size_t size = apply_dimensions(dim, std::forward<Dimensions>(dims)...);
_data = std::make_unique<T[]>(size);
}
What this code enables us to do is write an initializer for this matrix that takes an arbitrary number of dimensions.
int main() {
Matrix<int> mat{2, 2}; //Yields a 2x2 2D Rectangular Matrix
mat = Matrix<int>{4, 6, 5};//mat is now a 4x6x5 3D Rectangular Matrix
mat = Matrix<int>{9};//mat is now a 9-length 1D array.
mat = Matrix<int>{2, 3, 4, 5, 6, 7, 8, 9};//Why would you do this? (yet it compiles...)
}
And if the number and sizes of the dimensions is only known at runtime, this code will work around that:
int main() {
std::cout << "Input the sizes of each of the dimensions.\n";
std::string line;
std::getline(std::cin, line);
std::stringstream ss(line);
size_t dim;
std::vector<size_t> dimensions;
while(ss >> dim)
dimensions.emplace_back(dim);
Matrix<int> mat{dimensions};//Voila.
}
Then, we want to be able to access arbitrary indexes of this matrix. This code offers two ways to do so: either statically using templates, or variably at runtime.
//public:
T & operator()(std::vector<size_t> const& indexes) {
if(indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
T const& operator()(std::vector<size_t> const& indexes) const {
if (indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
template<typename ... Indexes>
T & operator()(size_t idx, Indexes&& ... indexes) {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
template<typename ... Indexes>
T const& operator()(size_t idx, Indexes&& ... indexes) const {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
And then, in practice:
Matrix<int> mat{6, 5};
mat(5, 2) = 17;
//mat(5, 1, 7) = 24; //throws exception at runtime because of wrong number of dimensions.
mat = Matrix<int>{9, 2, 8};
mat(5, 1, 7) = 24;
//mat(5, 2) = 17; //throws exception at runtime because of wrong number of dimensions.
And this works fine with runtime-dynamic indexing:
std::vector<size_t> indexes;
/*...*/
mat(indexes) = 54; //Will throw if index count is wrong, will succeed otherwise
There are a number of other functions that this kind of object might want, like a resize method, but choosing how to implement that is a high-level design decision. I've also left out tons of other potentially valuable implementation details (like an optimizing move-constructor, a comparison operator, a copy constructor) but this should give you a pretty good idea of how to start.
EDIT:
If you want to avoid use of templates entirely, you can cut like half of the code provided here, and just use the methods/constructor that uses std::vector<size_t> to provide dimensions/index data. If you don't need the ability to dynamically adapt at runtime to the number of dimensions, you can remove the std::vector<size_t> overloads, and possibly even make the number of dimensions a template argument for the class itself (which would enable you to use size_t[] or std::array[size_t, N] to store dimensional data).
Well, assuming you care about efficiency at all, you probably want to store all of the elements in a contiguous manner regardless. So you probably want to do something like:
template <std::size_t N, class T>
class MultiArray {
MultiArray(const std::array<std::size_t, N> sizes)
: m_sizes(sizes)
, m_data.resize(product(m_sizes)) {}
std::array<std::size_t, N> m_sizes;
std::vector<T> m_data;
};
The indexing part is where it gets kind of fun. Basically, if you want a[1][2][3] etc to work, you have to have a return some kind of proxy object, that has its own operator[]. Each one would have to be aware of its own rank. Each time you do [] it returns a proxy letting you specify the next index.
template <std::size_t N, class T>
class MultiArray {
// as before
template <std::size_t rank>
class Indexor {
Indexor(MultiArray& parent, const std::array<std::size_t, N>& indices = {})
: m_parent(parent), m_indices(indices) {}
auto operator[](std::size_t index) {
m_indices[rank] = index;
return Indexor<rank+1>(m_indices, m_parent);
}
std::array<std::size_t, N> m_indices;
MultiArray& m_parent;
};
auto operator[](std::size_t index) {
return Indexor<0>(*this)[index];
}
}
Finally, you have a specialization for when you're done with the last index:
template <>
class Indexor<N-1> { // with obvious constructor
auto operator[](std::size_t index) {
m_indices[N-1] = index;
return m_parent.m_data[indexed_product(m_indices, m_parent.m_sizes)];
}
std::array<std::size_t, N> m_indices;
MultiArray& m_parent;
};
Obviously this is a sketch but at this point its just filling out details and getting it to compile. There are other approaches like instead having the indexor object have two iterators and narrowing but that seemed a bit more complex. You also don't need to template the Indexor class and could use a runtime integer instead but that would make it very easy to misuse, having one too many or too few [] would be a runtime error, not compile time.
Edit: you would also be able to initialize this in the way you describe in 17, but not in 14. But in 14 you can just use a function:
template <class ... Ts>
auto make_double_array(Ts ts) {
return MultiArray<sizeof ... Ts, double>(ts...);
}
Edit2: I use product and indexed_product in the implementation. The first is obvious, the second is less so, but hopefully they should be clear. The latter is a function that given an array of dimensions, and an array of indices, would return the position of that element in the array.

Understanding on User Defined function

Create a UserArray of bit fields which can be declared as follows: The size occupied by our Array will be less then a normal array. Suppose we want an ARRAY of 20 FLAGs (TRUE/FALSE). A bool FLAG[20] will take 20 bytes of memory, while UserArray<bool,bool,0,20> will take 4 bytes of memory.
Use class Template to create user array.
Use Bit wise operators to pack the array.
Equality operation should also be implemented.
template<class T,int W,int L,int H>//i have used template<class T>
//but never used such way
class UserArray{
//....
};
typedef UserArray<bool,4,0,20> MyType;
where:
T = type of an array element
W = width of an array element, 0 < W < 8
L = low bound of array index (preferably zero)
H = high bound of array index
A main program:
int main() {
MyType Display; //typedef UserArray<T,W,L,H> MyType; defined above
Display[0] = FALSE; //need to understand that how can we write this?
Display[1] = TRUE; //need to understand that how can we write this?
//assert(Display[0]);//commented once, need to understand above code first
//assert(Display[1]);//commented once..
//cout << “Size of the Display” << sizeof(Display);//commented once..
}
My doubt is how those parameters i.e T,L,W & H are used in class UserArray and how can we write instance of UserArray as Display[0] & Display[1] what does it represent?
Short & simple example of similar type will be easy for me to understand.
W, L and H are non-type template parameters. You can instantiate a template (at compile-time) with constant values, e.g.:
template <int N>
class MyArray
{
public:
float data[N];
void print() { std::cout << "MyArray of size " << N << std::endl; }
};
MyArray<7> foo;
MyArray<8> bar;
foo.print(); // "MyArray of size 7"
bar.print(); // "MyArray of size 8"
In the example above, everywhere that N appears in the template definition, it will be replaced at compile-time by the supplied constant.
Note that MyArray<7> and MyArray<8> are completely different types as far as the compile is concerned.
I have no idea what the solution to your specific problem is. But your code won't compile, currently, as you have not provided values for the template parameters.
This is not simple, particularly as you can have variable bit widths.
<limits.h> has a constant CHAR_BIT, which is the number of bits in a byte. Usually this is 8, but it could be greater than 8 (not less though).
I suggest the number of elements per byte be CHAR_BIT / W. This might waste a few bits for example, if width is 3 and CHAR_BIT is 8, but this is complicated enough as is.
You'll then need to define operator[] to access the elements, and likely need to do some bit fiddling to do this. For the non-const version of operator[], you'll probably have to return some sort of proxy object when there are more than one elements in a byte, and have its operator= overridden so it writes back to the appropriate spot in the array.
It's a good exercise though to figure this one out though.
Here's some code that implements what you ask for, except the lower bound is fixed at 0. It also shows a rare use case for the address_of operator. You could take this further and make this container compatible with STL algorithms if you liked.
#include <iostream>
#include <limits.h>
#include <stddef.h>
template<class T, size_t WIDTH, size_t SIZE>
class UserArray;
template<class T, size_t WIDTH, size_t SIZE>
class UserArrayProxy;
template<class T, size_t WIDTH, size_t SIZE>
class UserArrayAddressProxy
{
public:
typedef UserArray<T, WIDTH, SIZE> array_type;
typedef UserArrayProxy<T, WIDTH, SIZE> proxy_type;
typedef UserArrayAddressProxy<T, WIDTH, SIZE> this_type;
UserArrayAddressProxy(array_type& a_, size_t i_) : a(a_), i(i_) {}
UserArrayAddressProxy(const this_type& x) : a(x.a), i(x.i) {}
proxy_type operator*() { return proxy_type(a, i); }
this_type& operator+=(size_t n) { i += n; return *this; }
this_type& operator-=(size_t n) { i -= n; return *this; }
this_type& operator++() { ++i; return *this; }
this_type& operator--() { --i; return *this; }
this_type operator++(int) { this_type x = *this; ++i; return x; }
this_type operator--(int) { this_type x = *this; --i; return x; }
this_type operator+(size_t n) const { this_type x = *this; x += n; return x; }
this_type operator-(size_t n) const { this_type x = *this; x -= n; return x; }
bool operator==(const this_type& x) { return (&a == &x.a) && (i == x.i); }
bool operator!=(const this_type& x) { return !(*this == x); }
private:
array_type& a;
size_t i;
};
template<class T, size_t WIDTH, size_t SIZE>
class UserArrayProxy
{
public:
static const size_t BITS_IN_T = sizeof(T) * CHAR_BIT;
static const size_t ELEMENTS_PER_T = BITS_IN_T / WIDTH;
static const size_t NUMBER_OF_TS = (SIZE - 1) / ELEMENTS_PER_T + 1;
static const T MASK = (1 << WIDTH) - 1;
typedef UserArray<T, WIDTH, SIZE> array_type;
typedef UserArrayProxy<T, WIDTH, SIZE> this_type;
typedef UserArrayAddressProxy<T, WIDTH, SIZE> address_proxy_type;
UserArrayProxy(array_type& a_, int i_) : a(a_), i(i_) {}
this_type& operator=(T x)
{
a.write(i, x);
return *this;
}
address_proxy_type operator&() { return address_proxy_type(a, i); }
operator T()
{
return a.get(i);
}
private:
array_type& a;
size_t i;
};
template<class T, size_t WIDTH, size_t SIZE>
class UserArray
{
public:
typedef UserArrayAddressProxy<T, WIDTH, SIZE> ptr_t;
static const size_t BITS_IN_T = sizeof(T) * CHAR_BIT;
static const size_t ELEMENTS_PER_T = BITS_IN_T / WIDTH;
static const size_t NUMBER_OF_TS = (SIZE - 1) / ELEMENTS_PER_T + 1;
static const T MASK = (1 << WIDTH) - 1;
T operator[](size_t i) const
{
return get(i);
}
UserArrayProxy<T, WIDTH, SIZE> operator[](size_t i)
{
return UserArrayProxy<T, WIDTH, SIZE>(*this, i);
}
friend class UserArrayProxy<T, WIDTH, SIZE>;
private:
void write(size_t i, T x)
{
T& element = data[i / ELEMENTS_PER_T];
int offset = (i % ELEMENTS_PER_T) * WIDTH;
x &= MASK;
element &= ~(MASK << offset);
element |= x << offset;
}
T get(size_t i)
{
return (data[i / ELEMENTS_PER_T] >> ((i % ELEMENTS_PER_T) * WIDTH)) & MASK;
}
T data[NUMBER_OF_TS];
};
int main()
{
typedef UserArray<int, 6, 20> myarray_t;
myarray_t a;
std::cout << "Sizeof a in bytes: " << sizeof(a) << std::endl;
for (size_t i = 0; i != 20; ++i) { a[i] = i; }
for (size_t i = 0; i != 20; ++i) { std::cout << a[i] << std::endl; }
std::cout << "We can even use address_of operator: " << std::endl;
for (myarray_t::ptr_t e = &a[0]; e != &a[20]; ++e) { std::cout << *e << std::endl; }
}