I have a large lookup table that currently needs 12 bits per entry. Is there a standard class that will give me a memory efficient container for storing odd-sized data? I have about a billion items in the table, so I care more about memory efficiency than speed.
I need to be able to get the underlying data and read/write it to a file as well.
How about this:
#include <stdio.h>
typedef unsigned char byte;
typedef unsigned short word;
typedef unsigned int uint;
typedef unsigned long long int qword;
enum {
bits_per_cell = 12, cellmask = (1<<bits_per_cell)-1,
N_cells = 1000000,
bufsize = (N_cells*bits_per_cell+7)/8,
};
byte* buf;
byte* Alloc( void ) {
buf = new byte[bufsize];
return buf;
};
// little-endian only
void put( uint i, uint c ) {
qword x = qword(i)*bits_per_cell;
uint y = x&15, z = (x>>4)<<1;
uint& a = (uint&)buf[z];
uint mask = ~(cellmask<<y);
a = a & mask | ((c&cellmask)<<y);
}
uint get( uint i ) {
qword x = qword(i)*bits_per_cell;
uint y = x&15, z = (x>>4)<<1;
uint& a = (uint&)buf[z];
return (a>>y)&cellmask;
}
/*
// bigendian/universal
void put( uint i, uint c ) {
qword x = qword(i)*bits_per_cell;
uint y = x&7, z = (x>>3);
uint a = buf[z] + (buf[z+1]<<8) + (buf[z+2]<<16);
uint mask = ~(cellmask<<y);
a = a & mask | ((c&cellmask)<<y);
buf[z] = byte(a); buf[z+1]=byte(a>>8); buf[z+2]=byte(a>>16);
}
uint get( uint i ) {
qword x = qword(i)*bits_per_cell;
uint y = x&7, z = (x>>3);
uint a = buf[z] + (buf[z+1]<<8) + (buf[z+2]<<16);
return (a>>y)&cellmask;
}
*/
int main( void ) {
if( Alloc()==0 ) return 1;
uint i;
for( i=0; i<N_cells; i++ ) put( i^1, i );
for( i=0; i<N_cells; i++ ) {
uint j = i^1, c, d;
c = get(j); d = i & cellmask;
if( c!=d ) printf( "x[%08X]=%04X, not %04X\n", j,c,d );
}
}
Have you looked at Boost::dynamic_bitset? I'm not saying that it would be the be-all, end all of your dreams but it could help you with some of the characteristics you've described. It's very similar to bitset of the standard library, only with resizeable options.
I might not try to use it by itself to solve your problem. Instead, I might combine it with another container class and use it in conjunction with some sort of mapping scheme. I don't know what type of mapping as it will depend on the data and the frequency of cycles. However, thinking more about this:
std::vector<std::bitset<12> > oneBillionDollars; //Austin Powers, my hero!
You have a problem of packing. The only idea I can get is that you'd want to find a LCM of N and some power of two. Not gonna be so easy, but definitely workable.
Also, you cannot really manipulate some weird sized data, so you need to pack it into a bigger integer type. The table will contain the data packed, but the "accessor" will yield an unpacked one.
// General structure
template <size_t N>
class Pack
{
public:
static size_t const Number = N;
static size_t const Density = 0; // number of sets of N bits
typedef char UnpackedType; // some integral
UnpackedType Get(size_t i) const; // i in [0..Density)
void Set(size_t i, UnpackedType t); // i in [0..Density)
// arbitrary representation
};
// Example, for 12 bits
// Note: I assume that all is set, you'll have to pad...
// but for a million one or two more should not be too much of an issue I guess
// if it is, the table shall need one more data member, which is reasonnable
class Pack12
{
public:
typedef uint16_t UnpackedType;
static size_t const Number = 12;
static size_t const Density = 4;
UnpackedType get(size_t i) const;
void set(size_t i, UnpackedType t);
private:
uint16_t data[3];
};
Now we can build on that to build a generic table that'll work for any pack:
template <typename Pack>
class Table
{
public:
typedef typename Pack::UnpackedType UnpackedType;
bool empty() const;
size_t size() const;
UnpackedType get(size_t i) const;
void set(size_t i, UnpackedType t);
private:
static size_t const NumberBits = Pack::Number;
static size_t const Density = Pack::Density;
std::deque<Pack> data;
};
template <typename Pack>
bool Table<Pack>::empty() const { return data.empty(); }
template <typename Pack>
size_t Table<Pack>::size() const { return data.size() * Density; }
template <typename Pack>
typename Table<Pack>::UnpackedType Table<Pack>::get(size_t i) const
{
Pack const& pack = data.at(i / Density);
return pack.get(i % Density);
}
// Table<Pack>::set is the same
A more clever way would be for Pack<N> to be able to deduce the getters and representations... but it doesn't seem worth the effort because the Pack interface is minimal and Table can present a richer interface without asking more than that.
Related
(This is not a duplicate of this or this that refer to fixed sizes, the issue is not to understand how pointers are stored, but if the compiler can automate the manual function).
Based on this SO question multidimensional arrays are stored sequentially.
// These arrays are the same
int array1[3][2] = {{0, 1}, {2, 3}, {4, 5}};
int array2[6] = { 0, 1, 2, 3, 4, 5 };
However I'm trying to create a 2 dimension array of floats in pre-allocated memory:
float a[5][10]
float b[50]; // should be same memory
Then I'm trying:
vector<char> x(1000);
float** a = (float**)x.data();
a[0][1] = 5;
The above code crashes, obviously because the compiler does not know the size of the array to allocate it in memory like in the compiler-level known array in the first example.
Is there a way to tell the compiler to allocate a multi dimensional array in sequential memory without manually calculating the pointers (say, by manually shifting the index and calling placement new for example)?
Currently, I'm doing it manually, for example:
template <typename T> size_t CreateBuffersInMemory(char* p,int n,int BufferSize)
{
// ib = T** to store the data
int ty = sizeof(T);
int ReqArraysBytes = n * sizeof(void*);
int ReqT = ReqArraysBytes * (ty*BufferSize);
if (!p)
return ReqT;
memset(p, 0, ReqT);
ib = (T**)p;
p += n * sizeof(void*);
for (int i = 0; i < n; i++)
{
ib[i] = (T*)p;
p += ty*BufferSize;
}
return ReqT;
}
Thanks a lot.
To allocate T[rows][cols] array as a one-dimensional array allocate T[rows * cols].
To access element [i][j] of that one-dimensional array you can do p[i * cols + j].
Example:
template<class T>
struct Array2d {
T* elements_;
unsigned columns_;
Array2d(unsigned rows, unsigned columns)
: elements_(new T[rows * columns]{}) // Allocate and value-initialize.
, columns_(columns)
{}
T* operator[](unsigned row) {
return elements_ + row * columns_;
}
// TODO: Implement the special member functions.
};
int main() {
Array2d<int> a(5, 10);
a[3][1] = 0;
}
Your code invokes undefined behavior because x.data() does not point to an array of pointers but to an array of 1000 objects of type char. You should be thankful that it crashes… ;-)
One way to access a contiguous buffer of some type as if it was a multidimensional array is to have another object that represents a multidimensional view into this buffer. This view object can then, e.g., provide member functions to access the data using a multidimensional index. To enable the a[i][j][k] kind of syntax (which you seem to be aiming for), provide an overloaded [] operator which returns a proxy object that itself offers an operator [] and so on until you get down to a single dimension.
For example, for the case that dimensions are fixed at compile time, we can define
template <int Extent, int... Extents>
struct row_major_layout;
template <int Extent>
struct row_major_layout<Extent>
{
template <typename T>
static auto view(T* data) { return data; }
};
template <int Extent, int... Extents>
struct row_major_layout
{
static constexpr int stride = (Extents * ... * 1);
template <typename T>
class span
{
T* data;
public:
span(T* data) : data(data) {}
auto operator[](std::size_t i) const
{
return row_major_layout<Extents...>::view(data + i * stride);
}
};
template <typename T>
static auto view(T* data) { return span<T>(data); }
};
and then simply create and access such a row_major_layout view
void test()
{
constexpr int M = 7, N = 2, K = 5;
std::vector<int> bla(row_major_layout<M, N, K>::size);
auto a3d = row_major_layout<M, N, K>::view(data(bla));
a3d[2][1][3] = 42;
}
live example here
Or in case the array bounds are dynamic:
template <int D>
class row_major_layout;
template <>
class row_major_layout<1>
{
public:
row_major_layout(std::size_t extent) {}
static constexpr std::size_t size(std::size_t extent)
{
return extent;
}
template <typename T>
friend auto view(T* data, const row_major_layout&)
{
return data;
}
};
template <int D>
class row_major_layout : row_major_layout<D - 1>
{
std::size_t stride;
public:
template <typename... Dim>
row_major_layout(std::size_t extent, Dim&&... extents)
: row_major_layout<D - 1>(std::forward<Dim>(extents)...), stride((extents * ... * 1))
{
}
template <typename... Dim>
static constexpr std::size_t size(std::size_t extent, Dim&&... extents)
{
return extent * row_major_layout<D - 1>::size(std::forward<Dim>(extents)...);
}
template <typename T>
class span
{
T* data;
std::size_t stride;
const row_major_layout<D - 1>& layout;
public:
span(T* data, std::size_t stride, const row_major_layout<D - 1>& layout)
: data(data), stride(stride), layout(layout)
{
}
auto operator[](std::size_t i) const
{
return view(data + i * stride, layout);
}
};
template <typename T>
friend auto view(T* data, const row_major_layout& layout)
{
return span<T>(data, layout.stride, layout);
}
};
and
void test(int M, int N, int K)
{
std::vector<int> bla(row_major_layout<3>::size(M, N, K));
auto a3d = view(data(bla), row_major_layout<3>(M, N, K));
a3d[2][1][3] = 42;
}
live example here
Based on this answer assuming you want an array of char you can do something like
std::vector<char> x(1000);
char (&ar)[200][5] = *reinterpret_cast<char (*)[200][5]>(x.data());
Then you can use ar as a normal two-dimensional array, like
char c = ar[2][3];
For anyone trying to achieve the same, I 've created a variadit template function that would create a n-dimension array in existing memory:
template <typename T = char> size_t CreateArrayAtMemory(void*, size_t bs)
{
return bs*sizeof(T);
}
template <typename T = char,typename ... Args>
size_t CreateArrayAtMemory(void* p, size_t bs, Args ... args)
{
size_t R = 0;
size_t PS = sizeof(void*);
char* P = (char*)p;
char* P0 = (char*)p;
size_t BytesForAllPointers = bs*PS;
R = BytesForAllPointers;
char* pos = P0 + BytesForAllPointers;
for (size_t i = 0; i < bs; i++)
{
char** pp = (char**)P;
if (p)
*pp = pos;
size_t RLD = CreateArrayAtMemory<T>(p ? pos : nullptr, args ...);
P += PS;
R += RLD;
pos += RLD;
}
return R;
}
Usage:
Create a 2x3x4 char array:
int j = 0;
size_t n3 = CreateArrayAtMemory<char>(nullptr,2,3,4);
std::vector<char> a3(n3);
char*** f3 = (char***)a3.data();
CreateArrayAtMemory<char>(f3,2,3,4);
for (int i1 = 0; i1 < 2; i1++)
{
for (int i2 = 0; i2 < 3; i2++)
{
for (int i3 = 0; i3 < 4; i3++)
{
f3[i1][i2][i3] = j++;
}
}
}
I'm writing a ProtectedPtr class that protects objects in memory using Windows Crypto API, and I've run into a problem creating a generic constant time compare function. My current code:
template <class T>
bool operator==(volatile const ProtectedPtr& other)
{
std::size_t thisDataSize = sizeof(*protectedData) / sizeof(T);
std::size_t otherDataSize = sizeof(*other) / sizeof(T);
volatile auto thisData = (byte*)getEncyptedData();
volatile auto otherData = (byte*)other.getEncyptedData();
if (thisDataSize != otherDataSize)
return false;
volatile int result = 0;
for (int i = 0; i < thisDataSize; i++)
result |= thisData[i] ^ otherData[i];
return result == 0;
}
getEncryptedData function:
std::unique_ptr<T> protectedData;
const T& getEncyptedData() const
{
ProtectMemory(true);
return *protectedData;
}
The problem is casting to byte*. When using this class with strings, my compiler complains that strings can't be casted to byte pointers. I was thinking maybe trying to base my function off of Go's ConstantTimeByteEq function, but it still brings me back to my original problem of converting a template type to an int or something that I can preform binary manipulation on.
Go's ConstantTimeByteEq function:
func ConstantTimeByteEq(x, y uint8) int {
z := ^(x ^ y)
z &= z >> 4
z &= z >> 2
z &= z >> 1
return int(z)
}
How can I easily convert a template type into something that can have binary manipulation easily preformed on it?
UPDATE Working generic constant compare function based on suggestions from lockcmpxchg8b:
//only works on primative types, and types that don't have
//internal pointers pointing to dynamically allocated data
byte* serialize()
{
const size_t size = sizeof(*protectedData);
byte* out = new byte[size];
ProtectMemory(false);
memcpy(out, &(*protectedData), size);
ProtectMemory(true);
return out;
}
bool operator==(ProtectedPtr& other)
{
if (sizeof(*protectedData) != sizeof(*other))
return false;
volatile auto thisData = serialize();
volatile auto otherData = other.serialize();
volatile int result = 0;
for (int i = 0; i < sizeof(*protectedData); i++)
result |= thisData[i] ^ otherData[i];
//wipe the unencrypted copies of the data
SecureZeroMemory(thisData, sizeof(thisData));
SecureZeroMemory(otherData, sizeof(otherData));
return result == 0;
}
Generally, what you're trying to accomplish in your current code is called Format Preserving Encryption. I.e., to encrypt a std::string such that the resulting ciphertext is also a valid std::string. This is much harder than letting the encryption process convert from the original type to a flat array of bytes.
To do the conversion to a flat array, declare a second template argument for a "Serializer" object, that knows how to serialize objects of type T into an array of unsigned char. You could default it to a generic sizeof/memcpy serializer that would work for all primitve types.
Here's an example for std::string.
template <class T>
class Serializer
{
public:
virtual size_t serializedSize(const T& obj) const = 0;
virtual size_t serialize(const T& obj, unsigned char *out, size_t max) const = 0;
virtual void deserialize(const unsigned char *in, size_t len, T& out) const = 0;
};
class StringSerializer : public Serializer<std::string>
{
public:
size_t serializedSize(const std::string& obj) const {
return obj.length();
};
size_t serialize(const std::string& obj, unsigned char *out, size_t max) const {
if(max >= obj.length()){
memcpy(out, obj.c_str(), obj.length());
return obj.length();
}
throw std::runtime_error("overflow");
}
void deserialize(const unsigned char *in, size_t len, std::string& out) const {
out = std::string((const char *)in, (const char *)(in+len));
}
};
Once you've reduced the objects down to a flat array of unsigned chars, then your given constant-time compare algorithm will work just fine.
Here's a really dumbed-down version of your example code using the serializer above.
template <class T, class S>
class Test
{
std::unique_ptr<unsigned char[]> protectedData;
size_t serSize;
public:
Test(const T& obj) : protectedData() {
S serializer;
size_t size = serializer.serializedSize(obj);
protectedData.reset(new unsigned char[size]);
serSize = serializer.serialize(obj, protectedData.get(), size);
// "Encrypt"
for(size_t i=0; i< size; i++)
protectedData.get()[i] ^= 0xa5;
}
size_t getEncryptedLen() const {
return serSize;
}
const unsigned char *getEncryptedData() const {
return protectedData.get();
}
const T getPlaintextData() const {
S serializer;
T target;
//"Decrypt"
for(size_t i=0; i< serSize; i++)
protectedData.get()[i] ^= 0xa5;
serializer.deserialize(protectedData.get(), serSize, target);
return target;
}
};
int main(int argc, char *argv[])
{
std::string data = "test";
Test<std::string, StringSerializer> tester(data);
const unsigned char *ptr = tester.getEncryptedData();
std::cout << "\"Encrypted\" bytes: ";
for(size_t i=0; i<tester.getEncryptedLen(); i++)
std::cout << std::setw(2) << std::hex << std::setfill('0') << (unsigned int)ptr[i] << " ";
std::cout << std::endl;
std::string recov = tester.getPlaintextData();
std::cout << "Recovered: " << recov << std::endl;
}
Output:
$ ./a.out
"Encrypted" bytes: d1 c0 d6 d1
Recovered: test
Edit: answering request for a generic serializer for primtive/flat types. Consider this as pseudocode, because I'm typing it into a browser without testing. I'm not sure if that's the right template syntax.
template<class T>
class PrimitiveSerializer : public Serializer<T>
{
public:
size_t serializedSize(const T& obj) const {
return sizeof obj;
};
size_t serialize(const T& obj, unsigned char *out, size_t max) const {
if(max >= sizeof obj){
memcpy(out, &obj, sizeof obj);
return sizeof obj;
}
throw std::runtime_error("overflow");
}
void deserialize(const unsigned char *in, size_t len, T& out) const {
if(len < sizeof out) {
throw std::runtime_error("underflow");
}
memcpy(&out, in, sizeof out);
}
};
I'm curious about what error the compiler gives you.
That said, try casting to a const char* or const void*.
Another issue could be the casting from a 64-bit pointer to an 8-bit byte. Try casting to an int, long, or longlong
Edit: Based upon your feedback, another minor change:
volatile auto thisData = (byte*)&getEncyptedData();
volatile auto otherData = (byte*)&other.getEncyptedData();
(note the ampersands). That will allow the previous casts to work
I am trying to make class that acts as multidimensional vector. It doesn't have to do anything fancy. I basically want to have a "container" class foo where I can access elements by foo[x][y][z]. Now I would also need similar classes for foo[x][y] and foo[x]. Which lead me to ponder about the following (more general) question, is there a way to make something like this where you can just initialize as foo A(a,b,c,...) for any n number of arguments and get a n-dimensional vector with elements accessible by [][][]...? Below the class I have for (in example) the four-dimensional case.
First the header
#ifndef FCONTAINER_H
#define FCONTAINER_H
#include <iostream>
using namespace std;
class Fcontainer
{
private:
unsigned dim1, dim2, dim3, dim4 ;
double* data;
public:
Fcontainer(unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4);
~Fcontainer();
Fcontainer(const Fcontainer& m);
Fcontainer& operator= (const Fcontainer& m);
double& operator() (unsigned const dim1, unsigned const dim2, unsigned const dim3, unsigned const dim4);
double const& operator() (unsigned const dim1, unsigned const dim2, unsigned const dim3, unsigned const dim4) const;
};
#endif // FCONTAINER_H
Now the cpp:
#include "fcontainer.hpp"
Fcontainer::Fcontainer(unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4)
{
dim1 = dims1; dim2 = dims2; dim3 = dims3; dim4 = dims4;
if (dims1 == 0 || dims2 == 0 || dims3 == 0 || dims4 == 0)
throw std::invalid_argument("Container constructor has 0 size");
data = new double[dims1 * dims2 * dims3 * dims4];
}
Fcontainer::~Fcontainer()
{
delete[] data;
}
double& Fcontainer::operator() (unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4)
{
if (dims1 >= dim1 || dims2 >= dim2 || dims3 >= dim3 || dims4 >= dim4)
throw std::invalid_argument("Container subscript out of bounds");
return data[dims1*dim2*dims3*dim4 + dims2*dim3*dim4 + dim3*dim4 + dims4];
}
double const& Fcontainer::operator() (unsigned const dims1, unsigned const dims2, unsigned const dims3, unsigned const dims4) const
{
if(dims1 >= dim1 || dims2 >= dim2 || dims3 >= dim3 || dims4 >= dim4)
throw std::invalid_argument("Container subscript out of bounds");
return data[dims1*dim2*dims3*dim4 + dims2*dim3*dim4 + dim3*dim4 + dims4];
}
So I want to expand this to an arbitrary amount of dimensions. I suppose it will take something along the lines of a variadic template or an std::initializer_list but I am not clear on how to approach this( for this problem).
Messing around in Visual Studio for a little while, I came up with this nonsense:
template<typename T>
class Matrix {
std::vector<size_t> dimensions;
std::unique_ptr<T[]> _data;
template<typename ... Dimensions>
size_t apply_dimensions(size_t dim, Dimensions&& ... dims) {
dimensions.emplace_back(dim);
return dim * apply_dimensions(std::forward<Dimensions>(dims)...);
}
size_t apply_dimensions(size_t dim) {
dimensions.emplace_back(dim);
return dim;
}
public:
Matrix(std::vector<size_t> dims) : dimensions(std::move(dims)) {
size_t size = flat_size();
_data = std::make_unique<T[]>(size);
}
template<typename ... Dimensions>
Matrix(size_t dim, Dimensions&&... dims) {
size_t size = apply_dimensions(dim, std::forward<Dimensions>(dims)...);
_data = std::make_unique<T[]>(size);
}
T & operator()(std::vector<size_t> const& indexes) {
if(indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
T const& operator()(std::vector<size_t> const& indexes) const {
if (indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
template<typename ... Indexes>
T & operator()(size_t idx, Indexes&& ... indexes) {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
template<typename ... Indexes>
T const& operator()(size_t idx, Indexes&& ... indexes) const {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
T & at(size_t flat_index) {
return _data[flat_index];
}
T const& at(size_t flat_index) const {
return _data[flat_index];
}
size_t dimension_size(size_t dim) const {
return dimensions[dim];
}
size_t num_of_dimensions() const {
return dimensions.size();
}
size_t flat_size() const {
size_t size = 1;
for (size_t dim : dimensions)
size *= dim;
return size;
}
private:
size_t get_flat_index(std::vector<size_t> const& indexes) const {
size_t dim = 0;
size_t flat_index = 0;
for (size_t index : indexes) {
flat_index += get_offset(index, dim++);
}
return flat_index;
}
template<typename ... Indexes>
size_t get_flat_index(size_t dim, size_t index, Indexes&& ... indexes) const {
return get_offset(index, dim) + get_flat_index(dim + 1, std::forward<Indexes>(indexes)...);
}
size_t get_flat_index(size_t dim, size_t index) const {
return get_offset(index, dim);
}
size_t get_offset(size_t index, size_t dim) const {
if (index >= dimensions[dim])
throw std::runtime_error("Index out of Bounds");
for (size_t i = dim + 1; i < dimensions.size(); i++) {
index *= dimensions[i];
}
return index;
}
};
Let's talk about what this code accomplishes.
//private:
template<typename ... Dimensions>
size_t apply_dimensions(size_t dim, Dimensions&& ... dims) {
dimensions.emplace_back(dim);
return dim * apply_dimensions(std::forward<Dimensions>(dims)...);
}
size_t apply_dimensions(size_t dim) {
dimensions.emplace_back(dim);
return dim;
}
public:
Matrix(std::vector<size_t> dims) : dimensions(std::move(dims)) {
size_t size = flat_size();
_data = std::make_unique<T[]>(size);
}
template<typename ... Dimensions>
Matrix(size_t dim, Dimensions&&... dims) {
size_t size = apply_dimensions(dim, std::forward<Dimensions>(dims)...);
_data = std::make_unique<T[]>(size);
}
What this code enables us to do is write an initializer for this matrix that takes an arbitrary number of dimensions.
int main() {
Matrix<int> mat{2, 2}; //Yields a 2x2 2D Rectangular Matrix
mat = Matrix<int>{4, 6, 5};//mat is now a 4x6x5 3D Rectangular Matrix
mat = Matrix<int>{9};//mat is now a 9-length 1D array.
mat = Matrix<int>{2, 3, 4, 5, 6, 7, 8, 9};//Why would you do this? (yet it compiles...)
}
And if the number and sizes of the dimensions is only known at runtime, this code will work around that:
int main() {
std::cout << "Input the sizes of each of the dimensions.\n";
std::string line;
std::getline(std::cin, line);
std::stringstream ss(line);
size_t dim;
std::vector<size_t> dimensions;
while(ss >> dim)
dimensions.emplace_back(dim);
Matrix<int> mat{dimensions};//Voila.
}
Then, we want to be able to access arbitrary indexes of this matrix. This code offers two ways to do so: either statically using templates, or variably at runtime.
//public:
T & operator()(std::vector<size_t> const& indexes) {
if(indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
T const& operator()(std::vector<size_t> const& indexes) const {
if (indexes.size() != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
return _data[get_flat_index(indexes)];
}
template<typename ... Indexes>
T & operator()(size_t idx, Indexes&& ... indexes) {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
template<typename ... Indexes>
T const& operator()(size_t idx, Indexes&& ... indexes) const {
if (sizeof...(indexes)+1 != dimensions.size())
throw std::runtime_error("Incorrect number of parameters used to retrieve Matrix Data!");
size_t flat_index = get_flat_index(0, idx, std::forward<Indexes>(indexes)...);
return at(flat_index);
}
And then, in practice:
Matrix<int> mat{6, 5};
mat(5, 2) = 17;
//mat(5, 1, 7) = 24; //throws exception at runtime because of wrong number of dimensions.
mat = Matrix<int>{9, 2, 8};
mat(5, 1, 7) = 24;
//mat(5, 2) = 17; //throws exception at runtime because of wrong number of dimensions.
And this works fine with runtime-dynamic indexing:
std::vector<size_t> indexes;
/*...*/
mat(indexes) = 54; //Will throw if index count is wrong, will succeed otherwise
There are a number of other functions that this kind of object might want, like a resize method, but choosing how to implement that is a high-level design decision. I've also left out tons of other potentially valuable implementation details (like an optimizing move-constructor, a comparison operator, a copy constructor) but this should give you a pretty good idea of how to start.
EDIT:
If you want to avoid use of templates entirely, you can cut like half of the code provided here, and just use the methods/constructor that uses std::vector<size_t> to provide dimensions/index data. If you don't need the ability to dynamically adapt at runtime to the number of dimensions, you can remove the std::vector<size_t> overloads, and possibly even make the number of dimensions a template argument for the class itself (which would enable you to use size_t[] or std::array[size_t, N] to store dimensional data).
Well, assuming you care about efficiency at all, you probably want to store all of the elements in a contiguous manner regardless. So you probably want to do something like:
template <std::size_t N, class T>
class MultiArray {
MultiArray(const std::array<std::size_t, N> sizes)
: m_sizes(sizes)
, m_data.resize(product(m_sizes)) {}
std::array<std::size_t, N> m_sizes;
std::vector<T> m_data;
};
The indexing part is where it gets kind of fun. Basically, if you want a[1][2][3] etc to work, you have to have a return some kind of proxy object, that has its own operator[]. Each one would have to be aware of its own rank. Each time you do [] it returns a proxy letting you specify the next index.
template <std::size_t N, class T>
class MultiArray {
// as before
template <std::size_t rank>
class Indexor {
Indexor(MultiArray& parent, const std::array<std::size_t, N>& indices = {})
: m_parent(parent), m_indices(indices) {}
auto operator[](std::size_t index) {
m_indices[rank] = index;
return Indexor<rank+1>(m_indices, m_parent);
}
std::array<std::size_t, N> m_indices;
MultiArray& m_parent;
};
auto operator[](std::size_t index) {
return Indexor<0>(*this)[index];
}
}
Finally, you have a specialization for when you're done with the last index:
template <>
class Indexor<N-1> { // with obvious constructor
auto operator[](std::size_t index) {
m_indices[N-1] = index;
return m_parent.m_data[indexed_product(m_indices, m_parent.m_sizes)];
}
std::array<std::size_t, N> m_indices;
MultiArray& m_parent;
};
Obviously this is a sketch but at this point its just filling out details and getting it to compile. There are other approaches like instead having the indexor object have two iterators and narrowing but that seemed a bit more complex. You also don't need to template the Indexor class and could use a runtime integer instead but that would make it very easy to misuse, having one too many or too few [] would be a runtime error, not compile time.
Edit: you would also be able to initialize this in the way you describe in 17, but not in 14. But in 14 you can just use a function:
template <class ... Ts>
auto make_double_array(Ts ts) {
return MultiArray<sizeof ... Ts, double>(ts...);
}
Edit2: I use product and indexed_product in the implementation. The first is obvious, the second is less so, but hopefully they should be clear. The latter is a function that given an array of dimensions, and an array of indices, would return the position of that element in the array.
I coded a small simple program that compares the performances of several ways to fill a simple 8x8 matrix with different kind of containers. Here's the following code :
#define MATRIX_DIM 8
#define OCCUR_MAX 100000
static void genHeapAllocatedMatrix(void)
{
int **pPixels = new Pixel *[MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
pPixels[idy] = new Pixel[MATRIX_DIM];
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++)
pPixels[idy][idx] = 42;
}
}
static void genStackAllocatedMatrix(void)
{
std::array<std::array<int, 8>, 8> matrix;
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
matrix[idy][idx] = 42;
}
}
}
static void genStackAllocatedMatrixBasic(void)
{
int matrix[MATRIX_DIM][MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
matrix[idy][idx] = 42;
}
}
}
int main(void)
{
clock_t begin, end;
double time_spent;
begin = clock();
for (type::uint32 idx = 0; idx < OCCUR_MAX; idx++)
{
//genHeapAllocatedMatrix();
genStackAllocatedMatrix();
//genStackAllocatedMatrixBasic();
}
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
std::cout << "Elapsed time = " << time_spent << std::endl;
return (0);
}
As you can guess the more efficient way is the last one with a simple two-dimentional C array (hard-coded). Of course the worse choice is the number one using heap allocations.
My problem is I want to stock this 2-dimensional array as an attribute in a class. Here's a definition of a custom class that handle a matrix :
template <typename T>
class Matrix
{
public:
Matrix(void);
Matrix(type::uint32 column, type::uint32 row);
Matrix(Matrix const &other);
virtual ~Matrix(void);
public:
Matrix &operator=(Matrix const &other);
bool operator!=(Matrix const &other);
bool operator==(Matrix const &other);
type::uint32 rowCount(void) const;
type::uint32 columnCount(void) const;
void printData(void) const;
T **getData(void) const;
void setData(T **matrix);
private:
type::uint32 m_ColumnCount;
type::uint32 m_RowCount;
T **m_pMatrix;
};
To do the job done I tried the following thing using a cast :
Matrix<int> matrix;
int tab[MATRIX_DIM][MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
tab[idy][idx] = 42;
}
}
matrix.setData((int**)&tab[0][0]);
This code compiles correctly but if I want to print it there's a segmentation fault.
int tab[MATRIX_DIM][MATRIX_DIM];
for (type::uint32 idy = 0; idy < MATRIX_DIM; idy++) {
for (type::uint32 idx = 0; idx < MATRIX_DIM; idx++) {
tab[idy][idx] = 42;
}
}
int **matrix = (int**)&tab[0][0];
std::cout << matrix[0][0] << std::endl; //Segmentation fault
Is there a possible way to stock this kind of two dimentional array as an attribute without heap allocation?
That's because a two-dimensional array is not an array of pointers.
So, you should use int * for your matrix type, but then of course you will not be able to index it by two dimensions.
Another option is to store a pointer to the array:
int (*matrix)[MATRIX_DIM][MATRIX_DIM];
matrix = &tab;
std::cout << (*matrix)[0][0] << std::endl;
But that doesn't suit well an idea of incapsulating matrix in a class. F better idea would be for the class to allocate the storage itself (possibly in a single heap allocation) and to provide an access to the matrix through methods only (e.g. GetCell(row, col) etc.), without exposing raw pointers.
Measuring the speed of operations on an 8 x 8 array is largely pointless. For a data set as small as that, the cost of the operation will be close to zero and you are mostly measuring setup time, etc.
Timings become important for larger data sets, but you cannot sensibly extrapolate the small set results to the larger set. With larger data sets you will often find that the data exists on multiple memory pages. There is a danger that paging costs will dominate other costs. Very large improvements in efficiency are possible by ensuring that your algorithm processes all (or most) of the data on one page before moving to the next page, rather than constantly swapping pages.
In general, you are best to use the simplest data structures with the least liklihood of programming error and optimising processing algorithms. I say "in general" as extreme cases do exist where small differences in access time matter, but they are rare.
Use a single array to represent the matrix instead of allocating for each index.
I've written a class for this already. Feel free to use it:
#include <vector>
template<typename T, typename Allocator = std::allocator<T>>
class DimArray
{
private:
int Width, Height;
std::vector<T, Allocator> Data;
public:
DimArray(int Width, int Height);
DimArray(T* Data, int Width, int Height);
DimArray(T** Data, int Width, int Height);
virtual ~DimArray() {}
DimArray(const DimArray &da);
DimArray(DimArray &&da);
inline std::size_t size() {return Data.size();}
inline std::size_t size() const {return Data.size();}
inline int width() {return Width;}
inline int width() const {return Width;}
inline int height() {return Height;}
inline int height() const {return Height;}
inline T* operator [](const int Index) {return Data.data() + Height * Index;}
inline const T* operator [](const int Index) const {return Data.data() + Height * Index;}
inline DimArray& operator = (DimArray da);
};
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(int Width, int Height) : Width(Width), Height(Height), Data(Width * Height, 0) {}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(T* Data, int Width, int Height) : Width(Width), Height(Height), Data(Width * Height, 0) {std::copy(&Data[0], &Data[0] + Width * Height, const_cast<T*>(this->Data.data()));}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(T** Data, int Width, int Height) : Width(Width), Height(Height), Data(Width * Height, 0) {std::copy(Data[0], Data[0] + Width * Height, const_cast<T*>(this->Data.data()));}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(const DimArray &da) : Width(da.Width), Height(da.Height), Data(da.Data) {}
template<typename T, typename Allocator>
DimArray<T, Allocator>::DimArray(DimArray &&da) : Width(std::move(da.Width)), Height(std::move(da.Height)), Data(std::move(da.Data)) {}
template<typename T, typename Allocator>
DimArray<T, Allocator>& DimArray<T, Allocator>::operator = (DimArray<T, Allocator> da)
{
this->Width = da.Width;
this->Height = da.Height;
this->Data.swap(da.Data);
return *this;
}
Usage:
int main()
{
DimArray<int> Matrix(1000, 1000); //creates a 1000 * 1000 matrix.
Matrix[0][0] = 100; //ability to index it like a multi-dimensional array.
}
More usage:
template<typename T, std::size_t size>
class uninitialised_stack_allocator : public std::allocator<T>
{
private:
alignas(16) T data[size];
public:
typedef typename std::allocator<T>::pointer pointer;
typedef typename std::allocator<T>::size_type size_type;
typedef typename std::allocator<T>::value_type value_type;
template<typename U>
struct rebind {typedef uninitialised_stack_allocator<U, size> other;};
pointer allocate(size_type n, const void* hint = 0) {return static_cast<pointer>(&data[0]);}
void deallocate(void* ptr, size_type n) {}
size_type max_size() const {return size;}
};
int main()
{
DimArray<int, uninitialised_stack_allocator<int, 1000 * 1000>> Matrix(1000, 1000);
}
I am attempting to implement a packed_bits class using variadic templates and std::bitset.
In particular, I am running into problems writing a get function which returns a reference to a subset of the member m_bits which contains all the packed bits. The function should be analogous to std::get for std::tuple.
It should act as an reference overlay so I can manipulate a subset of packed_bits.
For instance,
using my_bits = packed_bits<8,16,4>;
my_bits b;
std::bitset< 8 >& s0 = get<0>( b );
std::bitset< 16 >& s1 = get<1>( b );
std::bitset< 4 >& s2 = get<2>( b );
UPDATE
Below is the code that has been rewritten according to Yakk's recommendations below. I am stuck at the point of his last paragraph: not sure how to glue together the individual references into one bitset-like reference. Thinking/working on that last part now.
UPDATE 2
Okay, my new approach is going to be to let bit_slice<> do the bulk of the work:
it is meant to be short-lived
it will publicly subclass std::bitset<length>, acting as a temporary buffer
on construction, it will copy from packed_bits<>& m_parent;
on destruction, it will write to m_parent
in addition to the reference via m_parent, it must also know offset, length
get<> will become a free-function which takes a packet_bits<> and returns a bit_slice<> by value instead of bitset<> by reference
There are various short-comings to this approach:
bit_slice<> has to be relatively short-lived to avoid aliasing issues, since we only update on construction and destruction
we must avoid multiple overlapping references while coding (whether threaded or not)
we will be prone to slicing if we attempt to hold a pointer to the base class when we have an instance of the child class
but I think this will be sufficient for my needs. I will post the finished code when it is complete.
UPDATE 3
After fighting with the compiler, I think I have a basic version working. Unfortunately, I could not get the free-floating ::get() to compile properly: BROKEN shows the spot. Otherwise, I think it's working.
Many thanks to Yakk for his suggestions: the code below is about 90%+ based on his comments.
UPDATE 4
Free-floating ::get() fixed.
UPDATE 5
As suggested by Yakk, I have eliminated the copy. bit_slice<> will read on get_value() and write on set_value(). Probably 90%+ of my calls will be through these interfaces anyways, so no need to subclass/copy.
No more dirtiness.
CODE
#include <cassert>
#include <bitset>
#include <iostream>
// ----------------------------------------------------------------------------
template<unsigned... Args>
struct Add { enum { val = 0 }; };
template<unsigned I,unsigned... Args>
struct Add<I,Args...> { enum { val = I + Add<Args...>::val }; };
template<int IDX,unsigned... Args>
struct Offset { enum { val = 0 }; };
template<int IDX,unsigned I,unsigned... Args>
struct Offset<IDX,I,Args...> {
enum {
val = IDX>0 ? I + Offset<IDX-1,Args...>::val : Offset<IDX-1,Args...>::val
};
};
template<int IDX,unsigned... Args>
struct Length { enum { val = 0 }; };
template<int IDX,unsigned I,unsigned... Args>
struct Length<IDX,I,Args...> {
enum {
val = IDX==0 ? I : Length<IDX-1,Args...>::val
};
};
// ----------------------------------------------------------------------------
template<unsigned... N_Bits>
struct seq
{
static const unsigned total_bits = Add<N_Bits...>::val;
static const unsigned size = sizeof...( N_Bits );
template<int IDX>
struct offset
{
enum { val = Offset<IDX,N_Bits...>::val };
};
template<int IDX>
struct length
{
enum { val = Length<IDX,N_Bits...>::val };
};
};
// ----------------------------------------------------------------------------
template<unsigned offset,unsigned length,typename PACKED_BITS>
struct bit_slice
{
PACKED_BITS& m_parent;
bit_slice( PACKED_BITS& t ) :
m_parent( t )
{
}
~bit_slice()
{
}
bit_slice( bit_slice const& rhs ) :
m_parent( rhs.m_parent )
{ }
bit_slice& operator=( bit_slice const& rhs ) = delete;
template<typename U_TYPE>
void set_value( U_TYPE u )
{
for ( unsigned i=0; i<length; ++i )
{
m_parent[offset+i] = u&1;
u >>= 1;
}
}
template<typename U_TYPE>
U_TYPE get_value() const
{
U_TYPE x = 0;
for ( int i=length-1; i>=0; --i )
{
if ( m_parent[offset+i] )
++x;
if ( i!=0 )
x <<= 1;
}
return x;
}
};
template<typename SEQ>
struct packed_bits :
public std::bitset< SEQ::total_bits >
{
using bs_type = std::bitset< SEQ::total_bits >;
using reference = typename bs_type::reference;
template<int IDX> using offset = typename SEQ::template offset<IDX>;
template<int IDX> using length = typename SEQ::template length<IDX>;
template<int IDX> using slice_type =
bit_slice<offset<IDX>::val,length<IDX>::val,packed_bits>;
template<int IDX>
slice_type<IDX> get()
{
return slice_type<IDX>( *this );
}
};
template<int IDX,typename T>
typename T::template slice_type<IDX>
get( T& t )
{
return t.get<IDX>();
};
// ----------------------------------------------------------------------------
int main( int argc, char* argv[] )
{
using my_seq = seq<8,16,4,8,4>;
using my_bits = packed_bits<my_seq>;
using my_slice = bit_slice<8,16,my_bits>;
using slice_1 =
bit_slice<my_bits::offset<1>::val,my_bits::length<1>::val,my_bits>;
my_bits b;
my_slice s( b );
slice_1 s1( b );
assert( sizeof( b )==8 );
assert( my_seq::total_bits==40 );
assert( my_seq::size==5 );
assert( my_seq::offset<0>::val==0 );
assert( my_seq::offset<1>::val==8 );
assert( my_seq::offset<2>::val==24 );
assert( my_seq::offset<3>::val==28 );
assert( my_seq::offset<4>::val==36 );
assert( my_seq::length<0>::val==8 );
assert( my_seq::length<1>::val==16 );
assert( my_seq::length<2>::val==4 );
assert( my_seq::length<3>::val==8 );
assert( my_seq::length<4>::val==4 );
{
auto s2 = b.get<2>();
}
{
auto s2 = ::get<2>( b );
s2.set_value( 25 ); // 25==11001, but only 4 bits, so we take 1001
assert( s2.get_value<unsigned>()==9 );
}
return 0;
}
I wouldn't have get return a bitset, because each bitset needs to manage its own memory.
Instead, I'd use a bitset internally to manage all of the bits, and create bitset::reference-like individual bit references, and bitset-like "slices", which get can return.
A bitslice would have a pointer back to the original packed_bits, and would know the offset where it starts, and how wide it is. It's references to individual bits would be references from the original packed_bits, which are references from the internal bitset, possibly.
Your Size is redundant -- sizeof...(pack) tells you how many elements are in the pack.
I'd pack up the sizes of the slices into a seqence so you can pass it around easier. Ie:
template<unsigned... Vs>
struct seq {};
is a type from which you can extract an arbitrary length list of unsigned ints, yet can be passed as a single parameter to a template.
As a first step, write bit_slice<offset, length>, which takes a std::bitset<size> and produces bitset::references to individual bits, where bit_slice<offset, length>[n] is the same as bitset[n+offset].
Optionally, bit_slice could store offset as a run-time parameter (because offset as a compile-time parameter is just an optimization, and not that strong of one I suspect).
Once you have bit_slice, working on the tuple syntax of packed_bits is feasible. get<n, offset=0>( packed_bits<a,b,c,...>& ) returns a bit_slice<x> determined by indexing the packed_bits sizes, with an offset determined by adding the first n-1 packed_bits sizes, which is then constructed from the internal bitset of the packed_bits.
Make sense?
Apparently not. Here is a quick bit_slice that represents some sub-range of bits within a std::bitset.
#include <bitset>
template<unsigned Width, unsigned Offset, std::size_t SrcBitWidth>
struct bit_slice {
private:
std::bitset<SrcBitWidth>* bits;
public:
// cast to `bitset`:
operator std::bitset<Width>() const {
std::bitset<Width> retval;
for(unsigned i = 0; i < Offset; ++i) {
retval[i] = (*this)[i];
}
return retval;
}
typedef typename std::bitset<SrcBitWidth>::reference reference;
reference operator[]( size_t pos ) {
// TODO: check that pos < Width?
return (*bits)[pos+Offset];
}
constexpr bool operator[]( size_t pos ) const {
// TODO: check that pos < Width?
return (*bits)[pos+Offset];
}
typedef bit_slice<Width, Offset, SrcBitWidth> self_type;
// can be assigned to from any bit_slice with the same width:
template<unsigned O_Offset, unsigned O_SrcBitWidth>
self_type& operator=( bit_slice<Width, O_Offset, O_SrcBitWidth>&& o ) {
for (unsigned i = 0; i < Width; ++i ) {
(*this)[i] = o[i];
}
return *this;
}
// can be assigned from a `std::bitset<Width>` of the same size:
self_type& operator=( std::bitset<Width> const& o ) {
for (unsigned i = 0; i < Width; ++i ) {
(*this)[i] = o[i];
}
return *this;
}
explicit bit_slice( std::bitset<SrcBitWidth>& src ):bits(&src) {}
bit_slice( self_type const& ) = default;
bit_slice( self_type&& ) = default;
bit_slice( self_type&o ):bit_slice( const_cast<self_type const&>(o)) {}
// I suspect, but am not certain, the the default move/copy ctor would do...
// dtor not needed, as there is nothing to destroy
// TODO: reimplement rest of std::bitset's interface that you care about
};
template<unsigned offset, unsigned width, std::size_t src_width>
bit_slice< width, offset, src_width > make_slice( std::bitset<src_width>& src ) {
return bit_slice< width, offset, src_width >(src);
}
#include <iostream>
int main() {
std::bitset<16> bits;
bits[8] = true;
auto slice = make_slice< 8, 8 >( bits );
bool b0 = slice[0];
bool b1 = slice[1];
std::cout << b0 << b1 << "\n"; // should output 10
}
Another useful class would be a bit_slice with runtime offset-and-source size. This will be less efficient, but easier to program against.
I'm going to guess it something like this:
#include <iostream>
#include <bitset>
using namespace std;
template<int N, int L, int R>
bitset<L-R>
slice(bitset<N> value)
{
size_t W = L - R + 1;
if (W > sizeof(uint64_t) * 8) {
W = 31;
throw("Exceeding integer word size");
}
uint64_t svalue = (value.to_ulong() >> R) & ((1 << W) - 1);
return bitset<L-R>{svalue};
}
}
int main()
{
bitset<16> beef { 0xBEEF };
bitset<4-3> sliced_beef = slice<16, 4, 3>(beef);
auto fast_sliced_beef = slice<16, 4, 3>(beef);
return 0;
}