Optimization of a c++ matrix/bitmap class

Optimization of a c++ matrix/bitmap class - c++

I am searching a 2D matrix (or bitmap) class which is flexible but also fast element access. The contents A flexible class should allow you to choose dimensions during runtime, and would look something like this (simplified):
class Matrix
{
public:
Matrix(int w, int h) :
data(new int[x*y]), width(w) {}
void SetElement(int x, int y, int val)
{
data[x+y*width] = val;
}
// ...
private: // symbols
int width;
int* data;
};
A faster often proposed solution using templates is (simplified):
template <int W, int H>
class TMatrix {
TMatrix() data(new int[W*H]) {}
void SetElement(int x, int y, int val)
{
data[x+y*W] = val;
}
private:
int* data;
};
This is faster as the width can be "inlined" in the code. The first solution does not do this. However this is not very flexible anymore, as you can't change the size anymore at runtime.
So my question is:
Is there a possibility to tell the compiler to generate faster code (like when using the template solution), when the size in the code is fixed and generate flexible code when its runtime dependend?
I tried to achieve this by writing "const" where ever possible. I tried it with gcc and VS2005, but no success. This kind of optimization would be useful for many other similar cases.

I'd just go with the first version, myself.
But, if you really want to try to get the best of both worlds, you could have a Matrix class which holds a pointer to a polymorphic implementation type. For common sizes (say up to 4x4), you could point at template instantiations, and for larger you could point at an implementation that handled the general MxN case.
Having said all that, I think all the indirection & virtual calls would negate any performance improvement that might come from the templates. I don't think you can have your cake & eat it too, in this case.
If you're always dealing with data who's size is known at compile time (graphics/geometry vectors for example), you're better off with the template version (possibly storing the data in statically sized (non-heap allocated) arrays). If you need a general capability for arbitrary data, use the dynamic version instead.

Of course your needs may differ, but I'd skip the automatic generation and just go with a plain&simple set of "fixed" versions. E.g. Vector3, Vector4, Matrix3x3, Matrix3x4, and Matrix4x4. I suppose you could derive all of those from the templated version, but it won't make any particular performance difference.
Is there any particular reason why you want to be able to change the dimensions at runtime? Because I would suggest that just copying from one to the other wouldn't be terribly costly for the (what I suspect to be rare) instances when the change needs to occur.
Finally- something that I've seen done is to have named element access as well as the array, but you can only do that with "hard coded" types. Something like:
class Vector3
{
public:
// other stuff...
union
{
struct { float x, y, z; };
float m[3];
};
};
(that may not be entirely legal C++, hack to suit your compiler.)
Oh, even the templated version doesn't need to use new. Just declare the data as float data[W*H]; Getting it out of the heap will be a bigger performance boost than "optimizing out" a bit of math.

Not so much a complete answer, but some info that may help (if you're not already aware of these): Both OpenCV and Boost (uBLAS) have very good (fast/complete/full-featured) matrix implementations. I've not looked inside them to see how they set/get elements or resize after instantiation though.

Related

How to initialize all fields of a big class with two different fields in standard C++

I have a very big class with a bunch of members, and I want to initialize them with a given specific value.The code below is the most naive implementation, but I don't like it since it's inelegant and hard to maintain because I have to list all the members in the constructor.
struct I_Dont_Like_This_Approach {
int foo;
long bar;
unsigned baz;
int a;
int b;
int c;
int d;
SomeStruct and_so_on;
/*...*/
public:
explicit I_Dont_Like_This_Approach(int i) : foo(i), bar(i), baz(i), a(i), b(i), c(i), d(i), and_so_on(i) /*...*/ {}
};
I thought of an alternative implementation using templates.
template <int N>
struct MyBigClass {
int foo{N};
long bar{N};
unsigned baz{N};
int a{N};
int b{N};
int c{N};
int d{N};
SomeStruct and_so_on{N};
/*...*/
};
but I'm not sure if the code below is safe.
MyBigClass<1> all_one;
MyBigClass<2> all_two;
/* Is the following reinterpret_cast safe? */
all_one = reinterpret_cast<decltype(all_one) &>(all_two);
Does the C++ specification have any guarantees about the data layout compatibility of such templated structs? Or is there a more reasonable implementation? (in standard C++, and don't use macros)

I would argue that the first one is much more maintainable, with the right warnings enabled (and a modern compiler), you will see if your initializer list gets out of sync with the class fields at compile time.
As to your alternative.. you're using templates as compiler arguments, which is not what they're meant to be. That brings a whole slew of issues:
instantiated templates get copied in memory, making your executable larger. Though in this case, I'm hoping your compiler is smart enough to see that the field structure is the same and treat it as one type.
your code now works only with constant literal integers, no more run-time variables.
there is indeed no guarantee that the memory structure of those two classes is the same. You can disable optimizations in most compilers (like pack, alignment, etc), but that comes at the cost of disabling optimizations, which isn't actually necessary except to support your specific code.
And related to the last one, if you ever need to consider whether this is ever going to break, you're heading down a very dark road. I mean any sane person can tell you it will "probably work", but the fact that you have no guarantees in the language that pretty much popularized memory corruption and buffer overflows should terrify you. Write constructors.

Is this way of abstracting template Matrix class data type good?

I recently had to write my own matrix multiplication library. Initially, I wrote it as a template class then I realized that most classes use the matrix class without caring about the datatype used by Matrix class because they just perform a certain transformation and to the matrix without checking the result. So they really ought to not be aware of the datatypes. I was thinking of making a matrix class with a void pointer to the data.
class Mat
{
private:
void *data;
int dtype; // data type used by matrix
int cols, rows;
template<class type>
Mat add(const Mat& a, type unused); // notice unused parameters
public:
Mat(int dtype);
~Mat();
Mat operator+(const Mat& a);
template<class type>
type* getdata(); // this only function that exposes the
//datatype to the user since they want to read the elements
};
I need a the addition function to be a template since its accelerating computations using SSE intrinsics and I have abstracted the intrinsics using template classes. So I thought of adding an unused parameter to the template add so that the compiler would be able to distinguish between the different templates.
Mat Mat::operator+(const Mat& a)
{
Mat result;
switch(dtype)
{
case 0: // int
result = this->add<int>(a, 0);
break;
case 1: // float
result = this->add<float>(a, 0);
break;
};
return result;
}
Is this a bad idea ? if not any way to get rid of the unused parameter in the add method ?
Another Idea I had was to Make IntMatrix, Float Matrix classes inherit from Mat class just to have it call the add function with the template type to avoid having the case switch in the operator overload of the addition. Is this also a bad design ?
clarification
I want to be able to have 2 vectors:
vector<Transform*> transformVector; // list of classes doing operation on matrix
vector<Mat*> results; // intermediate results vector
results.push_back(input_mat)
for(int i = 0; i < transformVector.size(); ++i){
results.push_back(transformVector[i]->transform(results[i]));
// transform here might have to return a result of type float
// even though the input was of type int
}

It would be more efficient to make the Mat class templated and let the compiler create the necessary add functions.
With the current implementation, you'd have to add a new switch case for each new type, and be careful to properly cast the void* to the right type. When you use templates, the compiler will helps you out by checking your types.
You could even create a template which lets you add a Mat<int> to a Mat<float> (or two other matrices of different types).
template <typename T, size_t Col, size_t Row>
Mat {
std::array<T, Col * Row> data; // or other data structure
// ...
template <typename OtherT>
add(const Mat<OtherT, Col, Row>& other);
};

One hard point here is type * getData().
Here again, either you return a plain void * and require the caller to do an explicit cast on it, or you have to use a templated function.
Long story made short, you have changed a templated class (where the overload are resolved at compile time) for a bunch of template methods and a bit of switches to resolve some of the functions at run time.
You say most classes use the matrix class without caring about the datatype. That is exactly what templates are made for: a bunch of storage and processing that is independant of the underlying type (well templates can do a little more, but were initialy created for that)
void * is always a safe pointer and is a great choice for C compatible APIs. But unless you have performance problems (templates can use too much memory on tiny systems because they declare a different class for each implementation (*)), and can prove that a void * is better for a specific use case, you should stick to the common rules. Write simple and easy to read code, and only optimize when you have found a bottleneck.
After your edit, I can see that you want to store matrixes of different underlying types in a single container. I can imagine polymorphism if all matrixes can derive from a common non templated type but I would not be surprised if you suddenly fall in the type * getData() problem later: you static cast a void pointer, so the compiler has no way to prevent you to do a bad cast. An other possibility would be std::variant on matrixes (if C++17) or boost::variant or any other variant or any alternative. Some of them implement tricks for preventing bad casts at run time.
Hard to know which way is best without experimenting over the real problem...
Some other languages like Java have not templates (a different class for each and every implementation) but generics (a common class that acts on objects). The pros is only one class so the question about the correct template not being available at link time has vanished, the cons is that it requires some tricks to make the actual type available at run time.

Why not always use templates instead of actual types?

Why not just use templates instead of actual type? I mean, then you would not have to care about what type you are dealing with at any time, right? Or am I wrong and is there actually a reason why we use actual types, like int and char?
Thank you!

I think it's an issue of over complication that will never give a benefit.
Consider a simple class:
class Row {
size_t len;
size_t cap;
int* values;
};
NB: You'd really instantiate std::vector<int> but lets look at this as a familiar example...
So looking that way we certainly gain a benefit here by making this a template in the type of values.
template<typename VALUE>
class Row {
size_t len;
size_t cap;
VALUE* values;
};
That's a big win! We can write a general purpose Row class and even if this is part of (say) a maths package and this is a vector space tuple with members like sum() and max() and so on we can use other arithmetic types like long and double and build a very useful template.
How about going further? Why not parameterize the len and cap members?
template<typename VALUE,typename SIZE>
class Row {
SIZE len;
SIZE cap;
VALUE* values;
};
What have we won? Not so much it seems. The purpose of size_t is to be the suitable type to represent object sizes. You could use int or unsigned or whatever but you're not going to gain flexibility (negative lengths won't make sense) and all you will do is arbitrarily limit the size of a row.
Remember to follow this through every single use of Row must be a template and accept an alternative for SIZE. Here's our Matrix template:
template<typename VALUE, typename ROW_SIZE, typename COL_SIZE>
class {
Row< Row<VALUE,ROW_SIZE> , COL_SIZE> rows;
};
OK so we can simplify by making ROW_SIZE the same type as COL_SIZE but ultimately we've done that by picking size_t as the common denominator of sizes.
We can take this to it's logical conclusion and the entry point of the program will become:
int main() {
main<VALUE,SIZE,/*... many many types ...*/,INDEX_TYPE>();
return EXIT_SUCCESS;
}
Where every type decision is a parameter and been threaded up through all the functions and classes to the entry point.
There are a number of problems with this:
It's a maintenance nightmare. You can't change or add to a buried class without threading its type decisions up to the entry point.
It will be a compilation nightmare. C++ isn't fast at compiling and this will make it a shed load worse. For a large program I can imagine you might even run out of memory as the compiler resolves the mother of all templates. [more of an issue on larger applications]
Incomprehensible error messages. For good reason compilers struggle to provide easy to trace errors in templates. With templates nested in templates to who-knows how deep that would be a real problem.
You won't gain any useful flexibility. The types are eventually interlinked that many sundry types have a good provided answer that you won't want to change anyway.
In the end if you do have a type that you think is an application parameter (such as value-type in some mathematical package) the best way to parameterize is to use a typedef. typedef double real_type in effect makes the whole source code a template without all that template gubbins all round the shop.
You can typedef float real_type or typedef Rational real_type (where Rational is some imagined rational number implementation) and genuinely create a flexible parameterized library.
But even then you probably won't typedef size_t size_type or whatever because you're not expecting to vary that type.
So in summary you'll end up doing a lot of work to provide flexibility much of which you won't use and have mechanisms such as library level typedef that allow you to parameterize your application in far less conspicuous and labour intensive ways.
I'd say a draft guideline for templates is "Do you need two of them?". If some function or class is likely to have instances with different parameters then the answer is templates. If you think you've got a type (or a value) that is fixed for a given instance of the application then you should use compile time constants and library level typedefs.

There are a couple of reasons. Some of them I shall now list:
Backwards compatibility. Some code bases don't use templates and so you can just replace all the code.
Errors in Code. Sometimes you want to be certain that you are getting a float/int/char or what have you in order for your code to run without errors. Now it would be a fair assumption to use templates and then cast the types back to what you need but tat doesn't always work. For example:
#include <iostream>
#include <string>
using namespace std;
void hello(string msg){
msg += "!!!";
std::cout << msg << '\n';
}
int main(){
hello("Hi there"); // prints "Hi there!!!"
}
This works. But replacing the function above with this one doesn't work:
template<typename T>
void hello(T msg){
msg += "!!!";
std::cout << msg << '\n';
}
(Note: Some compilers may actually run the code above but usually you should get an error in evaluation of 'operator+=(const char*, char [4])')
Now there are way to get around such errors but sometimes you just want a simple working solution.

One reason is that templates need to be instantiated for every concrete type, so, lets say if you have a function like that:
void f(SomeObject object, Int x){
object.do_thing_a(x);
object.do_thing_b(x);
}
And Int is templated, compiler must generate one instance of foo, do_thing_a, do_thing_b, and probably many more functions called from do_thing_a and do_thing_b for every Int be it short or unsigned long long. Sometimes this can even lead to a combinatorial explosion of instances.
Also, you can't make virtual member functions template, for obvious reasons. There is now way compiler can know what instances it should put into vtable prior to compiling the whole program.
By the way, functional languages with type inference doing this all the time. When you write
f x y = x + y
in Haskell, you actually get (very loosely speaking) something close to C++
template<class Num, class A>
A f(A x, A y){
return Num::Add(x, y);
}
Yet in Haskell the compiler is not obligated to generate an instance for every concrete A.

Datatype for lookup table/index into array

Assume I have a class 'Widget'. In my application, I create a lot of Widgets which (for cache locality and other reasons) I keep in a vector.
For efficient lookups I would like to implement an index datastructure. For the sake of the question, let's assume it is a simple lookup table from int indices to Widget elements in the abovementioned vector.
My question is: What should the contents of the lookup table be.
In other words, with which type should I replace the question mark in
using LookupTable = std::vector<?>
I see the following options:
References (Widget&, or rather as it has to be assignable: reference_wrapper<Widget>)
Pointers (Widget*)
Indices in the Widget vector (size_t)
Iterator objects pointing into the Widget vector (std::vector<Widget>::iterator)
Among these options, indices seem to be the only option that don't get invalidated by a vector resize. I might actually be able to avoid resizes, however, implementing the lookup table like that means making assumptions about the vector implementation which seems unreasonable from a 'decoupled design' standpoint.
OTOH indices are not typesafe: If the thing I get out of the lookup table was a reference I could only use it to access the corresponding widget. Using size_t values I can do nonsensical operations like multiplying the result by 3. Also consider the following two signatures:
void doSomethingWithLookupResult(Widget& lookupResult);
void doSomethingWithLookupResult(size_t lookupResult);
The former is significantly more descriptive.
In summary: Which datatype can I use for my lookup table to achieve both a decoupling from the vector implementation and type safety?

Use std::vector::size_type (not size_t). std::vector::size_type may be size_t in most implementations, but for portability and future-proofing sake, we'll do it right.
Go ahead and make a typedef:
using WidgetIndex = std::vector::size_type;
so that this looks reasonable:
void doSomethingWithLookupResult(WidgetIndex lookupResult);
This avoids the vector resize issue which, while you down play it in your question, will eventually come back to bite you.
Don't play games with some user defined type such as tohava (very cleverly) proposes, unless you plan to use this idiom a great deal in your code base. Here is why not:
The problem that you are addressing (type-safety) is real and we'd like a solution to it if it is "free," but compared to other opportunities C++ programmers have to shoot themselves in the foot, this isn't that big an issue.
You'll be wasting time. Your time to design the class and then the time of every user of your code base (including yourself after you've forgotten the implementation in a few months) that will stare at that code and have to puzzle it out.
At some point in the future you'll trip over that "interesting" corner case that none of us can see now by staring at this code.
All that said, if you are going to use this idiom often in your code base (you have many classes that are stored in very static vectors or arrays), then it may make sense to make this investment. In that case the maintenance burden is spread over more code and the possibility of using the wrong index type with the wrong container is greater.

You can create a class that represents an index which also carries type information (in compile-time).
#include <vector>
template <class T>
struct typed_index {
typed_index(int i) : i(i) {}
template <class CONTAINER>
T &operator[](CONTAINER &c) { return c[i]; }
template <class CONTAINER>
const T &operator[](const CONTAINER &c) { return c[i]; }
int i;
};
int main() {
std::vector<int> v1 = {0};
std::vector<const char *> v2 = {"asd"};
typed_index<int> i = 3;
int z = i[v1];
const char *s = i[v2]; // will fail
}

fixed size C-style arrays in class declarations

I've come across a bit of code which essentially looks like this:
#include<iostream>
// in a header file
class xxx{
public:
xxx() { xxx_[0]=0; xxx_[1]=0; xxx_[2]=0;}
double x0() const {return xxx_[0];}
private:
double xxx_[3]; // ???
};
// in the main.cpp
int main(){
xxx x;
std::cout<<x.x0()<<"\n";
}
The question is --- is declaring as a class member an array of fixed size is really allowed by the standard?

There is nothing wrong with the above code. It might not be the best way to write it, but there is nothing intrinsically wrong with it.
Yes, your class xxx may contain a fixed-size array as a member. It's allowed in C too.
The compiler, even when reading the header to use it, knows how big to make sizeof(xxx) as a result.

There is nothing wrong with declaring static array as a member of class:
class A
{
int a[3];
};

It is allowed.
Design-wise, this is often not ideal, though; arrays don't have such a nice interface as std::array has:
std::array<double,3> xxx_;
for (auto it : xxx_) {...}
xxx_.size()
std::transform (xxx_.begin(), xxx_.end(), ...);
etc. So if you find yourself using your (static sized) array as a container most of the time, you should replace it with std::array (which has no spatial overhead). If you need dynamic sized arrays, look at std::vector, which has a small overhead (size + capacity, however, with manual allocation, you must remember the size, too, so the only overhead is capacity).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Optimization of a c++ matrix/bitmap class - c++

Not so much a complete answer, but some info that may help (if you're not already aware of these): Both OpenCV and Boost (uBLAS) have very good (fast/complete/full-featured) matrix implementations. I've not looked inside them to see how they set/get elements or resize after instantiation though.

Related

How to initialize all fields of a big class with two different fields in standard C++

Is this way of abstracting template Matrix class data type good?

Why not always use templates instead of actual types?

Datatype for lookup table/index into array

fixed size C-style arrays in class declarations

Categories

Resources