What is the better Matrix4x4 class design c++ newbie - c++

What would be better to use as a way to store matrix values?
float m1,m2,m3 ... ,m16
or
float[4][4].
I first tried float[16] but when im debugging and testing VS wont show what is inside of the array :( could implement a cout and try to read answer from a console test application.
Then i tried using float m1,m2,m3,etc under testing and debugging the values could be read in VS so it seemed easier to work with.
My question is because im fairly new with c++ what is the better design?
I find the float m1,m2 ... ,m16 easier to work with when debugging.
Would also love if someone could say from experience or has benchmark data what has better performance my gut says it shouldn't really matter because the matrix data should be laid out the same in memory right?
Edit:
Some more info its a column major matrix.
As far as i know i only need a 4x4 Matrix for the view transformation pipeline.
So nothing bigger and so i have some constant values.
Busy writing a simple software renderer as a way to learn more c++ and get some more experiences and learn/improve my Linear algebra skills. Will probably only go to per fragment shading and some simple lighting models and so far that i have seen 4x4 matrix is the biggest i will need for rendering.
Edit2:
Found out why i couldn't read the array data it was a float pointer i used and debugging menu only showed the pointer value i did discover a way to see the array value in watch where you have to do pointer, n where n = the element you want to see.
Everybody that answered thanks i will use the Vector4 m[4] answer for now.

You should consider a Vector4 with float [4] members, and a Matrix4 with Vector4 [4] members. With operator [], you have two useful classes, and maintain the ability to access elements with: [i][j] - in most cases, the element data will be contiguous, provided you don't use virtual methods.
You can also benefit from vector (SIMD) instructions this way, e.g., in Vector4
union alignas(16) { __m128 _v; float _s[4]; }; // members
inline float & operator [] (int i) { return _s[i]; }
inline const float & operator [] (int i) const { return _s[i]; }
and in Matrix4
Vector4 _m[4]; // members
inline Vector4 & operator [] (int i) { return _m[i]; }
inline const Vector4 & operator [] (int i) const { return _m[i]; }

The float m1, m2 .. m16; becomes very awkward to deal with when it comes to using loops to iterate through things. Using arrays of some sort is much easier. And, most likely, the compiler will generate AT LEAST as efficient code when you use loops as if you "hand-code", unless you actually write inline assembler or use SSE intrinsics.

The 16 float solution is fine as long as the code doesn't evolve (it is a hassle to maintain and it is not really readable)
The float[4][4] is a way better design (in terms of size parametrization) but you have to understand the notion of pointers.

I would use an array of 16 floats like float m[16]; with the sole reason being that it is very easy to pass it to a library like openGL, using the Matrix4fv suffix functions.
A 2D array like float m[4][4]; should also be configured in memory identically to float m[16] (see May I treat a 2D array as a contiguous 1D array?) and using that would be more convenient as far as having [row][col] (or [col][row] I am not sure which is correct in terms of openGL) indexing (compare m[1][1] vs m[5]).

Using separate variables for matrix elements may prove to be problematic. What are you planning to do when dealing with big matrices like 100x100?
Ofcourse you need to use some array-like structure and I strongly recommend you at least to use arrays

Related

Zero overhead subscript operator for a set of values

Assume we have a function with the following signature (the signature may not be changed, since this function is part of a legacy API):
void Foo(const std::string& s, float v0, float v1, float v2)
{ ... }
How can one access the last three arguments by index using the subscript operator [] without actually copying the data into some sort of container?
Regularly when I come across this kind of issue I put the values in a container, like const std::array<float,3> args{v0,v1,v2}; and access these values using args[0], which unfortunately needs to copy the values.
Another idea would be to access the arguments using a parameter pack, which in turn involves the creation of a templated function which seems to be overkill for this task.
I'm aware that the version using the std::array<> might be suitable since the compiler probably will optimize this kind of stuff, however, this question is kind of academically motivated.
You can't. Not in a way that guarantees zero overhead, or overhead similar to that of array subscripting.
You could, of course, do something like float* vs[]{&v0, &v1, &v2};, and then dereference the result of vs[i]. For that matter, you could make a utility class to act as a transparent reference (to try to get around arrays of references being illegal), though the result is inevitably limited.
The ultimate problem, though, is that nothing in the standard guarantees (or even suggests) that function arguments be stored in any particular memory ordering. On most platforms, at least one of those floats is going to be in a register, meaning that there's just no way to natively subscript it.
If a group of objects does not start out as an array, it's not possible to treat them as an array.
Another idea would be to access the arguments using a parameter pack, which in turn involves the creation of a templated function which seems to be overkill for this task.
Not necessarily. One thing you can do is use std::tie to build a std::tuple of references to the function parameters and then access that tuple via std::get. That should optimize out, but let you refer to the parameters as if they are part of a single collection. That would look like
void Foo(const std::string& s, float v0, float v1, float v2)
{
auto args = std::tie(v0, v1, v2);
std::cout << std::get<1>(args);
}
It's not using operator [], and requires your indices be compile time constants, but you can now pass them to something else as one object.
Danger Wil Robinson! Danger!
This is going to be horribly implementation dependent, and an all around bad idea! This relies on undefined behavior. Less awful with set hardware and tools, but less as in "we're only going to eat 5 babies, not a full dozen".
Those three floats are on the stack next to each other. I don't know if there are any packing rules for the stack. I don't know which order on the the stack they'll be ("v0 v1 v2" vs "v2 v1 v0"). Hell, some optimized build might even put them in a different order just to optimize some oddball case that doesn't actually come up in real life. I dunno. But I suspect something like this will work.
void Foo(const std::string& s, float v0, float v1, float v2)
{
float* vp = &v2;
for (int i = 0; i < 3; ++i)
{
printf("%f\n", vp[i]);
}
}
void main(void)
{
Foo("", 1.0f, 2.0f, 3.0f);
}
3.0000
2.0000
1.0000
So it is possible. It's also ugly, vile, evil, and probably both fattening and carcinogenic.
On GodBolt.org, using gcc x86-64 9.3, the above code worked fine. In VS2017 intel/64, I had to use float* vp = &v0 and for (int i = 0; i < 5; i += 2). Different alignment, different order, and different output (1, 2, 3, not 3, 2 1).
I'm pretty sure I just consigned my soul to the Nth circle of hell.

c++ casting partial array (at some offset) to a different type

In OpenCV library, Vec4f is defined to be a 4-tuple floating point, similar to this:
struct Vec4f {
float data[4];
/* here we may have some methods for this struct */
inline float operator [] (int i) { return data[i]; }
};
Sometimes it is used to represent a line: data[0-1] represents a unit directional vector of that line, while data[2-3] represents a point on that line.
In many cases, I need to convert it to a two-tuple "Point". In OpenCV, a point (or vector) is defined similar to something below:
struct Point2f {
float x; // value for x-coordination
float y; // value for y-coordination
};
I want to convert a line:
Vec4f myLine;
to a directional vector and a point on that line:
Point2f & lineDirectionalVector;
Point2f & linePoint;
I don't want to copy it, so I just use references. Thus, I wrote:
Point2f & lineDirectionalVector = *(Point2f*)(&myLine[0]);
Point2f & linePoint= *(Point2f*)(&myLine[2]);
The question is: It this way a good practice? If not, how do I do it?
Prehaps the first one can be written like this:
Point2f & lineDirectionalVector = reinterpret_cast<Point2f>(myLine);
Any suggestions? Preferred ways? Or perhaps I just make a copy like this:
Point2f lineDirectionalVector(myLine[0], myLine[1]);
Point2f linePoint(myLine[2], myLine[3]);
which is more readable...
The Eigen C++ linear algebra library does what you want a little more first-class with segment, head, and tail vector methods. It is normally best to embrace that which tries to make it first class as there are various errors in what you have typed that might take you a while to hunt down and if you ask what is most readable, well that code would always raise some hairs on the back of my neck... also OpenCV's whole point vs vec concept is dubious.
It's worth noting however, that arrays are guaranteed to be contiguous with no padding so there is no issue casting around various with something like that. You will need to be careful with casting structures such as Point2f though as alignment and padding issues can easily arise when structs get involved. It's a good idea to use static_asserts when you write that type of code and read relevant parts of the standard.

Simultaneously multiply all struct-elements with a scalar

I have a struct that represents a vector. This vector consists of two one-byte integers. I use them to keep values from 0 to 255.
typedef uint8_T unsigned char;
struct Vector
{
uint8_T x;
uint8_T y;
};
Now, the main use case in my program is to multiply both elements of the vector with a 32bit float value:
typedef real32_T float;
Vector Vector::operator * ( const real32_T f ) const {
return Vector( (uint8_T)(x * f), (uint8_T)(y * f) );
};
This needs to be performed very often. Is there a way that these two multiplications can be performed simultaneously? Maybe by vectorization, SSE or similar? Or is the Visual studio compiler already doing this simultaneously?
Another usecase is to interpolate between two Vectors.
Vector Vector::interpolate(const Vector& rhs, real32_T z) const
{
return Vector(
(uint8_T)(x + z * (rhs.x - x)),
(uint8_T)(y + z * (rhs.y - y))
);
}
This already uses an optimized interpolation aproach (https://stackoverflow.com/a/4353537/871495).
But again the values of the vectors are multiplied by the same scalar value.
Is there a possibility to improve the performance of these operations?
Thanks
(I am using Visual Studio 2010 with an 64bit compiler)
In my experience, Visual Studio (especially an older version like VS2010) does not do a lot of vectorization on its own. They have improved it in the newer versions, so if you can, you might see if a change of compiler speeds up your code.
Depending on the code that uses these functions and the optimization the compiler does, it may not even be the calculations that slow down your program. Function calls and cache misses may hurt a lot more.
You could try the following:
If not already done, define the functions in the header file, so the compiler can inline them
If you use these functions in a tight loop, try doing the calculations 'by hand' without any function calls (temporarily expose the variables) and see if it makes a speed difference)
If you have a lot of vectors, look at how they are laid out in memory. Store them contiguously to minimize cache misses.
For SSE to work really well, you'd have to work with 4 values at once - so multiply 2 vectors with 2 floats. In a loop, use a step of 2 and write a static function that calculates 2 vectors at once using SSE instructions. Because your vectors are not aligned (and hardly ever will be with 8 bit variables), the code could even run slower than what you have now, but it's worth a try.
If applicable and if you don't depend on the clamping that occurs with your cast from float to uint8_t (e.g. if your floats are in range [0,1]), try using float everywhere. This may allow the compiler do do far better optimization.
You haven't showed the full algorithm, but the conversions between integer and float numbers is a very slow operation. Eliminating this operation and using only one type (if possible preferably integers) can greatly improve performances.
Alternatevly, you can use lrint() to do the conversion as explained here.

Retuning multiple vectors from a function in c++?

I want to return multiple vectors from a function.
I am not sure either tuple can work or not. I tried but is not working.
xxx myfunction (vector<vector<float>> matrix1 , vector<vector<float>> matrix2) {
// some functional code: e.g.
// vector<vector<float>> matrix3 = matrix1 + matrix2;
// vector<vector<float>> matrix4 = matrix1 - matrix2;
return matrix3, matrix4;
If these matrices are very small then this approach might be OK, but generally I would not do it this way. First, regardless of their size, you should pass them in by const reference.
Also, std::vector<std::vector<T>> is not a very good "matrix" implementation - much better to store the data in a contiguous block and implement element-wise operations over the entire block. Also, if you are going to return the matrices (via a pair or other class) then you'll want to look into move semantics as you don't want extra copies.
If you are not using C++11 then I'd pass in matrices by reference and fill them in the function; e.g.
using Matrix = std::vector<std::vector<float>>; // or preferably something better
void myfunction(const Matrix &m1, const Matrix &m2, Matrix &diff, Matrix &sum)
{
// sum/diff clear / resize / whatever is appropriate for your use case
// sum = m1 + m2
// diff = m1 - m2
}
The main issue with functional style code, e.g. returning std::tuple<Matrix,Matrix> is avoiding copies. There are clever things one can here to avoid extra copies but sometimes it is just simpler, IMO, to go with a less "pure" style of coding.
For Matrices, I normally create a Struct or Class for it that has these vectors, and send objects of that class in to the function. It would also help to encapsulate Matrix related operations inside that Class.
If you still want to use vector of vector, here is my opinion. You could use InOut parameters using references/pointers : Meaning, if the parameters can be updated to hold results of calculation, you would be sending the arguments in, and you would not have to return anything in that case.
If the parameters need to be const and cannot be changed, then I normally send In parameters as const references, and separate Out parameters in the function argument list itself.
Hope this helps a bit.

Class design: arrays vs multiple variables

I have a bit of a theoretical question, however it is a problem I sometimes face when designing classes and I see it done differently when reading others code. Which of the following would be better and why:
example 1:
class Color
{
public:
Color(float, float, float);
~Color();
friend bool operator==(Color& lhs, Color& rhs);
void multiply(Color);
// ...
float get_r();
float get_g();
float get_b();
private:
float color_values[3];
}
example 2:
class Color
{
public:
// as above
private:
float r;
float g;
float b;
}
Is there a general rule one should follow in cases like this or is it just up to a programmer and what seems to make more sense?
Both!
Use this:
class Color {
// ...
private:
union {
struct {
float r, g, b;
};
float c[3];
};
};
Then c[0] will be equivalent to r, et cetera.
It depends, do you intend to iterate over the whole array ?
In that case, I think solution 1 is more appropriate.
It is very useful to have an array like that when you have functions that operate in a loop on the data
e.g.
void BumpColors(float idx)
{
for (int i = 0; i < 3; ++i)
color_values[i] += idx;
}
vs
void BumpColors(float idx)
{
color_values[0] += idx;
color_values[1] += idx;
color_values[2] += idx;
}
Of course this is trivial, and I think it really is a matter of preference. In some rare occasion you might have APIs that take a pointer to the data though, and while you can do
awesomeAPI((float*)&r);
I would much prefer doing
awesomeAPI((float*)&color_values[0]);
because the array will guarantee its contiguity whereas you can mess up with the contiguity by adding by mistake another member variable that is not related after float r.
Performance wise there would be no difference.
I'd say the second one is the best one.
First, the data your variables contain isn't supposed (physically) to be in an array. If you had for example a class with 3 students, not more, not less, you'd put them in an array, cause they are an array of students, but here, it's just colors.
Second, Someone that reads your code also can understand in the second case really fast what your variables contain (r is red, etc). It isn't the case with an array.
Third, you'll have less bugs, you won't have to remember "oh, in my array, red is 0, g is 1, b is 2", and you won't replace by mistake
return color_values[0]
by
return color_values[1]
in your code.
I think that you are right: "It just up to a programmer and what seems to make more sense." If this were my program, I would choose one form or the other without worrying too much about it, then write some other parts of the program, then revisit the matter later.
One of the benefits of class-oriented design is that it makes internal implementation details of this kind private, which makes it convenient to alter them later.
I think that your question does matter, only I doubt that one can answer it well until one has written more code. In the abstract, there are only three elements, and the three have names -- red, green and blue -- so I think that you could go either way with this. If forced to choose, I choose example 2.
Is there a general rule one should follow in cases like this or is it just up to a programmer and what seems to make more sense?
It's definitely up to the programmer and whatever makes more sense.
In your case, the second option seems more appropriate. After all, logically thinking, your member isn't an array of values, but values for r, g and b.
Advantages of using an array:
Maintainability: You can use the values in the array to loop
Maintainability: When a value should be added (like yellow?) than you don't have to change a lot of code.
Disadvantage:
Readability: The 'values' have more clearer names (namely r, g, b in this case).
In your case probably the r, g, b variables are best, since it's unlikely a color is added and a loop over 3 elements has probably a less high importance than readability.
Sometimes a programmer will use an array ( or data structure )
in order to save the data faster to disk (or memory) using 1 write operation.
This is especially useful if you are reading and writing a lot of data.