Are there benefits to allocating large data contiguously? - c++

In my program, I have the following arrays of double: a1, a2, ..., am; b1, b2, ..., bm; c1, c2, ..., cm; which are members of a class, all of length N, where m and N are known at run time. The reason I named them a, b, and, c is because they mean different things and that's how they are accessed outside the class. I wonder what's the best way to allocate memory for them. I was thinking:
1) Allocating everything in one big chunk. Something like.. double *ALL = new double[3*N*m] and then have a member function return a pointer to the requested part using pointer arithmetic.
2) Create 2D arrays A, B, and C of size m*N each.
3) Use std::vector? But since m is known at run time, then I need vector of vectors.
or does it not really matter what I use? I'm just wondering what's a good general practice.

If all three are linked in some way, if there is any relationship between a[i] and b[i], then they should all be stored together, ideally in a structure that names them with a meaningful and relevant name. This will be easier to understand for any future developer and ensures that the length of the array is always correct by default.
This is called design affordance, meaning that the structure of an object or interface lends itself to be used as intended by default. Just think how a programmer who had never seen the code before would interpret its purpose, the less ambiguity the better.
EDIT
Rereading I realize you might be asking about some kind of memory optimization (?) although it isn't clear. I'd still say use something like this, either an array of class pointers or structs depending on just how large N is.

This really depends significantly on how the data are used. If each array is used independently then the straightforward approach is either a number of named vectors of vectors.
If the arrays are used together where for example a[i] and b[i] are related and used together, separate arrays is not really a good approach because you'll keep accessing different areas of memory potentially causing a lot of cache misses. Instead you would want to aggregate the elements of a and b together into a struct or class and then have a single vector of those aggregates.
I don't see a big problem with allocating a big array and providing an appropriate interface to access the correct sets of elements. But please don't do this with new to manage your memory: Use vector even in this case: std::vector<double> all(3*N*m); However I'm not sure this buys you anything either so one of my other options may be more clear for the intention.

Use option 3, a vector of vectors. That will free you from worrying about memory management.
Then hide it behind an interface so you can change it if you feel the need.

Related

Do multi-dimensional arrays cause any problems in C and/or C++?

I know that this question seems a little bit hilarious at the first sight. But as I came across this question I´ve found a comment, of #BasileStarynkevitch, a C and C++ high-level user, in which he claimed that multidimensional arrays shall not be preferable to use, neither in C nor in C++:
Don't use multi-dimensional arrays in C++ (or in C).
Why? Why shouldn´t I use multi-dimensional arrays in C++ nor in C?
What does he meant with this statement?
Thereafter, another user replied on this comment:
Basile is right. It's possible to declare a 3D array in C/C++ but causes too many problems.
Which problems?
I use multi-dimensional arrays a lot and see no disadvantages in using them. In opposite, I think it has only advantages.
Are there any issues about using mulit-dimensional arrays I do not know about?
Can anyone explain me what they meant?
This is quite a broad (and interesting) performance related topic. We could discuss cache misses, cost of initialization of multi-dimensional arrays, vectorization, allocation of multidimensional std::array on the stack, allocation of multidimensional std::vector on the heap, access to the latter two, and so on... .
That said, if your program works fine with your multidimensional arrays, leave it the way it is, especially if your multidimensional arrays allow for more readability.
A performance related example:
Consider a std::vector which holds many std::vector<double>:
std::vector<std::vector<double>> v;
We know that each std::vector object inside v is allocated contiguously. Also, all the elements in a std::vector<double> in v are allocated contiguously. However, not all the double's present in v are in contiguous memory. So, depending on how you access those elements (how many times, in what order, ...), a std::vector of std::vector's can be very slow compared to a single std::vector<double> containing all the double's in contiguous memory.
Matrix libraries will typically store a 5x5 matrix in a plain array of size 25.
You cannot answer this question for C and C++ at once, because there is a fundamental difference between these two languages and their handling of multidimensional arrays. So this answer contains two parts:
C++
Multidimensional arrays are pretty useless in C++ because you cannot allocate them with dynamic sizes. The sizes of all dimensions except the outermost one must be compile time constants. In virtually all the usecases for multidimensional arrays I have encountered, the size parameters are simply not known at compile time. Because they come from the dimensions of an image file, or some simulation parameter, etc.
There might be some special cases where the dimensions are actually known at compile time, and in these cases, there is no issue with using multidimensional arrays in C++. In all the other cases, you'll need to either use pointer arrays (tedious to set up), nested std::vector<std::vector<std::vector<...>>>, or a 1D array with manual index computation (error prone).
C
C allows for true multidimensional arrays with dynamic sizes since C99. This is called VLA, and it allows you to create fully dynamically sized multidimensional arrays both on the stack and the heap.
However, there are two catches:
You can pass a multidimensional VLA to a function, but you can't return it. If you want to pass multidimensional data out of a function, you must return it by reference.
void foo(int width, int height, int (*data)[width]); //works
//int (*bar(int width, int height))[width]; //does not work
You can have pointers to multidimensional arrays in variables, and you can pass them to functions, but you cannot store them in structs.
struct foo {
int width, height;
//int (*data)[width]; //does not work
};
Both problems can be worked around (pass by reference to return a multidimensional array, and storing the pointer as a void* in the struct), but it's not trivial. And since its not a heavily used feature, only very few people know how to do it right.
Compile time array sizes
Both C and C++ allow you to use multidimensional arrays with dimensions known at compile time. These do not have the drawbacks listed above.
But their usefulness is reduced greatly: There are just so many cases where you would want to use a multidimensional array, and where you do not have the ghost of a chance to know the involved sizes at compile time. An example is image processing: You don't know the dimensions of the image before you have opened the image file. Likewise with any physics simulation: You do not know how large your working domain is until your program has loaded its configuration files. Etc.
So, in order to be useful, multidimensional arrays must support dynamic sizes imho.
As with most data structures, there is a "right" time to use them, and a "wrong" time. This is largely subjective, but for the purposes of this question let's just assume you're using a 2D array in a place where it wouldn't make sense.
That said, I think there are two notable reasons to avoid using multidimensional arrays in C++, and they mainly arise based on the use cases of the array. Namely:
1. Slow(er) Memory Traversal
A 2-Dimensional array such as i[j][k] can be accessed contiguously, but the computer must spend extra time computing the address of each element - more than it would spend on a 1D array. More importantly, iterators lose their usability in multidimensional arrays, forcing you to use the [j][k] notation, which is slower. One main advantage of simple arrays is their ability to sequentially access all members. This is partially lost with a 2+D array.
2. Inflexible size
This is just an issue with arrays in general, but resizing a multidimensional array becomes much more complex with 2, 3, or more dimensions. If one dimension needs to change size, the entire structure has to be copied over. If your application needs to be resized, its best to use some structure besides a multidimensional array.
Again these are use-case based, but those are both significant issues that could arise by using multidimensional arrays. In both cases above, there are other solutions available that would be better choices than a multi-dimensional array.
Well the "problems" referred to are not using the structure properly, walking off the end of one or another of the dimensions of the array. If you know what you are doing and code carefully it will work perfectly.
I have often used multidimensional arrays for complex matrix manipulations in C and C++. It comes up very frequently in signals analysis and signal detection as well as high performance libraries for analyzing geometries in simulations. I did not even consider dynamic array allocation as part of the question. Even then typically sized arrays for certain bound problems with a reset function could save memory and speed performance for complex analysis. One could use a cache for smaller matrix manipulations in a library and a more complex C++ OO treatment for larger dynamic allocations on a per-problem basis.
The statements are widely applicable, but not universal. If you have static bounds, it's fine.
In C++, if you want dynamic bounds, you can't have a single contiguous allocation, because the dimensions are part of the type. Even if you don't care for a contiguous allocation, you have to be extra careful, especially if you wish to resize a dimension.
Much simpler is to have a single dimension in some container that will manage the allocation, and a multidimensional view
Given:
std::size_t N, M, L;
std::cin >> N >> M >> L;
Compare:
int *** arr = new int**[N];
std::generate_n(arr, N, [M, L]()
{
int ** sub = new int*[M];
std::generate_n(sub, M, [L](){ return new int[L]; });
return sub;
});
// use arr
std::for_each_n(arr, N, [M](int** sub)
{
std::for_each_n(sub, M, [](int* subsub){ delete[] subsub; });
delete[] sub;
});
delete[] arr;
With:
std::vector<int> vec(N * M * L);
gsl::multi_span arr(vec.data(), gsl::strided_bounds<3>({ N, M, L }));
// use arr

Declaring 3D array structure in c++ using vector

Hi I am a graduate student studying scientific computing using c++. Some of our research focus on speed of an algorithm, therefore it is important to construct array structure that is fast enough.
I've seen two ways of constructing 3D Arrays.
First one is to use vector liblary.
vector<vector<vector<double>>> a (isize,vector<double>(jsize,vector<double>(ksize,0)))
This gives 3D array structure of size isize x jsize x ksize.
The other one is to construct a structure containing 1d array of size isize* jsize * ksize using
new double[isize*jsize*ksize]. To access the specific location of (i,j,k) easily, operator overloading is necessary(am I right?).
And from what I have experienced, first one is much faster since it can access to location (i,j,k) easily while latter one has to compute location and return the value. But I have seen some people preferring latter one over the first one. Why do they prefer the latter setting? and is there any disadvantage of using the first one?
Thanks in adavance.
Main difference between those will be the layout:
vector<vector<vector<T>>>
This will get you a 1D array of vector<vector<T>>.
Each item will be a 1D array of vector<T>.
And each item of those 1D array will be a 1D array of T.
The point is, vector itself does not store its content. It manages a chunk of memory, and stores the content there. This has a number of bad consequences:
For a matrix of dimension X·Y·Z, you will end up allocating 1 + X + X·Y memory chunks. That's horribly slow, and will trash the heap. Imagine: a cube matrix of size 20 would trigger 421 calls to new!
To access a cell, you have 3 levels of indirection:
You must access the vector<vector<vector<T>>> object to get pointer to top-level memory chunk.
You must then access the vector<vector<T>> object to get pointer to second-level memory chunk.
You must then access the vector<T> object to get pointer to the leaf memory chunk.
Only then you can access the T data.
Those memory chunks will be spread around the heap, causing a lot of cache misses and slowing the overall computation.
Should you get it wrong at some point, it is possible to end up with some lines in your matrix having different lengths. After all, they're independent 1-d arrays.
Having a contiguous memory block (like new T[X * Y * Z]) on the other hand gives:
You allocate 1 memory chunk. No heap trashing, O(1).
You only need to access the pointer to the memory chunk, then can go straight for desired element.
All matrix is contiguous in memory, which is cache-friendly.
Those days, a single cache miss means dozens or hundreds lost computing cycles, do not underestimate the cache-friendliness aspect.
By the way, there is a probably better way you didn't mention: using one of the numerous matrix libraries that will handle this for you automatically and provide nice support tools (like SSE-accelerated matrix operations). One such library is Eigen, but there are plenty others.
→ You want to do scientific computing? Let a lib handle the boilerplate and the basics so you can focus on the scientific computing part.
In my point of view, there are too much advantages std::vector's have over normal plain arrays.
In short here are some:
It is much harder to create memory leaks with std::vector. This point alone is one of the biggest advantages. This has nothing to do with performance, but should be considered all the time.
std::vector is part of the STL. This part of C++ is one of the most used one. Thousands of people use the STL and so they get "tested" every day. Over the last years they got optimized so radically, they don't lack any performance anymore. (pls correct me if i see this wrong)
Handling with std::vector is easy as 1, 2, 3. No pointer handling no nothing... Just accessing it via methods or with []-operator and more other methods.
First of all, the idea that you access (i,j,k) in your vec^3 directly is somewhat flawed. What you have is a structure of pointers where you need to dereference three pointers along the way. Note that I have no idea whether that is faster or slower than computing the position within a one-dimensional array, though. You'd need to test that and it might depend on the size of your data (especially whether it fits in a chunk).
Second, the vector^3 requires pointers and vector sizes, which require more memory. In many cases, this will be irrelevant (as the image grows cubically but the memory difference only quadratically) but if your algoritm is really going to fill out any memory available, that can matter.
Third, the raw array stores everything in consecutive memory, which is good for streaming and can be good for certain algorithms because of quick cache accesses. For example when you add one 3D image to another.
Note that all of this is about hyper-optimization that you might not need. The advantages of vectors that skratchi.at pointed out in his answer are quite strong, and I add the advantage that vectors usually increase readability. If you do not have very good reasons not to use vectors, then use them.
If you should decide for the raw array, in any case, make sure that you wrap it well and keep the class small and simple, in order to counter problems regarding leaks and such.
Welcome to SO.
If everything what you have are the two alternatives, then the first one could be better.
Prefer using STL array or vector instead of a C array
You should avoid to use C++ plain arrays since you need to manage yourself the memory allocating/deallocating with new/delete and other boilerplate code like keep track of the size/check bounds. In clearly words "C arrays are less safe, and have no advantages over array and vector."
However, there are some important drawbacks in the first alternative. Something I would like to highlight is that:
std::vector<std::vector<std::vector<T>>>
is not a 3-d matrix. In a matrix, all the rows must have the same size. On the other hand, in a "vector of vectors" there is no guarantee that all the nested vectors have the same length. The reason is that a vector is a linear 1-D structure as pointed out in the #spectras answer. Hence, to avoid all sort of bad or unexpected behaviours, you must to include guards in your code to obtain the rectangular invariant guaranteed.
Luckily, the first alternative is not the only one you may have in hands.
For example, you can replace the c-style array by a std::array:
const int n = i_size * j_size * k_size;
std::array<int, n> myFlattenMatrix;
or use std::vector in case your matrix dimensions can change.
Accessing element by its 3 coordinates
Regarding your question
To access the specific location of (i,j,k) easily, operator
overloading is necessary(am I right?).
Not exactly. Since there isn't a 3-parameter operator for neither std::vector nor array, you can't overload it. But you can create a template class or function to wrap it for you. In any case you will must to deference the 3 vectors or calculate the flatten index of the element in the linear storage.
Considering do not use a third part matrix library like Eigen for your experiments
You aren't coding it for production but for research purposes instead. Particularly, your research is exactly regarding the performance of algorithms. In that case, I prefer do not recommend to use a third part library, like Eigen, absolutely. Of course it depends a lot of what kind of "speed of an algorithm" metrics are you willing to gather, but Eigen, for instance, will do a lot of things under the hood (like vectorization) which will have a tremendous influence on your experiments. Since it will be hard for you to control those unseen optimizations, these library's features may lead you to wrong conclusions about your algorithms.
Algorithm's performance and big-o notation
Usually, the performance of algorithms are analysed by using the big-O approach where factors like the actual time spent, hardware speed or programming language traits aren't taken in account. The book "Data Structures and Algorithms in C++" by Adam Drozdek can provide more details about it.

What advantages do arrays hold over vectors?

Well, after a full year of programming and only knowing of arrays, I was made aware of the existence of vectors (by some members of StackOverflow on a previous post of mine). I did a load of researching and studying them on my own and rewrote an entire application I had written with arrays and linked lists, with vectors. At this point, I'm not sure if I'll still use arrays, because vectors seem to be more flexible and efficient. With their ability to grow and shrink in size automatically, I don't know if I'll be using arrays as much. At this point, the only advantage I personally see is that arrays are much easier to write and understand. The learning curve for arrays is nothing, where there is a small learning curve for vectors. Anyway, I'm sure there's probably a good reason for using arrays in some situation and vectors in others, I was just curious what the community thinks. I'm an entirely a novice, so I assume that I'm just not well-informed enough on the strict usages of either.
And in case anyone is even remotely curious, this is the application I'm practicing using vectors with. Its really rough and needs a lot of work: https://github.com/JosephTLyons/Joseph-Lyons-Contact-Book-Application
A std::vector manages a dynamic array. If your program need an array that changes its size dynamically at run-time then you would end up writing code to do all the things a std::vector does but probably much less efficiently.
What the std::vector does is wrap all that code up in a single class so that you don't need to keep writing the same code to do the same stuff over and over.
Accessing the data in a std::vector is no less efficient than accessing the data in a dynamic array because the std::vector functions are all trivial inline functions that the compiler optimizes away.
If, however, you need a fixed size then you can get slightly more efficient than a std::vector with a raw array. However you won't loose anything using a std::array in those cases.
The places I still use raw arrays are like when I need a temporary fixed-size buffer that isn't going to be passed around to other functions:
// some code
{ // new scope for temporary buffer
char buffer[1024]; // buffer
file.read(buffer, sizeof(buffer)); // use buffer
} // buffer is destroyed here
But I find it hard to justify ever using a raw dynamic array over a std::vector.
This is not a full answer, but one thing I can think of is, that the "ability to grow and shrink" is not such a good thing if you know what you want. For example: assume you want to save memory of 1000 objects, but the memory will be filled at a rate that will cause the vector to grow each time. The overhead you'll get from growing will be costly when you can simply define a fixed array
Generally speaking: if you will use an array over a vector - you will have more power at your hands, meaning no "background" function calls you don't actually need (resizing), no extra memory saved for things you don't use (size of vector...).
Additionally, using memory on the stack (array) is faster than heap (vector*) as shown here
*as shown here it's not entirely precise to say vectors reside on the heap, but they sure hold more memory on the heap than the array (that holds none on the heap)
One reason is that if you have a lot of really small structures, small fixed length arrays can be memory efficient.
compare
struct point
{
float coords[4]
}
with
struct point
{
std::vector<float> coords;
}
Alternatives include std::array for cases like this. Also std::vector implementations will over allocate, meaning that if you want resize to 4 slots, you might have memory allocated for 16 slots.
Furthermore, the memory locations will be scattered and hard to predict, killing performance - using an exceptionally larger number of std::vectors may also need to memory fragmentation issues, where new starts failing.
I think this question is best answered flipped around:
What advantages does std::vector have over raw arrays?
I think this list is more easily enumerable (not to say this list is comprehensive):
Automatic dynamic memory allocation
Proper stack, queue, and sort implementations attached
Integration with C++ 11 related syntactical features such as iterator
If you aren't using such features there's not any particular benefit to std::vector over a "raw array" (though, similarly, in most cases the downsides are negligible).
Despite me saying this, for typical user applications (i.e. running on windows/unix desktop platforms) std::vector or std::array is (probably) typically the preferred data structure because even if you don't need all these features everywhere, if you're already using std::vector anywhere else you may as well keep your data types consistent so your code is easier to maintain.
However, since at the core std::vector simply adds functionality on top of "raw arrays" I think it's important to understand how arrays work in order to be fully take advantage of std::vector or std::array (knowing when to use std::array being one example) so you can reduce the "carbon footprint" of std::vector.
Additionally, be aware that you are going to see raw arrays when working with
Embedded code
Kernel code
Signal processing code
Cache efficient matrix implementations
Code dealing with very large data sets
Any other code where performance really matters
The lesson shouldn't be to freak out and say "must std::vector all the things!" when you encounter this in the real world.
Also: THIS!!!!
One of the powerful features of C++ is that often you can write a class (or struct) that exactly models the memory layout required by a specific protocol, then aim a class-pointer at the memory you need to work with to conveniently interpret or assign values. For better or worse, many such protocols often embed small fixed sized arrays.
There's a decades-old hack for putting an array of 1 element (or even 0 if your compiler allows it as an extension) at the end of a struct/class, aiming a pointer to the struct type at some larger data area, and accessing array elements off the end of the struct based on prior knowledge of the memory availability and content (if reading before writing) - see What's the need of array with zero elements?
embedding arrays can localise memory access requirement, improving cache hits and therefore performance

2D array vs. structure (C++)

If execution speed is important, should I use this,
struct myType {
float dim[3];
};
myType arr[size];
or to use a 2D array as arr[size][index]
It does not matter. The compiler will produce the exact same code regardless in almost all cases. The only difference would be if the struct induced some kind of padding, but given that you have floats that seems unlikely.
It depends on your use case. If you use the three dimensions typically together, the struct organization can be reasonable. Especially when using the dimension individually the array layout is most likely to give better performance: contemporary processors don't just load individual words but rather units of cache lines. If only parts of the data is used there are words loaded which aren't used.
The array layout is also more accessible to parallel processing e.g. using SIMD operations. This is unfortunate to some extend because the object layout is generally different. Actually, the arrays you are using are probably similar but if you change things to become float array[3][size] things become different.
No difference at all. Pick what is more readable to you.
Unless you're working on some weird platform, the memory layout of those two will be the same -- and for the compiler this is what counts most.
The only difference is when you pass something to a function.
When you use the array solution, you never copy the array contains but just pass the array address.
The structs will always be copied if you don't explicitly pass the struct address in case of the struct solution.
One other thing to keep in mind that another poster mentioned: If dim will always have a size of 3 in the struct, but the collection really represents something like "Red, Green, Blue" or "X, Y, Z" or "Car, Truck, Boat", from a maintenance standpoint you might be better off breaking them out. That is, use something like
typedef struct VEHICLES
{
float fCar;
float fTruck;
float fBoat;
} Vehicles;
That way when you come back in two years to debug it, or someone else has to look at it, they will not have to guess what dim[0], dim[1] and dim[2] refer to.
You might want to map out the 2d array to 1d. Might be more cache friendly

Vector, Matrix, Algebra class Design

I am in the process of designing a maths library. I want to start with a simple vector 4 and a matrix 4x4 and I'll extend it with the needs. I am trying to pro and cons of several design I have seen so that I can choose. I find it frustrating I searched a lot, I found a lot but almost no answer were talking about efficiency of the design which is critical for a maths library.
What I am taking into consideration, compiler are amazing now a days I know I can't be smarter that the compiler, but I want to help him to the max. C++11 is bringing good stuff, like move semantics and other stuff like std::tuple....
From what I know the data should be stored in continuous memory.
Where I am a bit lost and need more info is:
A) Should the data be:
value_type[ Rows * Cols] (simple c array) or
value_type* (allocated on the heap of size Rows * Cols) or use something like
std::tuple
B) Also Inheritance or composition/aggregation
I could have a template base class for the data or I could do it with composition/aggregation
C) I saw 3 layout of the data
a struct and union (like in this article http://www.gamasutra.com/view/feature/4248/designing_fast_crossplatform_simd_.php)
a simple member variable
another one used static pointer to member instead of a union. (http://www.gamedev.net/topic/261920-a-slick-trick-in-c/)
D) Also in the gamasutra article (which seem to be old and compiler are better now) He say that the class should not have operator overload and that global function should be used instead. For example the crossProduct function to make it non-member instead of a member function.
I have all of those question, I know there is a lot. What are your take on those, especially on A and C.
Edit:
Thanks all for those answer for the point A, I must say that at the moment my biggest question is with point C, sorry I know it wasn't clear. Point c is really about the design of the classes. I saw 2 option (kind of three if you consider this static trick http://www.gamedev.net/topic/261920-a-slick-trick-in-c/) I could have for a Vector4 in example I could have members of x, y, z and w publicly available or I could also make a union with those members and an array, or I could have only an array and have functions X(), Y(), Z(), W() for accessor. And finally there is the static trick that I provided the link just above but I would prefer if the x,y,z and w would be static and the array would be the data member.
Refer to Blitz++. Also see its "About" page to get a gist about it. It is one of the popular industrial strength math library written in C++. Though you didn't ask for which library to refer to I am citing this mainly because you can learn from some of the design choices made in this library. You might find insights on the very questions you are pondering.
For a small 4x4 matrix, I would avoid dynamically allocating memory on the heap ... a simple one-dimensional array that you can index as a 2D array should suffice (i.e., the ordered pair (x,y) value would be matrix_array[COLUMNS * y + x]), especially considering that loading any single value in the array will also cause adjacent values to be stored on the processor's cache-line, speeding up access to any adjacent elements. The cache-loading can happen with heap-allocated memory as well, but the main reason to avoid heap allocation if possible for small matricies is that many sequential math operations will require you to return a temporary, and without r-value references, you're going to end up doing a lot of calls to new and delete inside the copy-constructors of those temporaries which will slow things down tremendously compared to quickly allocating memory on the stack.
Secondly, I would suggest you go with a templated approach, as this will let your define you matrix for not only plain-old-data-types like double, etc., but also for any secondary composite types you may decide to define later, or import from another library such as rationals, complex numbers, etc. Whether you decide to add operator overloads is really up to you ... some people do not like it because it "hides" the complexity of what may be happening underneath the hood (i.e, A * B for doubles will be far simpler than A * B for a 4x4 matrix<double>). On the other-hand, it can greatly simplify the amount of code you write for complex math operations.