Vector, Matrix, Algebra class Design

Vector, Matrix, Algebra class Design - c++

I am in the process of designing a maths library. I want to start with a simple vector 4 and a matrix 4x4 and I'll extend it with the needs. I am trying to pro and cons of several design I have seen so that I can choose. I find it frustrating I searched a lot, I found a lot but almost no answer were talking about efficiency of the design which is critical for a maths library.
What I am taking into consideration, compiler are amazing now a days I know I can't be smarter that the compiler, but I want to help him to the max. C++11 is bringing good stuff, like move semantics and other stuff like std::tuple....
From what I know the data should be stored in continuous memory.
Where I am a bit lost and need more info is:
A) Should the data be:
value_type[ Rows * Cols] (simple c array) or
value_type* (allocated on the heap of size Rows * Cols) or use something like
std::tuple
B) Also Inheritance or composition/aggregation
I could have a template base class for the data or I could do it with composition/aggregation
C) I saw 3 layout of the data
a struct and union (like in this article http://www.gamasutra.com/view/feature/4248/designing_fast_crossplatform_simd_.php)
a simple member variable
another one used static pointer to member instead of a union. (http://www.gamedev.net/topic/261920-a-slick-trick-in-c/)
D) Also in the gamasutra article (which seem to be old and compiler are better now) He say that the class should not have operator overload and that global function should be used instead. For example the crossProduct function to make it non-member instead of a member function.
I have all of those question, I know there is a lot. What are your take on those, especially on A and C.
Edit:
Thanks all for those answer for the point A, I must say that at the moment my biggest question is with point C, sorry I know it wasn't clear. Point c is really about the design of the classes. I saw 2 option (kind of three if you consider this static trick http://www.gamedev.net/topic/261920-a-slick-trick-in-c/) I could have for a Vector4 in example I could have members of x, y, z and w publicly available or I could also make a union with those members and an array, or I could have only an array and have functions X(), Y(), Z(), W() for accessor. And finally there is the static trick that I provided the link just above but I would prefer if the x,y,z and w would be static and the array would be the data member.

Refer to Blitz++. Also see its "About" page to get a gist about it. It is one of the popular industrial strength math library written in C++. Though you didn't ask for which library to refer to I am citing this mainly because you can learn from some of the design choices made in this library. You might find insights on the very questions you are pondering.

For a small 4x4 matrix, I would avoid dynamically allocating memory on the heap ... a simple one-dimensional array that you can index as a 2D array should suffice (i.e., the ordered pair (x,y) value would be matrix_array[COLUMNS * y + x]), especially considering that loading any single value in the array will also cause adjacent values to be stored on the processor's cache-line, speeding up access to any adjacent elements. The cache-loading can happen with heap-allocated memory as well, but the main reason to avoid heap allocation if possible for small matricies is that many sequential math operations will require you to return a temporary, and without r-value references, you're going to end up doing a lot of calls to new and delete inside the copy-constructors of those temporaries which will slow things down tremendously compared to quickly allocating memory on the stack.
Secondly, I would suggest you go with a templated approach, as this will let your define you matrix for not only plain-old-data-types like double, etc., but also for any secondary composite types you may decide to define later, or import from another library such as rationals, complex numbers, etc. Whether you decide to add operator overloads is really up to you ... some people do not like it because it "hides" the complexity of what may be happening underneath the hood (i.e, A * B for doubles will be far simpler than A * B for a 4x4 matrix<double>). On the other-hand, it can greatly simplify the amount of code you write for complex math operations.

Related

Declaring 3D array structure in c++ using vector

Hi I am a graduate student studying scientific computing using c++. Some of our research focus on speed of an algorithm, therefore it is important to construct array structure that is fast enough.
I've seen two ways of constructing 3D Arrays.
First one is to use vector liblary.
vector<vector<vector<double>>> a (isize,vector<double>(jsize,vector<double>(ksize,0)))
This gives 3D array structure of size isize x jsize x ksize.
The other one is to construct a structure containing 1d array of size isize* jsize * ksize using
new double[isize*jsize*ksize]. To access the specific location of (i,j,k) easily, operator overloading is necessary(am I right?).
And from what I have experienced, first one is much faster since it can access to location (i,j,k) easily while latter one has to compute location and return the value. But I have seen some people preferring latter one over the first one. Why do they prefer the latter setting? and is there any disadvantage of using the first one?
Thanks in adavance.

Main difference between those will be the layout:
vector<vector<vector<T>>>
This will get you a 1D array of vector<vector<T>>.
Each item will be a 1D array of vector<T>.
And each item of those 1D array will be a 1D array of T.
The point is, vector itself does not store its content. It manages a chunk of memory, and stores the content there. This has a number of bad consequences:
For a matrix of dimension X·Y·Z, you will end up allocating 1 + X + X·Y memory chunks. That's horribly slow, and will trash the heap. Imagine: a cube matrix of size 20 would trigger 421 calls to new!
To access a cell, you have 3 levels of indirection:
You must access the vector<vector<vector<T>>> object to get pointer to top-level memory chunk.
You must then access the vector<vector<T>> object to get pointer to second-level memory chunk.
You must then access the vector<T> object to get pointer to the leaf memory chunk.
Only then you can access the T data.
Those memory chunks will be spread around the heap, causing a lot of cache misses and slowing the overall computation.
Should you get it wrong at some point, it is possible to end up with some lines in your matrix having different lengths. After all, they're independent 1-d arrays.
Having a contiguous memory block (like new T[X * Y * Z]) on the other hand gives:
You allocate 1 memory chunk. No heap trashing, O(1).
You only need to access the pointer to the memory chunk, then can go straight for desired element.
All matrix is contiguous in memory, which is cache-friendly.
Those days, a single cache miss means dozens or hundreds lost computing cycles, do not underestimate the cache-friendliness aspect.
By the way, there is a probably better way you didn't mention: using one of the numerous matrix libraries that will handle this for you automatically and provide nice support tools (like SSE-accelerated matrix operations). One such library is Eigen, but there are plenty others.
→ You want to do scientific computing? Let a lib handle the boilerplate and the basics so you can focus on the scientific computing part.

In my point of view, there are too much advantages std::vector's have over normal plain arrays.
In short here are some:
It is much harder to create memory leaks with std::vector. This point alone is one of the biggest advantages. This has nothing to do with performance, but should be considered all the time.
std::vector is part of the STL. This part of C++ is one of the most used one. Thousands of people use the STL and so they get "tested" every day. Over the last years they got optimized so radically, they don't lack any performance anymore. (pls correct me if i see this wrong)
Handling with std::vector is easy as 1, 2, 3. No pointer handling no nothing... Just accessing it via methods or with []-operator and more other methods.

First of all, the idea that you access (i,j,k) in your vec^3 directly is somewhat flawed. What you have is a structure of pointers where you need to dereference three pointers along the way. Note that I have no idea whether that is faster or slower than computing the position within a one-dimensional array, though. You'd need to test that and it might depend on the size of your data (especially whether it fits in a chunk).
Second, the vector^3 requires pointers and vector sizes, which require more memory. In many cases, this will be irrelevant (as the image grows cubically but the memory difference only quadratically) but if your algoritm is really going to fill out any memory available, that can matter.
Third, the raw array stores everything in consecutive memory, which is good for streaming and can be good for certain algorithms because of quick cache accesses. For example when you add one 3D image to another.
Note that all of this is about hyper-optimization that you might not need. The advantages of vectors that skratchi.at pointed out in his answer are quite strong, and I add the advantage that vectors usually increase readability. If you do not have very good reasons not to use vectors, then use them.
If you should decide for the raw array, in any case, make sure that you wrap it well and keep the class small and simple, in order to counter problems regarding leaks and such.

Welcome to SO.
If everything what you have are the two alternatives, then the first one could be better.
Prefer using STL array or vector instead of a C array
You should avoid to use C++ plain arrays since you need to manage yourself the memory allocating/deallocating with new/delete and other boilerplate code like keep track of the size/check bounds. In clearly words "C arrays are less safe, and have no advantages over array and vector."
However, there are some important drawbacks in the first alternative. Something I would like to highlight is that:
std::vector<std::vector<std::vector<T>>>
is not a 3-d matrix. In a matrix, all the rows must have the same size. On the other hand, in a "vector of vectors" there is no guarantee that all the nested vectors have the same length. The reason is that a vector is a linear 1-D structure as pointed out in the #spectras answer. Hence, to avoid all sort of bad or unexpected behaviours, you must to include guards in your code to obtain the rectangular invariant guaranteed.
Luckily, the first alternative is not the only one you may have in hands.
For example, you can replace the c-style array by a std::array:
const int n = i_size * j_size * k_size;
std::array<int, n> myFlattenMatrix;
or use std::vector in case your matrix dimensions can change.
Accessing element by its 3 coordinates
Regarding your question
To access the specific location of (i,j,k) easily, operator
overloading is necessary(am I right?).
Not exactly. Since there isn't a 3-parameter operator for neither std::vector nor array, you can't overload it. But you can create a template class or function to wrap it for you. In any case you will must to deference the 3 vectors or calculate the flatten index of the element in the linear storage.
Considering do not use a third part matrix library like Eigen for your experiments
You aren't coding it for production but for research purposes instead. Particularly, your research is exactly regarding the performance of algorithms. In that case, I prefer do not recommend to use a third part library, like Eigen, absolutely. Of course it depends a lot of what kind of "speed of an algorithm" metrics are you willing to gather, but Eigen, for instance, will do a lot of things under the hood (like vectorization) which will have a tremendous influence on your experiments. Since it will be hard for you to control those unseen optimizations, these library's features may lead you to wrong conclusions about your algorithms.
Algorithm's performance and big-o notation
Usually, the performance of algorithms are analysed by using the big-O approach where factors like the actual time spent, hardware speed or programming language traits aren't taken in account. The book "Data Structures and Algorithms in C++" by Adam Drozdek can provide more details about it.

C++ Vector vs Array [duplicate]

I'm new to C++. I'm reading "Beginning C++ Through Game Programming" by Michael Dawson. However, I'm not new to programming in general. I just finished a chapter that dealt with vectors, so I've got a question about their use in the real world (I'm a computer science student, so I don't have much real-world experience yet).
The author has a Q/A at the end of each chapter, and one of them was:
Q: When should I use a vector instead of an array?
A: Almost always. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.
What do you guys think? I remember learning about vectors in a Java book, but we didn't cover them at all in my Intro to Comp. Sci. class, nor my Data Structures class at college. I've also never seen them used in any programming assignments (Java and C). This makes me feel like they're not used very much, although I know that school code and real-world code can be extremely different.
I don't need to be told about the differences between the two data structures; I'm very aware of them. All I want to know is if the author is giving good advice in his Q/A, or if he's simply trying to save beginner programmers from destroying themselves with complexities of managing fixed-size data structures. Also, regardless of what you think of the author's advice, what do you see in the real-world more often?

A: Almost always [use a vector instead of an array]. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.
That's an over-simplification. It's fairly common to use arrays, and can be attractive when:
the elements are specified at compile time, e.g. const char project[] = "Super Server";, const Colours colours[] = { Green, Yellow };
with C++11 it will be equally concise to initialise std::vectors with values
the number of elements is inherently fixed, e.g. const char* const bool_to_str[] = { "false", "true" };, Piece chess_board[8][8];
first-use performance is critical: with arrays of constants the compiler can often write a memory snapshot of the fully pre-initialised objects into the executable image, which is then page-faulted directly into place ready for use, so it's typically much faster that run-time heap allocation (new[]) followed by serialised construction of objects
compiler-generated tables of const data can always be safely read by multiple threads, whereas data constructed at run-time must complete construction before other code triggered by constructors for non-function-local static variables attempts to use that data: you end up needing some manner of Singleton (possibly threadsafe which will be even slower)
In C++03, vectors created with an initial size would construct one prototypical element object then copy construct each data member. That meant that even for types where construction was deliberately left as a no-operation, there was still a cost to copy the data elements - replicating their whatever-garbage-was-left-in-memory values. Clearly an array of uninitialised elements is faster.
One of the powerful features of C++ is that often you can write a class (or struct) that exactly models the memory layout required by a specific protocol, then aim a class-pointer at the memory you need to work with to conveniently interpret or assign values. For better or worse, many such protocols often embed small fixed sized arrays.
There's a decades-old hack for putting an array of 1 element (or even 0 if your compiler allows it as an extension) at the end of a struct/class, aiming a pointer to the struct type at some larger data area, and accessing array elements off the end of the struct based on prior knowledge of the memory availability and content (if reading before writing) - see What's the need of array with zero elements?
classes/structures containing arrays can still be POD types
arrays facilitate access in shared memory from multiple processes (by default vector's internal pointers to the actual dynamically allocated data won't be in shared memory or meaningful across processes, and it was famously difficult to force C++03 vectors to use shared memory like this even when specifying a custom allocator template parameter).
embedding arrays can localise memory access requirement, improving cache hits and therefore performance
That said, if it's not an active pain to use a vector (in code concision, readability or performance) then you're better off doing so: they've size(), checked random access via at(), iterators, resizing (which often becomes necessary as an application "matures") etc.. It's also often easier to change from vector to some other Standard container should there be a need, and safer/easier to apply Standard algorithms (x.end() is better than x + sizeof x / sizeof x[0] any day).
UPDATE: C++11 introduced a std::array<>, which avoids some of the costs of vectors - internally using a fixed-sized array to avoid an extra heap allocation/deallocation - while offering some of the benefits and API features: http://en.cppreference.com/w/cpp/container/array.

One of the best reasons to use a vector as opposed to an array is the RAII idiom. Basically, in order for c++ code to be exception-safe, any dynamically allocated memory or other resources should be encapsulated within objects. These objects should have destructors that free these resources.
When an exception goes unhandled, the ONLY things that are gaurenteed to be called are the destructors of objects on the stack. If you dynamically allocate memory outside of an object, and an uncaught exception is thrown somewhere before it is deleted, you have a memory leak.
It's also a nice way to avoid having to remember to use delete.
You should also check out std::algorithm, which provides a lot of common algorithms for vector and other STL containers.
I have on a few occasions written code with vector that, in retrospect, probably would have been better with a native array. But in all of these cases, either a Boost::multi_array or a Blitz::Array would have been better than either of them.

A std::vector is just a resizable array. It's not much more than that. It's not something you would learn in a Data Structures class, because it isn't an intelligent data structure.
In the real world, I see a lot of arrays. But I also see a lot of legacy codebases that use "C with Classes"-style C++ programming. That doesn't mean that you should program that way.

I am going to pop my opinion in here for coding large sized array/vectors used in science and engineering.
The pointer based arrays in this case can be quite a bit faster especially for standard types. But the pointers add the danger of possible memory leaks. These memory leaks can lead to longer debug cycle. Additionally if you want to make the pointer based array dynamic you have to code this by hand.
On the other hand vectors are slower for standard types. They also are both dynamic and memory safe as long as you are not storing dynamically allocated pointers in the stl vector.
In science and engineering the choice depends on the project. how important is speed vs debug time? For example LAAMPS which is a simulation software uses raw pointers that are handled through their memory management class. Speed is priority for this software. A software I am building, i have to balance speed, with memory footprint and debug time. I really dont want to spend a lot of time debugging so i am using the STL vector.
I wanted to add some more information to this answer that I discovered from extensive testing of large scale arrays and lots of reading the web. So, another problem with stl vector and large sized arrays (one million +) occurs in how memory gets allocated for these arrays. Stl vector uses the std::allocator class for handling memory. This class is a pool based memory allocator. Under small scale loading the pool based allocation is extremely efficient in terms of speed and memory use. As the size of the vector gets into the millions, the pool based strategy becomes a memory hog. This happens because the pools tendency is to always hold more space than is being currently used by the stl vector.
For large scale vectors you are either better off writing your own vector class or using pointers (raw or some sort of memory management system from boost or the c++ library). There are advantages and disadvantages to both approaches. The choice really depends on the exact problem you are tackling (too many variables to add in here). If you do happen to write your own vector class make sure to allow the vector an easy way to clear its memory. Currently for the Stl vector you need to use swap operations to do something that really should have been built into the class in the first place.

Rule of thumb: if you don't know the number of elements in advance, or if the number of elements is expected to be large (say, more than 10), use vector. Otherwise, you could also use an array. For example, I write a lot of geometry-processing code and I define a line as an ARRAY of 2 coordinates. A line is defined by two points, and it will ALWAYS be defined by exactly two points. Using a vector instead of an array would be overkill in many ways, also performance-wise.
Another thing: when I say "array" I really DO MEAN array: a variable declared using an array syntax, such as int evenOddCount[2]; If you consider choosing between a vector and a dynamically-allocated block of memory, such as int *evenOddCount = new int[2];, the answer is clear: USE VECTOR!

It's a rare case in the real world where you deal with fixed collections, of a known size. In almost all cases there is a degree of the unknown in exactly what size of data set you will be accommodating in your program. Indeed it is the hallmark of a good program that it can accomodate a wide range of possible scenarios.
Take these (trivial) scenarios as examples:
You have implemented a view
controller to track AI combatants in
a FPS. The game logic spawns a random
number of combatants in various zones
every couple of seconds. The player
is downing AI combatants at a rate
known only at run time.
A lawyer has accessed the Municipal
Court website in his state and is
querying the number of new DUI cases
that came in over the night. He
chooses to filter the list by a set
of variables including time the
accident occurred, zip code, and
arresting officer.
The operating system needs to
maintain a list of memory addresses
in use by the various programs
running on it. The number of programs
and their memory usage changes in
unpredictable ways.
In any of these cases a good argument can be made that a variable size list (that accommodates dynamic inserts and deletes) will perform better than a simple array. With the main benefits coming from reduced need to alloc/dealloc memory space for the fixed array as you add or remove elements from it.

As far as arrays are considered, simple integer or string arrays are very easy to use. On the other hand, for common functions like searching,sorting,insertion,removal, you can achieve much faster speed using standard algorithms (built in library functions) on vectors. Specially if you are using vectors of objects.
Secondly there is this huge difference that vectors can grow in size dynamically as more objects are inserted. Hope that helps.

c++11 std::array vs static array vs std::vector

First question, is it a good thing to start using c++11 if I will develop a code for the 3 following years?
Then if it is, what is the "best" way to implement a matrix if I want to use it with Lapack? I mean, doing std::vector<std::vector< Type > > Matrix is not easily compatible with Lapack.
Up to now, I stored my matrix with Type* Matrix(new Type[N]) (the pointer form with new and delete were important because the size of the array is not given as a number like 5, but as a variable).
But with C++11 it is possible to use std::array. According to this site, this container seems to be the best solution... What do you think?

First things first, if you are going to learn C++, learn C++11. The previous C++ standard was released in 2003, meaning it's already ten years old. That's a lot in IT world. C++11 skills will also smoothly translate to upcoming C++1y (most probably C++14) standard.
The main difference between std::vector and std::array is the dynamic (in size and allocation) and static storage. So if you want to have a matrix class that's always, say, 4x4, std::array<float, 4*4> will do just fine.
Both of these classes provide .data() member, which should produce a compatible pointer. Note however, that std::vector<std::vector<float>> will NOT occuppy contiguous 16*sizeof(float) memory (so v[0].data() won't work). If you need a dynamically sized matrix, use single vector and resize it to the width*height size.
Since the access to the elements will be a bit harder (v[width * y +x] or v[height * x + y]), you might want to provide a wrapper class that will allow you to access arbitrary field by row/column pair.
Since you've also mentioned C-style arrays; std::array provides nicer interface to deal with the same type of storage, and thus should be preferred; there's nothing to gain with static arrays over std::array.

This is a very late reply to the question, but if someone reads this, I just want to point out that one should almost never implement a matrix as a ''vector of vectors''. The reason is that each row of the matrix gets stored in some random location on the heap. This means that matrix operations will do a lot of random memory accesses leading to cache misses, which slows down the implementation considerably.
In other words, if you care at all about performance, just allocate an array/std::array/std::vector of size rows * columns, then use wrapper functions that transforms a pair of integers to the corresponding element in the array. Unless you need to support things like returning references to rows of the matrix, then all of these options should work just fine.

Are there benefits to allocating large data contiguously?

In my program, I have the following arrays of double: a1, a2, ..., am; b1, b2, ..., bm; c1, c2, ..., cm; which are members of a class, all of length N, where m and N are known at run time. The reason I named them a, b, and, c is because they mean different things and that's how they are accessed outside the class. I wonder what's the best way to allocate memory for them. I was thinking:
1) Allocating everything in one big chunk. Something like.. double *ALL = new double[3*N*m] and then have a member function return a pointer to the requested part using pointer arithmetic.
2) Create 2D arrays A, B, and C of size m*N each.
3) Use std::vector? But since m is known at run time, then I need vector of vectors.
or does it not really matter what I use? I'm just wondering what's a good general practice.

If all three are linked in some way, if there is any relationship between a[i] and b[i], then they should all be stored together, ideally in a structure that names them with a meaningful and relevant name. This will be easier to understand for any future developer and ensures that the length of the array is always correct by default.
This is called design affordance, meaning that the structure of an object or interface lends itself to be used as intended by default. Just think how a programmer who had never seen the code before would interpret its purpose, the less ambiguity the better.
EDIT
Rereading I realize you might be asking about some kind of memory optimization (?) although it isn't clear. I'd still say use something like this, either an array of class pointers or structs depending on just how large N is.

This really depends significantly on how the data are used. If each array is used independently then the straightforward approach is either a number of named vectors of vectors.
If the arrays are used together where for example a[i] and b[i] are related and used together, separate arrays is not really a good approach because you'll keep accessing different areas of memory potentially causing a lot of cache misses. Instead you would want to aggregate the elements of a and b together into a struct or class and then have a single vector of those aggregates.
I don't see a big problem with allocating a big array and providing an appropriate interface to access the correct sets of elements. But please don't do this with new to manage your memory: Use vector even in this case: std::vector<double> all(3*N*m); However I'm not sure this buys you anything either so one of my other options may be more clear for the intention.

Use option 3, a vector of vectors. That will free you from worrying about memory management.
Then hide it behind an interface so you can change it if you feel the need.

C++ std::vector vs array in the real world

I'm new to C++. I'm reading "Beginning C++ Through Game Programming" by Michael Dawson. However, I'm not new to programming in general. I just finished a chapter that dealt with vectors, so I've got a question about their use in the real world (I'm a computer science student, so I don't have much real-world experience yet).
The author has a Q/A at the end of each chapter, and one of them was:
Q: When should I use a vector instead of an array?
A: Almost always. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.
What do you guys think? I remember learning about vectors in a Java book, but we didn't cover them at all in my Intro to Comp. Sci. class, nor my Data Structures class at college. I've also never seen them used in any programming assignments (Java and C). This makes me feel like they're not used very much, although I know that school code and real-world code can be extremely different.
I don't need to be told about the differences between the two data structures; I'm very aware of them. All I want to know is if the author is giving good advice in his Q/A, or if he's simply trying to save beginner programmers from destroying themselves with complexities of managing fixed-size data structures. Also, regardless of what you think of the author's advice, what do you see in the real-world more often?

A: Almost always [use a vector instead of an array]. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.
That's an over-simplification. It's fairly common to use arrays, and can be attractive when:
the elements are specified at compile time, e.g. const char project[] = "Super Server";, const Colours colours[] = { Green, Yellow };
with C++11 it will be equally concise to initialise std::vectors with values
the number of elements is inherently fixed, e.g. const char* const bool_to_str[] = { "false", "true" };, Piece chess_board[8][8];
first-use performance is critical: with arrays of constants the compiler can often write a memory snapshot of the fully pre-initialised objects into the executable image, which is then page-faulted directly into place ready for use, so it's typically much faster that run-time heap allocation (new[]) followed by serialised construction of objects
compiler-generated tables of const data can always be safely read by multiple threads, whereas data constructed at run-time must complete construction before other code triggered by constructors for non-function-local static variables attempts to use that data: you end up needing some manner of Singleton (possibly threadsafe which will be even slower)
In C++03, vectors created with an initial size would construct one prototypical element object then copy construct each data member. That meant that even for types where construction was deliberately left as a no-operation, there was still a cost to copy the data elements - replicating their whatever-garbage-was-left-in-memory values. Clearly an array of uninitialised elements is faster.
One of the powerful features of C++ is that often you can write a class (or struct) that exactly models the memory layout required by a specific protocol, then aim a class-pointer at the memory you need to work with to conveniently interpret or assign values. For better or worse, many such protocols often embed small fixed sized arrays.
There's a decades-old hack for putting an array of 1 element (or even 0 if your compiler allows it as an extension) at the end of a struct/class, aiming a pointer to the struct type at some larger data area, and accessing array elements off the end of the struct based on prior knowledge of the memory availability and content (if reading before writing) - see What's the need of array with zero elements?
classes/structures containing arrays can still be POD types
arrays facilitate access in shared memory from multiple processes (by default vector's internal pointers to the actual dynamically allocated data won't be in shared memory or meaningful across processes, and it was famously difficult to force C++03 vectors to use shared memory like this even when specifying a custom allocator template parameter).
embedding arrays can localise memory access requirement, improving cache hits and therefore performance
That said, if it's not an active pain to use a vector (in code concision, readability or performance) then you're better off doing so: they've size(), checked random access via at(), iterators, resizing (which often becomes necessary as an application "matures") etc.. It's also often easier to change from vector to some other Standard container should there be a need, and safer/easier to apply Standard algorithms (x.end() is better than x + sizeof x / sizeof x[0] any day).
UPDATE: C++11 introduced a std::array<>, which avoids some of the costs of vectors - internally using a fixed-sized array to avoid an extra heap allocation/deallocation - while offering some of the benefits and API features: http://en.cppreference.com/w/cpp/container/array.

One of the best reasons to use a vector as opposed to an array is the RAII idiom. Basically, in order for c++ code to be exception-safe, any dynamically allocated memory or other resources should be encapsulated within objects. These objects should have destructors that free these resources.
When an exception goes unhandled, the ONLY things that are gaurenteed to be called are the destructors of objects on the stack. If you dynamically allocate memory outside of an object, and an uncaught exception is thrown somewhere before it is deleted, you have a memory leak.
It's also a nice way to avoid having to remember to use delete.
You should also check out std::algorithm, which provides a lot of common algorithms for vector and other STL containers.
I have on a few occasions written code with vector that, in retrospect, probably would have been better with a native array. But in all of these cases, either a Boost::multi_array or a Blitz::Array would have been better than either of them.

A std::vector is just a resizable array. It's not much more than that. It's not something you would learn in a Data Structures class, because it isn't an intelligent data structure.
In the real world, I see a lot of arrays. But I also see a lot of legacy codebases that use "C with Classes"-style C++ programming. That doesn't mean that you should program that way.

I am going to pop my opinion in here for coding large sized array/vectors used in science and engineering.
The pointer based arrays in this case can be quite a bit faster especially for standard types. But the pointers add the danger of possible memory leaks. These memory leaks can lead to longer debug cycle. Additionally if you want to make the pointer based array dynamic you have to code this by hand.
On the other hand vectors are slower for standard types. They also are both dynamic and memory safe as long as you are not storing dynamically allocated pointers in the stl vector.
In science and engineering the choice depends on the project. how important is speed vs debug time? For example LAAMPS which is a simulation software uses raw pointers that are handled through their memory management class. Speed is priority for this software. A software I am building, i have to balance speed, with memory footprint and debug time. I really dont want to spend a lot of time debugging so i am using the STL vector.
I wanted to add some more information to this answer that I discovered from extensive testing of large scale arrays and lots of reading the web. So, another problem with stl vector and large sized arrays (one million +) occurs in how memory gets allocated for these arrays. Stl vector uses the std::allocator class for handling memory. This class is a pool based memory allocator. Under small scale loading the pool based allocation is extremely efficient in terms of speed and memory use. As the size of the vector gets into the millions, the pool based strategy becomes a memory hog. This happens because the pools tendency is to always hold more space than is being currently used by the stl vector.
For large scale vectors you are either better off writing your own vector class or using pointers (raw or some sort of memory management system from boost or the c++ library). There are advantages and disadvantages to both approaches. The choice really depends on the exact problem you are tackling (too many variables to add in here). If you do happen to write your own vector class make sure to allow the vector an easy way to clear its memory. Currently for the Stl vector you need to use swap operations to do something that really should have been built into the class in the first place.

Rule of thumb: if you don't know the number of elements in advance, or if the number of elements is expected to be large (say, more than 10), use vector. Otherwise, you could also use an array. For example, I write a lot of geometry-processing code and I define a line as an ARRAY of 2 coordinates. A line is defined by two points, and it will ALWAYS be defined by exactly two points. Using a vector instead of an array would be overkill in many ways, also performance-wise.
Another thing: when I say "array" I really DO MEAN array: a variable declared using an array syntax, such as int evenOddCount[2]; If you consider choosing between a vector and a dynamically-allocated block of memory, such as int *evenOddCount = new int[2];, the answer is clear: USE VECTOR!

It's a rare case in the real world where you deal with fixed collections, of a known size. In almost all cases there is a degree of the unknown in exactly what size of data set you will be accommodating in your program. Indeed it is the hallmark of a good program that it can accomodate a wide range of possible scenarios.
Take these (trivial) scenarios as examples:
You have implemented a view
controller to track AI combatants in
a FPS. The game logic spawns a random
number of combatants in various zones
every couple of seconds. The player
is downing AI combatants at a rate
known only at run time.
A lawyer has accessed the Municipal
Court website in his state and is
querying the number of new DUI cases
that came in over the night. He
chooses to filter the list by a set
of variables including time the
accident occurred, zip code, and
arresting officer.
The operating system needs to
maintain a list of memory addresses
in use by the various programs
running on it. The number of programs
and their memory usage changes in
unpredictable ways.
In any of these cases a good argument can be made that a variable size list (that accommodates dynamic inserts and deletes) will perform better than a simple array. With the main benefits coming from reduced need to alloc/dealloc memory space for the fixed array as you add or remove elements from it.

As far as arrays are considered, simple integer or string arrays are very easy to use. On the other hand, for common functions like searching,sorting,insertion,removal, you can achieve much faster speed using standard algorithms (built in library functions) on vectors. Specially if you are using vectors of objects.
Secondly there is this huge difference that vectors can grow in size dynamically as more objects are inserted. Hope that helps.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Vector, Matrix, Algebra class Design - c++

Related

Declaring 3D array structure in c++ using vector

C++ Vector vs Array [duplicate]

c++11 std::array vs static array vs std::vector

Are there benefits to allocating large data contiguously?

C++ std::vector vs array in the real world

Categories

Resources