Discussion about std::vector<T> and standard array - c++

Discussion about std::vector and standard array
Say if we have following code:
void myclass::loadArray(void *outData)
void myclass::loadVector(void *outData)
void myclass::func ()
{
//here we have a vector
std::vector<int> myVector;
myVector.resize(10)
// here we have an array
int myArray[10];
here I wonder what will be the differences between following implementations
//1: array
myclass::loadArray(myArray)
//2: array
myclass::loadArray(&(myArray[0]))
//1: vector
myclass::loadVector(myVector)
//2: vector
myclass::loadVector(&(myVector[0]))
}
From my understanding, Just depending on if we want to use array and vector we pick different solution.
There is no difference between 1 and 2. Could you please correct me if I am wrong.

The two versions with the array are equivalent: in the first, the array is implicitly converted to a pointer to its first element, which the second creates explicitly.
The first version with the vector won't compile, since there is no implicit conversion to a pointer. You'll have to explictly get the address of the array; either as you do in the second version, or with vector.data().

1 and 2 will be the same for the array versions. In the first, the array will decay to a pointer to the first element, the second calls explicitly with a pointer to the first element.
The first vector version will not compile as you can't implicitly change a std::vector to a void*.
For the second vector version, you will be calling the function with a pointer to the first element stored in the vector, not the vector itself.

Related

Declaring arrays in C++

I am new to C++ and currently learning it with a book by myself. This book seems to say that there are several kinds of arrays depending on how you declare it. I guess the difference between dynamic arrays and static arrays are clear to me. But I do not understand the difference between the STL std::array class and a static array.
An STL std::array variable is declared as:
std::array < int, arraySize > array1;
Whereas a static array variable is declared as:
int array1[arraySize];
Is there a fundamental difference between the two? Or is it just syntax and the two are basically the same?
A std::array<> is just a light wrapper around a C-style array, with some additional nice interface member functions (like begin, end etc) and typedefs, roughly defined as
template<typename T, size_t N>
class array
{
public:
T _arr[N];
T& operator[](size_t);
const T& operator[](size_t) const;
// other member functions and typedefs
}
One fundamental difference though is that the former can be passed by value, whereas for the latter you only pass a pointer to its first element or you can pass it by reference, but you cannot copy it into the function (except via a std::copy or manually).
A common mistake is to assume that every time you pass a C-style array to a function you lose its size due to the array decaying to a pointer. This is not always true. If you pass it by reference, you can recover its size, as there is no decay in this case:
#include <iostream>
template<typename T, size_t N>
void f(T (&arr)[N]) // the type of arr is T(&)[N], not T*
{
std::cout << "I'm an array of size " << N;
}
int main()
{
int arr[10];
f(arr); // outputs its size, there is no decay happening
}
Live on Coliru
The main difference between these two is an important one.
Besides the nice methods the STL gives you, when passing a std::array to a function, there is no decay. Meaning, when you receive the std::array in the function, it is still a std::array, but when you pass an int[] array to a function, it effectively decays to an int* pointer and the size of the array is lost.
This difference is a major one. Once you lose the array size, the code is now prone to a lot of bugs, as you have to keep track of the array size manually. sizeof() returns the size of a pointer type instead of the number of elements in the array. This forces you to manually keep track of the array size using interfaces like process(int *array, int size). This is an ok solution, but prone to errors.
See the guidelines by Bjarne Stroustroup:
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rp-run-time
That can be avoided with a better data type, which std::array is designed for, among many other STL classes.
As a side note, unless there's a strong reason to use a fixed size array, std::vector may be a better choice as a contiguous memory data structure.
std::array and C-style arrays are similar:
They both store a contiguous sequence of objects
They are both aggregate types and can therefore be initialized using aggregate initialization
Their size is known at compile time
They do not use dynamic memory allocation
An important advantage of std::array is that it can be passed by value and doesn't implicitly decay to a pointer like a C-style array does.
In both cases, the array is created on the stack.
However, the STL's std::array class template offers some advantages over the "raw" C-like array syntax of your second case:
int array1[arraySize];
For example, with std::array you have a typical STL interface, with methods like size (which you can use to query the array's element count), front, back, at, etc.
You can find more details here.
Is there a fundamental difference between the two? or is it just syntax and the two are basically the same?
There's a number of differences for a raw c-style array (built-in array) vs. the std::array.
As you can see from the reference documentation there's a number of operations available that aren't with a raw array:
E.g.: Element access
at()
front()
back()
data()
The underlying data type of the std::array is still a raw array, but garnished with "syntactic sugar" (if that should be your concern).
The key differences of std::array<> and a C-style array is that the former is a class that wraps around the latter. The class has begin() and end() methods that allow std::array objects to be easily passed in as parameters to STL algorithms that expect iterators (Note that C-style arrays can too via non member std::begin/std::end methods). The first points to the beginning of the array and the second points to one element beyond its end. You see this pattern with other STL containers, such as std::vector, std::map, std::set, etc.
What's also nice about the STL std::array is that it has a size() method that lets you get the element count. To get the element count of a C-style array, you'll have to write sizeof(cArray)/sizeof(cArray[0]), so doesn't stlArray.size() looks much more readable?
You can get full reference here:
http://en.cppreference.com/w/cpp/container/array
Usually you should prefer std::array<T, size> array1; over T array2[size];, althoug the underlying structure is identical.
The main reason for that is that std::array always knows its size. You can call its size() method to get the size. Whereas when you use a C-style array (i.e. what you called "built-in array") you always have to pass the size around to functions that work with that array. If you get that wrong somehow, you could cause buffer overflows and the function tries to read from/write to memory that does not belong to the array anymore. This cannot happen with std::array, because the size is always clear.
IMO,
Pros: It’s efficient, in that it doesn’t use any more memory than built-in fixed arrays.
Cons: std::array over a built-in fixed array is a slightly more awkward syntax, and that you have to explicitly specify the array length (the compiler won’t calculate it for you from the initializer).

Pointer to Array holding two Vectors

I am having something i can't seem to figure out myself again. I don't know how to call this problem.
vector<int> *integerVectors[2] = { new vector<int>, new vector<int>};
(*integerVectors)[0].push_back(1);
(*integerVectors)[1].push_back(1);
When i run this i get an unhandled exeption. What I want is an Array with 2 indexes and each of them holding a vector.
EDIT: The Problem seems to appear when i start pushing back.
this syntax:
MyType *var[size];
creates an array of pointers to MyType of size size, which means that this:
vector<int> *integerVectors[2];
produces a size 2 array of pointers to integer vectors, a fact which is backed up by your ability to initialize integerVectors with an initializer-list of pointers to integer vectors produced by calls to new.
this:
(*integerVectors)
produces a pointer to your first vector pointer. You then call operator[] on it, which offsets the pointer by the size of a vector. But this is no longer a pointer to your array--if you call it with an argument of greater than 0, you'll be referencing an imaginary vector next to the one pointed by your first vector element.
Then you call push_back on the imaginary vector, naturally leading to massive problems at run time.
You either want to offset before dereferencing, as in #Abstraction's suggestion of
integerVectors[i]->push_back(1);
or you want to avoid C-style arrays. You're already using one vector, and nesting them rather than making arrays of them will avoid much confusion of this type in the future, while preserving the correct syntax:
vector<vector<int>*> integerVectors = {new vector<int>, new vector<int>};
integerVectors[1]->push_back(1);
Without the c-style array, your needed syntax is vastly clearer.
Even better, you could just avoid the pointers altogether and use
vector<vector<int>> integerVectors = {vector<int>{}, vector<int>{}};
integerVectors[1].push_back(1);
The first line of which has a few equivalent syntaxes, as pointed out by #NeilKirk:
vector<vector<int>> integerVectors(2);
vector<vector<int>> integerVectors{{},{}};
vector<vector<int>> integerVectors(2, vector<int>{});
vector<vector<int>> integerVectors(2, {});
The curly braces I'm using in initialization are another way to initialize an object, without the possibility of the compiler confusing it for a function in the most vexing parse. (However, you have to be careful with it around things which can be initialized with initializer lists like here, which is why some of these initializations use parentheses, instead) The first syntax default-initializes two vectors, and so does the second, though it explicitly states the same thing. It can also be modified to produce more complex structures: two vectors both with the elements {1,2,3} could be one of the below:
vector<vector<int>> integerVectors = {{1,2,3},{1,2,3}};
vector<vector<int>> integerVectors{{1,2,3},{1,2,3}};
vector<vector<int>> integerVectors(2, vector<int>{1,2,3});
vector<vector<int>> integerVectors(2, {1,2,3});
This line is the problem
(*integerVectors)[1].push_back(1);
Dereferencing *integerVectors gives you the first vector pointer (equivalent of integerVectors[0]. Then you call you call operator[] with 1 as arguent on this pointer which will give reference to vector with address shifted with one vector size forward (equivalent to *(integerVectors[0]+1) ) which is not valid.
The right syntax is
integerVectors[1]->push_back(1);

Vector initialization using array

we can initialize a vector using array. suppose,
int a[]={1,2,3}
vector<int>x(a,a+3)
this is a way . My question is, what is a and a+3 here, are they pointer?
and someone could explain this:
for the above array
vector<int>x(a,&a[3])
also gives no error and do the same as above code. If we write a[3], it should be outside of the array? can someone tell me the inner mechanism?
Yes, a and a+3 are pointers. The plain a is an array that gets converted implicitly to a pointer at the drop of a hat.
The construct &a[3] is identical in meaning to a+3 for plain C arrays. Yes, a[3] is outside the array, but one-past is allowed to point to, if you don't dereference it.
A vector's range constructor looks like this:
template <class InputIterator>
vector (InputIterator first, InputIterator last,
const allocator_type& alloc = allocator_type());
It will construct elements in the range [first, last), meaning that the last element is not included. &a[3] points to an element outside the bounds of the array, similar to how std::end(a) will point one past the end of a. Compare it to:
std::vector<int> x(std::begin(a), std::end(a));
Also, *(a + 3) is equivalent to a[3].
int a[]={1,2,3}
vectorx(a,a+3)
a is an array so it is always pointing to its base address.
a+3 mean base address+(sizeof(int) *3)
suppose base address is 100
a=100;
a+3=106;
vectorx(a,&a[3])
Here &a[3] is equivalent to a+3

How can I sort an array passed as a parameter?

I have to write a method within already-written code that passes me an array directly. However once inside my method that array has become a pointer to the first object in the array. So now I have done some calculations and want to sort the array. But since it's now not considered an array, I can't perform the sort() function.
What's the best way to sort an array when I only have the pointer to work with?
You either need to know the number of elements in the array, passed as a separate parameter or have a pointer to one past the last element.
void my_sort(int* p, unsigned n) {
std::sort(p, p+n);
}
or
void my_sort2(int* p, int* p_end) {
std::sort(p, p_end);
}
and you would call them
int a[] = { 3, 1, 2 };
my_sort(a, sizeof a / sizeof a[0]); // or 3...
my_sort2(a, &a[2] + 1); // one past the last element! i.e. a+3
In c there is essentially no difference between an "array" and a "pointer to the first object in the array". Arrays are referred to using their base pointer, that is, pointer to first object.
Technically precise explanation at Array base pointer and its address are same. Why?
So, just sort the array as you would anywhere else. Got an example sort or sample code in mind or is that sufficient?
Sort it exactly as you would sort it before you passed it in. If your sort() function requires a length, then pass the length as an additional parameter.
The best would be if you could start using std::array from C++11 on:
http://en.cppreference.com/w/cpp/container/array
This way, you would also have the size known and accessible by the corresponding size method. You could also consider other std container types rather than raw array. In general, it is better to avoid raw arrays as much as possible.
Failing that, you would need to know the size of the array either through function parameter, or other means like class member variable if it is happening inside a class, and so on.
Then, you could use different type of sorting algorithms based on your complexity desire; let it be quick sort, bubble sort, heap sort, stable sort, etc... it depends on what kind of data, the array represents, etc.
One sorting algorithm is to use std::sort. Therefore, you would be writing something like this:
std::sort (mystdarray.begin(), mystdarray.end());
or
std::sort (myrawarray, myrawarray+size);

How can a vector be used in this way?

Can someone explain this code please? how is it that the function bar accepts a reference to the first element of the vector?
jintArray arry;
std::vector<int> foo = GetIntegerArray(env, arry);
bar(&foo[0])
where the protoytpe of bar is
bar(int* array)
This is valid as long as the template type isn't bool. The C++ vector type specifies that the vector elements are consecutive in memory like that so that you can do exactly this.
The reason why it doesn't work with bool is due to template specialization. Where the bools are compressed down to a bitfield.
http://en.wikipedia.org/wiki/Vector_%28C%2B%2B%29#vector.3Cbool.3E_specialization
how is it that the function bar accepts a reference to the first element of the vector?
This seems to be the source of the confusion. The expression &foo[0] is not a reference to the first element, but rather a pointer. operator[] is overloaded in the vector class to obtain a reference (or const-reference), and applying & will obtain the address of the object.
Yes. Just make sure the vector is not empty, or &foo[0] would be an error. C++11 introduced the std::vector<T>::data() function that does not have this problem.
Also, returning a vector by value is usually not a good idea. You might want to use an output iterator or a vector reference parameter in GetIntegerArray, so you would call it like this:
std::vector<int> foo;
GetIntegerArray(env, arry, back_inserter(foo));
or
std::vector<int> foo;
GetIntegerArray(env, arry, foo);
When you use std::vector<int>, it is guaranteed that all the element are created in contiguous memory. As such, when you write &v[0] it returns the pointer to the first element, and from this you can go the next element by writing &v[0]+1, and so on.
By the way, if you want to traverse through all elements or a section of elements, then a better interface for bar would be this:
void bar(int *begin, int *end)
{
for ( ; begin != end; ++begin)
{
//code
}
}
So you can call like this:
bar(&foo[0], &foo[0] + foo.size());//process all elements
bar(&foo[0], &foo[0] + foo.size()/2);//process first half elements
bar(&foo[0], &foo[0] + N); //process first N elements(assumingN <=foo.size())
bar(&foo[0]+foo.size()/2, &foo[0]+foo.size());//process second half elements