Simple Deque initialization question - c++

I have used the following code to insert some data in a deque.
int data[] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
deque<int> rawData (data, data + sizeof(data) / sizeof(int));
But I dont understand this part of the code,
data + sizeof(data) / sizeof(int)
What does it mean?

Let's take that bit by bit.
data is the iterator showing where to start. It's an array, but in C and C++ arrays decay to pointers on any provocation, so it's used as a pointer. Start taking in data from data on, and continue until the end iterator.
The end iterator is a certain amount past the start iterator, so it can be expressed as data + <something>, where <something> is the length. The start iterator is an int [] that is treated as an int *, so we want to find the length in ints. (In C and C++, pointers increment by the length of the pointed-to type.)
Therefore, sizeof(data) / sizeof(int) should be the length of the array. sizeof(data) is the total size of the array in bytes. (This is one of the differences between arrays and pointers: arrays have a defined size, while pointers point to what might be the start of an array of unknown size.) sizeof(int) is the total size of an int in bytes, and so the quotient is the total size of array in ints.
We want the size of array in ints because array decays into an int *, and so data + x points to the memory location x ints past data. From a beginning and a total size, we find the end of data, and so we copy everything in data from the beginning to the end.

That's a pointer to the imaginary element beyond the last element of the array. The sizeof(data)/sizeof(data[0]) yields the number of elements in data array. deque constructor accepts "iterator to the first element" and "iterator beyond the last element" (that's what end() iterator yields). This construct effectively computes the same as what .end() iterator would yield.

Related

H­ow d­oes List::in­sert w­ork with a­rr­a­ys­

I am on youtube looking over tutorials. One had a block of code using the list library as such:
int arr1[5]{ 1,2,3,4,5 };
list<int> list1;
list1.insert(list1.begin(), arr1, arr1 + 5);
My question is, how can one use an array like this? Last I checked, arr1 is an array not a pointer that you use to loop through elements. How does the insert function work?
When an array is used by name, it's name references to first element of the array. For an array arr whenever you say arr[x], [] is defined in terms of pointers. It means start at a pointer referencing arr and move x steps ahead. A size of each step is the sizeof datatype your array is made up of. Thus,arr[x] can also be written as *(arr + x), dereferencing the pointer after x steps.
Now speaking of your list insertion, it means copy all the elements between pointers arr and arr + 5 to the list.
arr1 can be used as a pointer to the beginning of the array, because it gets converted automatically.

Inserting a sub-array into a vector

The following statement inserts part of an array into an empty vector. It then prints the last elemnt inserted which is 14 in this case. My question is, how is the final array element that is inserted being determined with this syntax? How is "myArray+3" returning the third element in the array to the function?
vector <int> myVector(10);
int myArray[5] = {3,9,14,19,94};
myVector.insert(myVector.begin(), myArray, myArray+3);
cout << myVector.at(2) << endl;
For starters the vector is not empty. It has 10 elements initialized by zeroes.
vector <int> myVector(10);
As for these arguments
myArray, myArray+3
then they specify a range in the array the following way
[&myArray[0], &myArray[3])
^^^ ^^^
That means that the elements pointed to by these pointers
&myArray[0], &myArray[1], &myArray[2]
will be included in the vector. That is the second value of the range specifies elements before the value.
The element pointed to by the pointer &myArray[3] (that is by the pointer myArray + 3) will not be inserted to the vector.
Compare for example. If an array has N elements then the range of acceptable indices for its element is
[0, N-1]
^^^ ^^^
that can be also specified like
[0, N)
^^^ ^^^
Arrays in C++ are laid out in a contiguous fashion, so that the address of the array is the same as the address of the first element of the array, followed by the address of the next, and the next, etc.
Now when you do myArray + 3, this is actually saying, "Go to the first element and get the third element from the start position".
So if you had done (myArray + 1) + 3, this will mean to first from the first position to the second, and using your new position as a reference point, move three positions from there.
How does it know where to go? Simply by taking the size in bytes of a single element of the array and multiplying that by the distance you wanted to move forward, and then adding this value to the address of the reference position, and voila! You have gotten to the nth element of the array.

Pointer to 2D Array(why does it work)

I have the following function ( I want to print all elements from a given row)
void print_row(int j, int row_dimension, int *p)
{
p = p + (j * row_dimension);
for(int i = 0; i< row_dimension; i++)
cout<<*(p+i)<< " ";
}
Creating an array
int j[3][3]={{1,2,3},
{4,5,6},
{7,8,9} };
What I do not understand is why can I call the function in the following way :
print_row(i, 3, *j);
Why can I give as a parameter "*j" ? Shouldn't an address be passed? Why can I use the indirection operator?
int j[3][3] =
{{1,2,3},
{4,5,6},
{7,8,9}}; // 2d array
auto t1 = j; // int (*t1)[3]
auto t2 = *j; // int *t2
So what is happening is that *j produces j[0], which is an int[3] which then decays to an int*.
j is in fact an array of arrays. As such, *j is an array of three integers, and when used as a rvalue, it decays to a pointer to its first element, said differently, it decays to &j[0][0].
Then in printrow you compute the starting address of the first element of each subarray - that's the less nice part, I'll come back later to it. Then you correctly use the *(p+i) equivalent of p[i] to access each element of the subarray.
The remaining part of the answer is my interpretation of a strict reading of C standard
I said that computing the starting address of each subarray was the less nice part. It works because we all know that a 2D array of size NxM has the same representation in memory as a linear array of size N*M and we alias those representations. But if we respect strictly the standard, as an int pointer, &p[i][j] points to the first element of an array of three elements. As such, when you add the size of a row, you point past the end of the array which leads to undefined behaviour if you later dereference this address. Of course it works with all common compilers, but on an old question of mine, #HansPassant gave me a reference on experimental compilers able to enforce controls on arrays sizes. Those compilers could detect the access past the end of the array and raise a run time error... but it would break a lot of existing code!
To be strictly standard conformant, you should use a pointer to arrays of 3 integers. It requires the use of Variable Length Arrays, which is an optional feature but is fully standard conformant for system supporting it. Alternatively, you can go down to the byte representation of the 2D array, get its initial address, and from there compute as byte addresses the starting point of each subarray. It is a lot of boiling plate address computations but it fully respect the ##!%$ strict aliasing rule...
TL/DR: this code works with all common compilers, and will probably work with a lot of future versions of them, but it is not correct in a strict interpretation of the standard.
Your code works because *j is a pointer which has the same value as j or j[0]. Such behavior caused by mechanics of how two-dimensional arrays are handled by the compiler.
When you declare 2D array:
int j[3][3]={{1,2,3},
{4,5,6},
{7,8,9}};
compiler actually puts all values sequentially in memory, so the following declaration will have the same footprint:
int j[9]={1,2,3,4,5,6,7,8,9};
So in your case pointers j, *j and j[0] just point to the same place in memory.
Memory isn't multidimensional, so even if its a 2D array, it's data will be placed in a sequential manner, so if you get a pointer to that array -- that is implicitly a pointer to the first element of it -- and start reading the elements sequentially, you will iterate through all elements of this 2D array, reading element from the subsequent rows just after the last element of the previous one.

Calculate number of elements in an array based on pointers to the first and last elements

Suppose there exists an array, A, such that its elements are of struct Element and I am told that the struct Element is packed with no padding.
If I am given pointers to the first and last element in A, can I determine the number of elements in A based on the address of the pointers and amount of memory an Element takes up? Or does the structure of an array in memory not work like that.
My thought is if the pointers I'm given are Element* start and Element* finish...
number of elements = (finish - start) / sizeof(Element)
Is this logical thinking?
If you have:
Element* start; // first element
Element* finish; // last element
Then:
numElements = finish - start + 1;
If finish is like an end in STL, you do not have the +1.
Because of pointer arithmetic, you do not have to divide by sizeof(Element)
With regard to considering whether there might be padding at the structure end, as Billy indicated, sizeof already contains that, as will pointer arithmetic. from the C++14 final draft:
N3797/5.3.3/2 [ sizeof ]
When applied to a class, the result is the number of bytes in an
object of that class including any padding required for placing objects of that type in an array.
When you use pointer arithmetic, you can say that the "unit" is the size of one element of the type pointed to.
I.e. if you have Element* start pointing to the 0-th element of an array, start + 1 will point to the 1-st element of that array.
So, when you use finish - start, you already get the number of elements between them, and there is no need to divide by sizeof(Element).

Returning unique() function in C++

I came across the following function, which sorts an array passed down by main(), removes duplicates, and returns the number of unique elements. It's the last bit I'm having a hard time wrapping my head around.
int reduce(long ar[], int n) {
sort(ar, ar + n);
return unique(ar, ar + n) - ar; // ???
}
To my understanding unique() returns a pointer to the end of the segment that stores the unique values in the array. But I don't see why subtracting the array name from the iterator results in an int that equals the number of unique elements, or why unique(ar, ar+n) can't be typecasted to int to achieve the same result.
why unique(ar, ar+n) can't be typecasted to int to achieve the same result.
Because, as you said, unique returns a pointer. A pointer is a memory address, not an index. So casting a pointer to an int is meaningless.
why subtracting the array name from the iterator results in an int that equals the number of unique elements
Subtracting two pointers (into the same array) evaluates to the number of elements between them.*
* As pointed out by #Nawaz in comments below, this result is signed. So (p1 - p2) == -(p2 - p1).
Say you have an array like this:
{1, 2, 2, 3, 4, 4, 5}
After calling std::unique, you'll probably end up with this (thank-you, Nawaz), with the elements past the new end left as they used to be before the call:
{1, 2, 3, 4, 5, 4, 5}
^
std::unique returns an iterator to the new end of the array, so where the arrow is. From there it makes logical sense that subtracting the beginning of the array would return the number of unique elements. If you want to be a bit more explicit, you can use return std::distance(ar, std::unique(ar, ar + n));, which also works when the iterator doesn't support subtraction.