question on stl fill function in C++ - c++

Let's say I have an array like this:
string x[2][55];
If I want to fill it with "-1", is this the correct way:
fill(&x[0][0],&x[2][55],"-1");
That crashed when I tried to run it. If I change x[2][55] to x[1][54] it works but it doesn't init the last element of the array.
Here's an example to prove my point:
string x[2][55];
x[1][54] = "x";
fill(&x[0][0],&x[1][54],"-1");
cout<<x[1][54]<<endl; // this print's "x"

Because when you have a multi-dimensional array, the address beyond the first element is a little confusing to calculate. The simple answer is you do this:
&x[1][55]
Let's consider what a 2d array x[N][M] is laid out in memory
[0][0] [0][1] ... [0][M-1] [1][0] [1][1] ... [1][M-1] [N-1][0] .. [N-1][M-1]
So, the very last element is [N-1][M-1] and the first element beyond is [N-1][M]. If you take the address of [N][M] then you go very far past the end and you overwrite lots of memory.
Another way to calculate the first address beyond the end is to use sizeof.
&x[0][0] + sizeof(x) / sizeof(std::string);

From the formal and pedantic point of view, this is illegal. Reinterpreting a 2D array as a 1D array results in undefined behavior, since you are literally attempting to access 1D x[0] array beyond its boundary.
In practice, this will work (although some code analysis tools might catch an report this as a violation). But in order to specify the pointer to the "element beyond the last" correctly, you have to be careful. It can be specified as &x[1][55] or as &x[2][0] (both are the same address).

Related

Dynamic and static array

I am studying C++ reading Stroustrup's book that in my opinion is not very clear in this topic (arrays). From what I have understood C++ has (like Delphi) two kind of arrays:
Static arrays that are declared like
int test[3] = {10,487,-22};
Dynamic arrays that are called vectors
std::vector<int> a;
a.push_back(10);
a.push_back(487);
a.push_back(-22);
I have already seen answers about this (and there were tons of lines and concepts inside) but they didn't clarify me the concept.
From what I have understood vectors consume more memory but they can change their size (dynamically, in fact). Arrays instead have a fixed size that is given at compile time.
In the chapter Stroustrup said that vectors are safe while arrays aren't, whithout explaining the reason. I trust him indeed, but why? Is the reason safety related to the location of the memory? (heap/stack)
I would like to know why I am using vectors if they are safe.
The reason arrays are unsafe is because of memory leaks.
If you declare a dynamic array
int * arr = new int[size]
and you don't do delete [] arr, then the memory remains uncleared and this is known as a memory leak. It should be noted, ANY time you use the word new in C++, there must be a delete somewhere in there to free that memory. If you use malloc(), then free() should be used.
http://ptolemy.eecs.berkeley.edu/ptolemyclassic/almagest/docs/prog/html/ptlang.doc7.html
It is also very easy to go out of bounds in an array, for example inserting a value in an index larger than its size -1. With a vector, you can push_back() as many elements as you want and the vector will resize automatically. If you have an array of size 15 and you try to say arr[18] = x,
Then you will get a segmentation fault. The program will compile, but will crash when it reaches a statement that puts it out of the array bounds.
In general when you have large code, arrays are used infrequently. Vectors are objectively superior in almost every way, and so using arrays becomes sort of pointless.
EDIT: As Paul McKenzie pointed out in the comments, going out of array bounds does not guarantee a segmentation fault, but rather is undefined behavior and is up to the compiler to determine what happens
Let us take the case of reading numbers from a file.
We don't know how many numbers are in the file.
To declare an array to hold the numbers, we need to know the capacity or quantity, which is unknown. We could pick a number like 64. If the file has more than 64 numbers, we start overwriting the array. If the file has fewer than 64 (like 16), we are wasting memory (by not using 48 slots). What we need is to dynamically adjust the size of the container (array).
To dynamically adjust the capacity of an array, a new larger array must be created, then elements copied and the old array deleted.
The std::vector will adjust its capacity as necessary. It handles the dynamic allocation of memory for you.
Another aspect is the passing of the container to a function. With an array, you need to pass the array and the capacity. With std::vector, you only need to pass the vector. The vector object can be queried about its capacity.
One Security I can see is that you can't access something in vector which is not there.
What I meant by that is , if you push_back only 4 elements and you try to access index 7 , then it will throw back an error. But in array that doesn't happen.
In short, it stops you from accessing corrupt data.
edit :
programmer has to compare the index with vector.size() to throw an error. and it doesn't happne automatically. One has to do it by himself/herself.

C++ indexing beyond an array's size

My code takes an int input and sets that as the array's size, and I have some test prints that print out the index of the array starting from 0 to 4..
std::cout<<array[0]<<std::endl;
std::cout<<array[1]<<std::endl;
std::cout<<array[2]<<std::endl;
std::cout<<array[3]<<std::endl;
std::cout<<array[4]<<std::endl;
However, I noticed that if the input is smaller than 5, say 2 for instance, then the first two cout print out correctly, but then the rest print out 0 or random numbers like 17 and 135137. Is this an out of bounds thing that happens when you index beyond the array size or is this a problem in my code? (I know I have to change the print statements)
The arrays are dynamically allocated by the way, which I think shouldn't matter.
Is this an out of bounds thing that happens when you index beyond the array size or is this a problem in my code?
Both.
Assuming array itself has a size of at least 5 elements, the initial contents of it before you set the values to anything are undefined; essentially random (they're just whatever happened to be hanging out in that particular block of memory that your array now occupies). If array itself has a size of less than 5, the values are still undefined but accessing them also runs the risk of crashing the program. In either case, the fact that you are printing values beyond the end of the initialized, valid data in your array is a problem with your code.
If you allocate an array of n elements, accessing the (n+1)th element is undefined behaviour (UB). (Note after comments: The (n+1)th element is the element with index n. So if array has only size 3, accessing array[3] already causes UB).
So, yes it is an "out of bounds thing" and it is a problem of your code (because it's you who accesses the array beyond its size.
Why not loop to print out simply the existing element instead of hardcoding the indices?

C++: Write to/read from invalid/out of bound array index?

First of all, I am a beginner when it comes to C++ programming. Yesterday I encountered something rather strange. I was trying to determine the length of an array via a pointer pointing towards it. Since sizeof didn't work I did a little Google search and ended up on this website where I found the answer that it was not possible. Instead I should put an out of bound value at the last index of the array and increment a counter until this index is reached. Because I didn't want to overwrite the information that was contained at the last index, I tried putting the out of bound value one index after the last one. I expected it to fail, but for some reason it didn't.
I thought that I made a mistake somewhere else and that the array was longer then I assigned it to be, so I made the following test:
int a[4];
a[20] = 42;
std::cout << a[20];
The output is 42 without any errors. Why does this work? This should not be valid at all, right? What's even more interesting is the fact that this works with any primitive type array. However, once I use a std::string the program instantly exists with 1.
Any ideas?
Your system just happens to not be using the memory that just happens to be 20 * sizeof(int) bytes further from the address of your array. (From the beginning of it.) Or the memory belongs to your process and therefore you can mess with it and either break something for yourself or just by lucky coincidence break nothing.
Bottom line, don't do that :)
I think what you need to understand is the following:
when you creating a[4] the compiler allocate memory for 4 integers and remember in a the address of the first one: (*a == &(a[0])).
when you read\write the compiler doesn't check if you in the bounds (because he doesn't longer have this information). and just go to the address of the requested cell of the array in the following way: a[X] == &(a + sizeof(int) * X)
in C++ it's the programmer responsibility to check the bounds when accessing an array.

How do I take the address of one past the end of an array if the last address is 0xFFFFFFFF?

If it is legal to take the address one past the end of an array, how would I do this if the last element of array's address is 0xFFFFFFFF?
How would this code work:
for (vector<char>::iterator it = vector_.begin(), it != vector_.end(); ++it)
{
}
Edit:
I read here that it is legal before making this question: May I take the address of the one-past-the-end element of an array?
If this situation is a problem for a particular architecture (it may or may not be), then the compiler and runtime can be expected to arrange that allocated arrays never end at 0xFFFFFFFF. If they were to fail to do this, and something breaks when an array does end there, then they would not conform to the C++ standard.
Accessing out of the array boundaries is undefined behavior. You shouldn't be surprised if a demon flies out of your nose (or something like that)
What might actually happen would be an overflow in the address which could lead to you reading address zero and hence segmentation fault.
If you are always within the array range, and you do the last ++it which goes out of the array and you compare it against _vector.end(), then you are not really accessing anything and there should not be a problem.
I think there is a good argument for suggesting that a conformant C implementation cannot allow an array to end at (e.g.) 0xFFFFFFFF.
Let p be a pointer to one-element-off-the-end-of-the-array: if buffer is declared as char buffer[BUFFSIZE], then p = buffer+BUFFSIZE, or p = &buffer[BUFFSIZE]. (The latter means the same thing, and its validity was made explicit in the C99 standard document.)
We then expect the ordinary rules of pointer comparison to work, since the initialization of p was an ordinary bit of pointer arithmetic. (You cannot compare arbitrary pointers in standard C, but you can compare them if they are both based in a single array, memory buffer, or struct.) But if buffer ended at 0xFFFFFFFF, then p would be 0x00000000, and we would have the unlikely situation that p < buffer!
This would break a lot of existing code which assumes that, in valid pointer arithmetic done relative to an array base, the intuitive address-ordering property holds.
It's not legal to access one past the end of an array
that code doesn't actually access that address.
and you will never get an address like that on a real system for you objects.
The difference is between dereferencing that element and taking its address. In your example the element past the end wont be dereferenced and so it is a valid. Although this was not really clear in the early days of C++ it is clear now. Also the value you pass to subscript does not really matter.
Sometimes the best thing you can do about corner cases is forbid them. I saw this class of problem with some bit field extraction instructions of the NS32032 in which the hardware would load 32 bits starting at the byte address and extract from that datum. So even single-bit fields anywhere in the last 3 bytes of mapped memory would fail. The solution was to never allow the last 4 bytes of memory to be available for allocation.
Quite a few architectures that would be affected by this solve the problem by reserving offset 0xFFFFFFFF (and a bit more) for the OS.

What may cause losing object at the other end of a pointer in c++?

EDIT: I have found the error: I did not initialize an array with a size. question can be closed.
I have a class V, and another class N. An object of N will have an array of pointers to objects of class V (say V **vList). So, N has a function like
V **getList();
Now in some function of other classes or simply a driver function, if I say V **theList = (N)n.getList(); Q1: theList would be pointing at the 1st element of the array? Given that the size of array is known, can I loop through with index i and say V *oneV = *vList[i]? Please correct me if what I'm doing above is wrong.
I have been using debugger to trace through the whole process of my program running, the thing I found was that after using V *oneV = vList[i], the value of the pointers in the array, vList, were the same as when they were created, but if I follow the pointer to where it is pointing at, the object was gone. I'm guessing that might be the reason why I am getting seg fault or bus error. Could it be the case? WHY did I 'loose' the object at the other end of a pointer? What did I do wrong?
and yes, I am working on a school assignment, that's why I do not want to print out my codes, I want to finish it myself, but I need help finding a problem. I think I still need explanation on array of pointers. Thank you
Q1 is right. For the second part, V *oneV = vList[i] would be the correct syntax. In your syntax you are dereferencing one more time (treating an object of type V as a pointer to such an object) which obviously is crashing your code.
EDIT:
Since you are using the correct syntax, the reason of segfaults would depend on your memory management of the objects of type V. If you have inserted addresses of objects created on the stack (automatic vars, not by new or malloc) inside a function and are trying to access them outside of it, then the pointers would be dangling and your code will crash.
Class N has to manage the number of elements in a list somehow. The usual approaches are to make a public function which returns the number of elements in the array, or to provide an iterator function which loops over all the list's elements.
An array with N elements are stored at array[0] through array[N-1]. You're accessing one past the end of the array.
First rule out the initial ones:
you are initializing correctly (new instead of automatic/local variables)
you are accessing the elements correctly (not like in the typo you posted in the question - based on your comment)
you are using the right size
If you go through all the normal ones and everything is k, then make sure to pay special attention to your loops / size calculations / and anything else that could be causing you to write to unintended addresses.
It is possible to write garbage at unintended locations & then get the error in unexpected places ... the worst I saw like that, was some file descriptors's variables being corrupted because of an array gone wrong right before those variables - it broke on file related functions, which seemed v. crazy.
theList would be pointing at the 1st
element of the array? Given that the
size of array is known, can I loop
through with index i and say V *oneV =
*vList[i]?
Yes, that is correct.
I'm guessing that might be the reason
why I am getting seg fault or bus
error. Could it be the case?
Yes, if you have an invalid pointer and try to dereference it you'll get a segfault.
WHY did I 'loose' the object at the
other end of a pointer? What did I do
wrong?
That is difficult to predict without seeing the actual code. Most probable causes are that either you are not filling the V** correctly or after putting a V* pointer inside V** array you are deleting that object from some other place. BTW, I am assuming that you are allocating memory using new, is this assumption correct?