why doesn't segmentation fault occur here? [duplicate] - c++

This question already has answers here:
Why don't I get a segmentation fault when I write beyond the end of an array?
(4 answers)
Closed 1 year ago.
if I don't define the size of my vector the segmentation fault will occur but in this way , I mean that I define the size lower than the size I use , the segmentation fault won't occur . but why?
what is the difference between them ?
#include <iostream>
#include <string>
using namespace std;
#include <vector>
int main()
{
vector <string> a(2);
a[0]="hello world";
a[2]="maryam";
cout << a[0] << a[2];
return 0;
}

C++ has a notion of undefined behavior. Accessing a vector out-of-bounds is a typical example of undefined behavior (because vector::operator[] performs no bounds checking). Nothing meaningful can be said about the outcome of the program.
But to explain what probably happens...
A vector is a class that usually contains a pointer to a heap-allocated array and its size (the "capacity").
An empty vector has a null pointer value. Dereferencing a null pointer often immediately leads to a segfault because there's no virtual memory region allocated at address 0.
On the other hand, a vector of capacity 2 points to an array of size 2. Writing past it is often possible, you will simply overwrite heap memory that happens to reside immediately after that array. This is called a heap buffer overflow. In a simple program it may appear to work fine. But in a larger program nothing good will happen somewhere further down the code.

Undefined behaviour (which is what you're doing when reading or writing beyond the end of a vector with operator[]) is exactly that, undefined.
It could just as easily crash for the size two case and seem to work for size zero.
Or both could format your storage devices while playing maniacal_laughter.mp3.
Undefined behaviour places no limits on the outcome, and it's something you would be well advised to steer clear of.

[] operator does not have range check. [] operator is used for directly accessing a memory address based on the offset value and "base" address.
From your example,
a[0], a[2]
0 and 2 are offset values, and "base" address will be the address of the array, pointed by the vector.
If you want to have range check, use:
a.at(0), a.at(2)

Related

C++ new[] operator creates array of length = length + 1?

Why does the new[] operator in C++ actually create an array of length + 1? For example, see this code:
#include <iostream>
int main()
{
std::cout << "Enter a positive integer: ";
int length;
std::cin >> length;
int *array = new int[length]; // use array new. Note that length does not need to be constant!
//int *array;
std::cout << "I just allocated an array of integers of length " << length << '\n';
for (int n = 0; n<=length+1; n++)
{
array[n] = 1; // set element n to value 1
}
std::cout << "array[0] " << array[0] << '\n';
std::cout << "array[length-1] " << array[length-1] << '\n';
std::cout << "array[length] " << array[length] << '\n';
std::cout << "array[length+1] " << array[length+1] << '\n';
delete[] array; // use array delete to deallocate array
array = 0; // use nullptr instead of 0 in C++11
return 0;
}
We dynamically create an array of length "length" but we are able to assign a value at the index length+1. If we try to do length+2, we get an error.
Why is this? Why does C++ make the length = length + 1?
It doesn’t. You’re allowed to calculate the address array + n, for the purpose of checking that another address is less than it. Trying to access the element array[n] is undefined behavior, which means the program becomes meaningless and the compiler is allowed to do anything whatsoever. Literally anything; one old version of GCC, if it saw a #pragma directive, started a roguelike game on the terminal. (Thanks, Revolver_Ocelot, for reminding me: that was technically implementation-defined behavior, a different category.) Even calculating the address array + n + 1 is undefined behavior.
Because it can do anything, the particular compiler you tried that on decided to let you shoot yourself in the foot. If, for example, the next two words after the array were the header of another block in the heap, you might get a memory-corruption bug. Or maybe a compiler stored the array at the top of your memory space, the address &array[n+1] is aNULL` pointer, and trying to dereference it causes a segmentation fault. Or maybe the next page of memory is not readable or writable and trying to access it crashes the program with a protection fault. Or maybe the implementation bounds-checks your array accesses at runtime and crashes the program. Maybe the runtime stuck a canary value after the array and checks later to see if it was overwritten. Or maybe it happens, by accident, to work.
In practice, you really want the compiler to catch those bugs for you instead of trying to track down the bugs that buffer overruns cause later. It would be better to use a std::vector than a dynamic array. If you must use an array, you want to check that all your accesses are in-bounds yourself, because you cannot rely on the compiler to do that for you and skipping them is a major cause of bugs.
If you write or read beyond the end of an array or other object you create with new, your program's behaviour is no longer defined by the C++ standard.
Anything can happen and the compiler and program remain standard compliant.
The most likely thing to happen in this case is you are corrupting memory in the heap. In a small program this "seems to work" as the section of the heap ypu use isn't being used by any other code, in a larger one you will crash or behave randomly elsewhere in a seemingoy unrelated bit of code.
But arbitrary things could happen. The compiler could prove a branch leads to access beyond tue end of an array and dead-code eliminate paths that lead to it (UB that time travels), or it could hit a protected memory region and crash, or it could corrupt heap management data and cause a future new/delete to crash, or nasal demons, or whatever else.
At the for loop you are assigning elements beyond the bounds of the loop and remember that C++ does not do bounds checking.
So when you initialize the array you are initializing beyond the bounds of the array (Say the user enters 3 for length you are initializing 1 to array[0] through array[5] because the condition is n <= length + 1;
The behavior of the array is unpredictable when you go beyond its bounds, but most likely your program will crash. In this case you are going 2 elements beyonds its bounds because you have used = in the condition and length + 1.
There is no requirement that the new [] operator allocate more memory than requested.
What is happening is that your code is running past the end of the allocated array. It therefore has undefined behaviour.
Undefined behaviour means that the C++ standard imposes no requirements on what happens. Therefore, your implementation (compiler and standard library, in this case) will be equally correct if your program SEEMS to work properly (as it does in your case), produces a run time error, trashes your system drive, or anything else.
In practice, all that is happening is that your code is writing to memory, and later reading from that memory, past the end of the allocated memory block. What happens depends on what is actually in that memory location. In your case, whatever happens to be in that memory location is able to be modified (in the loop) or read (in order to print to std::cout).
Conclusion: the explanation is not that new[] over-allocates. It is that your code has undefined behaviour, so can seem to work anyway.

delete[] operator causes segmentation fault in very simple case

I have a very strange segmentation fault that occurs when I call delete[] on an allocated dynamic array (created with the new keyword). At first it occurred when I deleted a global pointer, but it also happens in the following very simple case, where I delete[] arr
int main(int argc, char * argv [])
{
double * arr = new double [5];
delete[] arr;
}
I get the following message:
*** Error in `./energy_out': free(): invalid next size (fast): 0x0000000001741470 ***
Aborted (core dumped)
Apart from the main function, I define some fairly standard functions, as well as the following (defined before the main function)
vector<double> cos_vector()
{
vector<double> cos_vec_temp = vector<double>(int(2*pi()/trig_incr));
double curr_val = 0;
int curr_idx = 0;
while (curr_val < 2*pi())
{
cos_vec_temp[curr_idx] = cos(curr_val);
curr_idx++;
curr_val += trig_incr;
}
return cos_vec_temp;
}
const vector<double> cos_vec = cos_vector();
Note that the return value of cos_vector, cos_vec_temp, gets assigned to the global variable cos_vec before the main function is called.
The thing is, I know what causes the error: cos_vec_temp should be one element bigger, as cos_vec_temp[curr_idx] ends up accessing one element past the end of the vector cos_vec_temp. When I make cos_vec_temp one element larger at its creation, the error does not occur. But I do not understand why it occurs at the delete[] of arr. When I run gdb, after setting a breakpoint at the start of the main function, just after the creation of arr, I get the following output when examining contents of the variables:
(gdb) p &cos_vec[6283]
$11 = (__gnu_cxx::__alloc_traits<std::allocator<double> >::value_type *) 0x610468
(gdb) p arr
$12 = (double *) 0x610470
In the first gdb command, I show the memory location of the element just past the end of the cos_vec vector, which is 0x610468. The second gdb command shows the memory location of the arr pointer, which is 0x610470. Since I assigned a double to the invalid memory location 0x610468, I understand it must have wrote partly over the location that starts at 0x610470, but this was done before arr was even created (the function is called before main). So why does this affect arr? I would have thought that when arr is created, it does not "care" what was previously done to the memory location there, since it is not registered as being in use.
Any clarification would be appreciated.
NOTE:
cos_vec_temp was previously declared as a dynamic double array of size int(2*pi()/trig_incr) (same size as the one in the code, but created with new). In that case, I also had the invalid access as above, and it also did not give any errors when I accessed the element at that location. But when I tried to call delete[] on the cos_vec global variable (which was of type double * then) it also gave a segmentation fault, but it did not give the message that I got for the case above.
NOTE 2:
Before you downvote me for using a dynamic array, I am just curious as to why this occurs. I normally use STL containers and all their conveniences (I almost NEVER use dynamic arrays).
Many heap allocators have meta-data stored next to the memory it allocates for you, before or after (or both) the memory. If you write out of bounds of some heap-allocated memory (and remember that std::vector dynamically allocates off the heap) you might overwrite some of this meta-data, corrupting the heap.
None of this is actually specified in the C++ specifications. All it says that going out of bounds leads to undefined behavior. What the allocators do, or store, and where it possibly store meta-data, is up to the implementation.
As for a solution, well most people tell you to use push_back instead of direct indexing, and that will solve the problem. Unfortunately it will also mean that the vector needs to be reallocated and copied a few times. That can be solved by reserving an approximate amount of memory beforehand, and then let the extra stray element lead to a reallocation and copying.
Or, or course, make better predictions for the actual amount of elements the vector will contain.
It looks like you are writing past the end of the vector allocated in the function executing before main, causing undefined behavior later on.
You should be able to fix the problem by rounding the number up when allocating the vector (casting to int rounds the number down), or using push_back instead of indexing:
cos_vec_temp.push_back(cos(curr_val));

Access to out of array range does not give any error [duplicate]

This question already has answers here:
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 7 years ago.
Why this:
#include <iostream>
using namespace std;
int main() {
int a[1] = {0};
a[2048] = 1234;
cout << a[2048] << endl;
return 0;
}
does not give any compile-time error? (gcc 4.9.3)
Because this is legal C++.
You can try to dereference any pointer, even if it's not allocated by your program, you can try to access any cell of an array, even if it is out of bounds, the legality of an expression doesn't depend on the values of the variables involved in that expression.
The compiler doesn't have to run any static analysis to check whether you'll actually cause undefined behaviour or not, and shouldn't fail to compile if it assumes that you will (even when it is obvious that you will).
The problem is that you can't check all possible array access at compile-time (that would be way too expensive), so you'd have to arbitrarily draw a line somewhere (the problem being the word "arbitrarily", that wouldn't fit well in the standard).
Hence, checking that you won't cause undefined behaviour is the responsability of the programmer (or of specific static analysis tools).
Access to out of array range does not give any error
This is just because you were unlucky. :) What you can call it is "Undefined Behavior". Compiler is not doing any bound check on arrays, and what you are trying to do in statement a[2048] = 1234;is to write a memory location on stack, which is unused.

How could it get more memory than I wanted?(C++) [duplicate]

This question already has answers here:
Why doesn't my program crash when I write past the end of an array?
(9 answers)
Closed 8 years ago.
I wanted to take a 1 integer memory, but how this program can work?
Code:
#include<iostream>
using namespace std;
int main(){
int* k=new int[1];
for(int i=0;i<5;i++)
cin>>k[i];
for(int i=0;i<5;i++)
cout<<k[i]<<"\n";
delete[] k;
return 0;
}
Input:
999999
999998
999997
999996
999995
Output:
999999
999998
999997
999996
999995
You invoked undefined behavior by accessing memory you did not allocate. This works purely "by chance". Literally every behavior of you program would be legal, including the program ordering pizza, ...
This will probably work in practice most of the time because your OS will usually not just give you 4 Byte or something like this, but a whole page of memory (often 4kB) but to emphasize this: You can never rely on this behavior!
The way that a c++ program uses an array is that it the index that you want, multiplies it by the size of the element the array is made of, then adds it to the first memory location in the array. It just so happened that where you placed this in your program, going back an additional 4 elements didn't corrupt anything, so you were just fine. It doesn't actually care. However if you overwrite another variable, or a stack pointer, then you run into trouble. I wouldn't recommend doing this in practice, however, as the behavior can be undefined.

Why are we able to access unallocated memory in a class?

I am sorry if I may not have phrased the question correctly, but in the following code:
int main() {
char* a=new char[5];
a="2222";
a[7]='f'; //Error thrown here
cout<<a;
}
If we try to access a[7] in the program, we get an error because we haven't been assigned a[7].
But if I do the same thing in a class :
class str
{
public:
char* a;
str(char *s) {
a=new char[5];
strcpy(a,s);
}
};
int main()
{
str s("ssss");
s.a[4]='f';s.a[5]='f';s.a[6]='f';s.a[7]='f';
cout<<s.a<<endl;
return 0;
}
The code works, printing the characters "abcdfff".
How are we able to access a[7], etc in the code when we have only allocated char[5] to a while we were not able to do so in the first program?
In your first case, you have an error:
int main()
{
char* a=new char[5]; // declare a dynamic char array of size 5
a="2222"; // assign the pointer to a string literal "2222" - MEMORY LEAK HERE
a[7]='f'; // accessing array out of bounds!
// ...
}
You are creating a memory leak and then asking why undefined behavior is undefined.
Your second example is asking, again, why undefined behavior is undefined.
As others have said, it's undefined behavior. When you write to memory out of bounds of the allocated memory for the pointer, several things can happen
You overwrite an allocated, but unused and so far unimportant location
You overwrite a memory location that stores something important for your program, which will lead to errors because you've corrupted your own memory at that point
You overwrite a memory location that you aren't allowed to access (something out of your program's memory space) and the OS freaks out, causing an error like "AccessViolation" or something
For your specific examples, where the memory is allocated is based on how the variable is defined and what other memory has to be allocated for your program to run. This may impact the probability of getting one error or another, or not getting an error at all. BUT, whether or not you see an error, you shouldn't access memory locations out of your allocated memory space because like others have said, it's undefined and you will get non-deterministic behavior mixed with errors.
int main() {
char* a=new char[5];
a="2222";
a[7]='f'; //Error thrown here
cout<<a;
}
If we try to access a[7] in the program, we get an error because we
haven't been assigned a[7].
No, you get a memory error from accessing memory that is write-protected, because a is pointing to the write-only memory of "2222", and by chance two bytes after the end of that string is ALSO write-protected. If you used the same strcpy as you use in the class str, the memory access would overwrite some "random" data after the allocated memory which is quite possibly NOT going to fail in the same way.
It is indeed invalid (undefined behaviour) to access memory outside of the memory you have allocated. The compiler, C and C++ runtime library and OS that your code is produced with and running on top of is not guaranteed to detect all such things (because it can be quite time-consuming to check every single operation that accesses memory). But it's guaranteed to be "wrong" to access memory outside of what has been allocated - it just isn't always detected.
As mentioned in other answers, accessing memory past the end of an array is undefined behavior, i.e. you don't know what will happen. If you are lucky, the program crashes; if not, the program continues as if nothing was wrong.
C and C++ do not perform bounds checks on (simple) arrays for performance reasons.
The syntax a[7] simply means go to memory position X + sizeof(a[0]), where X is the address where a starts to be stored, and read/write. If you try to read/write within the memory that you have reserved, everything is fine; if outside, nobody knows what happens (see the answer from #reblace).