From C++ Primer 18.1.1:
If the [thrown] expression has an array or function type, the expression is
converted to its corresponding pointer type.
How is it that this program can produce correct output of 9876543210 (g++ 5.2.0)?
#include <iostream>
using namespace std;
int main(){
try{
int a[10] = {9,8,7,6,5,4,3,2,1,0};
throw a;
}
catch(int* b) { for(int i = 0; i < 10; ++i) cout << *(b+i); }
}
From the quote, throw a would create an exception object of type int* which is a pointer to the first element of the array. But surely the array elements of a would be destroyed when we exit the try block and enter the catch clause since we change block scope? Am I getting a false positive or are the array elements "left alone" (not deleted) during the duration of the catch clause?
Dereferencing a dangling pointer is Undefined Behaviour.
When a program encounters Undefined Behaviour, it is free to do anything. Which includes crashing and making daemons fly out of your nose, but it also includes doing whatever you expected.
In this case, there are no intervening operations that would overwrite that part of memory and access past the stack pointer (below it, because stack grows down on most platforms) is normally not checked.
So yes, this is a false positive. The memory is not guaranteed to contain the values any more and is not guaranteed to be accessible at all, but it still happens to contain them and be accessible.
Also note, that gcc optimizations are rather infamous for relying on the program not invoking undefined behaviour. Quite often if you have an undefined behaviour, the unoptimized version appears to work, but once you turn on optimizations, it starts doing something completely unexpected.
But surely the array elements of a would be destroyed when we exit the try block and enter the catch clause since we change block scope?
Correct.
Am I getting a false positive or are the array elements "left alone" (not deleted) during the duration of the catch clause?
They're not "left alone". The array is destroyed as you correctly assumed. The program is accessing invalid memory. The behaviour is undefined.
How is it that this program can produce correct output of 9876543210 (g++ 5.2.0)?
There is no correct output when the behaviour is undefined. The program can produce what it did because it can produce any output when the behaviour is undefined.
Reasoning about UB is usually pointless, but in this case, the contents of the invalid memory could be identical if the part of the memory had not yet been overwritten by the program.
You will find later in this chapter a Warning:
Throwing a pointer requires that the object to which the pointer
points exist wherever the corresponding handler resides.
so your example would be valid if a array were static or global, otherwise its UB. Or (as Jan Hudec writes in comment) in the enclosing block of the try/catch statement.
Related
I am new to programming and am currently studying about address typecasting. I don't seem to understand why I am getting this : *** stack smashing detected ***: terminated Aborted (core dumped) when I run the following code??
#include<iostream>
using namespace std;
void updateValue(int *p){
*p = 610 % 255;
}
int main(){
char ch = 'A';
updateValue((int*)&ch);
cout << ch;
}
Here's what I understand about the code:
The address of ch is typecasted to int* and passed into the function updateValue(). Now, inside the updateValue() stack, an integer pointer p is created which points to ch. When p is dereferenced, it interprets ch as an int and reads 4(or 8) bytes of contiguous memory instead of 1. So, 'A'(65) along with some garbage value gets assigned to 610%255 i.e. 20.
But I don't understand, what and where things are going wrong?
when p is dereferenced, it interprets ...
When you indirect through the reinterpreted p and access an object of the wrong type, the behaviour of the program is undefined.
what and where things are going wrong?
Things started going wrong when you reinterpreted a pointer to one type as a pointer to an unrelated type.
Some rules of thumb:
Don't use reinterpret casts until you know what it does. They are difficult to get right, and are rarely useful.
Don't use reinterpret casts when it would result in undefined behaviour.
Don't use C-style casts at all.
If you think that you need to reinterpret cast, then take a few steps back, and consider why you think that you need it.
The problem is that you're typecasting a char* to an int* and then dereferencing p which leads to undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior. The program may just crash.
So the output that you're seeing(maybe seeing) is a result of undefined behavior. And as i said don't rely on the output of a program that has UB. The program may just crash which happens in your case.
For example, here the program crashes, but here it doesn't crash.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.
But I don't understand, what and where things are going wrong?
In this statement
*p = 610 % 255;
the memory that does not belong to the object ch that has the type char is overwritten. That is instead of one byte occupied by the object ch there are overwritten 4 bytes that correspond to the allocated memory for an object of the type int.
Let's consider below program:
int main ()
{
int *p, *r;
p = (int*)malloc(sizeof(int));
cout<<"Addr of p = "<<p <<endl;
cout<<"Value of p = "<<*p <<endl;
free(p);
cout<<"After free(p)"<<endl;
r = (int*)malloc(sizeof(int));
cout<<"Addr of r = "<<r <<endl;
cout<<"Value of r = "<<*r <<endl;
*p = 100;
cout<<"Value of p = "<<*p <<endl;
cout<<"Value of r = "<<*r <<endl;
return 0;
}
Output:
Addr of p = 0x2f7630
Value of p = 3111728
free(p)
Addr of r = 0x2f7630
Value of r = 3111728
*p = 100
Value of p = 100
Value of r = 100
In the above code, p and r are dynamically created.
p is created and freed. r is created after p is freed.
On changing the value in p, r's value also gets changed. But I have already freed p's memory, then why on changing p's value, r's value also gets modified with the same value as that of p?
I have come to below conclusion. Please comment if I am right?
Explanation:
Pointer variables p and q are dynamically declared. Garbage values are stored initially. Pointer variable p is freed/deleted. Another pointer variable r is declared. The addresses allocated for r is same as that of p (p still points to the old address). Now if the value of p is modified, r’s value also gets modified with the same value as that of p (since both variables are pointing to the same address).
The operator free() only frees the memory address from the pointer variable and returns the address to the operating system for re-use, but the pointer variable (p in this case) still points to the same old address.
The free() function and the delete operator do not change the content of a pointer, as the pointer is passed by value.
However, the stuff in the location pointed to by the pointer may not be available after using free() or delete.
So if we have memory location 0x1000:
+-----------------+
0x1000 | |
| stuff in memory |
| |
+-----------------+
Lets assume that the pointer variable p contains 0x1000, or points to the memory location 0x1000.
After the call to free(p), the operating system is allowed to reuse the memory at 0x1000. It may not use it immediately or it could allocate the memory to another process, task or program.
However, the variable p was not altered, so it still points to the memory area. In this case, the variable p still has a value, but you should not dereference (use the memory) because you don't own the memory any more.
Your analysis is superficially close in some ways but not correct.
p and r are defined to be pointers in the first statement of main(). The are not dynamically created. They are defined as variables of automatic storage duration with main(), so they cease to exist when (actually if, in the case of your program) main() returns.
It is not p that is created and freed. malloc() dynamically allocates memory and, if it succeeds, returns a pointer which identifies that dynamically allocated memory (or a NULL pointer if the dynamic allocation fails) but does not initialise it. The value returned by malloc() is (after conversion into a pointer to int, which is required in C++) assigned to p.
Your code then prints the value of p.
(I have highlighted the next para in italic, since I'll refer back to it below).
The next statement prints the value of *p. Doing that means accessing the value at the address pointed to by p. However, that memory is uninitialised, so the result of accessing *p is undefined behaviour. With your implementation (compiler and library), at this time, that happens to result in a "garbage value", which is then printed. However, that behaviour is not guaranteed - it could actually do anything. Different implementations could give different results, such as abnormal termination (crash of your program), reformatting a hard drive, or [markedly less likely in practice] playing the song "Crash" by the Primitives through your computer's loud speakers.
After calling free(p) your code goes through a similar sequence with the pointer r.
The assignment *p = 100 has undefined behaviour, since p holds the value returned by the first malloc() call, but that has been passed to free(). So, as far as your program is concerned, that memory is no longer guaranteed to exist.
The first cout statement after that accesses *p. Since p no longer exists (having being passed to free()) that gives undefined behaviour.
The second cout statement after that accesses *r. That operation has undefined behaviour, for exactly the same reason I described in the italic paragraph above (for p, as it was then).
Note, however, that there have been five occurrences of undefined behaviour in your code. When even a single instance of undefined behaviour occurs, all bets are off for being able to predict behaviour of your program. With your implementation, the results happen to be printing p and r with the same value (since malloc() returns the same value 0x2f7630 in both cases), printing a garbage value in both cases, and then (after the statement *p = 100) printing the value of 100 when printing *p and *r.
However, none of those results are guaranteed. The reason for no guarantee is that the meaning of "undefined behaviour" in the C++ standard is that the standard describes no limits on what is permitted, so an implementation is free to do anything. Your analysis might be correct, for your particular implementation, at the particular time you compiled, linked, and ran your code. It might even be correct next week, but be incorrect a month from now after updating your standard library (e.g. applying bug fixes). It is probably incorrect for other implementations.
Lastly, a couple of minor points.
Firstly, your code is incomplete, and would not even compile in the form you have described it. In discussion above, I have assumed your code is actually preceded by
#include <iostream>
#include <cstdlib>
using namespace std;
Second, malloc() and free() are functions in the standard library. They are not operators.
Your analysis of what actually happened is correct; however, the program is not guaranteed to behave this way reliably. Every use of p after free(p) "provokes undefined behavior". (This also happens when you access *p and *r without having written anything there first.) Undefined behavior is worse than just producing an unpredictable result, and worse than just potentially causing the program to crash, because the compiler is explicitly allowed to assume that code that provokes undefined behavior will never execute. For instance, it would be valid for the compiler to treat your program as identical to
int main() {}
because there is no control flow path in your program that does not provoke undefined behavior, so it must be the case that the program will never run at all!
free() frees the heap memory to be re-used by OS. But the contents present in the memory address are not erased/removed.
I have encountered a problem in my learning of C++, where a local variable in a function is being passed to the local variable with the same name in another function, both of these functions run in main().
When this is run,
#include <iostream>
using namespace std;
void next();
void again();
int main()
{
int a = 2;
cout << a << endl;
next();
again();
return 0;
}
void next()
{
int a = 5;
cout << a << endl;
}
void again()
{
int a;
cout << a << endl;
}
it outputs:
2
5
5
I expected that again() would say null or 0 since 'a' is declared again there, and yet it seems to use the value that 'a' was assigned in next().
Why does next() pass the value of local variable 'a' to again() if 'a' is declared another time in again()?
http://en.cppreference.com/w/cpp/language/ub
You're correct, an uninitialized variable is a no-no. However, you are allowed to declare a variable and not initialize it until later. Memory is set aside to hold the integer, but what value happens to be in that memory until you do so can be anything at all. Some compilers will auto-initialize variables to junk values (to help you catch bugs), some will auto-initialize to default values, and some do nothing at all. C++ itself promises nothing, hence it's undefined behavior. In your case, with your simple program, it's easy enough to imagine how the compiler created assembly code that reused that exact same piece of memory without altering it. However, that's blind luck, and even in your simple program isn't guaranteed to happen. These types of bugs can actually be fairly insidious, so make it a rule: Be vigilant about uninitialized variables.
An uninitialized non-static local variable of *built-in type (phew! that was a mouthful) has an indeterminate value. Except for the char types, using that value yields formally Undefined Behavior, a.k.a. UB. Anything can happen, including the behavior that you see.
Apparently with your compiler and options, the stack area that was used for a in the call of next, was not used for something else until the call of again, where it was reused for the a in again, now with the same value as before.
But you cannot rely on that. With UB anything, or nothing, can happen.
* Or more generally of POD type, Plain Old Data. The standard's specification of this is somewhat complicated. In C++11 it starts with §8.5/11, “If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value.”. Where “automatic … storage duration” includes the case of local non-static variable. And where the “no initialization” can occur in two ways via §8.5/6 that defines default initialization, namely either via a do-nothing default constructor, or via the object not being of class or array type.
This is completely coincidental and undefined behavior.
What's happened is that you have two functions called immediately after one another. Both will have more or less identical function prologs and both reserve a variable of exactly the same size on the stack.
Since there are no other variables in play and the stack is not modified between the calls, you just happen to end up with the local variable in the second function "landing" in the same place as the previous function's local variable.
Clearly, this is not good to rely upon. In fact, it's a perfect example of why you should always initialize variables!
I've checked myself, I wrote a program like this
int main() {
int i;
cout << i;
return 0;
}
I ran the program a few times and the result was same all the time, zero.
I've tried it in C and the result was the same.
But my textbook says
If you don’t initialize an variable that’s defined
inside a function, the variable value remain undefined.That means the element takes on
whatever value previously resided at that location in memory.
How's this possible when the program always assign a free memory location to a variable? How could it be something other than zero(I assume that default free memory value is zero)?
How's this possible when the program always assign a free memory
location to a variable? How could it be something rather than zero?
Let's take a look at an example practical implementation.
Let's say it utilizes stack to keep local variables.
void
foo(void)
{
int foo_var = 42;
}
void
bar(void)
{
int bar_var;
printf("%d\n", bar_var);
}
int
main(void)
{
bar();
foo();
bar();
}
Totally broken code above illustrates the point. After we call foo, certain location on the stack where foo_var was placed is set to 42. When we call bar, bar_var occupies that exact location. And indeed, executing the code results in printing 0 and 42, showing that bar_var value cannot be relied upon unless initialized.
Now it should be clear that local variable initialisation is required. But could main be an exception? Is there anything which could play with the stack and in result give us a non-zero value?
Yes. main is not the first function executed in your program. In fact there is tons of work required to set everything up. Any of this work could have used the stack and leave some non-zeros on it. Not only you can't expect the same value on different operating systems, it may very well suddenly change on the very system you are using right now. Interested parties can google for "dynamic linker".
Finally, the C language standard does not even have the term stack. Having a "place" for local variables is left to the compiler. It could even get random crap from whatever happened to be in a given register. It really can be totally anything. In fact, if an undefined behaviour is triggered, the compiler has the freedom to do whatever it feels like.
If you don’t initialize an variable that’s defined inside a function, the variable value remain undefined.
This bit is true.
That means the element takes on whatever value previously resided at that location in memory.
This bit is not.
Sometimes in practice this will occur, and you should realise that getting zero or not getting zero perfectly fits this theory, for any given run of your program.
In theory your compiler could actually assign a random initial value to that integer if it wanted, so trying to rationalise about this is entirely pointless. But let's continue as if we assumed that "the element takes on whatever value previously resided at that location in memory"…
How could it be something rather than zero(I assume that default free memory value is zero)?
Well, this is what happens when you assume. :)
This code invokes Undefined Behavior (UB), since the variable is used uninitialized.
The compiler should emit a warning, when a warning flag is used, like -Wall for example:
warning: 'i' is used uninitialized in this function [-Wuninitialized]
cout << i;
^
It just happens, that at this run, on your system, it had the value of 0. That means that the garbage value the variable was assigned to, happened to be 0, because the memory leftovers there suggested so.
However, notice that, kernel zeroes appear relatively often. That means that it is quite common that I can get zero as an output of my system, but it is not guaranteed and should not be taken into a promise.
Static variables and global variables are initialized to zero:
Global:
int a; //a is initialized as 0
void myfunc(){
static int x; // x is also initialized as 0
printf("%d", x);}
Where as non-static variables or auto variables i.e. the local variables are indeterminate (indeterminate usually means it can do anything. It can be zero, it can be the value that was in there, it can crash the program). Reading them prior to assigning a value results in undefined behavior.
void myfunc2(){
int x; // value of x is assigned by compiler it can even be 0
printf("%d", x);}
It mostly depends on compiler but in general most cases the value is pre assumed as 0 by the compliers
#include <iostream>
using namespace std;
int main()
{
int a[100],n;
cout<<&n<<" "<<&a[100]<<endl;
if(&n!=&a[100])
{
cout<<" What is wrong with C++?";
}
}
It prints the address of n and a[100] as same. But when I compare the two values in the if loop It says that they both are not equal.
What does this mean?
When I change the value of n, a[100] also changes so doesn't that mean n and a[100] are equal.
First, let's remember that there is no a[100]. It does not exist! If you tried to access a[100]'s "value" then, according to the abstract machine called C++, anything can happen. This includes blowing up the sun or dying my hair purple; now, I like my hair the way it is, so please don't!
Anyway, what you're doing is playing with the array's "one-past-the-end" pointer. You are allowed to obtain this pointer, which is a fake pointer of sorts, as long as you do not dereference it.
It is available only because the "101st element" would be "one past the end" of the array. (And there is debate as to whether you are allowed to write &a[100], rather than a+100, to get such a pointer; I am in the "no" camp.)
However, that still says nothing about comparing it to the address of some entirely different object. You cannot assume anything about the relative location of local variables in memory. You simply have no idea where n will live with respect to a.
The results you're observing are unpredictable, unreliable and meaningless side effects of the undefined behaviour exhibited by your program, caused by a combination of compiler optimisations and data locality.
For the array declaration int a[100], array index starts from 0 to 99 and when you try access the address of 101th element which is out of its range could overlap with next member (in your case its variable n) on the stack. However its undefined behavior.
For a bit if facts here is the relevant text from the specifications
Equality operator (==,!=)
Pointers to objects of the same type can be compared for equality with the 'intuitive' expected results:
From § 5.10 of the C++11 standard:
Pointers of the same type (after pointer conversions) can be compared for equality. Two pointers of the same type compare equal if > and only if they are both null, both point to the same function, or > both represent the same address (3.9.2).
(leaving out details on comparison of pointers to member and or the
null pointer constants - they continue down the same line of 'Do >
What I Mean':)
[...] If both operands are null, they compare equal. Otherwise if only one is null, they compare unequal.[...]
The most 'conspicuous' caveat has to do with virtuals, and it does seem to be the logical thing to expect too:
[...] if either is a pointer to a virtual member function, the result is unspecified. Otherwise they compare equal if and only if
they would refer to the same member of the same most derived object
(1.8) or the same subobject if they were dereferenced with a
hypothetical object of the associated class type. [...]
Maybe the problem could be that array of int is not the same as int!
by writing &a[100] you're invoking undefined behavior, since there is no element with index 100 in the array a. To get the same address safely, you could instead write a+100 (or &a[0]+100) in which case your program would be well-defined, but whether or not the if condition will hold cannot be predicted or relied upon on any implementation.