#include<iostream>
using namespace std;
int main( )
{
int *p;
double *q;
cout << p << " " << q << endl;
p++;
q++;
cout << p << " " << q << endl;
//*p = 5; // should be wrong!
}
This function prints
0x7ffe6c0591a0 0
0x7ffe6c0591a4 0x8
Why does p point to some randm address and q to zero? Also, when I uncomment the line *p=5, shouln't it throw an error? It still works fine:
code with line uncommented output
0x7ffc909a2f70 0
0x7ffc909a2f74 0x8
What can explain this weird behaviour?
When local (auto) variables of basic type (int, double, pointers to them, etc) are uninitialised, any operation that accesses their value yields undefined behaviour.
Printing a variable accesses its value, so both the statements with cout << ... give undefined behaviour. Incrementing a variable also accesses its value (it is not possible to give the result of incrementing without accessing the previous value) so both the increment operators present undefined behaviour. Derefererencing an unitialised pointer (as in *p) gives undefined behaviour, as does assigning a value to the result *p = 5.
So every statement you have shown after the definitions of p and q gives undefined behaviour.
Undefined behaviour means there are no constraints on what is permitted to happen - or, more simply, that anything can happen. That allows any result from "appear to do nothing" to "crash" to "reformat your hard drive".
The particular output your are getting therefore doesn't really matter. You may get completely different behavior when the code is built with a different compiler, or even during a different phase of the moon.
In terms of a partial explanation of what you are seeing .... The variables p and q will probably receive values corresponding to whatever happens to be in memory at the location where they are created - and therefore to whatever some code (within an operating system driver, within your program, even within some other program) happened to write at that location previously. But that is only one of many possible explanations.
As you have not initialised the variables - the code resorts to undefined behaviour. So anything can happen including what you have experienced.
C++ gives you more than enough rope to hang yourself. Compile with all the warnings switched on to avoid some of the perils.
When you do not initiate a variable, it set to a random or specific value based on the compiler policy.
About ++ if you create a pointer to class A, and use ++ the pointer will be incremented by sizeof(A).
And about your last question, *p=5 is a good instance of undefined behavior when you did not allocate a memory for p;
Related
So, as of now, it seems to be impossible to actually modify a "const" value in C++ (tested in VS 2017).
const int a = 5;
int* ptr = (int*)&a; // Method 1
*((int*)(&a)) = 6; // Method 2
int* ptr = const_cast<int*>(&a); // Method 3
*ptr = 55;
cout << a << "\t" << &a << endl;
cout << *ptr << "\t" << ptr << endl;
Result:
5 SOMEMEMORYADDRESS
55 SOMEMEMORYADDRESS
Anyone got any idea what else can be tried to achieve the effect? Really curious how it is possible to have 1 memory address (at least according to the console) with 2 values.
Please note: there are topics like this for older C++ versions (and they used to work in the past - but they don't, anymore).
Really curious how it is possible to have 1 memory address (at least according to the console) with 2 values.
It's because you invoked undefined behavior. The C++ standard, from C++98, has expressly forbidden you from modifying an object that is declared const. And the standard has a catch-all statement such that if you do anything which causes modification of a const object, you get undefined behavior.
Because modifying an object declared const is UB, the compiler is free to assume that this object will never be modified. So, since the compiler can see that a is const and it is initialed to 5, it is 100% valid for the compiler to at compile time replace everything which revers to this object with 5. So when you do cout << a, the compiler is free to not bother to access memory; it can just do cout << 5.
If you did something to modify the memory behind a, that's UB, so the compiler doesn't have to care about what happens in that case.
they used to work in the past - but they don't, anymore
No, they never "worked". They merely just so happened to do what you thought they should. But C++ never guaranteed that compilers would behave in this way, so you have no right to complain about compilers changing that behavior now.
Why does the new[] operator in C++ actually create an array of length + 1? For example, see this code:
#include <iostream>
int main()
{
std::cout << "Enter a positive integer: ";
int length;
std::cin >> length;
int *array = new int[length]; // use array new. Note that length does not need to be constant!
//int *array;
std::cout << "I just allocated an array of integers of length " << length << '\n';
for (int n = 0; n<=length+1; n++)
{
array[n] = 1; // set element n to value 1
}
std::cout << "array[0] " << array[0] << '\n';
std::cout << "array[length-1] " << array[length-1] << '\n';
std::cout << "array[length] " << array[length] << '\n';
std::cout << "array[length+1] " << array[length+1] << '\n';
delete[] array; // use array delete to deallocate array
array = 0; // use nullptr instead of 0 in C++11
return 0;
}
We dynamically create an array of length "length" but we are able to assign a value at the index length+1. If we try to do length+2, we get an error.
Why is this? Why does C++ make the length = length + 1?
It doesn’t. You’re allowed to calculate the address array + n, for the purpose of checking that another address is less than it. Trying to access the element array[n] is undefined behavior, which means the program becomes meaningless and the compiler is allowed to do anything whatsoever. Literally anything; one old version of GCC, if it saw a #pragma directive, started a roguelike game on the terminal. (Thanks, Revolver_Ocelot, for reminding me: that was technically implementation-defined behavior, a different category.) Even calculating the address array + n + 1 is undefined behavior.
Because it can do anything, the particular compiler you tried that on decided to let you shoot yourself in the foot. If, for example, the next two words after the array were the header of another block in the heap, you might get a memory-corruption bug. Or maybe a compiler stored the array at the top of your memory space, the address &array[n+1] is aNULL` pointer, and trying to dereference it causes a segmentation fault. Or maybe the next page of memory is not readable or writable and trying to access it crashes the program with a protection fault. Or maybe the implementation bounds-checks your array accesses at runtime and crashes the program. Maybe the runtime stuck a canary value after the array and checks later to see if it was overwritten. Or maybe it happens, by accident, to work.
In practice, you really want the compiler to catch those bugs for you instead of trying to track down the bugs that buffer overruns cause later. It would be better to use a std::vector than a dynamic array. If you must use an array, you want to check that all your accesses are in-bounds yourself, because you cannot rely on the compiler to do that for you and skipping them is a major cause of bugs.
If you write or read beyond the end of an array or other object you create with new, your program's behaviour is no longer defined by the C++ standard.
Anything can happen and the compiler and program remain standard compliant.
The most likely thing to happen in this case is you are corrupting memory in the heap. In a small program this "seems to work" as the section of the heap ypu use isn't being used by any other code, in a larger one you will crash or behave randomly elsewhere in a seemingoy unrelated bit of code.
But arbitrary things could happen. The compiler could prove a branch leads to access beyond tue end of an array and dead-code eliminate paths that lead to it (UB that time travels), or it could hit a protected memory region and crash, or it could corrupt heap management data and cause a future new/delete to crash, or nasal demons, or whatever else.
At the for loop you are assigning elements beyond the bounds of the loop and remember that C++ does not do bounds checking.
So when you initialize the array you are initializing beyond the bounds of the array (Say the user enters 3 for length you are initializing 1 to array[0] through array[5] because the condition is n <= length + 1;
The behavior of the array is unpredictable when you go beyond its bounds, but most likely your program will crash. In this case you are going 2 elements beyonds its bounds because you have used = in the condition and length + 1.
There is no requirement that the new [] operator allocate more memory than requested.
What is happening is that your code is running past the end of the allocated array. It therefore has undefined behaviour.
Undefined behaviour means that the C++ standard imposes no requirements on what happens. Therefore, your implementation (compiler and standard library, in this case) will be equally correct if your program SEEMS to work properly (as it does in your case), produces a run time error, trashes your system drive, or anything else.
In practice, all that is happening is that your code is writing to memory, and later reading from that memory, past the end of the allocated memory block. What happens depends on what is actually in that memory location. In your case, whatever happens to be in that memory location is able to be modified (in the loop) or read (in order to print to std::cout).
Conclusion: the explanation is not that new[] over-allocates. It is that your code has undefined behaviour, so can seem to work anyway.
I am a c++ learner. Others told me "uninitiatied pointer may point to anywhere". How to prove that by code.?I made a little test code but my uninitiatied pointer always point to 0. In which case it does not point to 0? Thanks
#include <iostream>
using namespace std;
int main() {
int* p;
printf("%d\n", p);
char* p1;
printf("%d\n", p1);
return 0;
}
Any uninitialized variable by definition has an indeterminate value until a value is supplied, and even accessing it is undefined. Because this is the grey-area of undefined behaviour, there's no way you can guarantee that an uninitialized pointer will be anything other than 0.
Anything you write to demonstrate this would be dictated by the compiler and system you are running on.
If you really want to, you can try writing a function that fills up a local array with garbage values, and create another function that defines an uninitialized pointer and prints it. Run the second function after the first in your main() and you might see it.
Edit: For you curiosity, I exhibited the behavior with VS2015 on my system with this code:
void f1()
{
// junk
char arr[24];
for (char& c : arr) c = 1;
}
void f2()
{
// uninitialized
int* ptr[4];
std::cout << (std::uintptr_t)ptr[1] << std::endl;
}
int main()
{
f1();
f2();
return 0;
}
Which prints 16843009 (0x01010101). But again, this is all undefined behaviour.
Well, I think it is not worth to prove this question, because a good coding style should be used and this say's: Initialise all variables! One example: If you "free" a pointer, just give them a value like in this example:
char *p=NULL; // yes, this is not needed but do it! later you may change your program an add code beneath this line...
p=(char *)malloc(512);
...
free(p);
p=NULL;
That is a safe and good style. Also if you use free(p) again by accident, it will not crash your program ! In this example - if you don't set NULL to p after doing a free(), your can use the pointer by mistake again and your program would try to address already freed memory - this will crash your program or (more bad) may end in strange results.
So don't waste time on you question about a case where pointers do not point to NULL. Just set values to your variables (pointers) ! :-)
It depends on the compiler. Your code executed on an old MSVC2008 displays in release mode (plain random):
1955116784
1955116784
and in debug mode (after croaking for using unitialized pointer usage):
-858993460
-858993460
because that implementation sets uninitialized pointers to 0xcccccccc in debug mode to detect their usage.
The standard says that using an uninitialized pointer leads to undefined behaviour. That means that from the standard anything can happen. But a particular implementation is free to do whatever it wants:
yours happen to set the pointers to 0 (but you should not rely on it unless it is documented in the implementation documentation)
MSVC in debug mode sets the pointer to 0xcccccccc in debug mode but AFAIK does not document it (*), so we still cannot rely on it
(*) at least I could not find any reference...
I've met a situation that I think it is undefined behavior: there is a structure that has some member and one of them is a void pointer (it is not my code and it is not public, I suppose the void pointer is to make it more generic). At some point to this pointer is allocated some char memory:
void fooTest(ThatStructure * someStrPtr) {
try {
someStrPtr->voidPointer = new char[someStrPtr->someVal + someStrPtr->someOtherVal];
} catch (std::bad_alloc$ ba) {
std::cerr << ba.what << std::endl;
}
// ...
and at some point it crashes at the allocation part (operator new) with Segmentation fault (a few times it works, there are more calls of this function, more cases). I've seen this in debug.
I also know that on Windows (my machine is using Linux) there is also a Segmentation fault at the beginning (I suppose that in the first call of the function that allocates the memory).
More, if I added a print of the values :
std::cout << someStrPtr->someVal << " " << someStrPtr->someOtherVal << std::endl;
before the try block, it runs through the end. This print I've done to see if there is some other problem regarding the structure pointer, but the values are printed and not 0 or negative.
I've seen these topics: topic1, topic2, topic3 and I am thinking that there is some UB linked to the void pointer. Can anyone help me in pointing the issue here so I can solve it, thanks?
No, that in itself is not undefined behavior. In general, when code "crashes at the allocation part", it's because something earlier messed up the heap, typically by writing past one end of an allocated block or releasing the same block more than once. In short: the bug isn't in this code.
A void pointer is a perfectly fine thing to do in C/C++ and you can usually cast from/to other types
When you get a seg-fault while initialization, this means some of the used parameters are themselves invalid or so:
Is someStrPtr valid?
is someStrPtr->someVal and someStrPtr->someotherVal valid?
Are the values printed is what you were expecting?
Also if this is a multuthreaded application, make sure that no other thread is accessing those variables (especially between your print and initialization statement). This is what is really difficult to catch
I was working with pointers and was trying different things.
So here I made a general print function
void print(const int &value)
{
cout << "value = " << value << " & address = " << &value << endl;
}
Now Here's a code that works somehow:
int j = 9090;
int* p;
*p = j; //Please read further
print(*p);
I know *p = j is something illogical and I should have done p = &j or p = new int etc.... but why did this work?
The output was
value = 9090 & address = 0x28ff48
Moreover if I print j, the address is 0x28ff44. Which means it allocated new memory automatically for p (as its address is ..48)
Now, if I only add another line before j's declaration:
int i = 80;
int j = 9090;
int* p;
*p = j; //Please read further
print(*p);
^The program crashes.
However if I add this i's line after declaration of p the program runs.
int j = 9090;
int* p;
int i = 80;
*p = j;
print(*p);
Am I doing something wrong? I was using GCC compiler in Dev C++ IDE 4.9.9.2. What's going on?
The problem is that dereferencing an uninitialised pointer is undefined.
This means that anything can happen - sometimes a program will crash, sometimes just give strange results, and sometimes (if you're unlucky) it will just seem to work.
Since your pointer is uninitialised, you will access memory at whatever random location it happens to represent.
Sometimes, this is a valid address, and the program will keep running.
Sometimes, it's an invalid address, and the program will crash.
int* p;
*p = j;
This is correct syntax of C++. It means that in field which address is in p, you put value of j. The problem with crash originate from the fact that you don't provide any memory for p. When you say int * p it means that you have pointer on int variable. However you didn't provide any real address in memory to store int value. There is some rubbish in p but this rubbish is considered to be valid address by program and it tries to write value of j there. If you are lucky and this address was not used the program will work further. Otherwise it will crash. To avoid such undefined behaviour you need to allocate memory for p:
int* p = new int;
*p = j;
I know *p = j is something illogical
In this case it is undefined behavior. You are dereferencing a pointer that wasn't initialized, so it effectively points to an unknown (random) location.
Which means it allocated new memory automatically for p
No, it didn't. The address you see is just the random, undefined value which ended up in your uninitialized pointer.
why did this work?
That's the thing with undefined behavior: It might work by coincidence, for example if the random value in the pointer happened to point into some valid, writable memory region, your code would work fine for that specific run. Another time, it might just crash. Yet another time, it might corrupt your heap, or some unrelated data, and cause a seemingly unrelated crash or bug some time later. Which is just screaming for long nights in front of your debugger ;)
Bottom line: Don't do it - avoid UB at all cost.
However if I add this i's line after declaration of p the program runs.
I think the behavior you observe (crash vs. no crash) is not directly related to the code changes, but coincidental. In fact, if you'd run the "working" version multiple times, I'm sure you'd still get a crash after some time.