Change the way gdb displays uninitialized data - gdb

I'm debugging a program that declares an array with 1024 elements and it's not initialized until much later. Every time I use "info locals" it shows me this really long list of uninitialized data. Is there any way to change the way that gdb presents uninitialized variables? Something along the lines of lot_data[1024]=UNINITIALIZED.

Is there any way to change the way that gdb presents uninitialized variables?
No.
GDB doesn't know whether a memory location has been assigned or not. To GDB it's just bits, and it can't display bits differently depending on where their value came from (which it doesn't know).
P.S. Actually tracking the state of bits is possible with instrumentation (clang -fsanitize=memory -fsanitize-memory-track-origins ...), but is a fairly expensive thing to do.
Also consider that memory can remain uninitialized despite being assigned:
int buf[5]; // uninitialized memory declared
int k = buf[0]; // k is still uninitialized
int *ip = malloc(sizeof(buf)); // uninitialized memory created
memcpy(buf, ip, sizeof(buf)); // buf is still uninitialized, despite being written to
int j = buf[0]; // j is still uninitialized

Related

Code running successfully using vector but showing error using array

I was practicing an array manipulation question. While solving I declared an array (array A in code).
For some test cases, I got a segmentation fault. I replaced the array with vector and got AC. I don't know the reason for this. Plz, explain.
#include <bits/stdc++.h>
using namespace std;
int main()
{
int n,m,a,b,k;
cin>>n>>m;
vector<long int> A(n+2);
//long int A[n+2]={0};
for(int i=0;i<m;i++)
{
cin>>a>>b>>k;
A[a]+=k;
A[b+1]-=k;
}
long res=0;
for(int i=1;i<n+2;i++)
{
A[i]+=A[i-1];
if(res<A[i])
res=A[i];
}
cout<<res;
return 0;
}
Since it looks like you haven't being programming in C++ for very long I will try to break it down for you to make it simpler to understand:
First of all c++ does not intialize any values for you this is not Java, so please do not do:
int n,m,a,b,k;
And then use:
A[a]+=k;
A[b+1]-=k;
At this point we have no idea what a and b are it might be -300 for all we know, you never intialized it. Hence, occasically you get lucky and the number that is initalized by the compiler does not cause a segmentation fault, and other times you are not so lucky and the value intialized by the compiler does cause a segmentation fault.
long int A[n+2]={0}; is not legal in Standard C++. There are a bunch of reasons for this and I think you stumbled over one of them.
Compilers that allow Variable Length Arrays follow the example of C99 and the array is allocated on the stack. Stack is a limited resource, usually between 1 and 10 MB for a desktop computer. If the user inputs an n of sufficient size, the array will take up too much of the stack or breach the bounds of the stack resulting in Undefined Behaviour. of then this behaviour manifests in a segmentation fault from accessing memory that is so far off the end of the stack that it's not controlled by the program. There are typically no warnings when you overflow the stack. Often a program crash or corrupted data is the the way you find out, and it's too late to salvage the program by then.
On the other hand, a vector allocates it's internal buffer from the freestore, and on a modern PC with virtual memory and 64 bit addressing the freestore is fantastically huge and throws an exception if you attempt to exceed what it can allocate.
Another important difference is
long int A[n+2]={0};
likely did not zero initialize the array. This is the case with g++. The first byte will be set to zero and the remainder are uninitialized. Such is the curse of using non-Standard extensions. You cannot count on the behaviour guaranteed by the Standard.
std::vector will zero initialize the whole array or set the array to whatever value you tell it to use.

C++/Address Space: 2 Bytes per adress?

I was just trying something and i was wondering how this could be. I have the following Code:
int var1 = 132;
int var2 = 200;
int *secondvariable = &var2;
cout << *(secondvariable+2) << endl << sizeof(int) << endl;
I get the Output
132
4
So how is it possible that the second int is only 2 addresses higher? I mean shouldn't it be 4 addresses? I'm currently under WIN10 x64.
Regards
With cout << *(secondvariable+2) you don't print a pointer, you print the value at secondvariable[2], which is an invalid indexing and lead to undefined behavior.
If you want to print a pointer then drop the dereference and print secondvariable+2.
While you already are far in the field of undefined behaviour (see Some programmer dude's answer) due to indexing an array out of bounds (a single variable is considered an array of length 1 for such matters), some technical background:
Alignment! Compilers are allowed to place variables at addresses such that they can be accessed most efficiently. As you seem to have gotten valid output by adding 2*sizeof(int) to the second variable's address, you apparently have reached the first one by accident. Apparently, the compiler decided to leave a gap in between the two variables so that both can be aligned to addresses dividable by 8.
Be aware, though, that you don't have any guarantee for such alignment, different compilers might decide differently (or same compiler on another system), and alignment even might be changed via compiler flags.
On the other hand, arrays are guaranteed to occupy contiguous memory, so you would have gotten the expected result in the following example:
int array[2];
int* a0 = &array[0];
int* a1 = &array[1];
uintptr_t diff = static_cast<uintptr_t>(a1) - static_cast<uintptr_t>(a0);
std::cout << diff;
The cast to uintptr_t (or alternatively to char*) assures that you get address difference in bytes, not sizes of int...
This is not how C++ works.
You can't "navigate" your scope like this.
Such pointer antics have completely undefined behaviour and shall not be relied upon.
You are not punching holes in tape now, you are writing a description of a program's semantics, that gets converted by your compiler into something executable by a machine.
Code to these abstractions and everything will be fine.

Undefined behaviour observed in C++/memory allocation

#include <iostream>
using namespace std;
int main()
{
int a=50;
int b=50;
int *ptr = &b;
ptr++;
*ptr = 40;
cout<<"a= "<<a<<" b= "<<b<<endl;
cout<<"address a "<<&a<<" address b= "<<&b<<endl;
return 0;
}
The above code prints :
a= 50 b= 50
address a 0x7ffdd7b1b710 address b= 0x7ffdd7b1b714
Whereas when I remove the following line from the above code
cout<<"address a "<<&a<<" address b= "<<&b<<endl;
I get output as
a= 40 b= 50
My understanding was that the stack grows downwards, so the second answers seems to be the correct one. I am not able to understand why the print statement would mess up the memory layout.
EDIT:
I forgot to mention, I am using 64 bit x86 machine, with OS as ubuntu 14.04 and gcc version 4.8.4
First of all, it's all undefined behavior. The C++ standard says that you can increment pointers only as long as you are in array boundaries (plus one element after), with some more exceptions for standard layout classes, but that's about it. So, in general, snooping around with pointers is uncharted territory.
Coming to your actual code: since you are never asking for its address, probably the compiler either just left a in a register, or even straight propagated it as a constant throughout the code. For this reason, a never touches the stack, and you cannot corrupt it using the pointer.
Notice anyhow that the compiler isn't restricted to push/pop variables on the stack in the order of their declaration - they are reordered in whatever order they seem fit, and actually they can even move in the stack frame (or be replaced) throughout the function - and a seemingly small change in the function may make the compiler to alter completely the stack layout. So, even comparing the addresses as you did says nothing about the direction of stack growth.
UB - You have taken a pointer to b, you move that pointer ptr++ which means you are pointing to some unknown, un-assigned memory and you try to write on that memory region, which will cause an Undefined Behavior.
On VS 2008, debugging it step-by-step will throw this message for you which is very self-explanatory::

Why do I get a random number when increasing the integer value of a pointer?

I am an expert C# programmer, but I am very new to C++. I get the basic idea of pointers just fine, but I was playing around. You can get the actual integer value of a pointer by casting it as an int:
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
Which makes sense; it's a memory address. But I can move to the next pointer, and cast it as an int:
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
int* jptr = (int*)((int)iptr + 1);
int j = (int)*iptr;
and I get a seemingly random number (although this is not a good PSRG). What is this number? Is it another number used by the same process? Is it possibly from a different process? Is this bad practice, or disallowed? And if not, is there a use for this? It's kind of cool.
What is this number? Is it another number used by the same process? Is it possibly from a different process?
You cannot generally cast pointers to integers and back and expect them to be dereferencable. Integers are numbers. Pointers are pointers. They are totally different abstractions and are not compatible.
If integers are not large enough to be able to store the internal representation of pointers (which is likely the case; integers are usually 32 bits long and pointers are usually 64 bits long), or if you modify the integer before casting it back to a pointer, your program exhibits undefined behaviour and as such anything can happen.
See C++: Is it safe to cast pointer to int and later back to pointer again?
Is this bad practice, or disallowed?
Disallowed? Nah.
Bad practice? Terrible practice.
You move beyond i pointer by 4 or 8 bytes and print out the number, which might be another number stored in your program space. The value is unknown and this is Undefined Behavior. Also there is a good chance that you might get an error (that means your program can blow up) [Ever heard of SIGSEGV? The Segmentation violation problem]
You are discovering that random places in memory contain "unknown" data. Not only that, but you may find yourself pointing to memory that your process does not have "rights" to so that even the act of reading the contents of an address can cause a segmentation fault.
In general is you allocate some memory to a pointer (for example with malloc) you may take a look at these locations (which may have random data "from the last time" in them) and modify them. But data that does not belong explicitly to a pointer's block of memory can behave all kings of undefined behavior.
Incidentally if you want to look at the "next" location just to
NextValue = *(iptr + 1);
Don't do any casting - pointer arithmetic knows (in your case) exactly what the above means : " the contents of the next I refer location".
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
int* jptr = (int*)((int)iptr + 1);
int j = (int)*iptr;
You can cast int to pointer and back again, and it will give you same value
Is it possibly from a different process? no it's not, and you can't access memory of other process except using readProcessMemmory and writeProcessMemory under win32 api.
You get other number because you add 1 to the pointer, try to subtract 1 and you will same value.
When you define an integer by
int i = 5;
it means you allocate a space in your thread stack, and initialize it as 5. Then you get a pointer to this memory, which is actually a position in you current thread stack
When you increase your pointer by 1, it means you point to the next location in your thread stack, and you parse it again as an integer,
int* jptr = (int*)((int)iptr + 1);
int j = (int)*jptr;
Then you will get an integer from you thread stack which is close to where you defined your int i.
Of course this is not suggested to do, unless you want to become an hacker and want to exploit stack overflow (here it means what it is, not the site name, ha!)
Using a pointer to point to a random address is very dangerous. You must not point to an address unless you know what you're doing. You could overwrite its content or you may try to modify a constant in read-only memory which leads to an undefined behaviour...
This for example when you want to retrieve the elements of an array. But cannot cast a pointer to integer. You just point to the start of the array and increase your pointer by 1 to get the next element.
int arr[5] = {1, 2, 3, 4, 5};
int *p = arr;
printf("%d", *p); // this will print 1
p++; // pointer arithmetics
printf("%d", *p); // this will print 2
It's not "random". It just means that there are some data on the next address
Reading a 32-bit word from an address A will copy the 4 bytes at [A], [A+1], [A+2], [A+3] into a register. But if you dereference an int at [A+1] then the CPU will load the bytes from [A+1] to [A+4]. Since the value of [A+4] is unknown it may make you think that the number is "random"
Anyway this is EXTREMELY dangerous 💀 since
the pointer is misaligned. You may see the program runs fine because x86 allows for unaligned accesses (with some performance penalty). But most other architectures prohibit unaligned operations and your program will just end in segmentation fault. For more information read Purpose of memory alignment, Data Alignment: Reason for restriction on memory address being multiple of data type size
you may not be allowed to touch the next byte as it may be outside of your address space, is write-only, is used for another variable and you changed its value, or whatever other reasons. You'll also get a segfault in that case
the next byte may not be initialized and reading it will crash your application on some architectures
That's why the C and C++ standard state that reading memory outside an array invokes undefined behavior. See
How dangerous is it to access an array out of bounds?
Access array beyond the limit in C and C++
Is accessing a global array outside its bound undefined behavior?

C++: float value reused across iteration

Let's look at the following piece of code which I unintentionally wrote:
void test (){
for (int i = 1; i <=5; ++i){
float newNum;
newNum +=i;
cout << newNum << " ";
}
}
Now, this is what I happened in my head:
I have always been thinking that float newNum would create a new variable newNum with a brand-new value for each iteration since the line is put inside the loop. And since float newNum doesn't throw a compile error, C++ must be assigning some default value (huhm, must be 0). I then expected the output to be "1 2 3 4 5". What was printed was "1 3 6 10 15".
Please help me know what's wrong with my expectation that float newNum would create a new variable for each iteration?
Btw, in Java, this piece of code won't compile due to newNum not initialized and that's probably better for me since I would know I need to set it to 0 to get the expected output.
Since newNum is not initialized explicitly, it will have a random value (determined by the garbage data contained in the memory block it is allocated to), at least on the first iteration.
On subsequent iterations, it may have its earlier values reused (as the compiler may allocate it repeatedly to the same memory location - this is entirely up to the compiler's discretion). Judging from the output, this is what actually happened here: in the first iteration newNum had the value 0 (by pure chance), then 1, 3, 6 and 10, respectively.
So to get the desired output, initialize the variable explicitly inside the loop:
float newNum = 0.0;
C++ must be assigning some default
value (huhm, must be 0)
This is the mistake in your assumptions. C++ doesn't attempt to assign default values, you must explicitly initialise everything.
Most likely it will assign the same location in memory each time around the loop and so (in this simple case) newNum will probably seem to persist from each iteration to the next.
In a more complicated scenario the memory assigned to newNum would be in an essentially random state and you could expect weird behaviour.
http://www.cplusplus.com/doc/tutorial/variables/
The float you are creating is not initialised at all. Looks like you got lucky and it turned out to be zero on the first pass, though it could have had any value in it.
In each iteration of the loop a new float is created, but it uses the same bit of memory as the last one, so ended up with the old value that you had.
To get the effect you wanted, you will need to initialise the float on each pass.
float newNum = 0.0;
The mistake in thoughts you've expressed is "C++ must be assigning some default value". It will not. newNum contains dump.
You are using an uninitialized automatic stack variable. In each loop iteration it is located at the same place on the stack, so event though it has an undefined value, in your case it will be the value of the previous iteration.
Also beware that in the first iteration it could potentialliay have any value, not only 0.0.
You might have got your expected answer in a debug build but not release as debug builds sometimes initialise variables to 0 for you. I think this is undefined behaviour - because C++ doesn't auto initialise variables for you, every time around the loop it is creating a new variable but it keeps using the same memory as it was just released and doesn't scrub out the previous value. As other people have said you could have ended up with complete nonsense printing out.
It is not a compile error to use an uninitialised variable but there should usually be a warning about it. Always good to turn warnings on and try to remove all of them incase something nastier is hidden amongst them.
Using garbage value in your code invokes Undefined Behaviour in C++. Undefined Behavior means anything can happen i.e the behavior of the code is not defined.
float newNum; //uninitialized (may contain some garbage value)
newNum +=i; // same as newNum=newNum+i
^^^^^
Whoa!!
So better try this
float newNum=0; //initialize the variable
for (int i = 1; i <=5; ++i){
newNum +=i;
cout << newNum << " ";
}
C++ must be assigning some default value (huhm, must be 0).
C++ doesn't initialize things without default constructor (it may set it to something like 0xcccccccc on debug build, though) - because as any proper tool, compiler "thinks" that if you haven't provided initialization then it is what you wanted. float doesn't have default constructor, so it is unknown value.
I then expected the output to be "1 2 3 4 5". What was printed was "1 3 6 10 15".
Please help me know what's wrong with my expectation that float newNum would create a new variable for each iteration?
Variable is a block of memory. In this variable is allocated on stack. You didn't initialize it, and each iteration it just happen to be placed on the same memory address, which is why it stores previous value. Of course, you shouldn't rely on such behavior. If you want value to persist across iterations, declare it outside of loop.
Btw, in Java, this piece of code won't compile due to newNum not initialized
BTW, in C++ normal compiler would give you a warning that variable is not initialized (Example: "warning C4700: uninitialized local variable 'f' used"). And on debug build you would get crt debug error (Example: "Run-Time check failure #3 - The variable 'f' is being used without being initialized").
and that's probably better
Negative. You DO need uninitialized variables from time to time (normally - to initialize them without "standard" assignment operator - by passing into function by pointer/reference, for example), and forcing me to initialize every one of them will be a waste of my time.
I don't do too much C++ since last month, but:
The float values is newly allocated for each iteration. I'm a bit surprised about the initial zero value, though. The thing is, after each iteration, the float value runs out of scope, the next step (the reenter of the loop scope) first reallocates the float memory and this will often return the same memory block that was just freed.
(I'm waiting for any bashing :-P)