C++ Stack around the variable x was corrupted - c++

I am new to C++. I am learning some basics.
I tried the below program and got a run time error Stack around the variable x was corrupted.
int x = 56;
int *ptr = &x;
ptr[1]=8;
cout << *ptr << endl;
But if i update the index in line 3 to 0, say ptr[0] = 8, I am not getting any run time error and the console shows 8 as the output.
I assume 2 digits in the integer variable x and thought pointer index will have 0 and 1 as valid values. Why ptr[1] is causing run tume error where as ptr[2], ptr[3] does not cause any run times error and simply shows 56 as o/p.
Can any one help me to understand what is really going on here. May be a better tutorial site as an add on would help me to understand more on this subject.

ptr[1] is actually the integer next to the variable pointed to by ptr. Writing to such memory means overwriting about anything. You may overwrite other variables. Or return addresses. Or stack frames. Or about anything else. Heck, you may overwrite ptr itself, depending on how variables on the stack are arranged.
Undefined behavior. Compilers are allowed to assume it doesn't happen.

Let see what you are doing here (Lets assume int is 4 bytes):
int x = 56;
int *ptr = &x;
ptr[1]=8;
cout << *ptr << endl;
<- 4 bytes->......(rest of the stack)
----------------
| x | |
----------------
^ ^
ptr[0] ptr[1]
So, pre[1] is writing to a memory location which does not yet exist. So, you are writing data out-of-bound.

Presumably you were expecting ptr[1] to mean the second byte in x. But that's not how pointer arithmetic works. The pointer is an int*, so arithmetic is performed in "chunks" of int-sizes. Therefore, ptr[1] is the non-existent integer "next to" x.
You could probably see this working by making ptr a char* instead, but be careful because this is real hackery and probably not a good idea unless you really know what you're doing.
A further misconception is your indication that the number of decimal digits in the human-readable representation of x's value has anything to do with the number of bytes taking up by x in memory; it doesn't.

Related

C++/Address Space: 2 Bytes per adress?

I was just trying something and i was wondering how this could be. I have the following Code:
int var1 = 132;
int var2 = 200;
int *secondvariable = &var2;
cout << *(secondvariable+2) << endl << sizeof(int) << endl;
I get the Output
132
4
So how is it possible that the second int is only 2 addresses higher? I mean shouldn't it be 4 addresses? I'm currently under WIN10 x64.
Regards
With cout << *(secondvariable+2) you don't print a pointer, you print the value at secondvariable[2], which is an invalid indexing and lead to undefined behavior.
If you want to print a pointer then drop the dereference and print secondvariable+2.
While you already are far in the field of undefined behaviour (see Some programmer dude's answer) due to indexing an array out of bounds (a single variable is considered an array of length 1 for such matters), some technical background:
Alignment! Compilers are allowed to place variables at addresses such that they can be accessed most efficiently. As you seem to have gotten valid output by adding 2*sizeof(int) to the second variable's address, you apparently have reached the first one by accident. Apparently, the compiler decided to leave a gap in between the two variables so that both can be aligned to addresses dividable by 8.
Be aware, though, that you don't have any guarantee for such alignment, different compilers might decide differently (or same compiler on another system), and alignment even might be changed via compiler flags.
On the other hand, arrays are guaranteed to occupy contiguous memory, so you would have gotten the expected result in the following example:
int array[2];
int* a0 = &array[0];
int* a1 = &array[1];
uintptr_t diff = static_cast<uintptr_t>(a1) - static_cast<uintptr_t>(a0);
std::cout << diff;
The cast to uintptr_t (or alternatively to char*) assures that you get address difference in bytes, not sizes of int...
This is not how C++ works.
You can't "navigate" your scope like this.
Such pointer antics have completely undefined behaviour and shall not be relied upon.
You are not punching holes in tape now, you are writing a description of a program's semantics, that gets converted by your compiler into something executable by a machine.
Code to these abstractions and everything will be fine.

C++ / Arduino understanding the usage of uint8_t and *

Can someone explain to me what's going on in this code block? Specifically on line 3. I have a hunch the * before ptr is significant. And (uint8_t *) looks like a cast to a byte... But what's up with the *? It also looks like r, g, and b will all evaluate to the same value.
case TRUECOLOR: { // 24-bit ('truecolor') image (no palette)
uint8_t pixelNum, r, g, b,
*ptr = (uint8_t *)&imagePixels[imageLine * NUM_LEDS * 3];
for(pixelNum = 0; pixelNum < NUM_LEDS; pixelNum++) {
r = *ptr++;
g = *ptr++;
b = *ptr++;
strip.setPixelColor(pixelNum, r, g, b);
}
I work primarily in C#.
The second and third line can be expressed more cleanly:
uint8_t pixelNum;
uint8_t r;
uint8_t g;
uint8_t b;
uint8_t *ptr = (uint8_t *)&imagePixels[imageLine * NUM_LEDS * 3];
The first four variable declarations should be fairly simple, the fifth one is something C# does not have. It declares ptr as a pointer to a uint8_t. This pointer is set to the address of the value which is the imageLine * NUM_LEDS * 3th element in the imagePixels array. As this might be a different type (maybe a pointer to a char, who knows), this value is cast to a pointer to an uint8_t.
The next occurence of the asterisk (*) is in the for-loop body, where it is used as the dereference operator, which basically resolves a pointer to get the actual value.
Pointers 101
A pointer is like the street address of a house. It shows you where the house is so you can find it, but when you pass it around, you don't pass around the whole house. You can dereference it, meaning you can actually visit the house.
The two operators used in conjunction with pointers are the asterisk (*) and the ampersand (&). The asterisk is used in declarations of pointers and to dereference a pointer, the ampersand is used to get the address of something.
Take a look at the following example:
int x = 12;
int *y = &x;
std::cout << "X is " << *y; // Will print "X is 12"
We declare x as an int holding the value 12. Now we declare y as a pointer to an int, and set it to point at x by storing x's address. By using *y, we access the actual value of x, the int that y points at.
Since a pointer is a type of reference, modifying the value via the pointer changes the actual value of the thing pointed at.
int x = 12;
int *y = &x;
*y = 10;
std::cout << "X is " << x; // Will print "X is 10"
Pointers 102
Pointers are a large topic, and I suggest you take your time to read about them from different sources if necessary.
Used in a variable definition, the * means ptr is a pointer. The value it stores is an address in memory for another variable or a part of another variable. In this case ptr is a pointer to a block of memory inside imagePixels and from the names of the variables involved it's a line in an image. Since the type is uint8_t, this is taking whatever imagePixels is and using it as a block of individual bytes.
Used outside a varable definition, the * takes on a different meaning: dereference the pointer. Go to the location in memory stored in the pointer and get the value.
And yeah, * can also be used for multiplication, upping the code-reading fun level.
Incrementing (++) a pointer moves the address to the next address. If you had a uint32_t * the address would advance by 4 to point at the next uint32_t. In this case we have uint8_t, so the address is advanced one byte. So
r = *ptr++;
A) Get value at pointer.
After A) Advance the pointer.
After A) Assign value to r.
Exactly where the "advance the pointer" stage goes is tricky. It is after step A. In C++17 or greater it is before "Assign the value" because there is now a separation between the stuff on the right and the stuff on the left of an equals sign. But before C++17 all we can say is it's after step A. Search keyterm: "Sequence Points".
g = *ptr++;
b = *ptr++;
Do it again, get and assign the current value at ptr, advance the pointer.
strip.setPixelColor(pixelNum, r, g, b);
From the naming I presume this sets a given pixel to the colours read above.
You can't just
strip.setPixelColor(pixelNum, *ptr++, *ptr++, *ptr++);
Because of sequencing again. There are no guarantees of the order in which the parameters will be computed. This is to allow compiler developers to make optimizations for speed and size that they cannot if the ordering is specified, but it's a kick in the teeth to those expecting left-to-right resolution. My understanding is this still holds true in the C++17 standard.
OK. So what is this doing?
There is a big block of memory from which you want one and only one line.
*ptr = (uint8_t *)&imagePixels[imageLine * NUM_LEDS * 3];
pinpoints the beginning of that line and sets it up to be treated like a dumb array of bytes.
for(pixelNum = 0; pixelNum < NUM_LEDS; pixelNum++) {
Generic for loop. For all the pixels on the line of LEDs.
r = *ptr++;
g = *ptr++;
b = *ptr++;
Get the colour of one pixel on the line in the standard 8 bit RGB format and point at the next pixel
strip.setPixelColor(pixelNum, r, g, b);
writes the read colour to one pixel.
The for loop will then loop around and start working on the next pixel until there are no more pixels on the line.
The asterisk(*) is the symbol for a pointer. So the (uint8_t *) is a cast to a pointer that is pointing to a uint8_t. Then within the loop, where the asterisk is prefixed to a symbol (ie *ptr) that is dereferencing that pointer. Dereferencing the pointer returns the data that the pointer is pointing to.
I suggest reading a bit about pointers as they are critical to understanding C/C++. Here is the C++ Docs on Pointers
MildlyInformed, I would need more code to run through it to explain it. One tool I found really, really useful though is the C visualizer. It's an online debug tool that helps you figure out what's happening in code by running you through step-by-step, line by line. It can be found at: http://www.pythontutor.com/visualize.html#mode=edit
Even though the URL talks about python, it can do C and a bunch of languages. I would have commented instead of posting an answer, but my rep isn't high enough. I hope this helps!
(I'm not affiliated with the above website, other than to use it occasionally when I'm baffled)

Why do I get a random number when increasing the integer value of a pointer?

I am an expert C# programmer, but I am very new to C++. I get the basic idea of pointers just fine, but I was playing around. You can get the actual integer value of a pointer by casting it as an int:
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
Which makes sense; it's a memory address. But I can move to the next pointer, and cast it as an int:
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
int* jptr = (int*)((int)iptr + 1);
int j = (int)*iptr;
and I get a seemingly random number (although this is not a good PSRG). What is this number? Is it another number used by the same process? Is it possibly from a different process? Is this bad practice, or disallowed? And if not, is there a use for this? It's kind of cool.
What is this number? Is it another number used by the same process? Is it possibly from a different process?
You cannot generally cast pointers to integers and back and expect them to be dereferencable. Integers are numbers. Pointers are pointers. They are totally different abstractions and are not compatible.
If integers are not large enough to be able to store the internal representation of pointers (which is likely the case; integers are usually 32 bits long and pointers are usually 64 bits long), or if you modify the integer before casting it back to a pointer, your program exhibits undefined behaviour and as such anything can happen.
See C++: Is it safe to cast pointer to int and later back to pointer again?
Is this bad practice, or disallowed?
Disallowed? Nah.
Bad practice? Terrible practice.
You move beyond i pointer by 4 or 8 bytes and print out the number, which might be another number stored in your program space. The value is unknown and this is Undefined Behavior. Also there is a good chance that you might get an error (that means your program can blow up) [Ever heard of SIGSEGV? The Segmentation violation problem]
You are discovering that random places in memory contain "unknown" data. Not only that, but you may find yourself pointing to memory that your process does not have "rights" to so that even the act of reading the contents of an address can cause a segmentation fault.
In general is you allocate some memory to a pointer (for example with malloc) you may take a look at these locations (which may have random data "from the last time" in them) and modify them. But data that does not belong explicitly to a pointer's block of memory can behave all kings of undefined behavior.
Incidentally if you want to look at the "next" location just to
NextValue = *(iptr + 1);
Don't do any casting - pointer arithmetic knows (in your case) exactly what the above means : " the contents of the next I refer location".
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
int* jptr = (int*)((int)iptr + 1);
int j = (int)*iptr;
You can cast int to pointer and back again, and it will give you same value
Is it possibly from a different process? no it's not, and you can't access memory of other process except using readProcessMemmory and writeProcessMemory under win32 api.
You get other number because you add 1 to the pointer, try to subtract 1 and you will same value.
When you define an integer by
int i = 5;
it means you allocate a space in your thread stack, and initialize it as 5. Then you get a pointer to this memory, which is actually a position in you current thread stack
When you increase your pointer by 1, it means you point to the next location in your thread stack, and you parse it again as an integer,
int* jptr = (int*)((int)iptr + 1);
int j = (int)*jptr;
Then you will get an integer from you thread stack which is close to where you defined your int i.
Of course this is not suggested to do, unless you want to become an hacker and want to exploit stack overflow (here it means what it is, not the site name, ha!)
Using a pointer to point to a random address is very dangerous. You must not point to an address unless you know what you're doing. You could overwrite its content or you may try to modify a constant in read-only memory which leads to an undefined behaviour...
This for example when you want to retrieve the elements of an array. But cannot cast a pointer to integer. You just point to the start of the array and increase your pointer by 1 to get the next element.
int arr[5] = {1, 2, 3, 4, 5};
int *p = arr;
printf("%d", *p); // this will print 1
p++; // pointer arithmetics
printf("%d", *p); // this will print 2
It's not "random". It just means that there are some data on the next address
Reading a 32-bit word from an address A will copy the 4 bytes at [A], [A+1], [A+2], [A+3] into a register. But if you dereference an int at [A+1] then the CPU will load the bytes from [A+1] to [A+4]. Since the value of [A+4] is unknown it may make you think that the number is "random"
Anyway this is EXTREMELY dangerous 💀 since
the pointer is misaligned. You may see the program runs fine because x86 allows for unaligned accesses (with some performance penalty). But most other architectures prohibit unaligned operations and your program will just end in segmentation fault. For more information read Purpose of memory alignment, Data Alignment: Reason for restriction on memory address being multiple of data type size
you may not be allowed to touch the next byte as it may be outside of your address space, is write-only, is used for another variable and you changed its value, or whatever other reasons. You'll also get a segfault in that case
the next byte may not be initialized and reading it will crash your application on some architectures
That's why the C and C++ standard state that reading memory outside an array invokes undefined behavior. See
How dangerous is it to access an array out of bounds?
Access array beyond the limit in C and C++
Is accessing a global array outside its bound undefined behavior?

Alocating local variables on stack & using pointer arithemtic

I read that in function the local variables are put on stack as they are defined after the parameters has been put there first.
This is mentioned also here
5 .All function arguments are placed on the stack. 6.The instructions
inside of the function begin executing. 7.Local variables are pushed
onto the stack as they are defined.
So I excpect that if the C++ code is like this:
#include "stdafx.h"
#include <iostream>
int main()
{
int a = 555;
int b = 666;
int *p = &a;
std::cout << *(p+1);
return 0;
}
and if integer here has 4 bytes and we call the memory space on stack that contains first 8 bits of int 555 x, then 'moving' another 4 bytes to the top of the stack via *(p+1) we should be looking into memory at address x + 4.
However, the output of this is -858993460 - an that is always like that no matter what value int b has. Evidently its some standard value. Of course I am accessing a memory which I should not as for this is the variable b. It was just an experiment.
How come I neither get the expected value nor an illegal access error?
Where is my assumption wrong?
What could -858993460 represent?
What everyone else has said (i.e. "don't do that") is absolutely true. Don't do that. However, to actually answer your question, p+1 is most likely pointing at either a pointer to the caller's stack frame or the return address itself. The system-maintained stack pointer is decremented when you push something on it. This is implementation dependent, officially speaking, but every stack pointer I've ever seen (this is since the 16-bit era) has been like this. Thus, if as you say, local variables are pushed on the stack as they are initialized, &a should == &b + 1.
Perhaps an illustration is in order. Suppose I compile your code for 32 bit x86 with no optimizations, and the stack pointer esp is 20 (this is unlikely, for the record) before I call your function. This is what memory looks like right before the line where you invoke cout:
4: 12 (value of p)
8: 666 (value of b)
12: 555 (value of a)
16: -858993460 (return address)
p+1, since p is an int*, is 16. The memory at this location isn't read protected because it's needed to return to the calling function.
Note that this answer is academic; it's possible that the compiler's optimizations or differences between processors caused the unexpected result. However, I would not expect p+1 to == &b on any processor architecture with any calling convention I've ever seen because the stack usually grows downward.
Your assumptions are true in theory (From the CS point of view).
In practice there is no guarantee to do pointer arithmetic in that way expecting those results.
For example, your asumption "All function arguments are placed on the stack" is not true: The allocation of function argumments is implementation-defined (Depending on the architecture, it could use registers or the stack), and also the compiler is free to allocate local variables in registers if it feels necesary.
Also the asumption "int size is 4 bytes, so adding 4 to the pointer goes to b" is false. The compiler could have added padding between a and b to ensure memory aligment.
The conclusion here is: Don't use low-level tricks, they are implementation-defined. Even if you have to (Regardless of our advises) do it, you have to know how the compiler works and how it generates the code.

Where is pointer metadata stored?

Could be that I am overlooking something obvious, but where is pointer metadata stored? For instance if I have a 32-bit int pointer ptr and I execute ptr++ it knows to advance 4 bytes in memory. However, if I have a 64-bit int pointer it knows to advance 8 bytes. So who keeps track of what type of pointer ptr is and where is it stored? For simplicity you can limit this to C++.
It isn't stored anywhere, per-se. The compiler looks at the type of the ptr and turns the ++ operation into an increment of the correct number of bytes.
In the symbol table while the compiler runs. Nowhere while your program runs, or rather it is implicit in the lower level code produced by the compiler.
It's not stored anywhere, it's determined at compile time. In fact, take this code as an example:
int *abc = NULL;
cout << abc + 1; /* Prints sizeof(int) */
cout << (void *)((char *)abc + 1); /* Prints 1. Casting it back to void * is necessary,
otherwise it will try to dereference it and print as a string. */