Can anyone please explain what this C++ code is doing? - c++

char b = 'a';
int *a = (int*)&b;
std::cout << *a;
What could be the content of *a? It is showing garbage value. Can you anyone please explain. Why?

Suppose char takes one byte in memory and int takes two bytes (the exact number of bytes depends of the platform, but usually they are not same for char and int). You set a to point to the memory location same as b. In case of b dereferencing will consider only one byte because it's of type char. In case of a dereferencing will access two bytes and thus will print the integer stored at these locations. That's why you get a garbage: first byte is 'a', the second is random byte - together they give you a random integer value.

Either the first or the last byte should be hex 61 depending on byte order. The other three bytes are garbage. best to change the int to an unsigned int and change the cout to hex.
I don't know why anyone would want to do this.

You initialize a variable with the datatype char ...
a char in c++ should have 1 Byte and an int should contain 2 Byte. Your a points to the address of the b variable... an adress should be defined as any hexadecimal number. Everytime you call this "program" there should be any other hexadecimal number, because the scheduler assigns any other address to your a variable if you start this program new.

Think of it as byte blocks. Char has one byte block (8 bits). If you set a conversion (int*) then you get the next 7 byte blocks from the char's address. Therefore you get 7 random byte blocks which means you'll get a random integer. That's why you get a garbage value.

The code invokes undefined behavior, garbage is a form of undefined behavior, but your program could also cause a system violation and crash with more consequences.
int *a = (int*)&b; initializes a pointer to int with the address of a char. Dereferencing this pointer will attempt to read an int from that address:
If the address is misaligned and the processor does not support misaligned accesses, you may get a system specific signal or exception.
If the address is close enough to the end of a segment that accessing beyond the first byte causes a segment violation, that's what you can get.
If the processor can read the sizeof(int) bytes at the address, only one of those will be a, (0x61 in ASCII) but the others have undetermined values (aka garbage). As a matter of fact, on some architectures, reading from uninitialized memory may cause problems: under valgrind for example, this will cause a warning to be displayed to the user.
All the above are speculations, undefined behavior means anything can happen.

Related

std::copy resulting in "Stack around the variable " was corrupted" error

I am currently trying to copy 4 bytes out of a vector<BYTE> into an integer value. When my function returns, I continually get an error that my stack is corrupted (Stack around the variable 'rxPacket' was corrupted). I am running in debug mode and debugging my DLL. I stripped the function down to something very basic shown below, but I still get the resulting error. I did find this issue and I'm wondering if I am experiencing something similar. But, I wanted to check to see if there was something obvious I am missing.
AtpSocket::RxPacket AtpSocket::sendAndWait(AtpSocket::Command cmd,const void* data,size_t dataLen,int timeout) {
AtpSocket::TxPacket txPacket;
AtpSocket::RxPacket rxPacket;
int rxCommandStringLength = 0;
std::vector<BYTE> readBuffer(20, 55);
std::reverse_copy(readBuffer.begin(), readBuffer.begin() + sizeof rxCommandStringLength, &rxCommandStringLength);
return rxPacket;
}
rxCommandStringLength is an int, so &rxCommandStringLength is an int*. Let's just assume for the sake of argument that sizeof(int) is 4 bytes 1.
1: to really ensure that exactly 4 bytes are being copied, you should be using int32_t instead of int, since int is not guaranteed to be 4 bytes on all platforms.
Iterators (which includes raw pointers) are incremented/decremented in terms of whole elements, not bytes. The element type is whatever type is returned when the iterator is dereferenced.
Since your input iterator is a vector::iterator to a BYTE array, and your output iterator is an int* pointer, reverse_copy() will iterate through the source array in whole BYTEs and through the destination memory in whole ints, not in single bytes like you are expecting. In other words, when reverse_copy() increments the input iterator it will jump forward by 1 byte, but when it increments the destination iterator it will jump forward in memory by 4 bytes.
In this example, reverse_copy() will read the 1st input BYTE and write the value to the 1st output int, then read the 2nd input BYTE and write the value to the 2nd output int, and so on, and so on. By the end, the input iterator will have advanced by 4 BYTEs, and the destination pointer will have advanced by 4 ints - 16 bytes total, which goes WAY beyond the bounds of the actual rxCommandStringLength variable, and thus reverse_copy() will have written into surrounding memory, corrupting the stack (if it doesn't just crash instead).
Since you want reverse_copy() to iterate in 1-byte increments for BOTH input AND output, you need to type-cast the int* pointer to match the data type used by the input iterators, eg:
std::reverse_copy(readBuffer.begin(), readBuffer.begin() + sizeof rxCommandStringLength, reinterpret_cast<BYTE*>(&rxCommandStringLength));

C++/Address Space: 2 Bytes per adress?

I was just trying something and i was wondering how this could be. I have the following Code:
int var1 = 132;
int var2 = 200;
int *secondvariable = &var2;
cout << *(secondvariable+2) << endl << sizeof(int) << endl;
I get the Output
132
4
So how is it possible that the second int is only 2 addresses higher? I mean shouldn't it be 4 addresses? I'm currently under WIN10 x64.
Regards
With cout << *(secondvariable+2) you don't print a pointer, you print the value at secondvariable[2], which is an invalid indexing and lead to undefined behavior.
If you want to print a pointer then drop the dereference and print secondvariable+2.
While you already are far in the field of undefined behaviour (see Some programmer dude's answer) due to indexing an array out of bounds (a single variable is considered an array of length 1 for such matters), some technical background:
Alignment! Compilers are allowed to place variables at addresses such that they can be accessed most efficiently. As you seem to have gotten valid output by adding 2*sizeof(int) to the second variable's address, you apparently have reached the first one by accident. Apparently, the compiler decided to leave a gap in between the two variables so that both can be aligned to addresses dividable by 8.
Be aware, though, that you don't have any guarantee for such alignment, different compilers might decide differently (or same compiler on another system), and alignment even might be changed via compiler flags.
On the other hand, arrays are guaranteed to occupy contiguous memory, so you would have gotten the expected result in the following example:
int array[2];
int* a0 = &array[0];
int* a1 = &array[1];
uintptr_t diff = static_cast<uintptr_t>(a1) - static_cast<uintptr_t>(a0);
std::cout << diff;
The cast to uintptr_t (or alternatively to char*) assures that you get address difference in bytes, not sizes of int...
This is not how C++ works.
You can't "navigate" your scope like this.
Such pointer antics have completely undefined behaviour and shall not be relied upon.
You are not punching holes in tape now, you are writing a description of a program's semantics, that gets converted by your compiler into something executable by a machine.
Code to these abstractions and everything will be fine.

Why do I get a random number when increasing the integer value of a pointer?

I am an expert C# programmer, but I am very new to C++. I get the basic idea of pointers just fine, but I was playing around. You can get the actual integer value of a pointer by casting it as an int:
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
Which makes sense; it's a memory address. But I can move to the next pointer, and cast it as an int:
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
int* jptr = (int*)((int)iptr + 1);
int j = (int)*iptr;
and I get a seemingly random number (although this is not a good PSRG). What is this number? Is it another number used by the same process? Is it possibly from a different process? Is this bad practice, or disallowed? And if not, is there a use for this? It's kind of cool.
What is this number? Is it another number used by the same process? Is it possibly from a different process?
You cannot generally cast pointers to integers and back and expect them to be dereferencable. Integers are numbers. Pointers are pointers. They are totally different abstractions and are not compatible.
If integers are not large enough to be able to store the internal representation of pointers (which is likely the case; integers are usually 32 bits long and pointers are usually 64 bits long), or if you modify the integer before casting it back to a pointer, your program exhibits undefined behaviour and as such anything can happen.
See C++: Is it safe to cast pointer to int and later back to pointer again?
Is this bad practice, or disallowed?
Disallowed? Nah.
Bad practice? Terrible practice.
You move beyond i pointer by 4 or 8 bytes and print out the number, which might be another number stored in your program space. The value is unknown and this is Undefined Behavior. Also there is a good chance that you might get an error (that means your program can blow up) [Ever heard of SIGSEGV? The Segmentation violation problem]
You are discovering that random places in memory contain "unknown" data. Not only that, but you may find yourself pointing to memory that your process does not have "rights" to so that even the act of reading the contents of an address can cause a segmentation fault.
In general is you allocate some memory to a pointer (for example with malloc) you may take a look at these locations (which may have random data "from the last time" in them) and modify them. But data that does not belong explicitly to a pointer's block of memory can behave all kings of undefined behavior.
Incidentally if you want to look at the "next" location just to
NextValue = *(iptr + 1);
Don't do any casting - pointer arithmetic knows (in your case) exactly what the above means : " the contents of the next I refer location".
int i = 5;
int* iptr = &i;
int ptrValue = (int)iptr;
int* jptr = (int*)((int)iptr + 1);
int j = (int)*iptr;
You can cast int to pointer and back again, and it will give you same value
Is it possibly from a different process? no it's not, and you can't access memory of other process except using readProcessMemmory and writeProcessMemory under win32 api.
You get other number because you add 1 to the pointer, try to subtract 1 and you will same value.
When you define an integer by
int i = 5;
it means you allocate a space in your thread stack, and initialize it as 5. Then you get a pointer to this memory, which is actually a position in you current thread stack
When you increase your pointer by 1, it means you point to the next location in your thread stack, and you parse it again as an integer,
int* jptr = (int*)((int)iptr + 1);
int j = (int)*jptr;
Then you will get an integer from you thread stack which is close to where you defined your int i.
Of course this is not suggested to do, unless you want to become an hacker and want to exploit stack overflow (here it means what it is, not the site name, ha!)
Using a pointer to point to a random address is very dangerous. You must not point to an address unless you know what you're doing. You could overwrite its content or you may try to modify a constant in read-only memory which leads to an undefined behaviour...
This for example when you want to retrieve the elements of an array. But cannot cast a pointer to integer. You just point to the start of the array and increase your pointer by 1 to get the next element.
int arr[5] = {1, 2, 3, 4, 5};
int *p = arr;
printf("%d", *p); // this will print 1
p++; // pointer arithmetics
printf("%d", *p); // this will print 2
It's not "random". It just means that there are some data on the next address
Reading a 32-bit word from an address A will copy the 4 bytes at [A], [A+1], [A+2], [A+3] into a register. But if you dereference an int at [A+1] then the CPU will load the bytes from [A+1] to [A+4]. Since the value of [A+4] is unknown it may make you think that the number is "random"
Anyway this is EXTREMELY dangerous ๐Ÿ’€ since
the pointer is misaligned. You may see the program runs fine because x86 allows for unaligned accesses (with some performance penalty). But most other architectures prohibit unaligned operations and your program will just end in segmentation fault. For more information read Purpose of memory alignment, Data Alignment: Reason for restriction on memory address being multiple of data type size
you may not be allowed to touch the next byte as it may be outside of your address space, is write-only, is used for another variable and you changed its value, or whatever other reasons. You'll also get a segfault in that case
the next byte may not be initialized and reading it will crash your application on some architectures
That's why the C and C++ standard state that reading memory outside an array invokes undefined behavior. See
How dangerous is it to access an array out of bounds?
Access array beyond the limit in C and C++
Is accessing a global array outside its bound undefined behavior?

What does *(int*) mean in C++?

I encountered the following line in a OpenGL tutorial and I wanna know what does the *(int*) mean and what is its value
if ( *(int*)&(header[0x1E])!=0 )
Let's take this a step at a time:
header[0x1E]
header must be an array of some kind, and here we are getting a reference to the 0x1Eth element in the array.
&(header[0x1E])
We take the address of that element.
(int*)&(header[0x1E])
We cast that address to a pointer-to-int.
*(int*)&(header[0x1E])
We dereference that pointer-to-int, yielding an int by interpreting the first sizeof(int) bytes of header, starting at offset 0x1E, as an int and gets the value it finds there.
if ( *(int*)&(header[0x1E])!=0 )
It compares that resulting value to 0 and if it isn't 0, executes whatever is in the body of the if statement.
Note that this is potentially very dangerous. Consider what would happen if header were declared as:
double header [0xFF];
...or as:
int header [5];
It's truly a terrible piece of code, but what it's doing is:
&(header[0x1E])
takes the address of the (0x1E + 1)th element of array header, let's call it addr:
(int *)addr
C-style cast this address into a pointer to an int, let's call this pointer p:
*p
dereferences this memory location as an int.
Assuming header is an array of bytes, and the original code has been tested only on intel, it's equivalent with:
header[0x1E] + header[0x1F] << 8 + header[0x20] << 16 + header[0x21] << 24;
However, besides the potential alignment issues the other posters mentioned, it has at least two more portability problems:
on a platform with 64 bit ints, it will make an int out of bytes 0x1E to 0x25 instead of the above; it will be also wrong on a platform with 16 bit ints, but I suppose those are too old to matter
on a big endian platform the number will be wrong, because the bytes will get reversed and it will end up as:
header[0x1E] << 24 + header[0x1F] << 16 + header[0x20] << 8 + header[0x21];
Also, if it's a bmp file header as rici assumed, the field is probably unsigned and the cast is done to a signed int. In this case it doesn't matter as it's being compared to zero, but in some other case it may.

Integer to Character conversion in C

Lets us consider this snippet:
int s;
scanf("%c",&s);
Here I have used int, and not char, for variable s, now for using s for character conversion safely I have to make it char again because when scanf reads a character it only overwrites one byte of the variable it is assigning it to, and not all four that int has.
For conversion I could use s = (char)s; as the next line, but is it possible to implement the same by subtracting something from s ?
What you've done is technically undefined behaviour. The %c format calls for a char*, you've passed it an int* which will (roughly speaking) be reinterpreted. Even assuming that the pointer value is still good after reinterpreting, storing an arbitrary character to the first byte of an int and then reading it back as int is undefined behaviour. Even if it were defined, reading an int when 3 bytes of it are uninitialized, is undefined behaviour.
In practice it probably does something sensible on your machine, and you just get garbage in the top 3 bytes (assuming little-endian).
Writing s = (char)s converts the value from int to char and then back to int again. This is implementation-defined behaviour: converting an out-of-range value to a signed type. On different implementations it might clean up the top 3 bytes, it might return some other result, or it might raise a signal.
The proper way to use scanf is:
char c;
scanf("%c", &c);
And then either int s = c; or int s = (unsigned char)c;, according to whether you want negative-valued characters to result in a negative integer, or a positive integer (up to 255, assuming 8-bit char).
I can't think of any good reason for using scanf improperly. There are good reasons for not using scanf at all, though:
int s = getchar();
Are you trying to convert a digit to its decimal value? If so, then
char c = '8';
int n = c - '0';
n should 8 at this point.
That's probably not a good idea; GCC gives me a warning for that code:
main.c:10: warning: format โ€˜%cโ€™ expects type โ€˜char *โ€™, but
argument 2 has type โ€˜int *โ€™
In this case you're ok since you're passing a pointer to more space than you need (for most systems), but what if you did it the other way around? Could be crash city. If you really want to do something like what you have there, just do the typecast or mask it - the mask will be endian-dependent.
As written this won't work reliably . The argument, &s, to scanf is a pointer to int and scanf is expecting a pointer to char. The two data type (int and char) have different sizes (at least on most architectures) so the data may get put in the wrong spot in memeory, and the other part of s may not get properly cleared.
The answers suggesting manipulation of the result after using a pointer to int rely on unspecified behavior (i.e. that scanf will put the character value it has in the least significant byte of the int you're pointing to), and are not safe.
Not but you could use the following:
s = s & 0xFF
That will blank out all of the data except the first byte. But in general all these ideas (and the ones above) are bad ideas, since not all systems store the lowest part of the integer in memory first. So if you ever have to port this code to a big endian system, you'll be screwed.
True, you may never have to port the code, but why write unportable code to begin with?
See this for more info:
http://en.wikipedia.org/wiki/Endianness