Pointer typecasting c++ - c++

#include <iostream>
#define print(x) std::cout << x;
#define println(x) std::cout << x << std::endl;
int main() {
int ex[5];
int* ptr = ex;
for (int i = 0; i < 5; i++) {
ex[i] = 2;
}
ex[2] = 3;
*(int*)((char*)ptr + 8) = 4;
println(ex[2]);
}
on line 13 i'm using (char*) and when i run println(sizeof(char*)) it says that it's 4 bytes but my instructor says that it's 1 byte long so we need to add 8 bytes to access the value in ex[2], how could this be possible i didn't understand ! :/

It depends on the architecture you use. By definition char is the type that has the size of 1, so sizeof(char) evaluates as 1, but it does not automatically mean that it is 8 bits.
To access the next value, you must add sizeof(int) to the pointer to make your code work independent of the architecture it is used on.
When you work with pointers, you tell the compiler that the value the pointer points to takes the space of that type in memory, and the next thing in the memory should be after that amount of units(bytes). So if you cast your int pointer to char pointer, you should add sizeof(int) to your char pointer to have the same effect as you would have added 1 to the int pointer. This is because char is automatically 1 unit by definition, if you would use anything other than char, this would not work, there is no architecture independent specification of sizes of types.

Related

Pointer Arithmetic (adding ints to arrays)

So I read online that if you have an array like
int arr[3] = {1,2,3}
I read that if you take (arr+n) any number n it will just add sizeof(n) to the address so if n is an integer(takes up 4 bytes) it will just add 4 right? But then I also experimented on my own and read some more stuff and found that
arr[i] = *(arr+i) for any i, which means it's not just adding the sizeof(i) so how exactly is this working?
Because obviously arr[0] == *(arr+0) and arr[1] == *(arr+1) so it's not just adding sizeof(the number) what is it doing?
I read that if you take (arr+n) any number n it will just add sizeof(n) to the address
This is wrong. When n is an int (or an integer literal of type int) then sizeof(n) is the same as sizeof(int) and that is a compile time constant. What actually happens is that first arr decays to a pointer to the first element of the array. Then sizeof(int) * n is added to the pointers value (because elements type is int):
1 2 3
^ ^
| |
arr arr+2
This is because each element in the array occupies sizeof(int) bytes and to get to memory address of the next element you have to add sizeof(int).
[...] and read some more stuff and found that arr[i] = *(arr+i)
This is correct. For c-ararys arr[i] is just shorthand way of writing *(arr+i).
When you write some_pointer + x then how much the pointer value is incremented depends on the type of the pointer. Consider this example:
#include <iostream>
int main(void) {
int * x = 0;
double * y = 0;
std::cout << x + 2 << "\n";
std::cout << y + 2 << "\n";
}
Possible output is
0x8
0x10
because x is incremented by 2* sizeof(int) while y is incremented by 2 * sizeof(double). Thats also the reason why you get different results here:
#include <iostream>
int main(void) {
int x[] = {1,2,3};
std::cout << &x + 1 <<"\n";
std::cout << &x[0] + 1;
}
However, note that you get different output with int* x = new int[3]{1,2,3}; because then x is just a int* that points to an array, it is not an array. This distinction between arrays and pointers to arrays causes much confusion. It is important to understand that arrays are not pointers, but they often do decay to pointers to their first element.

Calculate the offset of the pointer in C++

I am a c++ novice
After I calculated the pointer of the Player structure, the result was beyond my surprise
struct Player
{
const char* Name = "ab";
uintptr_t Health = 6;
uintptr_t Coins = 3;
} player;
std::cout << &player << std::endl; // 0100C000
uintptr_t* playerBaseAddress = (uintptr_t*)&player;
std::cout << playerBaseAddress << std::endl; // 0100C000
std::cout << (playerBaseAddress + 4) << std::endl; // 0100C010
0100C000+4 How do I get 0100C004 instead of 0100C010
Can someone explain that, please?
Like this
uintptr_t playerBaseAddress = (uintptr_t)&player;
In your version you have a pointer, so when you added 4 to your pointer the result was multiplied by the size of the object being pointed at. Clearly on your platform uintptr_t has size 4, so you got 0100C000 + 4*4 which is 0100C010.
This would also work
char* playerBaseAddress = (char*)&player;
because here the size of char is 1. so you get 0100C000 + 1*4 which equals 0100C004.
In pointer arithmetics, the sizes of the operations are multiplied by the pointed type's size.
This way it's easy to reference data right next to each other in memory.
For example:
int* ptr = new int[5];
ptr[3] = 4;
std::cout << *(ptr+3) << endl; // 4
delete[] ptr;
You could add four bytes to it by converting it to a pointer type which has the size of one byte, for example char*.
playerBaseAddress is of type uintptr_t* which is a pointer. Presumably uintptr_t takes 4 bytes in your environment. Now this piece
playerBaseAddress + 4
involves the pointer arithmetic: you move the pointer 4*sizeof(uintptr_t)=4*4=16 bytes forward. 16 in hex is 10. Hence your result.
Note that uintptr_t* playerBaseAddress = (uintptr_t*)&player; is UB anyway. I assume you meant uintptr_t playerBaseAddress = (uintptr_t)&player; instead.
Calculating the offset into a struct can be done by using offsetof, e.g.
#include <cstddef>
#include <iostream>
struct Player {
const char *name = "ab";
uintptr_t health = 6;
uintptr_t coins = 3;
};
int main()
{
size_t off = offsetof(Player, health);
std::cout << "off=" << off << '\n';
}
Will show 4 or 8, depending on the architecture and size of the structure's elements.

C++: Why is this code giving me memory issues / undefined behavior?

A bit of Background if you are interested...
The next piece of code is an attempt at implementing a Packet Error Code generator using Cyclical Redundancy Check (CRC-15). This is used to detect communication data corruption. A more detailed introduction is unnecessary.
Code and Issues
init_PEC15_Table function is a lookup-table generator.
pec15 function takes a data input, calculates the address of the solution and find the result in the lookup-table.
data is a char array that I have assigned a value of 1 to. This is going to be passed to pec15.
Now, I found that by just reordering the cout commands, the value of "stuffed pec", which is the output I am interested in, changes. By reading online I understood that this could be due to the memory stack unexpectedly changing in a way that affects the result registers and that this could be due to out-of-bounds operations on other variables. Am I mistaken in my understanding?
Now, I am a beginner and this is very daunting. I might have made some gross mistakes that I am not aware of so please feel free to tear the code to shreds.
Also, if it matters, this code is running on an mbed LPC1768.
#include <iostream>
using namespace std;
unsigned short pec15Table[256];
const unsigned int CRC15_POLY = 0x4599;
void init_PEC15_Table() // Cyclical Redundancy Check lookup table generator function
{
unsigned short rem;
for (int i = 0; i < 256; i++)
{
rem = i << 7;
for (int bit = 8; bit > 0; --bit)
{
if (rem & 0x4000)
{
rem = ((rem << 1));
rem = (rem ^ CRC15_POLY);
}
else
{
rem = ((rem << 1));
}
}
pec15Table[i] = rem & 0xFFFF;
// cout << hex << pec15Table [i] << endl;
}
}
unsigned short pec15(char* data, int lengt = 16) //Takes data as an input,
{
int rem, address;
rem = 16;//PEC seed (intial PEC value)
for (int i = 0; i < lengt; i++)
{
address = ((rem >> 7) ^ data[i]) & 0xff;//calculate PEC table address
rem = (rem << 8) ^ pec15Table[address];
}
return (rem * 2);//The CRC15 has a 0 in the LSB so the final value must be multiplied by 2
}
int main()
{
init_PEC15_Table(); //initialise pec table
char data = (short) 0x1 ; // Write 0x1 to char array containing the data 0x1
char* dataPtr = &data; // Create a pointer to that array
unsigned short result = pec15(dataPtr); //Pass data pointer to pec calculator
cout << "data in: " << (short) *dataPtr << endl; //Print the short representation of the char data array (Outputs 1)
cout << "size of data: " << sizeof(*dataPtr) << endl; //Print the size of the char array (Outputs 1)
cout << "stuffed pec: " << result << endl; //Print the output of the pec calculation
return 0;
}
The code you've written here does not sync with the comments you've written:
char data = (short) 0x1 ; // Write 0x1 to char array containing the data 0x1
char* dataPtr = &data; // Create a pointer to that array
The first line does not write anything to a character array. Rather, it creates a char variable whose numeric value is 1. As a note, the cast to short here isn't needed and has no effect - did you mean to write something else?
The second line does not create a pointer to an array. Rather, it creates a pointer to the data variable. You could potentially think of this as a pointer to an array of length one, but that's probably not what you meant to do.
The two above lines don't by themselves do anything bad. The next line, however, is a real problem:
unsigned short result = pec15(dataPtr); //Pass data pointer to pec calculator
Remember that pec15 has a second argument that's supposed to denote the length of the data passed in. Since you didn't specify it, it defaults to 16. However, your dataPtr pointer only points to a single char value, not 16 char values, so this results in undefined behavior.
I'm not sure how to fix this because I don't have a good sense for the intent behind your code. Did you mean to make a sixteen-element array? Did you mean to create an array filled with the value 0x1? The correct fix here depends on the answer to that question.
Try:
unsigned short result = pec15(dataPtr, 1);
otherwise lengt is 16 (has default value). I'd also advice to remove default value of lengt, as it makes little sense in context of pec15 function.

c++ - store byte[4] in an int

I want to take a byte array with 4 bytes in it, and store it in an int.
For example (non-working code):
unsigned char _bytes[4];
int * combine;
_bytes[0] = 1;
_bytes[1] = 1;
_bytes[2] = 1;
_bytes[3] = 1;
combine = &_bytes[0];
I do not want to use bit shifting to put the bytes in the int, I would like to point at the bytes memory and use them as an int if possible.
In Standard C++ it's not possible to do this reliably. The strict aliasing rule says that when you read through an expression of type int, it must actually designate an int object (or a const int etc.) otherwise it causes undefined behaviour.
However you can do the opposite: declare an int and then fill in the bytes:
int combine;
unsigned char *bytes = reinterpret_cast<unsigned char *>(&combine);
bytes[0] = 1;
bytes[1] = 1;
bytes[2] = 1;
bytes[3] = 1;
std::cout << combine << std::endl;
Of course, which value you get out of this depends on how your system represents integers. If you want your code to use the same mapping on different systems then you can't use memory aliasing; you'd have to use an equation instead.

Conceptual problem in Union

My code is this
// using_a_union.cpp
#include <stdio.h>
union NumericType
{
int iValue;
long lValue;
double dValue;
};
int main()
{
union NumericType Values = { 10 }; // iValue = 10
printf("%d\n", Values.iValue);
Values.dValue = 3.1416;
printf("%d\n", Values.iValue); // garbage value
}
Why do I get garbage value when I try to print Values.iValue after doing Values.dValue = 3.1416?
I thought the memory layout would be like this. What happens to Values.iValue and
Values.lValue; when I assign something to Values.dValue ?
In a union, all of the data members overlap. You can only use one data member of a union at a time.
iValue, lValue, and dValue all occupy the same space.
As soon as you write to dValue, the iValue and lValue members are no longer usable: only dValue is usable.
Edit: To address the comments below: You cannot write to one data member of a union and then read from another data member. To do so results in undefined behavior. (There's one important exception: you can reinterpret any object in both C and C++ as an array of char. There are other minor exceptions, like being able to reinterpret a signed integer as an unsigned integer.) You can find more in both the C Standard (C99 6.5/6-7) and the C++ Standard (C++03 3.10, if I recall correctly).
Might this "work" in practice some of the time? Yes. But unless your compiler expressly states that such reinterpretation is guaranteed to be work correctly and specifies the behavior that it guarantees, you cannot rely on it.
Because floating point numbers are represented differently than integers are.
All of those variables occupy the same area of memory (with the double occupying more obviously). If you try to read the first four bytes of that double as an int you are not going to get back what you think. You are dealing with raw memory layout here and you need to know how these types are represented.
EDIT: I should have also added (as James has already pointed out) that writing to one variable in a union and then reading from another does invoke undefined behavior and should be avoided (unless you are re-interpreting the data as an array of char).
Well, let's just look at simpler example first. Ed's answer describes the floating part, but how about we examine how ints and chars are stored first!
Here's an example I just coded up:
#include "stdafx.h"
#include <iostream>
using namespace std;
union Color {
int value;
struct {
unsigned char R, G, B, A;
};
};
int _tmain(int argc, _TCHAR* argv[])
{
Color c;
c.value = 0xFFCC0000;
cout << (int)c.R << ", " << (int)c.G << ", " << (int)c.B << ", " << (int)c.A << endl;
getchar();
return 0;
}
What would you expect the output to be?
255, 204, 0, 0
Right?
If an int is 32 bits, and each of the chars is 8 bits, then R should correspond to the to the left-most byte, G the second one, and so forth.
But that's wrong. At least on my machine/compiler, it appears ints are stored in reverse byte order. I get,
0, 0, 204, 255
So to make this give the output we'd expect (or the output I would have expected anyway), we have to change the struct to A,B,G,R. This has to do with endianness.
Anyway, I'm not an expert on this stuff, just something I stumbled upon when trying to decode some binaries. The point is, floats aren't necessarily encoded the way you'd expect either... you have to understand how they're stored internally to understand why you're getting that output.
You've done this:
union NumericType Values = { 10 }; // iValue = 10
printf("%d\n", Values.iValue);
Values.dValue = 3.1416;
How a compiler uses memory for this union is similar to using the variable with largest size and alignment (any of them if there are several), and reinterpret cast when one of the other types in the union is written/accessed, as in:
double dValue; // creates a variable with alignment & space
// as per "union Numerictype Values"
*reinterpret_cast<int*>(&dValue) = 10; // separate step equiv. to = { 10 }
printf("%d\n", *reinterpret_cast<int*>(dValue)); // print as int
dValue = 3.1416; // assign as double
printf("%d\n", *reinterpret_cast<int*>(dValue)); // now print as int
The problem is that in setting dValue to 3.1416 you've completely overwritten the bits that used to hold the number 10. The new value may appear to be garbage, but it's simply the result of interpreting the first (sizeof int) bytes of the double 3.1416, trusting there to be a useful int value there.
If you want the two things to be independent - so setting the double doesn't affect the earlier-stored int - then you should use a struct/class.
It may help you to consider this program:
#include <iostream>
void print_bits(std::ostream& os, const void* pv, size_t n)
{
for (int i = 0; i < n; ++i)
{
uint8_t byte = static_cast<const uint8_t*>(pv)[i];
for (int j = 0; j < 8; ++j)
os << ((byte & (128 >> j)) ? '1' : '0');
os << ' ';
}
}
union X
{
int i;
double d;
};
int main()
{
X x = { 10 };
print_bits(std::cout, &x, sizeof x);
std::cout << '\n';
x.d = 3.1416;
print_bits(std::cout, &x, sizeof x);
std::cout << '\n';
}
Which, for me, produced this output:
00001010 00000000 00000000 00000000 00000000 00000000 00000000 00000000
10100111 11101000 01001000 00101110 11111111 00100001 00001001 01000000
Crucially, the first half of each line shows the 32 bits that are used for iValue: note the 1010 binary in the least significant byte (on the left on an Intel CPU like mine) is 10 decimal. Writing 3.1416 changes the entire 64-bits to a pattern representing 3.1416 (see http://en.wikipedia.org/wiki/Double_precision_floating-point_format). The old 1010 pattern is overwritten, clobbered, an electromagnetic memory no more.