Calculate the offset of the pointer in C++ - c++

I am a c++ novice
After I calculated the pointer of the Player structure, the result was beyond my surprise
struct Player
{
const char* Name = "ab";
uintptr_t Health = 6;
uintptr_t Coins = 3;
} player;
std::cout << &player << std::endl; // 0100C000
uintptr_t* playerBaseAddress = (uintptr_t*)&player;
std::cout << playerBaseAddress << std::endl; // 0100C000
std::cout << (playerBaseAddress + 4) << std::endl; // 0100C010
0100C000+4 How do I get 0100C004 instead of 0100C010
Can someone explain that, please?

Like this
uintptr_t playerBaseAddress = (uintptr_t)&player;
In your version you have a pointer, so when you added 4 to your pointer the result was multiplied by the size of the object being pointed at. Clearly on your platform uintptr_t has size 4, so you got 0100C000 + 4*4 which is 0100C010.
This would also work
char* playerBaseAddress = (char*)&player;
because here the size of char is 1. so you get 0100C000 + 1*4 which equals 0100C004.

In pointer arithmetics, the sizes of the operations are multiplied by the pointed type's size.
This way it's easy to reference data right next to each other in memory.
For example:
int* ptr = new int[5];
ptr[3] = 4;
std::cout << *(ptr+3) << endl; // 4
delete[] ptr;
You could add four bytes to it by converting it to a pointer type which has the size of one byte, for example char*.

playerBaseAddress is of type uintptr_t* which is a pointer. Presumably uintptr_t takes 4 bytes in your environment. Now this piece
playerBaseAddress + 4
involves the pointer arithmetic: you move the pointer 4*sizeof(uintptr_t)=4*4=16 bytes forward. 16 in hex is 10. Hence your result.
Note that uintptr_t* playerBaseAddress = (uintptr_t*)&player; is UB anyway. I assume you meant uintptr_t playerBaseAddress = (uintptr_t)&player; instead.

Calculating the offset into a struct can be done by using offsetof, e.g.
#include <cstddef>
#include <iostream>
struct Player {
const char *name = "ab";
uintptr_t health = 6;
uintptr_t coins = 3;
};
int main()
{
size_t off = offsetof(Player, health);
std::cout << "off=" << off << '\n';
}
Will show 4 or 8, depending on the architecture and size of the structure's elements.

Related

Pointer typecasting c++

#include <iostream>
#define print(x) std::cout << x;
#define println(x) std::cout << x << std::endl;
int main() {
int ex[5];
int* ptr = ex;
for (int i = 0; i < 5; i++) {
ex[i] = 2;
}
ex[2] = 3;
*(int*)((char*)ptr + 8) = 4;
println(ex[2]);
}
on line 13 i'm using (char*) and when i run println(sizeof(char*)) it says that it's 4 bytes but my instructor says that it's 1 byte long so we need to add 8 bytes to access the value in ex[2], how could this be possible i didn't understand ! :/
It depends on the architecture you use. By definition char is the type that has the size of 1, so sizeof(char) evaluates as 1, but it does not automatically mean that it is 8 bits.
To access the next value, you must add sizeof(int) to the pointer to make your code work independent of the architecture it is used on.
When you work with pointers, you tell the compiler that the value the pointer points to takes the space of that type in memory, and the next thing in the memory should be after that amount of units(bytes). So if you cast your int pointer to char pointer, you should add sizeof(int) to your char pointer to have the same effect as you would have added 1 to the int pointer. This is because char is automatically 1 unit by definition, if you would use anything other than char, this would not work, there is no architecture independent specification of sizes of types.

How to calculate number of bytes in a vector in C++?

I have a code that looks as below:
typedef struct{
uint8_t distance[2];
uint8_t reflectivity;
}data_point;
typedef struct{
uint8_t flag[2];
uint8_t Azimuth[2];
std::vector<data_point> points; // always size is 32 but not initialized whereas filled in run-time using emplace_back
}data_block;
typedef struct{
uint8_t UDP[42];
std::vector<data_block> data; // always size is 12 but not initialized whereas filled in run-time using emplace_back
uint8_t time_stamp[4];
uint8_t factory[2];
}data_packet;
static std::vector<data_packet> packets_from_current_frame;
Assuming packets_from_current_frame.size() = 26, How can I calculate the number of bytes in packets_from_current_frame?
My solution on paper:
1 data_packet (assuming 32 points and 12 data) will have 42+ (12*(2+2+32(3))) + 4 + 2 = 1248. Hence, the end address is _begin + sizeof(uint8_t) * 26 * 1248 (_begin is the start address of the memory buffer).
With this calculation I always loose some data. Number of bytes that is lost depends on packets_from_current_frame.size(). What is wrong with the calculation?
Take a look at this code:
int main() {
static std::vector<data_packet> packets_from_current_frame;
std::cout << "sizeof data_point = " << (sizeof(data_point)) << std::endl;
std::cout << "sizeof data_block = " << (sizeof(data_block)) << std::endl;
std::cout << "sizeof data_packet = " << (sizeof(data_packet)) << std::endl;
return 0;
}
Output:
sizeof data_point = 3
sizeof data_block = 32
sizeof data_packet = 80
From this it is clear why your calculation fails.
You have forgotten to take into account things like the vector itself, fields after the vector and padding.
The correct way to calculate it is:
packets_from_current_frame.size() * (sizeof(data_packet) + data.size() * (sizeof(data_block) + points.size() * sizeof(data_point)))
Note: This amount of bytes are not stored in one consecutive memory block so don't try any direct memcpy.
The individual vector has consecutive data. If you want to know the address of the data held by a vector, use vector::data()

Why does memcpy to int not work after calling memcpy to bool value

I was playing around with memcpy when I stumbled on a strange result, where a memcpy that is called on the same pointer of memory after bool memcpy gives unexpected result.
I created a simple test struct that has a bunch of different type variables. I cast the struct into unsigned char pointer and then using memcpy I copy data from that pointer into separate variables. I tried playing around the offset of memcpy and shifting the int memcpy before bool (changed the layout of test struct so that the int would go before the bool too). Suprisingly the shifting fixed the problem.
// Simple struct containing 3 floats
struct vector
{
float x;
float y;
float z;
};
// My test struct
struct test2
{
float a;
vector b;
bool c;
int d;
};
int main()
{
// I create my structure on the heap here and assign values
test2* test2ptr = new test2();
test2ptr->a = 50;
test2ptr->b.x = 100;
test2ptr->b.y = 101;
test2ptr->b.z = 102;
test2ptr->c = true;
test2ptr->d = 5;
// Then turn the struct into an array of single bytes
unsigned char* data = (unsigned char*)test2ptr;
// Variable for keeping track of the offset
unsigned int offset = 0;
// Variables that I want the memory copied into they
float a;
vector b;
bool c;
int d;
// I copy the memory here in the same order as it is defined in the struct
std::memcpy(&a, data, sizeof(float));
// Add the copied data size in bytes to the offset
offset += sizeof(float);
std::memcpy(&b, data + offset, sizeof(vector));
offset += sizeof(vector);
std::memcpy(&c, data + offset, sizeof(bool));
offset += sizeof(bool);
// It all works until here the results are the same as the ones I assigned
// however the int value becomes 83886080 instead of 5
// moving this above the bool memcpy (and moving the variable in the struct too) fixes the problem
std::memcpy(&d, data + offset, sizeof(int));
offset += sizeof(int);
return 0;
}
So I expected the value of d to be 5 however it becomes 83886080 which I presume is just random uninitialized memory.
You ignore the padding of your data in a struct.
Take a look on the following simplified example:
struct X
{
bool b;
int i;
};
int main()
{
X x;
std::cout << "Address of b " << (void*)(&x.b) << std::endl;
std::cout << "Address of i " << (void*)(&x.i) << std::endl;
}
This results on my PC with:
Address of b 0x7ffce023f548
Address of i 0x7ffce023f54c
As you see, the bool value in the struct takes 4 bytes here even it uses less for its content. The compiler must add padding bytes to the struct to make it possible the cpu can access the data directly. If you have the data arranged linear as written in your code, the compiler have to generate assembly instructions on all access to align the data later which slows down your program a lot.
You can force the compiler to do that by adding pragma pack or something similar with your compiler. All the pragma things are compiler specific!
For your program, you have to use the address if the data for the memcpy and not the size of the data element before the element you want to access as this ignore padding bytes.
If I add a pragma pack(1) before my program, the output is:
Address of b 0x7ffd16c79cfb
Address of i 0x7ffd16c79cfc
As you can see, there are no longer padding bytes between the bool and the int. But the code which will access i later will be very large and slow! So avoid use of #pragma pack at all!
You've got the answer you need so I'll not get into detail. I just made an extraction function with logging to make it easier to follow what's happening.
#include <cstring>
#include <iostream>
#include <memory>
// Simple struct containing 3 floats
struct vector {
float x;
float y;
float z;
};
// My test struct
struct test2 {
float a;
vector b;
bool c;
int d;
};
template<typename T>
void extract(T& dest, unsigned char* data, size_t& offset) {
std::uintptr_t dp = reinterpret_cast<std::uintptr_t>(data + offset);
size_t align_overstep = dp % alignof(T);
std::cout << "sizeof " << sizeof(T) << " alignof " << alignof(T) << " data "
<< dp << " mod " << align_overstep << "\n";
if(align_overstep) {
size_t missing = alignof(T) - align_overstep;
std::cout << "misaligned - adding " << missing << " to align it again\n";
offset += missing;
}
std::memcpy(&dest, data + offset, sizeof(dest));
offset += sizeof(dest);
}
int main() {
std::cout << std::boolalpha;
// I create my structure on the heap here and assign values
test2* test2ptr = new test2();
test2ptr->a = 50;
test2ptr->b.x = 100;
test2ptr->b.y = 101;
test2ptr->b.z = 102;
test2ptr->c = true;
test2ptr->d = 5;
// Then turn the struct into an array of single bytes
unsigned char* data = reinterpret_cast<unsigned char*>(test2ptr);
// Variable for keeping track of the offset
size_t offset = 0;
// Variables that I want the memory copied into they
float a;
vector b;
bool c;
int d;
// I copy the memory here in the same order as it is defined in the struct
extract(a, data, offset);
std::cout << "a " << a << "\n";
extract(b, data, offset);
std::cout << "b.x " << b.x << "\n";
std::cout << "b.y " << b.y << "\n";
std::cout << "b.z " << b.z << "\n";
extract(c, data, offset);
std::cout << "c " << c << "\n";
extract(d, data, offset);
std::cout << "d " << d << "\n";
std::cout << offset << "\n";
delete test2ptr;
}
Possible output
sizeof 4 alignof 4 data 12840560 mod 0
a 50
sizeof 12 alignof 4 data 12840564 mod 0
b.x 100
b.y 101
b.z 102
sizeof 1 alignof 1 data 12840576 mod 0
c true
sizeof 4 alignof 4 data 12840577 mod 1
misaligned - adding 3 to align it again
d 5
24
There are apparently three padding bytes between the bool and the subsequent int. This is allowed by the standard due to alignment considerations (accessing a 4 byte int that is not aligned on a 4 byte boundary may be slow or crash on some systems).
So when you do offset += sizeof(bool), you are not incrementing enough. The int follows 4 bytes after, not 1. The result is that the 5 is not the first byte you read but the last one - you are reading three padding bytes plus the first one from test2ptr->d into d. And it is no coincidence that 83886080 = 2^24 * 5 (the padding bytes were apparently all zeros).

C++: Why is this code giving me memory issues / undefined behavior?

A bit of Background if you are interested...
The next piece of code is an attempt at implementing a Packet Error Code generator using Cyclical Redundancy Check (CRC-15). This is used to detect communication data corruption. A more detailed introduction is unnecessary.
Code and Issues
init_PEC15_Table function is a lookup-table generator.
pec15 function takes a data input, calculates the address of the solution and find the result in the lookup-table.
data is a char array that I have assigned a value of 1 to. This is going to be passed to pec15.
Now, I found that by just reordering the cout commands, the value of "stuffed pec", which is the output I am interested in, changes. By reading online I understood that this could be due to the memory stack unexpectedly changing in a way that affects the result registers and that this could be due to out-of-bounds operations on other variables. Am I mistaken in my understanding?
Now, I am a beginner and this is very daunting. I might have made some gross mistakes that I am not aware of so please feel free to tear the code to shreds.
Also, if it matters, this code is running on an mbed LPC1768.
#include <iostream>
using namespace std;
unsigned short pec15Table[256];
const unsigned int CRC15_POLY = 0x4599;
void init_PEC15_Table() // Cyclical Redundancy Check lookup table generator function
{
unsigned short rem;
for (int i = 0; i < 256; i++)
{
rem = i << 7;
for (int bit = 8; bit > 0; --bit)
{
if (rem & 0x4000)
{
rem = ((rem << 1));
rem = (rem ^ CRC15_POLY);
}
else
{
rem = ((rem << 1));
}
}
pec15Table[i] = rem & 0xFFFF;
// cout << hex << pec15Table [i] << endl;
}
}
unsigned short pec15(char* data, int lengt = 16) //Takes data as an input,
{
int rem, address;
rem = 16;//PEC seed (intial PEC value)
for (int i = 0; i < lengt; i++)
{
address = ((rem >> 7) ^ data[i]) & 0xff;//calculate PEC table address
rem = (rem << 8) ^ pec15Table[address];
}
return (rem * 2);//The CRC15 has a 0 in the LSB so the final value must be multiplied by 2
}
int main()
{
init_PEC15_Table(); //initialise pec table
char data = (short) 0x1 ; // Write 0x1 to char array containing the data 0x1
char* dataPtr = &data; // Create a pointer to that array
unsigned short result = pec15(dataPtr); //Pass data pointer to pec calculator
cout << "data in: " << (short) *dataPtr << endl; //Print the short representation of the char data array (Outputs 1)
cout << "size of data: " << sizeof(*dataPtr) << endl; //Print the size of the char array (Outputs 1)
cout << "stuffed pec: " << result << endl; //Print the output of the pec calculation
return 0;
}
The code you've written here does not sync with the comments you've written:
char data = (short) 0x1 ; // Write 0x1 to char array containing the data 0x1
char* dataPtr = &data; // Create a pointer to that array
The first line does not write anything to a character array. Rather, it creates a char variable whose numeric value is 1. As a note, the cast to short here isn't needed and has no effect - did you mean to write something else?
The second line does not create a pointer to an array. Rather, it creates a pointer to the data variable. You could potentially think of this as a pointer to an array of length one, but that's probably not what you meant to do.
The two above lines don't by themselves do anything bad. The next line, however, is a real problem:
unsigned short result = pec15(dataPtr); //Pass data pointer to pec calculator
Remember that pec15 has a second argument that's supposed to denote the length of the data passed in. Since you didn't specify it, it defaults to 16. However, your dataPtr pointer only points to a single char value, not 16 char values, so this results in undefined behavior.
I'm not sure how to fix this because I don't have a good sense for the intent behind your code. Did you mean to make a sixteen-element array? Did you mean to create an array filled with the value 0x1? The correct fix here depends on the answer to that question.
Try:
unsigned short result = pec15(dataPtr, 1);
otherwise lengt is 16 (has default value). I'd also advice to remove default value of lengt, as it makes little sense in context of pec15 function.

memory location of pointer variable itself?

Is it possible to find the memory location of a pointer variable itself?
i.e. I don't want to know the memory location the pointer is referencing, I want to know what the memory location of the pointer variable is.
int A = 5;
int *k = &A;
cout << k;
will give me the location of A. Does k have a location?
The location of int *k is &k, like so:
// For this example, assume the variables start at 0x00 and are 32 bits each.
int A = 9; // 0x00 = 0x09
int * k = &A; // 0x04 = 0x00
int ** k_2 = &k; // 0x08 = 0x04
// Thus:
cout << "Value of A: " << A; // "Value of A: 9"
cout << "Address of A: " << k; // "Address of A: 0x00"
cout << "Address of k: " << k_2; // "Address of k: 0x04"
assert( A == *k);
assert(&A == k);
assert(&A == *k_2);
assert( A == **k_2);
assert( k == *k_2);
assert(&k == k_2);
A pointer is a variable, 32-bit on 32-bit binaries and 64 in 64, storing a memory address. Like any other variable, it has an address of its own, and the syntax is identical. For most intents and purposes, pointers act like unsigned ints of the same size.
Now, when you take &k, you now have an int **, with all the interesting complexities that go with that.
From int ** k_2 = &k, *k_2 is k and **k_2 is *k, so everything works about how you'd expect.
This can be a useful way to handle creating objects cross-library (pointers are rather fundamental and mostly safe to pass). Double pointers are commonly used in interfaces where you want to pass a pointer that will be filled with an object, but not expose how the object is being created (DirectX, for example).
Yes. You can use &k to print out the memory location of *k.
int A = 5;
int *k = &A;
cout << k;
A's address is k;
k's address is &k.
Yo dawg, we put an address at your address so you can address while you address.