Add offset to pointer without casting to UINT - c++

If have written the following code to learn a bit about pointers and pointer arithmetic, meaning going from offset to offset to read something from a struct.
I have the following code
DWORD * x = (DWORD*)((UINT)ptr1 + sizeof(int) + sizeof(float));
float f = *(float*)x;
This Code works as it should. However I struggled a lot to get it working as at the beginning I was not casting ptr1 to UINT and therefore was getting a different address as I wanted to have.
Now it works. However can someone explain to me why I cannot add the offset (sizeof...) to the ptr1 directly which is of type DWORD * ?
DWORD * x = (DWORD*)(ptr1 + sizeof(int) + sizeof(float));
float f = *(float*)x;

Adding a number n to a T * pointer will move the pointer sizeof(T)*n bytes (not n bytes).
For your example, if we suppose that both sizeof(int) and sizeof(DWORD) are 4, adding sizeof(int) to ptr1 will move ptr1 16 bytes (instead of 4, which you likely intended).

going from offset to offset to read something from a struct.
You can't do that. That's not how pointers work. You shall not treat them as just numbers, addresses; that way madness lies. C++ is an abstraction and you can only have a pointer to an object or an array of objects. Once you start playing around with it to navigate between objects, you've lost. Prepare for UB weirdness relating to padding, alignment, aliasing, optimisations...

Related

C++ adding 4 bytes to pointer address

I have a question about pointers, and memory addresses:
Supposing I have the following code:
int * array = (int *) malloc(sizeof(int) * 4);
Now in array im storing a memory address, I know that c++ takes already care when adding +1 to this pointer it will add 4 bytes, but what If I want to add manually 4 bytes?
array + 0x004
If im correct this will lead to add 4*4 (16) bytes, but my Idea is to add manually those 4 bytes.
Why? Just playing around, i've tried this and I got a totally different result from what I expected, then i've researched and i've seen that c++ takes already care when you add +1 to a pointer (it sums 4 bytes in this case).
Any idea?
For a pointer p to a type T with value v, the expression p+n will (on most systems anyway) result in a pointer to the address v+n*sizeof(T). To get a fixed-byte offset to the pointer, you can first cast it to a character pointer, like this:
reinterpret_cast<T*>(reinterpret_cast<char*>(p) + n)
In c++, sizeof(char) is defined to be equal to 1.
Do note that accessing improperly aligned values can have large performance penalties.
Another thing to note is that, in general, casting pointers to different types is not allowed (called the strict aliasing rule), but an exception is explicitly made for casting any pointer type to char* and back.
The trick is convert the type of array into any pointer-type with a size of 1 Byte, or store the pointer value in an integer.
#include <stdint.h>
int* increment_1(int* ptr) {
//C-Style
return (int*)(((char*)ptr) + 4);
}
int* increment_2(int* ptr) {
//C++-Style
char* result = reinterpret_cast<char*>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
int* increment_3(int* ptr) {
//Store in integer
intptr_t result = reinterpret_cast<intptr_t>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
Consider that if you add an arbitrary number of bytes to an address of an object of type T, it no longer makes sense to use a pointer of type T, since there might not be an object of type T at the incremented memory address.
If you want to access a particular byte of an object, you can do so using a pointer to a char, unsigned char or std::byte. Such objects are the size of a byte, so incrementing behaves just as you would like. Furthermore, while rules of C++ disallow accessing objects using incompatible pointers, these three types are excempt of that rule and are allowed to access objects of any type.
So, given
int * array = ....
You can access the byte at index 4 like this:
auto ptr = reinterpret_cast<unsigned char*>(array);
auto byte_at_index_4 = ptr + 4;
array + 0x004
If im correct this will lead to add 4*4 (16) bytes
Assuming sizeof(int) happens to be 4, then yes. But size of int is not guaranteed to be 4.

C++ / Arduino understanding the usage of uint8_t and *

Can someone explain to me what's going on in this code block? Specifically on line 3. I have a hunch the * before ptr is significant. And (uint8_t *) looks like a cast to a byte... But what's up with the *? It also looks like r, g, and b will all evaluate to the same value.
case TRUECOLOR: { // 24-bit ('truecolor') image (no palette)
uint8_t pixelNum, r, g, b,
*ptr = (uint8_t *)&imagePixels[imageLine * NUM_LEDS * 3];
for(pixelNum = 0; pixelNum < NUM_LEDS; pixelNum++) {
r = *ptr++;
g = *ptr++;
b = *ptr++;
strip.setPixelColor(pixelNum, r, g, b);
}
I work primarily in C#.
The second and third line can be expressed more cleanly:
uint8_t pixelNum;
uint8_t r;
uint8_t g;
uint8_t b;
uint8_t *ptr = (uint8_t *)&imagePixels[imageLine * NUM_LEDS * 3];
The first four variable declarations should be fairly simple, the fifth one is something C# does not have. It declares ptr as a pointer to a uint8_t. This pointer is set to the address of the value which is the imageLine * NUM_LEDS * 3th element in the imagePixels array. As this might be a different type (maybe a pointer to a char, who knows), this value is cast to a pointer to an uint8_t.
The next occurence of the asterisk (*) is in the for-loop body, where it is used as the dereference operator, which basically resolves a pointer to get the actual value.
Pointers 101
A pointer is like the street address of a house. It shows you where the house is so you can find it, but when you pass it around, you don't pass around the whole house. You can dereference it, meaning you can actually visit the house.
The two operators used in conjunction with pointers are the asterisk (*) and the ampersand (&). The asterisk is used in declarations of pointers and to dereference a pointer, the ampersand is used to get the address of something.
Take a look at the following example:
int x = 12;
int *y = &x;
std::cout << "X is " << *y; // Will print "X is 12"
We declare x as an int holding the value 12. Now we declare y as a pointer to an int, and set it to point at x by storing x's address. By using *y, we access the actual value of x, the int that y points at.
Since a pointer is a type of reference, modifying the value via the pointer changes the actual value of the thing pointed at.
int x = 12;
int *y = &x;
*y = 10;
std::cout << "X is " << x; // Will print "X is 10"
Pointers 102
Pointers are a large topic, and I suggest you take your time to read about them from different sources if necessary.
Used in a variable definition, the * means ptr is a pointer. The value it stores is an address in memory for another variable or a part of another variable. In this case ptr is a pointer to a block of memory inside imagePixels and from the names of the variables involved it's a line in an image. Since the type is uint8_t, this is taking whatever imagePixels is and using it as a block of individual bytes.
Used outside a varable definition, the * takes on a different meaning: dereference the pointer. Go to the location in memory stored in the pointer and get the value.
And yeah, * can also be used for multiplication, upping the code-reading fun level.
Incrementing (++) a pointer moves the address to the next address. If you had a uint32_t * the address would advance by 4 to point at the next uint32_t. In this case we have uint8_t, so the address is advanced one byte. So
r = *ptr++;
A) Get value at pointer.
After A) Advance the pointer.
After A) Assign value to r.
Exactly where the "advance the pointer" stage goes is tricky. It is after step A. In C++17 or greater it is before "Assign the value" because there is now a separation between the stuff on the right and the stuff on the left of an equals sign. But before C++17 all we can say is it's after step A. Search keyterm: "Sequence Points".
g = *ptr++;
b = *ptr++;
Do it again, get and assign the current value at ptr, advance the pointer.
strip.setPixelColor(pixelNum, r, g, b);
From the naming I presume this sets a given pixel to the colours read above.
You can't just
strip.setPixelColor(pixelNum, *ptr++, *ptr++, *ptr++);
Because of sequencing again. There are no guarantees of the order in which the parameters will be computed. This is to allow compiler developers to make optimizations for speed and size that they cannot if the ordering is specified, but it's a kick in the teeth to those expecting left-to-right resolution. My understanding is this still holds true in the C++17 standard.
OK. So what is this doing?
There is a big block of memory from which you want one and only one line.
*ptr = (uint8_t *)&imagePixels[imageLine * NUM_LEDS * 3];
pinpoints the beginning of that line and sets it up to be treated like a dumb array of bytes.
for(pixelNum = 0; pixelNum < NUM_LEDS; pixelNum++) {
Generic for loop. For all the pixels on the line of LEDs.
r = *ptr++;
g = *ptr++;
b = *ptr++;
Get the colour of one pixel on the line in the standard 8 bit RGB format and point at the next pixel
strip.setPixelColor(pixelNum, r, g, b);
writes the read colour to one pixel.
The for loop will then loop around and start working on the next pixel until there are no more pixels on the line.
The asterisk(*) is the symbol for a pointer. So the (uint8_t *) is a cast to a pointer that is pointing to a uint8_t. Then within the loop, where the asterisk is prefixed to a symbol (ie *ptr) that is dereferencing that pointer. Dereferencing the pointer returns the data that the pointer is pointing to.
I suggest reading a bit about pointers as they are critical to understanding C/C++. Here is the C++ Docs on Pointers
MildlyInformed, I would need more code to run through it to explain it. One tool I found really, really useful though is the C visualizer. It's an online debug tool that helps you figure out what's happening in code by running you through step-by-step, line by line. It can be found at: http://www.pythontutor.com/visualize.html#mode=edit
Even though the URL talks about python, it can do C and a bunch of languages. I would have commented instead of posting an answer, but my rep isn't high enough. I hope this helps!
(I'm not affiliated with the above website, other than to use it occasionally when I'm baffled)

memset pointer + offset

For example, I have:
DWORD pointer = 0x123456;
DWORD offset = 0xABC;
I want to add offset to pointer and set the value at the address pointed by that pointer to 1.0f. How do I give memset() a pointer and an offset as first argument?
DWORDs are the same as uint32_t's. Just add them together as you would with any other integer.
Also, while setting a float (I am assuming you're setting a float due to the 'f' after the '1.0'), I wouldn't use memset. Just cast the pointer to a float, and de-reference it like so:
DWORD pointer = 0x123456;
DWORD offset = 0xABC;
pointer += offset;
float* float_pointer = reinterpret_cast<float*>(pointer);
*float_pointer = 1.0f;
This can be straightforward pointer arithmetic, but we've got a few preliminary issues to take care of first.
Why is your pointer variable declared as a DWORD? We'll need a proper pointer type eventually.
Why are you asking about memset, if you're trying to set a floating-point value? memset sets plain bit patterns; it's not much good for floating-point.
I assume your offset is supposed to be measured in bytes, not sizeof(float).
Anyway, computing the pointer you want could go something like this:
float *fp = (float *)(pointer + offset);
[Notice that I perform the addition inside the parens, before the cast, so as to get a byte offset. If I instead wrote (float *)pointer + offset, it'd offset by `sizeof(float).]
Once we've got that float pointer, we can set the location it points to to 1.0 in the usual way:
*fp = 1.0f;
Or we could set it to 0 using memset:
memset(fp, 0, sizeof(float));

Casting/dereferencing char pointers to a double array

Is there anything wrong with the casting a double pointer to a char pointer? Goal in the following code is to change the 1 element in three different ways.
double vec1[100];
double *vp = vec1;
char *yp = (char*) vp;
vp++;
vec1[1] = 19.0;
*vp = 12.0;
*((double*) (yp + (1*sizeof (vec1[0])))) = 34.0;
Casts of this type fall into the category of "OK if you know what you're doing but dangerous if you don't".
For example, in this case you already know the pointer value of "yp" (it was pointing to a double) so it is technically safe to increase its value by the size of a double and re-cast back to a double*.
A counter-example: suppose you didn't know where the char* came from...say, it was given to you as a function parameter. Now, your cast would be a big problem: since char* is technically 1-byte-aligned and a double is usually 8-byte-aligned, you can't be sure if you were given an 8-byte-aligned address. If it's aligned, your arithmetic would produce a valid double*; if not, it would crash when dereferenced.
This is just one example of how casts can go wrong. What you're doing (at first glance) looks like it will work but in general you really have to pay attention when you cast things.
With newer INTEL processors the main problem you can run into is alignment. Say you were to write something like this:
*((double*) (yp + 4)) = 34.0;
Then you are likely to have a runtime error because a double should be aligned on 8 bytes. This was also true on processors such as 68k, or MIPS.
This is similar to having a structure and doing casts on that structure. You are not unlikely to break things.
In most cases, if you can avoid such, your code will be a lot stronger. Personally, I do not even use such casts when reading a file. Instead, I get the data from the file and put it in a structure as required. Say I read 4 bytes in a buffer to convert to an integer, I'd write something like this:
unsigned char buf[4];
...
fread(buf, 1, 4, f);
my_struct.integer = buf[0] | (buf[1] << 8) | (buf[2] << 16) | (buf[3] << 24);
Now I did not do an ugly cast and I could control the endianess of the integer in the file whatever the endian of the processor you are running with.

C++ How to directly access memory

Say I have manually allocated a large portion of memory in C++, say 10 MB.
Say for the heck of it I want to store a few bits around the middle of this region.
How would I get at the memory at that location?
The only way I know of accessing raw memory is using array notation.
And array notation works well for that, as the allocated memory can be seen as a large array.
// Set the byte in the middle to `123`
((char *) memory_ptr)[5 * 1024 * 1024] = 123;
I typecast to a char pointer in case the pointer is of another type. If it's already a char pointer then the typecast isn't needed.
If you only want to set a single bit, see the memory as a giant bit field with 80 million separate bits. To find the bit you want, say bit number 40000000, you must first find the byte it's in and then the bit. This is done with normal division (to find the char) and modulo (to find the bit):
int wanted_bit = 40000000;
int char_index = wanted_bit / 8; // 8 bits to a byte
int bit_number = wanted_bit % 8;
((char *) memory_ptr)[char_index] |= 1 << bit_number; // Set the bit
Array notation is just another way of writing pointers. You can use that, or use pointers directly like so:
char *the_memory_block = // your allocated block.
char b = *(the_memory_block + 10); // get the 11th byte, *-operator is a dereference.
*(the_memory_block + 20) = b; // set the 21st byte to b, same operator.
memcpy, memzero, memmove, memcmp and others may also be very useful, like this:
char *the_memory_block = // your allocated block.
memcpy(the_memory_block + 20, the_memory_block + 10, 1);
Of course this code is also the same:
char *the_memory_block = // your allocated block.
char b = the_memory_block[10];
the_memory_block[20] = b;
And so is this:
char *the_memory_block = // your allocated block.
memcpy(&the_memory_block[20], &the_memory_block[10], 1);
Also, one is not safer then the other, they are completely equivalent.
I think the array notation would be your answer... You can use the bitshift operators << and >> with AND and OR bitmasks to access specific bits.
You can use array notation, or you can use pointer arithmetic:
char* buffer = new char[1024 * 1024 * 10];
// copy 3 bytes to the middle of the memory region using pointer arithmetic
//
std::memcpy(buffer + (1024 * 1024 * 5), "XXX", 3);
C/C++, arrays are treated as pointers to their first elements.
So, an array name is nothing but an alias to its first element:
*pName is equivalent pName[0]
And then:
*(pName+1) == pName[1];
*(pName+2) == pName[2];
And so on. Parenthesis are used to avoid precedence issues. Never forget using them.
After compilation, both ways will behave the same.
I do prefer brackets notation for readability.