Casting a void pointer to check memory alignment - c++

One of the solutions which often turns up in several posts on how to determine if a void * points to aligned memory involves casting the void pointer. That is say I get a void *ptr = CreateMemory() and now I want to check if memory pointed to by ptr is a multiple of some value, say 16.
See How to determine if memory is aligned? which makes a specific claim.
A pointer p is aligned on a 16-byte boundary iff ((unsigned long)p & 15) == 0.
A solution in a similar vein appears in this post.
How to check if a pointer points to a properly aligned memory location?
Can someone clarify, how does this casting work? I mean, if I had a void *ptr = CreateMem();, it seems to me that (unsigned long)(ptr) would give me the pointer itself reinterpreted as an unsigned long. Why would the value of this reinterpreted void pointer have any bearing on the alignment of the memory pointed to?
EDIT: Thanks for all the useful comments. Please bear with me a bit more. Perhaps a simplified example will help me understand this better.
#include <iostream>
using namespace std;
struct __attribute__((aligned(16))) Data0 {
uint8_t a;
};
struct Data1 {
uint8_t a;
};
int main() {
std::cout<<sizeof(Data0) << "\n"; //< --(1)
std::cout<<sizeof(Data1) << "\n";
Data0 ptr0[10];
Data1 ptr1[10];
std::cout<<(unsigned long) (ptr0 + 1) - (unsigned long) ptr0 << "\n"; //< --- (2)
std::cout<<(unsigned long) (ptr1 + 1) - (unsigned long) ptr1 << "\n";
return 0;
}
To date, I always interpreted aligned memory to have the following two requirements. sizeof() should return a multiple of the the specified size (See condition (1)). And the while incrementing a pointer to aligned struct array, the stride would end up also being a multiple of the specified size (See condition (2)).
So I am somewhat surprised to see there is a third requirement on the actual value of ptr0 as well. Or I might have misunderstood this entirely. Would the memory pointed to by ptr0 in the above example be considered aligned and not so for ptr1?
As I am typing this, I realize I do not actually understand what aligned memory itself means. Since I have mostly dealt with this while allocating cuda buffers, I tend to relate it to some sort of padding required for my data structs.
Consider a second example. That of aligned_alloc. https://en.cppreference.com/w/c/memory/aligned_alloc
Allocate size bytes of uninitialized storage whose alignment is specified by alignment.
I am not sure how to interpret this. Say, I do a void *p0 = aligned_alloc(16, 16 * 2), what is different about the memory pointed to by p0 as compared to say p1 where std::vector<char> buffer(32); char *p1 = buffer.data().

(unsigned long)(ptr) casting a pointer to an unsigned integer returns the pointer's memory address.
p & 15 is equivalent of p % 16, which is the rest of the division by 16. If it's 0, then it means the memory is aligned to multiple of 16.
((unsigned long)(ptr) & 15) == 0 returns true if the memory is aligned to multiple of 16.
uintptr_t is a better type for such casting, as it should be more platform-agnostic.

Why would the value of this reinterpreted void pointer have any bearing on the alignment of the memory pointed to?
Because in common architectures the conversion of a void pointer to an unsigned long is the address of the memory pointed to by the pointer. Or because of the way conversion to an unsigned type works, the low bits of the address if it is too big to fit in an unsigned long. So is is enough to test for alignment because in that case you only need the lowest bits of the address.
I am not sure whether it is really mandated per standard, because the actual representation of a memory address if left to the implementation, and I can remember the segment+address representation used by 16 bits intel processors. In those architectures the addressable memory used 20 bits, and a far pointer (able to represent any address) was represented as a 32 bits unsigned value where the high 16 bits were the segment and the low 16 bits were the offset. Those were then combined as:
Segment S S S S _ (4 * 4 = 16 bits)
Offset _ O O O O (4 * 4 = 16 bits)
Address A A A A A (5 * 4 = 20 bits)
You can see that this architecture did not allow to easily test for alignment higher than 16. For example this pointer value 0x00010040 is divisible by 64 but the actual address is 0x00050 (80) which is only divisible by 16.
But only dinosaurs can remember the Intel 8086 and its segmented address representation, and even with it, testing alignment up to 16 bytes was straightforward...

Related

C++ adding 4 bytes to pointer address

I have a question about pointers, and memory addresses:
Supposing I have the following code:
int * array = (int *) malloc(sizeof(int) * 4);
Now in array im storing a memory address, I know that c++ takes already care when adding +1 to this pointer it will add 4 bytes, but what If I want to add manually 4 bytes?
array + 0x004
If im correct this will lead to add 4*4 (16) bytes, but my Idea is to add manually those 4 bytes.
Why? Just playing around, i've tried this and I got a totally different result from what I expected, then i've researched and i've seen that c++ takes already care when you add +1 to a pointer (it sums 4 bytes in this case).
Any idea?
For a pointer p to a type T with value v, the expression p+n will (on most systems anyway) result in a pointer to the address v+n*sizeof(T). To get a fixed-byte offset to the pointer, you can first cast it to a character pointer, like this:
reinterpret_cast<T*>(reinterpret_cast<char*>(p) + n)
In c++, sizeof(char) is defined to be equal to 1.
Do note that accessing improperly aligned values can have large performance penalties.
Another thing to note is that, in general, casting pointers to different types is not allowed (called the strict aliasing rule), but an exception is explicitly made for casting any pointer type to char* and back.
The trick is convert the type of array into any pointer-type with a size of 1 Byte, or store the pointer value in an integer.
#include <stdint.h>
int* increment_1(int* ptr) {
//C-Style
return (int*)(((char*)ptr) + 4);
}
int* increment_2(int* ptr) {
//C++-Style
char* result = reinterpret_cast<char*>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
int* increment_3(int* ptr) {
//Store in integer
intptr_t result = reinterpret_cast<intptr_t>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
Consider that if you add an arbitrary number of bytes to an address of an object of type T, it no longer makes sense to use a pointer of type T, since there might not be an object of type T at the incremented memory address.
If you want to access a particular byte of an object, you can do so using a pointer to a char, unsigned char or std::byte. Such objects are the size of a byte, so incrementing behaves just as you would like. Furthermore, while rules of C++ disallow accessing objects using incompatible pointers, these three types are excempt of that rule and are allowed to access objects of any type.
So, given
int * array = ....
You can access the byte at index 4 like this:
auto ptr = reinterpret_cast<unsigned char*>(array);
auto byte_at_index_4 = ptr + 4;
array + 0x004
If im correct this will lead to add 4*4 (16) bytes
Assuming sizeof(int) happens to be 4, then yes. But size of int is not guaranteed to be 4.

Reliably determine the size of char

I was wondering how to reliably determine the size of a character in a portable way. AFAIK sizeof(char) can not be used because this yields always 1, even on system where the byte has 16 bit or even more or less.
For example when dealing with bits, where you need to know exactly how big it is, I was wondering if this code would give the real size of a character, independent on what the compiler thinks of it. IMO the pointer has to be increased by the compiler to the correct size, so we should have the correct value. Am I right on this, or might there be some hidden problem with pointer arithmetics, that would yield also wrong results on some systems?
int sizeOfChar()
{
char *p = 0;
p++;
int size_of_char = (int)p;
return size_of_char;
}
There's a CHAR_BIT macro defined in <limits.h> that evaluates to exactly what its name suggests.
IMO the pointer has to be increased by the compiler to the correct size, so we should have the correct value
No, because pointer arithmetic is defined in terms of sizeof(T) (the pointer target type), and the sizeof operator yields the size in bytes. char is always exactly one byte long, so your code will always yield the NULL pointer plus one (which may not be the numerical value 1, since NULL is not required to be 0).
I think it's not clear what you consider to be "right" (or "reliable", as in the title).
Do you consider "a byte is 8 bits" to be the right answer? If so, for a platform where CHAR_BIT is 16, then you would of course get your answer by just computing:
const int octets_per_char = CHAR_BIT / 8;
No need to do pointer trickery. Also, the trickery is tricky:
On an architecture with 16 bits as the smallest addressable piece of memory, there would be 16 bits at address 0x00001, another 16 bits at address 0x0001, and so on.
So, your example would compute the result 1, since the pointer would likely be incremented from 0x0000 to 0x0001, but that doesn't seem to be what you expect it to compute.
1 I use a 16-bit address space for brevity, it makes the addresses easier to read.
The size of one char (aka byte ) in bits is determined by the macro CHAR_BIT in <limits.h> (or <climits> in C++).
The sizeof operator always returns the size of a type in bytes, not in bits.
So if on some system CHAR_BIT is 16 and sizeof(int) is 4, that means an int has 64 bits on that system.

What is the size of a pointer? [duplicate]

This question already has answers here:
Do all pointers have the same size in C++?
(10 answers)
Closed 5 months ago.
Is the size of a pointer the same as the size as the type it's pointing to, or do pointers always have a fixed size? For example...
int x = 10;
int * xPtr = &x;
char y = 'a';
char * yPtr = &y;
std::cout << sizeof(x) << "\n";
std::cout << sizeof(xPtr) << "\n";
std::cout << sizeof(y) << "\n";
std::cout << sizeof(yPtr) << "\n";
What would the output of this be? Would sizeof(xPtr) return 4 and sizeof(yPtr) return 1, or would the 2 pointers actually return the same size?
The reason I ask this is because the pointers are storing a memory address and not the values of their respective stored addresses.
Function Pointers can have very different sizes, from 4 to 20 bytes on an x86 machine, depending on the compiler. So the answer is no - sizes can vary.
Another example: take an 8051 program. It has three memory ranges and thus has three different pointer sizes, from 8 bit, 16 bit, 24 bit, depending on where the target is located, even though the target's size is always the same (e.g., char).
Pointers generally have a fixed size, for ex. on a 32-bit executable they're usually 32-bit. There are some exceptions, like on old 16-bit windows when you had to distinguish between 32-bit pointers and 16-bit... It's usually pretty safe to assume they're going to be uniform within a given executable on modern desktop OS's.
Edit: Even so, I would strongly caution against making this assumption in your code. If you're going to write something that absolutely has to have a pointers of a certain size, you'd better check it!
Function pointers are a different story -- see Jens' answer for more info.
On 32-bit machine sizeof pointer is 32 bits ( 4 bytes), while on 64 bit machine it's 8 byte. Regardless of what data type they are pointing to, they have fixed size.
To answer your other question. The size of a pointer and the size of what it points to are not related. A good analogy is to consider them like postal addresses. The size of the address of a house has no relationship to the size of the house.
Pointers are not always the same size on the same architecture.
You can read more on the concept of "near", "far" and "huge" pointers, just as an example of a case where pointer sizes differ...
http://en.wikipedia.org/wiki/Intel_Memory_Model#Pointer_sizes
They can be different on word-addressable machines (e.g., Cray PVP systems).
Most computers today are byte-addressable machines, where each address refers to a byte of memory. There, all data pointers are usually the same size, namely the size of a machine address.
On word-adressable machines, each machine address refers instead to a word larger than a byte. On these, a (char *) or (void *) pointer to a byte of memory has to contain both a word address plus a byte offset within the addresed word.
http://docs.cray.com/books/004-2179-001/html-004-2179-001/rvc5mrwh.html
Recently came upon a case where this was not true, TI C28x boards can have a sizeof pointer == 1, since a byte for those boards is 16-bits, and pointer size is 16 bits. To make matters more confusing, they also have far pointers which are 22-bits. I'm not really sure what sizeof far pointer would be.
In general, DSP boards can have weird integer sizes.
So pointer sizes can still be weird in 2020 if you are looking in weird places
The size of a pointer is the size required by your system to hold a unique memory address (since a pointer just holds the address it points to)

Memory alignment in C-structs

I'm working on a 32-bit machine, so I suppose that the memory alignment should be 4 bytes. Say I have this struct:
typedef struct {
unsigned short v1;
unsigned short v2;
unsigned short v3;
} myStruct;
The plain added size is 6 bytes, and I suppose that the aligned size should be 8, but sizeof(myStruct) returns me 6.
However if I write:
typedef struct {
unsigned short v1;
unsigned short v2;
unsigned short v3;
int i;
} myStruct;
the plain added size is 10 bytes, aligned size shall be 12, and this time sizeof(myStruct) == 12.
Can somebody explain what is the difference?
At least on most machines, a type is only ever aligned to a boundary as large as the type itself [Edit: you can't really demand any "more" alignment than that, because you have to be able to create arrays, and you can't insert padding into an array]. On your implementation, short is apparently 2 bytes, and int 4 bytes.
That means your first struct is aligned to a 2-byte boundary. Since all the members are 2 bytes apiece, no padding is inserted between them.
The second contains a 4-byte item, which gets aligned to a 4-byte boundary. Since it's preceded by 6 bytes, 2 bytes of padding is inserted between v3 and i, giving 6 bytes of data in the shorts, two bytes of padding, and 4 more bytes of data in the int for a total of 12.
Forget about having different members, even if you write two structs whose members are exactly same, with a difference is that the order in which they're declared is different, then size of each struct can be (and often is) different.
For example, see this,
#include <iostream>
using namespace std;
struct A
{
char c;
char d;
int i;
};
struct B
{
char c;
int i; //note the order is different!
char d;
};
int main() {
cout << sizeof(A) << endl;
cout << sizeof(B) << endl;
}
Compile it with gcc-4.3.4, and you get this output:
8
12
That is, sizes are different even though both structs has same members!
Code at Ideone : http://ideone.com/HGGVl
The bottomline is that the Standard doesn't talk about how padding should be done, and so the compilers are free to make any decision and you cannot assume all compilers make the same decision.
By default, values are aligned according to their size. So a 2-byte value like a short is aligned on a 2-byte boundary, and a 4-byte value like an int is aligned on a 4-byte boundary
In your example, 2 bytes of padding are added before i to ensure that i falls on a 4-byte boundary.
(The entire structure is aligned on a boundary at least as big as the biggest value in the structure, so your structure will be aligned to a 4-byte boundary.)
The actual rules vary according to the platform - the Wikipedia page on Data structure alignment has more details.
Compilers typically let you control the packing via (for example) #pragma pack directives.
Assuming:
sizeof(unsigned short) == 2
sizeof(int) == 4
Then I personally would use the following (your compiler may differ):
unsigned shorts are aligned to 2 byte boundaries
int will be aligned to 4 byte boundaries.
typedef struct
{
unsigned short v1; // 0 bytes offset
unsigned short v2; // 2 bytes offset
unsigned short v3; // 4 bytes offset
} myStruct; // End 6 bytes.
// No part is required to align tighter than 2 bytes.
// So whole structure can be 2 byte aligned.
typedef struct
{
unsigned short v1; // 0 bytes offset
unsigned short v2; // 2 bytes offset
unsigned short v3; // 4 bytes offset
/// Padding // 6-7 padding (so i is 4 byte aligned)
int i; // 8 bytes offset
} myStruct; // End 12 bytes
// Whole structure needs to be 4 byte aligned.
// So that i is correctly aligned.
Firstly, while the specifics of padding are left up to the compiler, the OS also imposes some rules as to alignment requirements. This answer assumes that you are using gcc, though the OS may vary
To determine the space occupied by a given struct and its elements, you can follow these rules:
First, assume that the struct always starts at an address that is properly aligned for all data types.
Then for every entry in the struct:
The minimum space needed is the raw size of the element given by sizeof(element).
The alignment requirement of the element is the alignment requirement of the element's base type.
Notably, this means that the alignment requirement for a char[20] array is the same as
the requirement for a plain char.
Finally, the alignment requirement of the struct as a whole is the maximum of the alignment requirements of each of its elements.
gcc will insert padding after a given element to ensure that the next one (or the struct if we are talking about the last element) is correctly aligned. It will never rearrange the order of the elements in the struct, even if that will save memory.
Now the alignment requirements themselves are also a bit odd.
32-bit Linux requires that 2-byte data types have 2-byte alignment (their addresses must be even). All larger data types must have 4-byte alignment (addresses ending in 0x0, 0x4, 0x8 or 0xC). Note that this applies to types larger than 4 bytes as well (such as double and long double).
32-bit Windows is more strict in that if a type is K bytes in size, it must be K byte aligned. This means that a double can only placed at an address ending in 0x0 or 0x8. The only exception to this is the long double which is still 4-byte aligned even though it is actually 12-bytes long.
For both Linux and Windows, on 64-bit machines, a K byte type must be K byte aligned. Again, the long double is an exception and must be 16-byte aligned.
Each data type needs to be aligned on a memory boundary of its own size. So a short needs to be on aligned on a 2-byte boundary, and an int needs to be on a 4-byte boundary. Similarly, a long long would need to be on an 8-byte boundary.
The reason for the second sizeof(myStruct) being 12 is the padding that gets inserted between v3 and i to align i at a 32-bit boundary. There is two bytes of it.
Wikipedia explains the padding and alignment reasonably clearly.
In your first struct, since every item is of size short, the whole struct can be aligned on short boundaries, so it doesn't need to add any padding at the end.
In the second struct, the int (presumably 32 bits) needs to be word aligned so it inserts padding between v3 and i to align i.
Sounds like its being aligned to bounderies based on the size of each var, so that the address is a multiple of the size being accessed(so shorts are aligned to 2, ints aligned to 4 etc), if you moved one of the shorts after the int, sizeof(mystruct) should be 10. Of course this all depends on the compiler being used and what settings its using in turn.
The standard doesn't say much about the layout of structs with complete types - it's up to to the compiler. It decided that it needs the int to start on a boundary to access it, but since it has to do sub-boundary memory addressing for the shorts there is no need to pad them

Understanding sizeof(char) in 32 bit C compilers

(sizeof) char always returns 1 in 32 bit GCC compiler.
But since the basic block size in 32 bit compiler is 4, How does char occupy a single byte when the basic size is 4 bytes???
Considering the following :
struct st
{
int a;
char c;
};
sizeof(st) returns as 8 as agreed with the default block size of 4 bytes (since 2 blocks are allotted)
I can never understand why sizeof(char) returns as 1 when it is allotted a block of size 4.
Can someone pls explain this???
I would be very thankful for any replies explaining it!!!
EDIT : The typo of 'bits' has been changed to 'bytes'. I ask Sorry to the person who made the first edit. I rollbacked the EDIT since I did not notice the change U made.
Thanks to all those who made it a point that It must be changed especially #Mike Burton for downvoting the question and to #jalf who seemed to jump to conclusions over my understanding of concepts!!
sizeof(char) is always 1. Always. The 'block size' you're talking about is just the native word size of the machine - usually the size that will result in most efficient operation. Your computer can still address each byte individually - that's what the sizeof operator is telling you about. When you do sizeof(int), it returns 4 to tell you that an int is 4 bytes on your machine. Likewise, your structure is 8 bytes long. There is no information from sizeof about how many bits there are in a byte.
The reason your structure is 8 bytes long rather than 5 (as you might expect), is that the compiler is adding padding to the structure in order to keep everything nicely aligned to that native word length, again for greater efficiency. Most compilers give you the option to pack a structure, either with a #pragma directive or some other compiler extension, in which case you can force your structure to take minimum size, regardless of your machine's word length.
char is size 1, since that's the smallest access size your computer can handle - for most machines an 8-bit value. The sizeof operator gives you the size of all other quantities in units of how many char objects would be the same size as whatever you asked about. The padding (see link below) is added by the compiler to your data structure for performance reasons, so it is larger in practice than you might think from just looking at the structure definition.
There is a wikipedia article called Data structure alignment which has a good explanation and examples.
It is structure alignment with padding. c uses 1 byte, 3 bytes are non used. More here
Sample code demonstrating structure alignment:
struct st
{
int a;
char c;
};
struct stb
{
int a;
char c;
char d;
char e;
char f;
};
struct stc
{
int a;
char c;
char d;
char e;
char f;
char g;
};
std::cout<<sizeof(st) << std::endl; //8
std::cout<<sizeof(stb) << std::endl; //8
std::cout<<sizeof(stc) << std::endl; //12
The size of the struct is bigger than the sum of its individual components, since it was set to be divisible by 4 bytes by the 32 bit compiler. These results may be different on different compilers, especially if they are on a 64 bit compiler.
First of all, sizeof returns a number of bytes, not bits. sizeof(char) == 1 tells you that a char is eight bits (one byte) long. All of the fundamental data types in C are at least one byte long.
Your structure returns a size of 8. This is a sum of three things: the size of the int, the size of the char (which we know is 1), and the size of any extra padding that the compiler added to the structure. Since many implementations use a 4-byte int, this would imply that your compiler is adding 3 bytes of padding to your structure. Most likely this is added after the char in order to make the size of the structure a multiple of 4 (a 32-bit CPU access data most efficiently in 32-bit chunks, and 32 bits is four bytes).
Edit: Just because the block size is four bytes doesn't mean that a data type can't be smaller than four bytes. When the CPU loads a one-byte char into a 32-bit register, the value will be sign-extended automatically (by the hardware) to make it fill the register. The CPU is smart enough to handle data in N-byte increments (where N is a power of 2), as long as it isn't larger than the register. When storing the data on disk or in memory, there is no reason to store every char as four bytes. The char in your structure happened to look like it was four bytes long because of the padding added after it. If you changed your structure to have two char variables instead of one, you should see that the size of the structure is the same (you added an extra byte of data, and the compiler added one fewer byte of padding).
All object sizes in C and C++ are defined in terms of bytes, not bits. A byte is the smallest addressable unit of memory on the computer. A bit is a single binary digit, a 0 or a 1.
On most computers, a byte is 8 bits (so a byte can store values from 0 to 256), although computers exist with other byte sizes.
A memory address identifies a byte, even on 32-bit machines. Addresses N and N+1 point to two subsequent bytes.
An int, which is typically 32 bits covers 4 bytes, meaning that 4 different memory addresses exist that each point to part of the int.
In a 32-bit machine, all the 32 actually means is that the CPU is designed to work efficiently with 32-bit values, and that an address is 32 bits long. It doesn't mean that memory can only be addressed in blocks of 32 bits.
The CPU can still address individual bytes, which is useful when dealing with chars, for example.
As for your example:
struct st
{
int a;
char c;
};
sizeof(st) returns 8 not because all structs have a size divisible by 4, but because of alignment. For the CPU to efficiently read an integer, its must be located on an address that is divisible by the size of the integer (4 bytes). So an int can be placed on address 8, 12 or 16, but not on address 11.
A char only requires its address to be divisible by the size of a char (1), so it can be placed on any address.
So in theory, the compiler could have given your struct a size of 5 bytes... Except that this wouldn't work if you created an array of st objects.
In an array, each object is placed immediately after the previous one, with no padding. So if the first object in the array is placed at an address divisible by 4, then the next object would be placed at a 5 bytes higher address, which would not be divisible by 4, and so the second struct in the array would not be properly aligned.
To solve this, the compiler inserts padding inside the struct, so its size becomes a multiple of its alignment requirement.
Not because it is impossible to create objects that don't have a size that is a multiple of 4, but because one of the members of your st struct requires 4-byte alignment, and so every time the compiler places an int in memory, it has to make sure it is placed at an address that is divisible by 4.
If you create a struct of two chars, it won't get a size of 4. It will usually get a size of 2, because when it contains only chars, the object can be placed at any address, and so alignment is not an issue.
Sizeof returns the value in bytes. You were talking about bits. 32 bit architectures are word aligned and byte referenced. It is irrelevant how the architecture stores a char, but to compiler, you must reference chars 1 byte at a time, even if they use up less than 1 byte.
This is why sizeof(char) is 1.
ints are 32 bit, hence sizeof(int)= 4, doubles are 64 bit, hence sizeof(double) = 8, etc.
Because of optimisation padding is added so size of an object is 1, 2 or n*4 bytes (or something like that, talking about x86). That's why there is added padding to 5-byte object and to 1-byte not. Single char doesn't have to be padded, it can be allocated on 1 byte, we can store it on space allocated with malloc(1). st cannot be stored on space allocated with malloc(5) because when st struct is being copied whole 8 bytes are being copied.
It works the same way as using half a piece of paper. You use one part for a char and the other part for something else. The compiler will hide this from you since loading and storing a char into a 32bit processor register depends on the processor.
Some processors have instructions to load and store only parts of the 32bit others have to use binary operations to extract the value of a char.
Addressing a char works as it is AFAIR by definition the smallest addressable memory. On a 32bit system pointers to two different ints will be at least 4 address points apart, char addresses will be only 1 apart.