Confuse about data address alignment - c++

I have a question about the answer provided by
#dan04. What is aligned memory allocation?
In particular, if I have something like this:
int main(){
int num; // 4byte
char s; // 1byte
int *ptr;
}
If I have a 32 bit machine, do you think it would still be padding at the data by default?
In the previous question, it was asked about struct, and I am asking about variables declared in main.
update:
a = 2 bytes
b = 4 bytes
c = 1 byte
d = 1 byte
0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | | words

There are no rules for this. It depends on the implementation you are using. Further it may change depending on compiler options. The best you can do is to print the address of each variable. Then you can see how the memory layout is.
Something like this:
int main(void)
{
int num;
char s;
int *ptr;
printf("num: %p - size %zu\n", (void*)&num, sizeof num);
printf("s : %p - size %zu\n", (void*)&s, sizeof s);
printf("ptr: %p - size %zu\n", (void*)&ptr, sizeof ptr);
return 0;
}
Possible output:
num: 0x7ffee97fce84 - size 4
s : 0x7ffee97fce83 - size 1
ptr: 0x7ffee97fce88 - size 8
Also notice that in case you don't take the address (&) of a variable, the compiler may optimize your code so that the variable is never put into memory at all.
In general the alignment is typically made to get the best performance out of the HW platform used. That typically imply that variables are aligned to their size or at least 4 byte aligned for variables with size greater than 4.
Update:
OP gives a specific layout example in the update and asks if that layout can/will ever happen.
Again the answer is: It is implementation dependent
So in principle it could happen on some specific system. That said I doubt that it will happen on any mainstream system.
There is another code example compiled with gcc -O3
int main(void)
{
short s1;
int i1;
char c1;
int i2;
char c2;
printf("s1: %p - size %zu\n", (void*)&s1, sizeof s1);
printf("i1: %p - size %zu\n", (void*)&i1, sizeof i1);
printf("c1: %p - size %zu\n", (void*)&c1, sizeof c1);
printf("i2: %p - size %zu\n", (void*)&i2, sizeof i2);
printf("c2: %p - size %zu\n", (void*)&c2, sizeof c2);
return 0;
}
Output from my system:
s1: 0x7ffd222fc146 - size 2 <-- 2 byte aligned
i1: 0x7ffd222fc148 - size 4 <-- 4 byte aligned
c1: 0x7ffd222fc144 - size 1
i2: 0x7ffd222fc14c - size 4 <-- 4 byte aligned
c2: 0x7ffd222fc145 - size 1
Notice how the location in memory differs from the order variables was defined in the code. That ensures a good alignment.
Sorting by address:
c1: 0x7ffd222fc144 - size 1
c2: 0x7ffd222fc145 - size 1
s1: 0x7ffd222fc146 - size 2 <-- 2 byte aligned
i1: 0x7ffd222fc148 - size 4 <-- 4 byte aligned
i2: 0x7ffd222fc14c - size 4 <-- 4 byte aligned
So again to answer the update-question:
On most systems I doubt you'll see a 4 byte variable being placed at address xxx2, xxx6 or xxxa, xxxe. But still, systems may exist where that could happen.

It's quite hard to exactly predict, but there's certainly some padding going on.
Take these two codes for example (I run them on Coliru, 64bit machine)
#include<iostream>
#include <vector>
using namespace std;
//#pragma pack(push,1)
int main(){
int num1(5); // 4byte
int num2(3); // 4byte
char c1[2];
c1[0]='a';
c1[1]='a';
cout << &num1 << " " << &num2 << " " << endl;
cout << sizeof(c1) << " " << &c1 << endl;
}
//#pragma pack(pop)
#include<iostream>
#include <vector>
using namespace std;
//#pragma pack(push,1)
int main(){
int num1(5); // 4byte
int num2(3); // 4byte
char c1[1];
c1[0]='a';
cout << &num1 << " " << &num2 << " " << endl;
cout << sizeof(c1) << " " << &c1 << endl;
}
//#pragma pack(pop)
The first program outputs:
0x7fff3e1f9de8 0x7fff3e1f9dec
2 0x7fff3e1f9de0
While the second program outputs:
0x7fffdca72538 0x7fffdca7253c
1 0x7fffdca72537
You can definitely notice that there's a padding being made in the first program, looking at the addresses we can see that:
First program: CHAR | CHAR | 6-BYTE PADDING | INT | INT
Second program: CHAR | INT | INT
So for the basic question, yes it is probably padding by default.
I also tried to use pragma pack to avoid padding, and in contrast to the struct case, I didn't manage to make it avoid padding, since the outputs were exactly the same.

Related

Regarding Bit Fields in Structure and Union

When using bit fields inside structures like:
struct abc{
int a:3;
unsigned int b:1;
} t;
So, my question is, do variables (with set bit-fields) share the same memory space even inside structures? Because when I saw their representation it looked like - first, a will have 3 bits(from LSB) then 1 bit for b (MSB) and all in one 4-byte(32 bits) memory space.
In case of Unions:
typedef union abc {
unsigned int a:32;
int f:1;
} t;
int main()
{ t q;
q.a = (unsigned int)(pow(2,32)-1);
q.f = 1;
cout << sizeof(t) << endl;
cout << q.a << " " << q.f << endl;
return 0;
}
First of all, the sizeof() operator returns 4 bytes as the maximum size for this union
and as 32 bits are set for variable 1 then how can it even manage space for variable b?
Thank you for reading till the end, I'll really appreciate your answers.

Why std::stack memory size is bigger than as usual in c++?

This is the code for testing my question.
#include <iostream>
#include <stack>
using namespace std;
int main(){
int num;
int Array[1];
stack<int> Stack;
cout << "Int size " << sizeof(num) <<endl; // Output: Int size 4
cout << "Array size " << sizeof(num) <<endl; // Output: Array size 4
cout << "Stack size " << sizeof(Stack) <<endl; // Output: Stack size 80
return 0;
}
I'm trying to understand about memory space allocation. Normally int memory size is 4 bytes. But, when I initialize an Stack of int data-type in std::stack then the Stack size is 80 bytes.
Should it 4? Why is std::stack taking 80 bytes? Or what is actually inside of stack for being the size 80 bytes?
sizeof gets the static size of the object/type. stack dynamically allocates memory for its elements. So, there is no correlation between size of the elements and size of stack in general. So, why is it 80 bytes? This is highly implementation specific. Size of stack is usually the same as the underlying container. By default, the underlying container is a std::deque, so that's where we must have a look. I checked libstdc++ specifically, and it seems to have 1 pointer, 1 size_t for size and 2 iterators like so:
struct _Deque_impl_data
{
_Map_pointer _M_map;
size_t _M_map_size;
iterator _M_start;
iterator _M_finish;
//...
(std::deque derives from _Deque_base which has a single member of type _Deque_impl_data)
Pointer and integer are 8 bytes, the iterators are 32 bytes. This adds up to 80 bytes. I didn't further investigate, but since deque is a more complex structure, it's only natural that it needs some memory for its own book-keeping.
You maybe confusing sizeof(Stack) with Stack.size() here. The sizeof operator returns the total size of the class object, which, in the case of std::stack includes (of necessity) a number of internal data and control variables (padding the size out to, in your case, 80 bytes). However, a call to Stack.size() will return the number of items currently on the stack.
These 'internal variables' will include such things as a pointer to the allocated memory (likely 8 bytes), a value recording the current element count (also likely to be 8 bytes) and a number of other pointers and counters, to aid in manipulation of the stack and optimization of access to the contained data, such as the current capacity of the allocated space, etc.
The following modified code shows the difference:
#include <iostream>
#include <stack>
using namespace std;
int main()
{
int num;
int Array[1];
stack<int> Stack;
cout << "Int size " << sizeof(num) << endl; // Int size 4
cout << "Array size " << sizeof(Array) << endl; // Array size 4 (1 "int" element)
cout << "Stack size " << sizeof(Stack) << endl; // Size of a "std::stack<int>" instance
cout << "Stack size " << Stack.size() << endl; // Size (# entries) of stack = 0 (empty)
return 0;
}

Reading values from binary file stored in char array with reinterpret_cast (C++)

I am trying to read in a binary file in a known format. I want to find the most efficient way to extract values from it. My ideas are:
Method 1: Read each value into a new char array then get it into the correct data type. For the first 4 byte positive int, I bitshift the values accordingly and assign to an integer as below.
Method 2: Keep the whole file in a char array, then create pointers to different parts of it. In the code below I am trying to point to these first 4 bytes and use reinterpret_cast to interpret them as an integer when I dereference the variable 'bui'.
But the ouput from this code is:
11000000001100000000110000000011
3224374275
00000011000011000011000011000000
51130560
My questions are
why does the endianness get swapped using my method 2 and how do I point to it correctly?
which method is more efficient? I need all of the file, and the file contains other data types too so I will need to write different methods to interpret them if using method 1. I was assuming I could just define different type pointers if using method 2 without doing extra work!
Thanks
#include <iostream>
#include <bitset>
int main(void){
unsigned char b[4];
//ifs.read((char*)b,sizeof(b));
//let's pretend the following 4 bytes are read in representing the number 3224374275:
b[0] = 0b11000000;
b[1] = 0b00110000;
b[2] = 0b00001100;
b[3] = 0b00000011;
//method 1:
unsigned int a = 0; //4 byte capacity
a = b[0] << 24 | b[1] << 16 | b[2] << 8 | b[3];
std::bitset<32> xm1(a);
std::cout << xm1 << std::endl;
std::cout << a << std::endl;
//method 2;
unsigned int* bui = reinterpret_cast<unsigned int*>(b);
std::bitset<32> xm2(*bui);
std::cout << xm2 << std::endl;
std::cout << *bui << std::endl;
}

Why does conceptual storage allocation differ from the actual? [duplicate]

This question already has answers here:
Pointer subtraction confusion
(8 answers)
Closed 6 years ago.
I have a puzzling question (at least for me)
Say I declare an integer array:
int arr[3];
Conceptually, what happens in the memory is that, at compile time, 12 bytes are allocated to store 3 consecutive integers, right? (Here's an illustration)
Based on the illustration, the sample addresses of
arr[0] is 1000,
arr[1] is 1004, and
arr[2] is 1008.
My question is:
If I output the difference between the addresses of arr[0] and arr[1]:
std::cout << &arr[1] - &arr[0] << std::endl;
instead of getting 4,
I surprisingly get 1.
Can anybody explain why it resulted to that output?
PS: On my computer, an int is 4 bytes.
Pointer arithmetic automatically divides the value by the size of the base type so this is not surprising at all since one would expect to get 4 / 4 which is 1. Cast to unsignd char * to see the difference.
#include <iostream>
int
main(void)
{
int arr[2];
std::cout << &arr[1] - &arr[0] << std::endl;
std::cout << reinterpret_cast<unsigned char *>(&arr[1]) -
reinterpret_cast<unsigned char *>(&arr[0]) << std::endl;
return 0;
}

Size of byte when accessed via pointer

I'm working on an Arduino project. I'm trying to pass a byte pointer to a function, and let that function calculate the size of the data that the pointer refers to. But when I let the pointer refer to a byte, sizeof() returns 2. I wrote the following snippet to try to debug:
byte b;
byte *byteptr;
byteptr = &b;
print("sizeof(b): ");
println(sizeof(b));
print("sizeof(*byteptr) pointing to byte: ");
println(sizeof(*byteptr));
print("sizeof(byteptr) pointing to byte: ");
println(sizeof(byteptr));
the printed result is:
sizeof(b): 1
sizeof(*byteptr) pointing to byte: 1
sizeof(byteptr) pointing to byte: 2
So the size of a byte is 1, but via the pointer it's 2??
It appears that on Arduino, pointers are 16 bit. I believe your confusion stems from what * means in this context.
sizeof(*byteptr) is equivalent to the sizeof(byte). The * does not indicate a pointer type, it indicates dereferencing the pointer stored in byteptr. Ergo, it is 1 byte, which you would expect from the type byte.
sizeof(byteptr) does not dereference the pointer, and as such, is the size of the pointer itself, which on this system seems to be 2 bytes/16 bits.
Consider the following:
#include "iostream"
using namespace std;
int main()
{
char a = 1;
char* aP = &a;
cout << "sizeof(char): " << sizeof(char) << endl;
cout << "sizeof(char*): " << sizeof(char*) << endl;
cout << "sizeof(a): " << sizeof(a) << endl;
cout << "sizeof(aP): " << sizeof(aP) << endl;
cout << "sizeof(*aP): " << sizeof(*aP) << endl;
}
Output (on a 64 bit OS/compiler):
sizeof(char): 1
sizeof(char*): 8
sizeof(a): 1
sizeof(aP): 8
sizeof(*aP): 1
#Maseb I think you've gotten a good discussion of the differences between the size of a dereferenced pointer and the actual size of the pointer itself. I'll just add that the sizeof(byte_pointer) must be large enough so that every address of memory space where a byte value could potentially be stored will fit into the pointer's memory width. For example, if there 32,000 bytes of storage on your Arduino then you could potentially have a pointer that needs to point to the address 32,000. Since 2^15 is about 32,000 you need 14 or 15 bits to create a unique address for each memory storage location. We set pointer address space length to blocks of four bits. Therefore, your Arduino has a 16bit address space for pointers and sizeof(byte_pointer) is 2 bytes, or 16 bits.
With that said, I'll go ahead an answer your other question too. If you need to pass an array and a size, just create your own struct that includes both of those data elements. Then you can pass the pointer to this templated struct which includes the size (This is the basic implementation for the C++ Array container).
I've written the short code sample below to demonstrate how to create your own template for an array with a size element and then use that size element to iterate over the elements.
template<int N>
struct My_Array{
int size = N;
int elem[N];
};
//create the pointer to the struct
My_Array<3>* ma3 = new My_Array<3>;
void setup() {
//now fill the array element
for(int i=0; i < ma3->size; i++) {
ma3->elem[0]=i;
}
Serial.begin(9600);
//now you can use the size value to iterate over the elements
Serial.print("ma3 is this big: ");
Serial.println(ma3->size);
Serial.println("The array values are:");
Serial.print("\t[");
for(int i=0; i<ma3->size; i++) {
Serial.print(ma3->elem[i]);
if(i < ma3->size-1) Serial.print(", ");
}
Serial.println("]");
}
void loop() {
while(true) { /* do nothing */ }
}