In an object of this example class
class example
{
public:
int x;
}
an object would be allocated 4 bytes of memory. As an int would take 4 bytes.
How much memory would be allocated to an object of the following class -
class node
{
public:
int data;
node *prev, *next;
};
The int would take four bytes, but what about the 'next' and 'prev' pointers? What about the total size of an object of the class?
The total size of the object is sizeof(int) + 2*sizeof(node*) + any padding the compiler might add between the members. Using sizeof(node) is the only portable and reliable way to find that out.
Pointers are of size 4 bytes on x86 system or 8 bytes on x64 system.
So your total size of node is 4 + 4 + 4 or 4 + 8 + 8 which is 12 bytes on x86 architecture, or 20 bytes on x64 architecture.
Because of padding however, on x64 architecture, the actual size of the class will be 24 bytes, because x64 architecture requires 8 byte alignment.
As mentioned by Oliver Charlesworth, you can also do std::cout << sizeof(node) << "\n";, which will tell you the exact size of class node
Related
Someone explain me how does the order of the member declaration inside a class determines the size of that class.
For Example :
class temp
{
public:
int i;
short s;
char c;
};
The size of above class is 8 bytes.
But when the order of the member declaration is changed as below
class temp
{
public:
char c;
int i;
short s;
};
then the size of class is 12 bytes.
How?
The reason behind above behavior is data structure alignment and padding. Basically if you are creating a 4 byte variable e.g. int, it will be aligned to a four byte boundary i.e. it will start from an address in memory, which is multiple of 4. Same applies to other data types. 2 byte short should start from even memory address and so on.
Hence if you have a 1 byte character declared before the int (assume 4 byte here), there will be 3 free bytes left in between. The common term used for them is 'padded'.
Data structure alignment
Another good pictorial explanation
Reason for alignment
Padding allows faster memory access i.e. for cpu, accessing memory areas that are aligned is faster e.g. reading a 4 byte aligned integer might take a single read call where as if an integer is located at a non aligned address range (say address 0x0002 - 0x0006), then it would take two memory reads to get this integer.
One way to force compiler to avoid alignment is (specific to gcc/g++) to use keyword 'packed' with the structure attribute. packed keyword Also the link specifies how to enforce alignment by a specific boundary of your choice (2, 4, 8 etc.) using the aligned keyword.
Best practice
It is always a good idea to structure your class/struct in a way that variables are already aligned with minimum padding. This reduces the size of the class overall plus it reduces the amount of work done by the compiler i.e. no rearrangement of structure. Also one should always access member variables by their names in the code, rather than trying to read a specific byte from structure assuming a value would be located at that byte.
Another useful SO question on performance advantage of alignment
For the sake of completion, following would still have a size of 8 bytes in your scenario (32 bit machine), but it won't get any better since full 8 bytes are now occupied, and there is no padding.
class temp
{
public:
int i;
short s;
char c;
char c2;
};
class temp
{
public:
int i; //size 4 alignment 4
short s; //size 2 alignment 2
char c; //size 1 alignment 1
}; //Size 8 alignment max(4,2,1)=4
temp[i[0-4];s[4-2];c[6-7]]] -> 8
Padding in (7-8)
class temp
{
public:
char c; //size 1 alignment 1
int i; //size 4 alignment 4
short s; //size 2 alignment 2
};//Size 12 alignment max(4,2,1)=4
temp[c[0-1];i[4-8];s[8-10]]] -> 12
Padding in (1-4) and (10-12)
Someone explain me how does the order of the member declaration inside a class determines the size of that class.
For Example :
class temp
{
public:
int i;
short s;
char c;
};
The size of above class is 8 bytes.
But when the order of the member declaration is changed as below
class temp
{
public:
char c;
int i;
short s;
};
then the size of class is 12 bytes.
How?
The reason behind above behavior is data structure alignment and padding. Basically if you are creating a 4 byte variable e.g. int, it will be aligned to a four byte boundary i.e. it will start from an address in memory, which is multiple of 4. Same applies to other data types. 2 byte short should start from even memory address and so on.
Hence if you have a 1 byte character declared before the int (assume 4 byte here), there will be 3 free bytes left in between. The common term used for them is 'padded'.
Data structure alignment
Another good pictorial explanation
Reason for alignment
Padding allows faster memory access i.e. for cpu, accessing memory areas that are aligned is faster e.g. reading a 4 byte aligned integer might take a single read call where as if an integer is located at a non aligned address range (say address 0x0002 - 0x0006), then it would take two memory reads to get this integer.
One way to force compiler to avoid alignment is (specific to gcc/g++) to use keyword 'packed' with the structure attribute. packed keyword Also the link specifies how to enforce alignment by a specific boundary of your choice (2, 4, 8 etc.) using the aligned keyword.
Best practice
It is always a good idea to structure your class/struct in a way that variables are already aligned with minimum padding. This reduces the size of the class overall plus it reduces the amount of work done by the compiler i.e. no rearrangement of structure. Also one should always access member variables by their names in the code, rather than trying to read a specific byte from structure assuming a value would be located at that byte.
Another useful SO question on performance advantage of alignment
For the sake of completion, following would still have a size of 8 bytes in your scenario (32 bit machine), but it won't get any better since full 8 bytes are now occupied, and there is no padding.
class temp
{
public:
int i;
short s;
char c;
char c2;
};
class temp
{
public:
int i; //size 4 alignment 4
short s; //size 2 alignment 2
char c; //size 1 alignment 1
}; //Size 8 alignment max(4,2,1)=4
temp[i[0-4];s[4-2];c[6-7]]] -> 8
Padding in (7-8)
class temp
{
public:
char c; //size 1 alignment 1
int i; //size 4 alignment 4
short s; //size 2 alignment 2
};//Size 12 alignment max(4,2,1)=4
temp[c[0-1];i[4-8];s[8-10]]] -> 12
Padding in (1-4) and (10-12)
I have code written to create a linked list of dynamically created objects:
#include <iostream>
using namespace std;
struct X {
int i;
X* x;
};
void birth(X* head, int quant){
X* x = head;
for(int i=0;i<quant-1;i++){
x->i = i+1;
x->x = new X;
x = x->x;
}
x->i = quant;
x->x = 0;
}
void kill(X* x){
X* next;
while(1==1){
cout << x->i << endl;
cout << (long)x << endl;
next = x->x;
delete x;
if(next == 0){
break;
} else {
x = next;
}
}
}
int main(){
cout << (long)sizeof(X) << endl;
X* x = new X;
birth(x, 10);
kill(x);
return 0;
}
Which seems to be working, except for the fact that when you look at the addresses of each of the objects...
16
1
38768656
2
38768688
3
38768720
4
38768752
5
38768784
6
38768816
7
38768848
8
38768880
9
38768912
10
38768944
They seem to be created 32 bits apart despite the size of X being only 16 bits. Is there an issue with how I am creating the objects, or is this just a consequence of how dynamic allocation works?
The reason is stated in 7.22.3 Memory management functions of the C Standard:
The order and contiguity of storage allocated by successive calls to
the aligned_alloc, calloc, malloc, and realloc functions is
unspecified. The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any type
of object with a fundamental alignment requirement and then used to
access such an object or an array of such objects in the space
allocated
Since the memory must be "suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement", memory returned by malloc et al tends to start on distinct, platform-dependent multiples - usually 8- or 16-byte boundaries.
And because new is usually implemented with malloc, this applies to C++ new also.
Addresses of allocated memory blocks are controlled by the heap manager. Only the heap manager's interface is defined (new/delete, malloc/free), not its implementation. The application has to accept the provided addresses and work with them.
In other words, it is theoretically possible to implement a heap manager that allocates memory blocks at random-like addresses. The application, however, has to work equally well also in this case.
The new operator does not guarantee contiguous allocation. Here is a more convincing example:
#include <iostream>
int main()
{
for (int i = 0 ; i < 32 ; ++i)
std::cout << std::hex << new int() << std::endl;
}
Output on a 64bit CPU:
0x22cac20
0x22cac40
0x22cac60
0x22cac80
...
0x22cafe0
0x22cb000
Demo
You are working in an environment with 8 bytes of allocation overhead and minimum dynamic memory alignment of 16 bytes. So each 16 byte allocation has 8 bytes of allocation overhead and 8 bytes of alignment padding.
If you try again with a 24 byte object (making sure sizeof really is 24 not 32) you will find only 8 bytes of overhead and not an additional 8 bytes of alignment padding.
There is a minimum size (including overhead) of 32 bytes. So if you try with a tiny object, you get a total of 32, not 16. If you try with a 40 byte object, you get a total of 48 demonstrating the lack of 32 byte alignment.
That is all specific to the environment in which you are running. The C++ standard allows for a much wider range of possible behavior.
The 8 bytes immediately preceding the 16-byte aligned chunk returned by the allocator must hold the size of the allocation plus at least one status bit indicating whether the previous chunk is free. That is the minimum overhead a 64-bit allocator needs and while the chunk is in use it is all the overhead needed. But once a chunk is free, there is significant overhead at the beginning of the chunk to support consolidating adjacent free chunks and to support quickly finding a good size free chunk for new allocations. That overhead wouldn't fit if the total were just 16 bytes.
I'm using two different libraries to perform atomic operations. I create a binary tree node structure with a key (8 bytes) and pointer to left and right children (8 each).
The expected node size is 24 bytes.
If I use Intel TBB library I get the expected behaviour. But if I use HP's atomic_ops library I see the node size as 32.
Compilers used:
gcc4.6, gcc4.8, icc 2013
Machine arch: x86_64
Code:
#include<stdio.h>
#include<stdlib.h>
#include<tbb/atomic.h>
#include<atomic_ops.h>
struct node24
{
unsigned long key; //size 8
tbb::atomic<struct node*> child[2]; //size 2*8=16
};
struct node32
{
unsigned long key; // size 8
AO_double_t child; // size 16
};
int main()
{
printf("TBB node size: %d\n",sizeof(node24));
printf("HP atomicOps node size: %d\n",sizeof(node32));
}
Output
$ ./foo.o
TBB node size: 24
HP atomicOps node size: 32
EDIT
My assumption is for node24 the size is rounded up to the nearest 8 and for node32 the size is rounded up to the nearest 16 (size of AO_double_t). So I added an extra value variable (8 bytes) to make the node size as 32. Now I expected the size of node32 to be 32 but it becomes 48. I don't understand why the extra 16 bytes of padding when it is already aligned at 32.
There is not much reason why non-standard implementations of atomics should agree in their data types they use for it, size and alignment can be different. Depending on compiler flags, one could even use a locked version in some cases where the other uses a native instruction. Just don't mix them.
Modern C and C++ have atomics that are built into the languages, use them if you can. They are even designed to be compatible between the two.
As pointed in the comment, the node32::child is defined using
typedef __m128 double_ptr_storage;
and it has alignment of 16. Thus the compiler has to put additional padding after the first key field since its size is only 8 bytes, another 8 bytes of padded space are needed to fix the alignment. When you added the 3rd field (I assume to the end?) the compiler has to add further padding in order to keep alignment in arrays.
Someone explain me how does the order of the member declaration inside a class determines the size of that class.
For Example :
class temp
{
public:
int i;
short s;
char c;
};
The size of above class is 8 bytes.
But when the order of the member declaration is changed as below
class temp
{
public:
char c;
int i;
short s;
};
then the size of class is 12 bytes.
How?
The reason behind above behavior is data structure alignment and padding. Basically if you are creating a 4 byte variable e.g. int, it will be aligned to a four byte boundary i.e. it will start from an address in memory, which is multiple of 4. Same applies to other data types. 2 byte short should start from even memory address and so on.
Hence if you have a 1 byte character declared before the int (assume 4 byte here), there will be 3 free bytes left in between. The common term used for them is 'padded'.
Data structure alignment
Another good pictorial explanation
Reason for alignment
Padding allows faster memory access i.e. for cpu, accessing memory areas that are aligned is faster e.g. reading a 4 byte aligned integer might take a single read call where as if an integer is located at a non aligned address range (say address 0x0002 - 0x0006), then it would take two memory reads to get this integer.
One way to force compiler to avoid alignment is (specific to gcc/g++) to use keyword 'packed' with the structure attribute. packed keyword Also the link specifies how to enforce alignment by a specific boundary of your choice (2, 4, 8 etc.) using the aligned keyword.
Best practice
It is always a good idea to structure your class/struct in a way that variables are already aligned with minimum padding. This reduces the size of the class overall plus it reduces the amount of work done by the compiler i.e. no rearrangement of structure. Also one should always access member variables by their names in the code, rather than trying to read a specific byte from structure assuming a value would be located at that byte.
Another useful SO question on performance advantage of alignment
For the sake of completion, following would still have a size of 8 bytes in your scenario (32 bit machine), but it won't get any better since full 8 bytes are now occupied, and there is no padding.
class temp
{
public:
int i;
short s;
char c;
char c2;
};
class temp
{
public:
int i; //size 4 alignment 4
short s; //size 2 alignment 2
char c; //size 1 alignment 1
}; //Size 8 alignment max(4,2,1)=4
temp[i[0-4];s[4-2];c[6-7]]] -> 8
Padding in (7-8)
class temp
{
public:
char c; //size 1 alignment 1
int i; //size 4 alignment 4
short s; //size 2 alignment 2
};//Size 12 alignment max(4,2,1)=4
temp[c[0-1];i[4-8];s[8-10]]] -> 12
Padding in (1-4) and (10-12)