align cache for non-POD structure

align cache for non-POD structure - c++

Suppose there is non-POD struct below, is the alignment taken effect? If not, what would be expected?
struct S1
{
string s;
int32_t i;
double d;
} __attribute__ ((aligned (64)));
EDIT: the output of the sample code below is 64 even s is set to a long string.
int main(int argc,char *argv[])
{
S1 s1;
s1.s = "123451111111111111111111111111111111111111111111111111111111111111111111111111";
s1.i = 100;
s1.d = 20.123;
printf("%ld\n", sizeof(s1));
return 1;
}

Size of objects in C++ never changes. It's compile time constant. It has to be, because the compiler needs to know how much space to allocate for the object. In fact, sizeof(s1) is just alias for sizeof(S1) and no instance is involved with the later.
The std::string is a smallish object that wraps pointer to array of characters and reallocates that array to accommodate the value you set. So the string itself is stored outside the S1 object.
The __attribute__ ((aligned (64))) is not standard C++, it is GCC extension. It is honored even for non-POD objects. It simply tells the compiler to round the size of the structure up to next multiple of 64 bytes.
While it is honored on non-POD objects, I can't think of a reason to use it for one and can only think of very few reasons to use it for POD object. The default alignment is reasonable. You don't need to tweak it.

Yes, the alignment takes effect in GCC.
Example code:
#include <string>
#include <iostream>
#include <cstddef>
using namespace std;
struct S1
{
string s;
int i;
double d;
} __attribute__ ((aligned (64)));
struct S2
{
string s;
int i;
double d;
} __attribute__ ((aligned)); // let the compiler decide
struct S3
{
string s;
int i;
double d;
};
int main() {
cout << "S1 " << sizeof(S1) << endl;
cout << "S2 " << sizeof(S2) << endl;
cout << "S3 " << sizeof(S3) << endl;
}
Output:
S1 64
S2 16
S3 16
Also: You have observed that S1 is non-POD. Note that std::string is allowed to store its data externally (it usually does so because the data can be of arbitrary length, so a memory buffer must be allocated dynamically).
Remember that sizeof is only calculated during compilation, it can't depend on runtime values. Specifically, you can't ever query a size of dynamically allocated memory this way.
Also remember that every element in an array is always of the same type, so of the same size too.

Related

Zero sized array in struct managed by shared pointer

Consider the following structure:
struct S
{
int a;
int b;
double arr[0];
} __attribute__((packed));
As you can see, this structure is packed and has Zero sized array at the end.
I'd like to send this as binary data over the network (assume I took care of endianity).
In C/C++ I could just use malloc to allocate as much space as I want and use free later.
I'd like this memory to be handled by std::shared_ptr.
Is there a straight forward way of doing so without special hacks?

I'd like this memory to be handled by std::shared_ptr.
Is there a straight forward way of doing so without special hacks?
Sure, there is:
shared_ptr<S> make_buffer(size_t s)
{
auto buffer = malloc(s); // allocate as usual
auto release = [](void* p) { free(p); }; // a deleter
shared_ptr<void> sptr(buffer, release); // make it shared
return { sptr, new(buffer) S }; // an aliased pointer
}
This works with any objects that are placed in a malloced buffer, not just when there are zero-sized arrays, provided that the destructor is trivial (performs no action) because it is never called.
The usual caveats about zero-sized arrays and packed structures still apply as well, of course.

double arr[0];
} __attribute__((packed));
Zero sized arrays are not allowed as data members (nor as any other variable) in C++. Furthermore, there is no such attribute as packed in C++; it is a language extension (as such, it may be considered to be a special hack). __attribute__ itself is a language extension. The standard syntax for function attributes uses nested square brackets like this: [[example_attribute]].
I'd like to send this as binary data over the network
You probably should properly serialise the data. There are many serialisation specifications although none of them is universally ubiquitous and none of them is implemented in the C++ standard library. Indeed, there isn't a standard API for network commnication either.
A straightforward solution is to pick an existing serialisation format and use an existing library that implements it.

First, let me explain again why I have this packed structure:
it is used for serialization of data over the network so there's a header file with all network packet structures.
I know it generates bad assembly due to alignment issues, but I guess that this problem persists with regular serialization (copy to char * buffer with memcpy).
Zero size arrays are supported both by gcc and clang which I use.
Here's an example of a full program with a solution to my question and it's output (same output for gcc and g++).
compiled with -O3 -std=c++17 flags
#include <iostream>
#include <memory>
#include <type_traits>
#include <cstddef>
struct S1
{
~S1() {std::cout << "deleting s1" << std::endl;}
char a;
int b;
int c[0];
} __attribute__((packed));
struct S2
{
char a;
int b;
int c[0];
};
int main(int argc, char **argv)
{
auto s1 = std::shared_ptr<S1>(static_cast<S1 *>(::operator
new(sizeof(S1) + sizeof(int) * 1e6)));
std::cout << "is standart: " << std::is_standard_layout<S1>::value << std::endl;
for (int i = 0; i < 1e6; ++i)
{
s1->c[i] = i;
}
std::cout << sizeof(S1) << ", " << sizeof(S2) << std::endl;
std::cout << offsetof(S1, c) << std::endl;
std::cout << offsetof(S2, c) << std::endl;
return 0;
}
This is the output:
is standart: 1
5, 8
5
8
deleting s1
Is there anything wrong with doing this?
I made sure using valgrind all allocations/frees work properly.

Storing heterogeneous data continuously in memory as a sequence of chars

I have a large number of strings and some data associated with each string. For simplicity, lets assume that the data is an int for each string. Lets assume I have an std::vector<std::tuple<std::string, int>>. I want to try to store this data continuously in memory with a single heap allocation. I will not need to worry about adding or deleting strings in the future.
A simple example
Constructing an std::string requires a heap allocation, and accessing entry chars of the std::string requires a dereference. If I have a bunch of strings, I may make better use of memory by storing all of the strings in one std::string and storing each string's starting index and size as a separate variable. If I want, I could try to store the starting index and size within the std::string itself.
Back to my problem
One idea I had was to store everything in an std::string or std::vector<char>. Each entry of the std::vector<std::tuple<std::string, int>> would be laid out in memory like this:
length of next string (int or size_t)
sequence of chars representing the string (chars)
some number zero chars for correct int alignment (chars)
data (int)
This requires being able to interpret a sequence of chars as an int. There have been questions about this before, but it seems to me that trying to do this can result in undefined behavior. I believe that I can help this slightly by checking the sizeof(int).
Another option I have is to create a union
union CharInt{
char[sizeof(int)] some_chars;
int data;
}
here, I would need to be careful that the number of chars per int used is determined at compile-time based on the result of sizeof(int). I would then store an std::vector<CharInt>. This seems more "C++" than using reinterpret_cast. One downside of this is that accessing the second char member of a CharInt would require an additional pointer addition (the pointer to the CharInt + 1). This cost still seems small relative to the benefit of making everything contiguous.
Is this the better option? Are there other options available? Are there pitfalls I need to account for using the union method?
Edit:
I wanted to provide clarity about how CharInt would be used. I provided an example below:
#include <iostream>
#include <string>
#include <vector>
class CharIntTest {
public:
CharIntTest() {
my_trie.push_back(CharInt{ 42 });
std::string example_string{ "this is a long string" };
my_trie.push_back(CharInt{ example_string, 5 });
my_trie.push_back(CharInt{ 106 });
}
int GetFirstInt() {
return my_trie[0].an_int;
}
char GetFirstChar() {
return my_trie[1].some_chars[0];
}
char GetSecondChar() {
return my_trie[1].some_chars[1];
}
int GetSecondInt() {
return my_trie[2].an_int;
}
private:
union CharInt {
// here I would need to be careful that I only insert sizeof(int) number of chars
CharInt(std::string s, int index) : some_chars{ s[index], s[index+1], s[index+2], s[index+3]} {
}
CharInt(int i) : an_int{ i } {
}
char some_chars[sizeof(int)];
int an_int;
};
std::vector<CharInt> my_trie;
};
Note that I do not access the first or third CharInts as though they were chars. I do not access the second CharInt as though it were an int. Here is the main:
int main() {
CharIntTest tester{};
std::cout << tester.GetFirstInt() << "\n";
std::cout << tester.GetFirstChar() << "\n";
std::cout << tester.GetSecondChar() << "\n";
std::cout << tester.GetSecondInt();
}
which produces the desired output
42
i
s
106

What is the memory lay out of a C++ structure?

I was learning about the structures in C++ and got to know that if a structure in C++ has 3 variables (let each one be of some data type), then all of them are not allocated in a contiguous fashion. Is this correct?
If yes, then how much memory would be allocated to an object to that structure type?
For e.g. Let's say we have a structure like this:
struct a
{
int x;
int y;
char c;
};
Now how intuitively an object of type a, must occupy some space = sizeOf(int) + sizeOf(int) + sizeOf(char). But, if they are not allocated continuously, there could be some memory locations allocated for that object, that are just present for providing some padding - i.e. the memory allocated for that object could look something like this:
xxxx[4-bytes]xxxxx[4-bytes]xxxx[1-byte]xxx
(NOTE: In the above blockquote x corresponds to a memory location of size 1 byte. I also assumed that sizeOf(int) = 4-bytes and sizeOf(char) = 1- byte.)
So in the above one, we can see that the object a occupies more than 9-bytes (because there are some memory locations (x's) used for padding.)
So, does something like this happen?
Thanks for your replies!
P.S. Please let me know if I hadn't written something clearly.

When it comes to structs and classes, The layout is implementation-defined between compilers, os, and architecture... Some will use automatic alignment, others will use padding, some may even auto arrange it's members. If you need to know the size of a struct, use sizeof(Your Struct).
Here's a code snippet...
#include <iostream>
struct A {
char a;
float b;
int c;
};
struct B {
float a;
int b;
char c;
};
int main() {
std::cout << "Sizeof(A) = " << sizeof(A) << '\n';
std::cout << "Sizeof(B) = " << sizeof(B) << '\n';
return 0;
}
Output:
Sizeof(A) = 12
Sizeof(B) = 12
For my particular machine, I'm running Windows 7 - 64bit, It is an Intel Core2 Quad Extreme, and I'm using Visual Studio 2017 running it with C++17.
With my particular setup, both structures are being generated with a different layout, but have the same size in bytes.
In A's case...
char a; // 1 byte
// 3 bytes of padding
float b; // 4 bytes
int c; // 4 bytes (int is 32bit even on x64).
In B's case...
float a; // 4 bytes
int b; // 4 bytes
char c; // 1 byte
// 3 bytes of padding.
Also, your compiler flags and optimizations may have an effect. This isn't always guaranteed, as it is implementation-defined as stated in the standard.
--Edit--
Also, if you don't want this exact behavior there are some pragmas directives and macros such as pragma pack and alignas() that can be used to modify your implementation details. Here are a few references.
How to use alignas to replace pragma pack?
https://en.cppreference.com/w/cpp/preprocessor/impl
https://en.cppreference.com/w/cpp/language/alignas
https://learn.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=msvc-160
https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.cbclx01/pragma_pack.htm
https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.cbclx01/packqua.htm
https://www.iditect.com/how-to/57426535.html
https://downloads.ctfassets.net/oxjq45e8ilak/1LriV4eAdhNlu9Zv06H9NJ/53576095f772b5f6cddbbedccb7ebd8a/Alexander_Titov_Know_your_hardware_CPU_memory_hierarchy.pdf
https://cpc110.blogspot.com/2020/10/vs2019-alignas-in-struct-definition.html

So, structure Object variables take contiguous memory locations and the variables are stored in memory in the same order in which they are defined.
Please refer to the code below you will get an idea.
struct c{
int a;
int b;
int c;
};
int main()
{
c obj;
cout << &(obj.a) << endl; //0x7ffe128fe1c4
cout << &(obj.b) << endl; //0x7ffe128fe1c8
cout << &(obj.c) << endl; //0x7ffe128fe1cc
return 0;
}
For Structures Padding Concept Refer here

Padding in struct containing only one int array member in C++?

Giving simple structure (POD) containing only one array of shorts (bytes, ints from <cstdint>, etc) and no more fields will be added later:
#define FIXED_SIZE 128 // 'fixed' in long term, shouldn’t change in future versions
struct Foo {
uint16_t bar[FIXED_SIZE];
};
is it any possibility to end up with padding at the end of the structure added by compiler for any reason ?
It seems reasonable not to make any padding as it is no any obvious need of it, but is it any guarantees by standard (could you provide any links where it is explained)?
Later I would like to use arrays of Foo structs in simple serialization (IPC) within different platforms and don't want to use any libraries for this simple task (code simplified for demonstration):
#define FOO_ELEMS 1024
...
// sender
Foo *from = new Foo[FOO_ELEMS];
uint8_t *buff_to = new uint8_t[FOO_ELEMS * FIXED_SIZE * sizeof(uint16_t) ];
memcpy(buff_to, from, ...);
...
// receiver
uint8_t *buff_from = new uint8_t[ ... ];
Foo *to = new Foo[FOO_ELEMS];
memcpy(to, buff_from, ...);
I would like to use struct here instead of plain arrays as it will be some auxiliary methods within struct and it seems more convenient then to use plain functions + arrays pointers instead.
Intersects with this (plain C) question, but seems a little bit different for me:
Alignment of char array struct members in C standard

The various standards provide for padding to occur (but not at the start).
There is no strict requirement at all that it will only appear to align the members and the object in arrays.
So the truly conformant answer is:
Yes, there may be padding because the compiler can add it but not at the start or between array elements.
There is no standard way of forcing packing either.
However every time this comes up and every time I ask no one has ever identified a real compiler on a platform that pads structures for any other reason than for internal alignment and array alignment.
So for all know practical purposes that structure will not be packed on any known platform.
Please consider this yet another request for someone to find a real platform that breaks that principle.

Since we are already guaranteed that there will no padding at the beginning of the structure don't have to worry about that. At the end I could see padding being added if the sizeof of the array was not divisible by the word size of the machine.
The only way I could get any padding to be added to the struct though was to add an int member to the struct as well. In doing so the struct was padded to make them the same size.
#include <iostream>
#include <cstdint>
struct a
{
uint16_t bar[128];
};
struct b
{
uint16_t bar[127];
};
struct c
{
int test;
uint16_t bar[128];
};
struct d
{
int test;
uint16_t bar[127];
};
struct e
{
uint16_t bar[128];
int test;
};
struct f
{
uint16_t bar[127];
int test;
};
int main()
{
std::cout << sizeof(a) << "\t" << sizeof(b) << "\t" << sizeof(c) << "\t" << sizeof(d) << "\t" << sizeof(e) << "\t" << sizeof(f);
}
Live Example

How is memory allocated for stack variables?

On VS (release), I run the following:
int main(void)
{
char b[] = "123";
char a[] = "1234567";
printf("%x %x\n", b,a);
return 0;
}
I can see that, the mem address of a is b+3(the length of the string). Which shows that the memory are allocated with no gaps. And this guarantee that least memories are used.
So, I now kind of believe that all compilers will do so.
I want to make sure of this guess here. Can somebody give me an more formal proof or tell me that my guess is rooted on a coincidence.

No, it's not guaranteed that there will always be perfect packing of data.
For example, I compiled and runned this code on g++, and the difference is 8.
You can read more about this here.
tl;dr: Compilers can align objects in memory to only addresses divisible by some constant (always machine-word length) to help processor(for them it's easier to work with such addresses)
UPD: one interesting example about alignment:
#include <iostream>
using namespace std;
struct A
{
int a;
char b;
int c;
char d;
};
struct B
{
int a;
int c;
char b;
char d;
};
int main()
{
cout << sizeof(A) << " " << sizeof(B) << "\n";
}
For me, it prints
16 12

There is no guarantee what addresses will be chosen for each variable. Different processors may have different requirements or preferences for alignment of variables for instance.
Also, I hope there were at least 4 bytes between the addresses in your example. "123" requires 4 bytes - the extra byte being for the null terminator.

Try reversing the order of declaring a[] and b[], and/or increase the length of b.
You are making a very big assumption about how storage is allocated. Depending on your compiler the string literals might get stored in a literal pool that is NOT on the stack. Yet a[] and b[] do occupy elements on the stack. So, another test would be to add int c and compare those addresses.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

align cache for non-POD structure - c++

Related

Zero sized array in struct managed by shared pointer

Storing heterogeneous data continuously in memory as a sequence of chars

What is the memory lay out of a C++ structure?

Padding in struct containing only one int array member in C++?

How is memory allocated for stack variables?

Categories

Resources