How exactly structure packing and padding work? - c++

How exactly structs are packed and padded in c++? The standard does not says anything about how it should be done (as far as I know) and compilers can do whatever they want. But there are tutorials showing how to efficiently pack structs with known rules (for example that every variable needs to be on address that is multiple of its size and if end of previous variable is not multiple, then padding will be inserted), and with these rules we can pack structs by hand in source. What is it finally? We know in what way structs will be packed on modern machines (for example PCs) or it is just idea that can be right, but it is not good to take it for granted?

How exactly structs are packed and padded in c++?
Short answer: In such way that alignment requirements are satisfied.
The standard does not says anything about how it should be done (as far as I know) and compilers can do whatever they want.
Within bounds of the alignment requirements, this is indeed correct. This is also an answer to your question.

Related

Is there a reason not to use fixed width types?

I'm new to C++.
I was learning about types, their memory uses and the differences in their memory size based on architecture. Is there any downside to using fixed-width types such as int32_t?
The only real downside might be if you want your code to be portable to a system that doesn't have a 32-bit integer type. In practice those are pretty rare, but they are out there.
C++ has access to the C99 (and newer) integer types via cstdint, which will give you access to the int_leastN_t and int_fastN_t types which might be the most portable way to get specific bit-widths into your code, should you really happen to care about that.
The original intent of the int type was for it to represent the natural size of the architecture you were running on; you could assume that any operations on it were the fastest possible for an integer type.
These days the picture is more complicated. Cache effects or vector instruction optimization might favor using an integer type that is smaller than the natural size.
Obviously if your algorithm requires an int of at least a certain size, you're better off being explicit about it.
E.g.
To save space, use int_least32_t
To save time, use int_fast32_t
But in actuality, I personally use long (at least 32-bit) and int (at least 16-bit) from time to time simply because they are easier to type.
(Besides, int32_t is optional, not guaranteed to exist.)

Could Qt Synchronisation classes be used with MSVC compiler

In the Qt Synchronizing threads documentation listed here:
http://doc.qt.io/qt-5/threads-synchronizing.html
They wrote:
Note: Qt's synchronization classes rely on the use of properly aligned pointers. For instance, you cannot use packed classes with MSVC.
The sentence is not clear.
Which are the limitations of using Qt Synchronization classes with MSVC compiler?
You're facing a documentation bug, with following fix:
Note: Qt's synchronization classes rely on the use of properly aligned pointers. For instance, you cannot use t̲h̲e̲m̲ i̲n̲ packed classes with MSVC.
MSVC is a red herring. Qt's synchronization classes don't work with packed structures period, on all platforms Qt is supported on - since all these platforms support packed structures, and know how to access unaligned members of such structures.
Alignment refers to a restriction on the addresses of objects of certain types. For the compilers supported by Qt, there's just one sort of alignment restriction: some addresses should be a multiple of 2,4 or 8.
There are a few ways in which this restriction can be violated. In a packed class, when you have a float followed by a char followed by another float there will be no gaps between the three members (that's why they're called packed). As a result, the second float has an address that's 5 higher than the first. It's fairly obvious that one of the two addresses is not a multiple of 4 (the alignment of float).
Another way in which this can occur is casting a random char* to float*. The last two bits of the char* should be zero in this case.
MSVC++ can deal with such unaligned data (it's just slightly slower), but it does so by having the CPU load the data in two operations. This breaks Qt's synchronization which assumes that data is loaded in one operation, such that you get either an old or a new value. If the load is split in two operations, the first may see an old value and the second a new value. The result is that the register contains a mix of old and new bits (!)
It should not be an issue if you don't use packed classes (if you don't know what that is you are likely not to be using them).
See here for some information about what they are: Class contiguous data

why loki::flex_string's SmallStringOpt need aligment

I'm reading the source code of flex_string, and doesn't understand very well why the alignment is necessary, just for performance reason?
union
{
mutable value_type buf_[maxSmallString + 1];
Align align_;
};
here is link of design document of flex_string:
http://www.drdobbs.com/generic-a-policy-based-basicstring-imple/184403784#4
the author said:
But what's that Align business? Well, when dealing with such "seated allocation," you must be careful with alignment issues.
Quoting from the linked article:
But what's that Align business? Well, when dealing with such "seated
allocation," you must be careful with alignment issues. Because there
is no portable way of figuring out what alignment requirements Storage
has, SmallStringOpt accepts a type that specifies the alignment and
stores it in the dummy align_ variable.
I believe this is to do with the Storage template parameter. In order to be as generic as possible, the class is trying to work with any container even if that container has certain alignment requirements for its elements. This could be for performance reasons, or it could to do with compatibility with a certain architecture. The point is, there is no reliable, portable way to ascertain the alignment requirements of whatever "Storage" ends up being.
Hence the parameter Align is intended to be some type whose size is equal to the alignment required by Storage. It is a dummy variable in the union - it is never written to or read. only its size is used.
It can be seen from the code that the small string size is the higher of the configured maximum, and the alignment, making the alignment the minimum configurable small string size.
Hope this helps!

Do Google's Protocol Buffers automatically align data efficiently?

In a typical C or C++ struct the developer must explicitly order data members in a way that provides efficient memory alignment and padding, if that is an issue.
Google's Protocol Buffers behave a lot like structs and it is not clear how the compilation of these affects memory layout. Does anyone know if this tendency to organize data in a specific order for the sake of efficient memory layout is automatically handled by the protocol buffer compiler? I have been unable to find any information on this.
I.E. the buffer might actually internally order the data differently than it is specified in the message object of the protobuf.
In a typical C or C++ struct the developer must explicitly order data members in a way that provides efficient memory alignment and padding, if that is an issue.
Actually this is not entirely true.
It's true that most compilers (actually all I know of) tend to align struct elements to machine word addresses. They do this, due to performance reasons because it's usually cheaper to read from a word address and just mask away some bits than to read from the word address, shift the word, so the value you are looking for is right aligned and the mask away the bits not needed. (Of course this depends on the architecture you are compiling for)
So why is your statement I quoted above not true? - Because of the fact that compilers are arranging elements as described above, they also offer the programmer the opportunity to influnece this behavior. Usually this is done using a compiler specific pragma.
For example GCC and MS C Compilers provide a pragma called "pack" which allows the programmer to change the alignment behavior of the compiler for specific structs. Of course, if you choose to set pack to '1', the memory usage is improvide, but this will possibly impact your runtime behavior.
What never happens to my knowledge is a reordering of the members in a struct by the compiler.

Where can I find documentation on C++ memory alignment across different platforms/compilers?

I'm looking for a good (comprehensive) doc about memory alignment in C++, typical approaches, differences between compilers, and common pitfalls. Just to check if my understanding of the topic is correct and to learn something new.
This question is inspired by my answer to another question where I used following construct:
char const buf[1000] = ...;
unsigned int i = *reinterpret_cast<unsigned int*>(buf + shift); // shift can be anything
It was criticized as not conforming to memory alignment rules. Can you please explain as a bonus why this approach is flawed from memory alignment point of view? An example when it doesn't work will be highly appreciated. I know it's a bad approach in general, but I often use it in network protocol implementations, so it's more a practical question than theoretical one.
Also please don't mention strict-aliasing here, it's for another question.
Non-heap-allocated arrays of char have no specific requirements on their alignment. So your buffer of a thousand characters could be on an odd offset. Trying to read an int from that offset (reinterpreted as an int pointer obvious) would either result in poor performance or even a bus error on some hardware if the compiler doesn't split it up into separate read+bitmask operations.
Heap-allocated arrays of char are guaranteed to be aligned suitably to store any object type, so this is always an option.
For non-heap based storage, use boost::aligned_storage which ensures that the space is aligned properly for general use.
You can find an overview on wikipedia.
More in depth on the IBM site: Data alignment: Straighten up and fly right
Imagine the case where addresses must be 16-byte aligned like for example the PS3.
And then imagine that the shift == 1.
This would then for sure be a non 16-byte aligned pointer which would not work on this machine.