After much deliberation, I have reduced a problem down to the following simple example:
//__declspec(align(16)) class Vec4 {}; //For testing purposes on Windows
//class Vec4 {} __attribute__((__aligned__(16))); //For testing purposes on *nix
class Base { public:
Vec4 v; //16-byte-aligned struct
};
class Child : public Base {};
static_assert(alignof( Base)>=16,"Check 1");
static_assert(alignof(Child)>=16,"Check 2");
Check 1 passes; Check 2 fails. My question: Why?
In practice, allocating a Child (i.e. new Child) will cause v to be possibly 8-byte-aligned (v uses SSE, so this in turn causes a crash).
Compiler is Intel Compiler 2016. I tried g++ and Clang, and they seem okay. Is this maybe a compiler bug?
I have asked Intel directly. Although it could initially be reproduced by their team, after a few days, it couldn't be. Neither I nor they had an explanation. So, magic does exist, and no harm done.
As to the actual substance of the question: yes, it appears that the alignment of the child should be at least as large as the alignment of the base, and a compiler that does not do this is in error.
N.B. stack allocations respect the class's alignof, but malloc/new do not (how could they know?). As the other answer says, custom heap allocators must be used. This problem is separate from a correct alignof per se.
According to http://en.cppreference.com/w/cpp/language/object#Alignment:
It is implementation-defined if new-expression, std::allocator::allocate, and std::get_temporary_buffer support over-aligned types. Allocators instantiated with over-aligned types are allowed to fail to instantiate at compile time, to throw std::bad_alloc at runtime, to silently ignore unsupported alignment requirement, or to handle them correctly.
As a workaround I would suggest defining custom operator new for your base class.
Related
#include <iostream>
#include <type_traits>
#include <string>
int main()
{
std::cout << std::is_standard_layout<std::string>::value;
system("pause");
return 0;
}
I use Visual Studio 2017 to compile the code. When in debug mode, it output 0. But when in release mode, it output 1. What make it happened?
The following is a somewhat simplified explanation, because most of the classes described below are templated, and I'm omitting such details for brevity.
In MSVC 2017 toolset version 14.16, the std::string class is derived from _String_alloc class, which has a single private data member _Mypair of type _Compressed_pair. The class _Compressed_pair is derived from an empty std::allocator<char> class, and has a single private _Myval2 data member of type _String_val. The class _String_val is derived from _Container_base class, plus it has three private data members: _Bx of a union type _Bxty (which is able to provide either an in-place buffer for small string optimizations or a pointer to the dynamically-allocated buffer for regular strings), _Mysize and _Myres, both of size_type, tracking the length and the reserved storage size for the string.
In release mode, when _ITERATOR_DEBUG_LEVEL is #defined as 0, the _Container_base is an alias for an empty struct _Container_base0. However, compiling in debug mode, when _ITERATOR_DEBUG_LEVEL is not 0, the _Container_base is an alias for a non-empty struct _Container_base12, which contains a single public data member _Myproxy (which is a pointer to a _Container_proxy structure).
Now, here's the catch: one of the requirements for the standard layout type is that all of its (non-static) data members shall be declared in the same class (i.e., either all declared in the most derived class itself or all declared in some particular base class, but not scattered among different classes). Thus, because of the aforementioned _Container_base12 helper structure (used for debugging purposes), _String_val does not satisfy this requirement any more, and therefore cannot be considered being a standard layout type. In particular, its _Myproxy data member is declared in _Container_base12, whereas _Bx, _Mysize and _Myres data members are declared directly in _String_val itself.
By implication, when using a non-zero _ITERATOR_DEBUG_LEVEL, the std::string class itself no longer satisfies the standard layout type requirements, since one of the rules is that all (non-static) data members and base classes shall themselves be standard layout types as well.
While this does answer your question about origins of the difference, please bear in mind that these are just undocumented, implementation-specific details, not to be relied upon. Microsoft may change/reorganize their library code in the future. For example, MSVC 2019 toolset version 14.28 already uses composition instead of inheritance: there, the std::string class is not derived from _String_alloc anymore, and instead directly contains its own single _Mypair data member of type _Compressed_pair. (Fortunately, ABI compatibility between toolsets v14.28 and v14.16 is still preserved this way, it seems, because the resultant memory layout of std::string remains the same.) And of course the reasoning is similar for MSVC 2019 / toolset v14.28 regarding std::string not being standard layout in debug mode. However, some future version may bring breaking changes, and then at some point perhaps the std::is_standard_layout<std::string>::value observed in debug mode may even change to true without notice.
Consider the following struct:
class Foo {
int a;
};
Testing in g++, I get that sizeof(Foo) == 4 but is that guaranteed by the standard? Would a compiler be allowed to notice that a is an unused private field and remove it from the in-memory representation of the class (leading to a smaller sizeof)?
I don't expect any compilers to actually do that kind of optimization but this question popped up in a language lawyering discussion so now I'm curious.
The C++ standard doesn't define a lot about memory layouts. The fundamental rule for this case is item 4 under section 9 Classes:
4 Complete objects and member subobjects of class type shall have nonzero size. [ Note: Class objects can be assigned, passed as arguments to functions, and returned by functions (except objects of classes for which copying or moving has been restricted; see 12.8). Other plausible operators, such as equality comparison, can be defined by the user; see 13.5. — end note ]
Now there is one more restriction, though: Standard-layout classes. (no static elements, no virtuals, same visibility for all members) Section 9.2 Class members requires layout compatibility between different classes for standard-layout classes. This prevents elimination of members from such classes.
For non-trivial non-standard-layout classes I see no further restriction in the standard. The exact behavior of sizeof(), reinterpret_cast(), ... are implementation defined (i.e. 5.2.10 "The mapping function is implementation-defined.").
The answer is yes and no. A compiler could not exhibit exactly that behaviour within the standard, but it could do so partly.
There is no reason at all why a compiler could not optimise away the storage for the struct if that storage is never referenced. If the compiler gets its analysis right, then no program that you could write would ever be able to tell whether the storage exists or not.
However, the compiler cannot report a smaller sizeof() thereby. The standard is pretty clear that objects have to be big enough to hold the bits and bytes they contain (see for example 3.9/4 in N3797), and to report a sizeof smaller than that required to hold an int would be wrong.
At N3797 5.3.2:
The sizeof operator yields the number of bytes in the object
representation of its operand
I do not se that 'representation' can change according to whether the struct or member is referenced.
As another way of looking at it:
struct A {
int i;
};
struct B {
int i;
};
A a;
a.i = 0;
assert(sizeof(A)==sizeof(B));
I do not see that this assert can be allowed to fail in a standards-conforming implementation.
If you look at templates, you'll notice that "optimization" of such often ends up with nearly nothing in the output even though the template files may be thousands of lines...
I think that the optimization you are talking about will nearly always occur in a function when the object is used on the stack and the object doesn't get copied or passed down to another function and the private field is never accessed (not even initialized... which could be viewed as a bug!)
I was wondering what the size limit for a class is. I did a simple test:
#define CLS(name,other) \
class name\
{\
public: \
name() {};\
other a;\
other b;\
other c;\
other d;\
other e;\
other f;\
other g;\
other h;\
other i;\
other j;\
other k;\
};
class A{
int k;
public:
A(){};
};
CLS(B,A);
CLS(C,B);
CLS(D,C);
CLS(E,D);
CLS(F,E);
CLS(G,F);
CLS(H,G);
CLS(I,H);
CLS(J,I);
It fails to compile with
"'J' : class is too large"
If I remove the final declaration - CLS(J,I);, it all compiles fine.
Is this a compiler-imposed restriction, or is it somewhere in the standard?
In C++11 this is Annex B. Implementations can impose limits, but they should be at least:
Size of an object [262 144].
Data members in a single class [16 384].
Members declared in a single class [4 096].
The third one isn't directly relevant to the kind of construction you're using, I mention it just because it indicates that the second one is indeed the total members, presumably including those in bases and I'm not sure about members-of-members. But it's not just about the members listed in a single class definition.
Your implementation appears to have given up either 2^31 data members, or at size 2^32, since it accepts I but not J. It's fairly obviously reasonable for a compiler to refuse to consider classes with size greater than SIZE_MAX, even if the program happens not to instantiate it or use sizeof on the type. So even with the best possible effort on the part of the compiler I wouldn't ever expect this to work on a 32 bit implementation.
Note that "these quantities are only guidelines and do not determine compliance", so a conforming implication can impose an arbitrary smaller limit even where it has sufficient resources to compile a program that uses larger numbers. There's no minimum limit for conformance.
There are various opportunities in the C++ standard for a conforming implementation to be useless due to ridiculously small resource limits, so there's no additional harm done if this is another one.
C++03 is more-or-less the same:
Size of an object [262 144].
Data members in a single class, structure, or union [16 384].
Members declared in a single class [4 096].
I wanted to mention another place in which class size limit is mentioned, which is in section 1.2 of the Itanium C++ ABI draft
Various representations specified by this ABI impose limitations on
conforming user programs. These include, for the 64-bit Itanium ABI:
The offset of a non-virtual base subobject in the full object
containing it must be representable by a 56-bit signed integer (due to
RTTI implementation). This implies a practical limit of 2**55 bytes on
the size of a class.
I'm sure its compiler dependent. You can run your compiler in a preprocess only mode to see what the generated output is if you're curious. You might also want to look at template expansion rather than macros.
Taking the following snippet as an example:
struct Foo
{
typedef int type;
};
class Bar : private Foo
{
};
class Baz
{
};
As you can see, no virtual functions exist in this relationship. Since this is the case, are the the following assumptions accurate as far as the language is concerned?
No virtual function table will be created in Bar.
sizeof(Bar) == sizeof(Baz)
Basically, I'm trying to figure out if I'll be paying any sort of penalty for doing this. My initial testing (albeit on a single compiler) indicates that my assertions are valid, but I'm not sure if this is my compiler's optimizer or the language specification that's responsible for what I'm seeing.
According to the standard, Bar is not a POD (plain old data) type, because it has a base. As a result, the standard gives C++ compilers wide latitude with what they do with such a type.
However, very few compilers are going to do anything insane here. The one thing you probably have to look out for is the Empty Base Optimization. For various technical reasons, the C++ standard requires that any instance be allocated storage space. For some compilers, Foo will be allocated dedicated space in the bar class. Compilers which implement the Empty Base Optimization (most all in modern use) will remove the empty base, however.
If the given compiler does not implement EBO, then sizeof(foo) will be at least twice sizeof(baz).
Yeah, without any virtual members or member variables, there shouldn't be a size difference.
As far as I know the compiler will optimize this correctly, if any optimizing is needed at all.
This is a question that was sparked by Rob Walker's answer here.
Suppose I declare a class/struct like so:
struct
{
char A;
int B;
char C;
int D;
};
Is it safe to assume that these members will be declared in exactly that order in memory, or is this a compiler dependent thing? I'm asking because I had always assumed that the compiler can do whatever it wants with them.
This leads into my next question. If the above example causes memory alignment issues, why can the compiler not just turn that into something like this implicitly:
struct
{
char A;
char C;
int B;
int D;
};
(I'm primarily asking about C++, but I'd be interested to hear the C answer as well)
Related topics
Why doesn't GCC optimize structs?
C99 §6.7.2.1 clause 13 states:
Within a structure object, the
non-bit-field members and the units in
which bit-fields reside have addresses
that increase in the order in which
they are declared.
and goes on to say a bit more about padding and addresses. The C89 equivalent section is §6.5.2.1.
C++ is a bit more complicated. In the 1998 and 2003 standards, there is §9.2 clause 12 (clause 15 in C++11):
Nonstatic data members of a
(non-union) class declared without an
intervening access-specifier are
allocated so that later members have
higher addresses within a class
object. The order of allocation of
nonstatic data members separated by an
access-specifier is unspecified
(11.1). Implementation alignment
requirements might cause two adjacent
members not to be allocated
immediately after each other; so might
requirements for space for managing
virtual functions (10.3) and virtual
base classes (10.1).
The data members are arranged in the order declared. The compiler is free to intersperse padding to arrange the memory alignment it likes (and you'll find that many compilers have a boatload a alignment specification options---useful if mixing bits compiled by different programs.).
See also Why doesn't GCC optimize structs?.
It appears that this answer is somewhat obsolete for C++. You learn something everyday. Thanks aib, Nemanja.
I cannot speak for C++, but in C the order is guaranteed to be the same order in memory as declared in the struct.
Basically, you can count on that only for the classes with a standard layout. Strictly speaking, standard layout is a C++0x thing, but it is really just standardizing existing practice/
Aside from padding for alignment, no structure optimization is allowed by any compiler (that I am aware of) for C or C++. I can't speak for C++ classes, as they may be another beast entirely.
Consider your program is interfacing with system/library code on Windows but you want to use GCC. You would have to verify that GCC used an identical layout-optimization algorithm so all your structures would be packed correctly before sending them to the MS-compiled code.
While browsing the related topics at the right, I looked at this question. I figure this may be an interesting corner case when thinking about these issues (unless it's more common than I realize).
To paraphrase, if you have a struct in C that looks something like this:
struct foo{};
and subclass it like so in C++ (using a separate compilation unit):
extern "C" foo;
struct bar: public foo{};
Then the memory alignment won't necessarily be the same for the reasons aib mentions (even amongst compilers from the same vendor).