Is there any formal specification for the layout and memory alignment for the pseudo members of a tuple?
Is there anyway to modify the memory alignment of types in a tuple? Is it effected by a #pragma pack() directive?
For example:
typedef std::tuple<uint8_t, uint32_t> myTuple;
Is there any specification that says this will be in memory the same as:
#pragma pack() // Default packing
struct myStruct
{
uint8_t first;
uint32_t second;
}
Apologies if this is a stupid question but I don't entirely understand alignment when it comes to templates.
Edit: Example of what I'm trying to accomplish
Currently I have something along the lines of...
#pragma pack(push)
#pragma pack(4)
struct cTriangle
{
uint32 Index[3];
};
#pragma pack(pop)
template <class T>
inline bool Read(cFileStream& fStream, std::vector<T>& vec)
{
if (!vec.size())
return true;
// fStream.Read(void* pBuffer, size_t Size)
// Just a wrapper around a binary ifstream really
return fStream.Read(&vec[0], sizeof(T) * vec.size());
}
std::vector<cVector3> vPoint;
vPoint.resize(Verticies);
bool result = Read(FileStream, vPoint);
If I wanted to typedef cTriangle as std::tuple<uint32, uint32, uint32> for metaprogramming purposes would I still be able to read/write to the raw memory of the tuple (and thus a vector of tuples) or would that memory have unknown alignment?
Not only is there no requirement that the objects be arranged any particular way, but many tuple implementations actually put the second object before the first one.
Tuples are typically not standard-layout, because standard-layout classes can have at most one class in their inheritance hierarchy with non-static data members, and the typical way to implement variadic tuple is through recursive inheritance, with each level of recursion adding one data member. This allows tuple implementations to elide distinct empty members through the empty base class optimisation, which is not available for struct members.
If you check that sizeof(myTuple) == sizeof(myStruct), you are reasonably entitled to assume that the memory layout of the tuple contains the elements of the struct in some (consistent) order, but actually relying on that for aliasing will likely cause undefined behaviour.
If as you say you just want alias with tuple for metaprogramming, you'd be better off using a metaprogramming library such as Boost.Fusion that allows you to annotate the struct type with its members:
#pragma pack(push)
#pragma pack(4)
struct cTriangle {
uint32 Index[3];
};
#pragma pack(pop)
BOOST_FUSION_ADAPT_STRUCT(
cTriangle,
(uint32[3], Index))
As David points out, there is no guarantee according to the standard. However, if you are interested in a two value tuple, you might want to use std::pair<T1, T2>, which is standard-layout (if T1,T2 are standard-layout) and thus has predictable memory layout.
See also C++0x Tuples Store Elements Backwards
[EDIT]
Sorry, I had not seen ecatmur's answer before answering your comment. By the way: If all members of your struct have the same type, you can of course use std::array, which is standard layout and which allows element access with std::get similar to std::tuple.
Related
Following a snippet of code from Loki singleton implementation which shows what it calls "MaxAlign Trick". I assume it has something to do with alignment (duh!), but what's the purpose of trying to align with all the types mentioned inside the union? Will the placement new inside Create() break without it?
template <class T> struct CreateStatic
{
union MaxAlign
{
char t_[sizeof(T)];
short int shortInt_;
int int_;
long int longInt_;
float float_;
double double_;
long double longDouble_;
struct Test;
int Test::* pMember_;
int (Test::*pMemberFn_)(int);
};
static T* Create()
{
static MaxAlign staticMemory_;
return new(&staticMemory_) T;
}
// other code...
}
MaxAlign serves two purposes. First, it is an implementation of the C++11 std::max_align_t: "a trivial standard-layout type whose alignment requirement is at least as strict (as large) as that of every scalar type." (cppreference). Since the alignment of a type is the alignment of the data member with the highest alignment requirements, the definition of MaxAlign gives us exactly that: a type that is guaranteed to have the max. alignment for the platform of interest.
Second, it is also a buffer that is large enough to contain a T:
char t_[sizeof(T)];
Taking both aspects, MaxAlign provides the C++11 feature std::aligned_storage_t<size, alignment> (without taking over-alignment of T is into account - it probably didn't even exist back then).
But why is it needed: placement new requires the buffer to be suitably aligned for the instance that is being constructed. Without this "alignment trick", you might end up with undefined behaviour. T being an unkonwn type, Loki circumvents any risks by choosing the maximal alignment for the platform the code is being compiled for.
In modern code, you would probably not use placement new, but use a static object on the stack, e.g.
static T& Create() {
static T instance;
return instance;
}
But 20 years ago, this might not have worked properly across compilers and/or in multi-threaded environments (proper initialisation of T instance in the above is only guaranteed since C++11 IIRC).
I have tried union...
struct foo
{
union
{
struct // 2 bytes
{
char var0_1;
};
struct // 5 bytes
{
char var1_1;
int var1_2;
};
};
};
Problem: Unions do what I want, except they will always take the size of the biggest datatype. In my case I need struct foo to have some initialization that allows me to tell it which structure to chose of the two (if that is even legal) as shown below.
So after that, I tried class template overloading...
template <bool B>
class foo { }
template <>
class foo<true>
{
char var1;
}
template <>
class foo<false>
{
char var0;
int var1;
}
Problem: I was really happy with templates and the fact that I could use the same variable name on the char and int, but the problem was the syntax. Because the classes are created on compile-time, the template boolean variable needed to be a hardcoded constant, but in my case the boolean needs to be user-defined on runtime.
So I need something of the two "worlds." How can I achieve what I'm trying to do?
!!NOTE: The foo class/struct will later be inherited, therefore as already mentioned, size of foo is of utmost importance.
EDIT#1::
Application:
Basically this will be used to read/write (using a pointer as an interface) a specific data buffer and also allow me to create (new instance of the class/struct) the same data buffer. The variables you see above specify the length. If it's a smaller data buffer, the length is written in a char/byte. If it's a bigger data buffer, the first char/byte is null as a flag, and the int specifies the length instead. After the length it's obvious that the actual data follows, hence why the inheritance. Size of class is of the utmost importance. I need to have my cake and eat it too.
A layer of abstraction.
struct my_buffer_view{
std::size_t size()const{
if (!m_ptr)return 0;
if (*m_ptr)return *m_ptr;
return *reinterpret_cast<std::uint32_t const*>(m_ptr+1);
}
std::uint8_t const* data() const{
if(!m_ptr)return nullptr;
if(*m_ptr)return m_ptr+1;
return m_ptr+5;
}
std::uint8_t const* begin()const{return data();}
std::uint8_t const* end()const{return data()+size();}
my_buffer_view(std::uint_t const*ptr=nullptr):m_ptr(ptr){}
my_buffer_view(my_buffer_view const&)=default;
my_buffer_view& operator=(my_buffer_view const&)=default;
private:
std::uint8_t const* m_ptr=0;
};
No variable sized data anywhere. I coukd have used a union for size etx:
struct header{
std::uint8_t short_len;
union {
struct{
std::uint32_t long_len;
std::uint8_t long_buf[1];
}
struct {
std::short_buf[1];
}
} body;
};
but I just did pointer arithmetic instead.
Writing such a buffer to a bytestream is another problem entirely.
Your solution does not make sense. Think about your solution: you could define two independents classes: fooTrue and fooFalse with corresponding members exactly with the same result.
Probably, you are looking for a different solution as inheritance. For example, your fooTrue is baseFoo and your fooFalse is derivedFoo with as the previous one as base and extends it with another int member.
In this case, you have the polymorphism as the method to work in runtime.
You can't have your cake and eat it too.
The point of templates is that the specialisation happens at compile time. At run time, the size of the class is fixed (albeit, in an implementation-defined manner).
If you want the choice to be made at run time, then you can't use a mechanism that determines size at compile-time. You will need a mechanism that accommodates both possible needs. Practically, that means your base class will need to be large enough to contain all required members - which is essentially what is happening with your union based solution.
In reference to your "!!NOTE". What you are doing qualifies as premature optimisation. You are trying to optimise size of a base class without any evidence (e.g. measurement of memory usage) that the size difference is actually significant for your application (e.g. that it causes your application to exhaust available memory). The fact that something will be a base for a number of other classes is not sufficient, on its own, to worry about its size.
Is the layout in memory of test1 and test2 the same?
std::vector<std::pair<int,int> > test1;
std::vector<mystruct> test2;
where mystruct is defined as:
struct mystruct{
int a;
int b;
};
If you see on the cplusplus.com, you'll see that this is the struct of a pair:
template <class T1, class T2> struct pair
{
typedef T1 first_type;
typedef T2 second_type;
T1 first;
T2 second;
pair() : first(T1()), second(T2()) {}
pair(const T1& x, const T2& y) : first(x), second(y) {}
template <class U, class V>
pair (const pair<U,V> &p) : first(p.first), second(p.second) { }
}
Exactly the same, i would say, except some facts:
Well, beginning with the fact that pairs are compatible with the std containers and all that, for example, maps.
Also, the pairs are already made, and already have constructors for you.
EDIT:
I also forgot to mention that you'll have std::make_pair for you, that allow you skipping allocating memory and making your own pair in a struct and you too have some comparison and assignment operators defined.
Logically, std::pair<int, int> should be defined like that also.
There is however nothing about that on the standard and it is completely unguaranteed. You could take a look at the header files in your compiler to confirm that, but that doesn't prove anything.
Note: If you think it is absurd to be otherwise, I can give you an example how it could be defined otherwise. Imagine in the other stl template classes that use std::pair, they feel it would be convenient (for any reason), to have a pointer to the node containing the pair, in the pair. This way, they could, internally, add a pointer to the pair class while not violating any rules. Your assumption would then cause havoc. I would say it is unlikely for them to do such a thing, yes, but as long as they are not forced to that layout, anything can happen.
Yes, at least in Release mode it'll be the same size.
You shouldn't use this knowledge to do some memory trickery though, because of two reasons:
1)
The std uses a lot of additional debugging stuff. That makes sizes of classes differ in debug and release mode. So it might very well be that std::pair is larger in debug mode.
2)
The std might change internally. There is no guarantee that std::pair won't change it's memory layout in a different std version. So if you rely on that you have to live with the fear that one day it might not work anymore.
In any reasonable implementation of std::pair they will have the same layout. But this begs the question of why does it matter? You should not be doing a binary assignment from one to the other.
You can add a copy constructor to the struct, either directly or by subclassing it. You can also add a method that does the conversion the other way around.
struct mystruct{
int a;
int b;
mystruct(const std::pair<int,int> &rhs) { a=rhs.first; b=rhs.second; }
std::pair<int,int> as_pair() { return std::make_pair(a, b); }
};
What would be a decent approach to check at compile/run-time that a particular struct/class does not have any virtual functions. This check is required in order to ensure the proper byte alignment when doing placement new.
Having so much as a single virtual function will shift the entire data by a vtable pointer size, which will completely mess things up in conjunction with the placement new operator.
Some more details: I need something that works across all major compiler and platforms, e.g. VS2005, VC++10, GCC 4.5, and Sun Studio 12.1 on top of Windows, Linux, and Solaris.
Something that is guaranteed to work with the following scenario should suffice:
struct A { char c; void m(); };
struct B : A { void m(); };
Should someone decide to make this change:
struct A { char c; virtual void m(); };
struct B : A { void m(); };
It would be great to see a compile-time error that says struct A must not contain virtual functions.
There are facilities and tricks (depending on the version of C++ you are using) to get the proper alignment for a class.
In C++0x, the alignof command is similar to sizeof but returns the required alignment instead.
In C++03, the first thing to note is that the size is a multiple of the alignment, because elements need be contiguous in an array. This means that using the size as the alignment is over-zealous (and may waste space) but works fine. With some trickery you can get a better value:
template <typename T>
struct AlignHelper
{
T t;
char c;
};
template <typename T>
struct Alignment
{
static size_t const diff = sizeof(AlignHelper<T>) - sizeof(T);
static size_t const value = (diff != 0) ? diff : sizeof(T);
};
This little helper gives a correct alignment as a compile-time constant (suitable for template programming therefore). It may be larger than the minimal alignment required (*).
Normally though it should be fine to use placement new, unless you are actually using it on a "raw buffer". In this case, the size of the buffer should be determined with the following formula:
// C++03
char buffer[sizeof(T) + alignof(T) - 1];
Or you should make use of C++0x facilities:
// C++0x
std::aligned_storage<sizeof(T), alignof(T)> buffer;
Another trick to ensure a "right" alignment for virtual tables it to make use of the union:
// C++03 and C++0x
union { char raw[sizeof(T)]; void* aligner; } buffer;
The aligner parameter guarantees that the buffer is correctly aligned for pointers, and thus for virtual tables pointers as well.
EDIT: Additional explanations as suggested by #Tony.
(*) How does this work ?
To understand it we need to delve into the memory representation of a class. Each subelement of a class has its own alignment requirement, so for example:
struct A { int a; char b; int c; };
+----+-+---+----+
| a |b|xxx| c |
+----+-+---+----+
Where xxx denotes padding added so that c is suitably aligned.
What is the alignment of A ? Generally speaking, it is the stricter alignment of the subelements, so here, the alignment of int (which is often 4 since int is often a 32 bits integral).
To "guess" the alignment of an arbitrary type, we thus "trick" the compiler by using the AlignHelper template. Remember that sizeof(AlignHelper<T>) must be a multiple of the alignment because types should be laid out contiguously in an array, thus we hope our type will be padded after the c attribute, and the alignment will be the size of c (1 by definition) plus the size of the padding.
// AlignHelper<T>
+----------------+-+---+
| t |c|xxx|
+----------------+-+---+
// T
+----------------+
| t |
+----------------+
When we do sizeof(AlignHelper<T>) - sizeof(T) we get this difference. Surprisingly though, it could be 0.
The issue comes from the fact that if there is some padding (unused bytes) at the end of T, then a smart compiler could decide to stash c there, and thus the difference of size would be 0.
We could, obviously, try to recursively increase the size of c attribute (using a char array), until we finally get a non-zero difference. In which case we would get a "tight" alignment, but the simplest thing to do is to bail out and use sizeof(T), since we already know it is a multiple of the alignment.
Finally, there is no guarantee that the alignment we get with this method is the alignment of T, we get a multiple of it, but it could be bigger, since sizeof is implementation dependent and a compiler could decide to align all types on power of 2 boundaries, for example.
What would be a decent approach to
check at compile/run-time that a
particular struct/class does not have
any virtual functions
template<typename T>
struct Is_Polymorphic
{
struct Test : T { virtual ~Test() = 0; };
static const bool value = (sizeof(T) == sizeof(Test));
};
Above class can help you to check if the given class is polymorphic or not at compile time. [Note: virtual inheritance also have a vtable included]
You are almost certainly doing something wrong.
However, given that you have decided to do something wrong, you don't want to know if your tpe has no virtual functions. You want to know if it is okay to treat your type as an array of bytes.
In C++03, is your type POD? As luck would have it, there's a trait for that, aptly named is_pod<T>. This is provided by Boost/TR1 in C++03, although it requires a relatively modern compiler [gcc > 4.3, MSVC > 8, others I don't know].
In C++11, you can ease up your requirements by asking if your type is trivially copiable. Again, there's a trait for that: is_trivially_copyable<T>.
In either case, there is also is_polymorphic<T>, but as I said, that's really not what you want anyhow. If you are using an older compiler, it does have the advantage of working out of the box if you get it from Boost; it performs the sizeof test mentioned elsewhere, rather than simply reporting false for all user defined types as is the case with is_pod.
No matter what, you'd better be 120% sure your constructor is a noop; that's not something that can be verified.
I just saw your edit. Of what you listed, Sun Studio is the only one that might not have the necessary intrinsics for these traits to work. gcc and MSVC have both had them for several years now.
dynamic_cast is only allowed for polymorphic classes, so you can utilize that for a compile time check.
Use is_pod type trait from tr1?
There is no feature for you to determine whether a class has virtual functions are not.
Is the following legal C++ with well-defined behaviour?
class my_class { ... };
int main()
{
char storage[sizeof(my_class)];
new ((void *)storage) my_class();
}
Or is this problematic because of pointer casting/alignment considerations?
Yes, it's problematic. You simply have no guarantee that the memory is properly aligned.
While various tricks exist to get storage with proper alignment, you're best off using Boost's or C++0x's aligned_storage, which hide these tricks from you.
Then you just need:
// C++0x
typedef std::aligned_storage<sizeof(my_class),
alignof(my_class)>::type storage_type;
// Boost
typedef boost::aligned_storage<sizeof(my_class),
boost::alignment_of<my_class>::value>::type storage_type;
storage_type storage; // properly aligned
new (&storage) my_class(); // okay
Note that in C++0x, using attributes, you can just do this:
char storage [[align(my_class)]] [sizeof(my_class)];
As people have mentioned here, this won't necessarily work due to alignment restrictions. There are several ways to get the alignment right. First, if you have a C++0x-compliant compiler, you can use the alignof operator to try to force the alignment to be correct. Second, you could dynamically-allocate the character array, since memory from operator new is guaranteed to be aligned in such a way that anything can use it correctly. Third, you could try storing the character array in a union with some type that has the maximum possible alignment on your system; I believe that this article has some info on it (though it's designed for C++03 and is certainly not as good as the alignof operator that's coming out soon).
Hope this helps!
It is at least problematic due to alignment.
On most Non-Intel architecture the code will generate a "bus error" due to wrong alignment or be extremely slow because of processor traps needed to fix the unaligned memory access.
On Intel architecture this will normally just be a bit slower than usual. Except if some SSE operations are involved, then it may also crash.
In case anyone wants to avoid Boost or C++1x, this complete code works both in GCC and MSVC. The MSVC-specific code is based on Chromium's aligned_memory.h. It's a little more complex than the GCC version, because MSVC's __declspec(align(.)) only accepts literal alignment values, and this is worked around using template specialization for all possible alignments.
#ifdef _MSC_VER
template <size_t Size, size_t Align>
struct AlignedMemory;
#define DECLARE_ONE_ALIGNED_MEMORY(alignment) \
template <size_t Size> \
struct __declspec(align(alignment)) AlignedMemory<Size, alignment> { \
char mem[Size]; \
};
DECLARE_ONE_ALIGNED_MEMORY(1)
DECLARE_ONE_ALIGNED_MEMORY(2)
DECLARE_ONE_ALIGNED_MEMORY(4)
DECLARE_ONE_ALIGNED_MEMORY(8)
DECLARE_ONE_ALIGNED_MEMORY(16)
DECLARE_ONE_ALIGNED_MEMORY(32)
DECLARE_ONE_ALIGNED_MEMORY(64)
DECLARE_ONE_ALIGNED_MEMORY(128)
DECLARE_ONE_ALIGNED_MEMORY(256)
DECLARE_ONE_ALIGNED_MEMORY(512)
DECLARE_ONE_ALIGNED_MEMORY(1024)
DECLARE_ONE_ALIGNED_MEMORY(2048)
DECLARE_ONE_ALIGNED_MEMORY(4096)
#else
template <size_t Size, size_t Align>
struct AlignedMemory {
char mem[Size];
} __attribute__((aligned(Align)));
#endif
template <class T>
struct AlignedMemoryFor : public AlignedMemory<sizeof(T), __alignof(T)> {};
The char array may not be aligned correctly for the size of myclass. On some architectures, that means slower accesses, and on others, it means a crash. Instead of char, you should use a type whose alignment is equal to or greater than that of the struct, which is given by the largest alignment requirement of any of its members.
#include <stdint.h>
class my_class { int x; };
int main() {
uint32_t storage[size];
new(storage) my_class();
}
To allocate enough memory for one my_class instance, I think size ought to be sizeof(my_class) / sizeof(T), where T is whichever type you use to get the correct alignment.