There has been already question about this topic (notably How to get address of some struct member in array of structures).
My question is the following :
when we use struct to describe a hardware device, so each structure member will correspond to some registers of the hardware device - how can we be sure that each member of the structure is mapped correctly on each register of the hardware ?
The ABI of the compiler dictates the alignment of the members, the user can also makes some mistake - and the only way to be sure that the mapping is done correctly is to check at run time.
The map file (at least for GNU ld) does not provide any clue about the placement of structure members.
Would there be a way to know at compiler or link time where each structure members are located ?
You can use offsetof along with, in C++, static_assert.
For example, the entirely arbitrary
#include <cstddef>
struct iom { // off,len
uint32_t rx; // +0,4
uint32_t tx; // +4,4
uint64_t clk; // +8,8
uint16_t irq; // +16,2
};
static_assert(offsetof(iom,rx)==0);
static_assert(offsetof(iom,tx)==4);
static_assert(offsetof(iom,clk)==8);
static_assert(offsetof(iom,irq)==16);
If the static_assert fails, for example because your compiler aligns the members to 64-bit boundaries, you need a compiler specific way to alter the alignment and padding. Eg, with gcc,
} __attribute__((packed));
at the end of the struct definition.
NB. I've answered for C++17.
C++11 or 14, or C11 require an error message as the second argument to static_assert, although you can wrap the whole thing in a macro to compose a nice string for you.
The offsetof macro works in C as well.
Related
I'm building a C++ library which uses many functions and struct's defined in a C library. To avoid porting any code to C++, I add the typical conditional preprocessing to the C header files. For example,
//my_struct.h of the C library
#include <complex.h>
#ifdef __cplusplus
extern "C" {
#endif
typedef struct {
double d1,d2,d3;
#ifdef __cplusplus
std::complex<double> z1,z2,z3;
std::complex<double> *pz;
#else
double complex z1,z2,z3;
double complex *pz;
#endif
int i,j,k;
} my_struct;
//Memory allocating + initialization function
my_struct *
alloc_my_struct(double);
#ifdef __cplusplus
}
#endif
The implementation of alloc_my_struct() is compiled in C. It simply allocates memory via malloc() and initializes members of my_struct.
Now when I do the following in my C++ code,
#include "my_struct.h"
...
my_struct *const ms = alloc_my_struct(2.);
I notice that *ms always have the expected memory layout, i.e., any access such as ms->z1 evaluates to the expected value. I find this really cool considering that (correct me if I'm wrong) the memory layout of my_struct during allocation is decided by the C compiler (in my case gcc -std=c11), while during access by the C++ compiler (in my case g++ -std=c++11).
My question is : Is this compatibility standardized? If not, is there any way around it?
NOTE : I don't have enough knowledge to argue against alignment, padding, and other implementation-defined specifics. But it is noteworthy that the GNU scientific library, which is C-compiled, is implementing the same approach (although their structs do not involve C99 complex numbers) for use in C++. On the other hand, I've done sufficient research to conclude that C++11 guarantees layout compatibility between C99 double complex and std::complex<double>.
C and C++ do share memory layout rules. In both languages structs are placed in memory in the same way. And even if C++ did want to do things a little differently, placing the struct inside extern "C" {} guarantees C layout.
But what your code is doing relies on C++ std::complex and C99 complex to be the same.
So see:
https://gcc.gnu.org/ml/libstdc++/2007-02/msg00161.html
C Complex Numbers in C++?
Your program has undefined behaviour: your definitions of my_struct are not lexically identical.
You're gambling that alignment, padding and various other things will not change between the two compilers, which is bad enough… but since this is UB anything could happen even if it were true!
It may not always be identical!
In this case looks like sizeof(std::complex<double>) is identical to sizeof(double complex).
Also pay attention to the fact that the compilers may (or may not) add padding to the structs to make them aligned to a specific value, based on the optimization configuration. And the padding may not always be identical resulting in different structure sizes (between C and c++).
Links to related posts:
C/C++ Struct memory layout equivalency
I would add compiler-specific attributes to "pack" the fields,
thereby guaranteeing all the ints are adjacent and compact. This is
less about C vs. C++ and more about the fact that you are likely using
two "different" compilers when compiling in the two languages, even if
those compilers come from a single vendor.
Adding a constructor will not change the layout (though it will make
the class non-POD), but adding access specifiers like private between
the two fields may change the layout (in practice, not only in
theory).
C struct memory layout?
In C, the compiler is allowed to dictate some alignment for every
primitive type. Typically the alignment is the size of the type. But
it's entirely implementation-specific.
Padding bytes are introduced so every object is properly aligned.
Reordering is not allowed.
Possibly every remotely modern compiler implements #pragma pack which
allows control over padding and leaves it to the programmer to comply
with the ABI. (It is strictly nonstandard, though.)
From C99 §6.7.2.1:
12 Each non-bit-field member of a structure or union object is aligned
in an implementation- defined manner appropriate to its type.
13 Within a structure object, the non-bit-field members and the units
in which bit-fields reside have addresses that increase in the order
in which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa.
There may be unnamed padding within a structure object, but not at its
beginning.
In general, C and C++ have compatible struct layouts, because the layout is dictated by the platform's ABI rules, not just by the language, and (for most implementations) C and C++ follow the same ABI rules for type sizes, data layout, calling conventions etc.
C++11 even defined a new term, standard-layout, which means the type will have a compatible layout to a similar type in C. That means it can't use virtual functions, private data members, multiple inheritance (and a few other things). A C++ standard-layout type should have the same layout as an equivalent C type.
As noted in other answers, your specific code is not safe in general because std::complex<double> and complex double are not equivalent types, and there is no guarantee that they are layout-compatible. However GCC's C++ standard library ensures it will work because std::complex<double> and std::complex<float> are implemented in terms of the underlying C types. Instead of containing two double, GCC's std::complex<double> has a single member of type __complex__ double, which the compiler implements identically to the equivalent C type.
GCC does this specifically to support code like yours, because it's a reasonable thing to want to do.
So combining GCC's special efforts for std::complex with the standard-layout rules and the platform ABI, means that your code will work with that implementation.
This is not necessarily portable to other C++ implementations.
Also note that by malloc() a struct with C++ object (std::complex<double>) you skipped the ctor and this is also UB - even if you expect the ctor is empty or just zero the value and harmless to be skipped, you can't complain if this breaks. So your program work is by pure luck.
I would like to avoid to fall into the XY trap so here is the original problem:
We have a small program which creates a shared memory segment on the PC. This program creates it by reading its structure from its header file (bunch of individual and nested struct definition). Basically just a .h and a .cpp file. This program will be compiled by g++.
We would like to create another program, a shared memory viewer, which displays the layout of this memory in a tree view. For that, we have to parse the previously mentioned header file and computing the offsets to read/manipulate the content of the specific part of the shared memory. We do not want to write a parser if it is not necessary especially because the header file contains additional declarations and definitions too. This program will be compiled by the same version of g++ as the previous program.
Originally, we wanted to use gccxml in the second program to parse the header file but it is based on 4.2 gcc and is cannot parse the included header files which contain C++11 code. Another idea is to use libclang to get the structure of that header file. libclang contains size information too, but I do not know if the size of the types and padding/alignment is the same in case of g++ and clang.
My question is: can you assume that the size of the C++ types and the padding/alignment of the structs will be the same when you compile the code with clang and g++? The environment (PC, OS) is the same. I am afraid we cannot, because the C++ standard does not specify the exact sizes of the types.
Do you know another solution to the original problem?
Short answer: Since clang has as a goal to "be compatible with gcc" (for both C and C++), I would say that you can expect it to generate same offsets and sizes for the same code.
Long answer:
Assuming you are using only basic types (int, short, double, char and pointers to those types), and we're restricting to gcc and clang (and their C++ versions), keeping to the same OS and same bitness (32- or 64-bit on "both sides"), then subject to actual bugs in the compiler, it should have the same structure layout.
Of course, that is a long list of restrictions, and of course the "subject to actual bugs" is a never-ending concern in these cases.
You can make your case a bit easier if you use defined size types, such as uint32_t rather than int - conversely, if you put a class member in the structure, that has virtual members, you'd be seriously in trouble - but that doesn't work very well with shared memory anyway, as it's not guaranteed to be at the same place in different applications.
Be wary of STL functionality - you may not get the same C++ library for the two compilers (you may, or may not, depending on how you installed it).
I would double check, by adding some code to print the offset and size of important members (and run with both compilers, of course) - don't forget to do this for the members deep inside some struct, since it could well be that the overall size of a struct could be identical and the content could be at different offsets.
(As others have said, I have seen projects where some code is generated with a script that prints the offsets of the struct members, and this is used as input for other programs in the project)
Actually, in this particular case, you should be fine.
The memory layout of data-structures is part of the ABI (Application Binary Interface), and gcc and clang both follow the Itanium ABI on x86 (and x86_64). Therefore, baring bugs, and provided they both compile for x86 or x86_64, they should end up with binary compatible types.
In the general case, you would typically cheat:
Use packed data structure: struct X { ... } __attribute__((packed)) __attribute__((aligned (8))); and you completely control the structure memory layout
As mentioned by Alf, have one compiler spew the offset of each member and use that to feed the generation of structures for the second compiler
Other ?
Size of data types vary from platform to platform. Instead of hardcoding, use sizeof operator to find out appropriate size applicable for the target platform, for example,
sizeof(int)
sizeof(char)
sizeof(double)
etc.
If you use fixed width integer types (http://en.cppreference.com/w/cpp/types/integer) in a C-style struct and arrange members in decreasing order of size (i.e. largest members first), it should be pretty safe.
I think I understand your issue. This is what Chrome does
COMPILE_ASSERT(sizeof(double) == 8, Double_size_not_8);
It assumes the sizes will match but checks just to make sure.
COMPILE_ASSERT is a macro. You can find the definition here but the short version is it's just what it says. An assert that happens at compile time.
If the sizes did not match then one way to deal with it is to define your header in bytes only. Instead of for example
struct SomeBinaryFileHeader {
int version;
int width;
int height;
};
You might do this
struct SomeBinaryFileHeaderReadWriteVersion {
uint_8 version_0;
uint_8 version_1;
uint_8 version_2;
uint_8 version_3;
uint_8 width_0;
uint_8 width_1;
uint_8 width_2;
uint_8 width_3;
uint_8 height_0;
uint_8 height_1;
uint_8 height_2;
uint_8 height_3;
}
Etc. and then convert from one to the other which will even work across endianness
Working with a piece of code right now that features declarations of the form:
typedef PACKED(struct)
{
// some stuff in here
} struct_name;
now...PACKED is a macro on our part. What the heck does this syntax mean? I don't understand the use of parenthesis. This is not compiling, so I'm guessing this is probably incorrect. Is this close to some other valid syntax, or is it just nonsense?
If it is almost valid..how is this code actually supposed to be written and what is it supposed to mean?
The only form of typedef struct I've seen and can find online is:
typedef struct
{
// some stuff in here
} struct_name;
Solved: All I needed to realize was that struct was a parameter in a macro function. Thanks!
Usually something in all caps is a macro. In this case it's probably supposed to decorate the struct declaration with the syntax for creating a packed structure, which will vary based on the compiler used.
Chances are you're completely missing the definition of the PACKED macro.
Most compilers provide the functionality to specify how structures should be stored in memory. It's rarely used, but packed usually means *have this structure occupy the least space possible, even at a loss of speed in accessing its members*.
Given that your code doesn't compile, I'd say PACKED was most likely a macro now lost, or a compiler-specific keyword not available in the one you're using.
With that syntax, PACKED has to be a macro.
Probably referring to some manner of struct alignment
__attribute__(packed, aligned(1))
#pragma pack(push,1)
//...
#pragma pack(pop)
though hard for us to tell you, as opposed to the reverse :))
Imagine a struct made up of 32-bit, 16-bit, and 8-bit member values. Where the ordering of member values is such that each member is on it's natural boundary.
struct Foo
{
uint32_t a;
uint16_t b;
uint8_t c;
uint8_t d;
uint32_t e;
};
Member alignment and padding rules are documented for Visual C++. sizeof(Foo) on VC++ the above struct is predictably "12".
Now, I'm pretty sure the rule is that no assumption should be made about padding and alignment, but in practice, do other compilers on other operating systems make similar guarantees?
If not, is there an equivalent of "#pragma pack(1)" on GCC?
In practice, on any system where the uintXX_t types exist, you will get the desired alignment with no padding. Don't throw in ugly gcc-isms to try to guarantee it.
Edit: To elaborate on why it may be harmful to use attribute packed or aligned, it may cause the whole struct to be misaligned when used as a member of a larger struct or on the stack. This will definitely hurt performance and, on non-x86 machines, will generate much larger code. It also means it's invalid to take a pointer to any member of the struct, since code that accesses the value through a pointer will not be aware that it could be misaligned and thus could fault.
As for why it's unnecessary, keep in mind that attribute is specific to gcc and gcc-workalike compilers. The C standard does not leave alignment undefined or unspecified. It's implementation-defined which means the implementation is required to further specify and document how it behaves. gcc's behavior is, and always has been, to align each struct member on the next boundary of its natural alignment (the same alignment it would have when used outside of a struct, which is necessarily a number that evenly divides the size of the type). Since attribute is a gcc feature, if you use it you're already assuming a gcc-like compiler, but then by assumption you have the alignment you want already.
In general you are correct that it's not a safe assumption, although you will often get the packing you expect on many systems. You may want to use the packed attribute on your types when you use gcc.
E.g.
struct __attribute__((packed)) Blah { /* ... */ };
On systems that actually offer those types, it is highly likely to work. On, say, a 36-bit system those types would not be available in the first place.
GCC provides an attribute
__attribute__ ((packed))
With similar effect.
While reading about the function InterlockedIncrement I saw the remark that the variable passed must be aligned on a 32-bit boundary. Normally I have seen the code which uses the InterlockedIncrement like this:
class A
{
public:
A();
void f();
private:
volatile long m_count;
};
A::A() : m_count(0)
{
}
void A::f()
{
::InterlockedIncrement(&m_count);
}
Does the above code work properly in multi-processor systems or should I take some more care for this?
It depends on your compiler settings. However, by default, anything eight bytes and under will be aligned on a natural boundary. Thus an "int" we be aligned on a 32-bit boundary.
Also, the "#pragma pack" directive can be used to change alignment inside a compile unit.
I would like to add that the answer assumes Microsoft C/C++ compiler. Packing rules might differ from compiler to compiler. But in general, I would assume that most C/C++ compilers for Windows use the same packing defaults just to make working with Microsoft SDK headers a bit easier.
The code looks fine (variables will be properly aligned unless you specifically do something to break that - usually involving casting or 'packed' structures).
Yes, this will work fine. Compilers usually do align unless instructed otherwise.
Strictly speaking, it really depends on your usage of A - for instance, if you pack an "A" object within a shell ITEMIDLIST, or a struct with a bad "pragma pack" the data may not be properly aligned.