For this simplified test case:
#include <map>
class Tester {
int foo;
std::map<int, int> smap;
};
int main() {
Tester test;
return 0;
}
I get the following compiler warning:
$ clang++ -std=c++98 -Weverything test.cc
test.cc:5:24: warning: padding class 'Tester' with 4 bytes to align 'smap' [-Wpadded]
std::map<int, int> smap;
^
Can anyone explain what this warning means, and how I should address it?
There's no real problem here. In C and C++, the compiler is allowed to insert padding after struct members to provide better alignment, and thus allow faster memory access. In this case, it looks like has decided to place smap on an 8-byte alignment. Since an int is almost certainly four bytes, the warning is telling you that there are four bytes of wasted space in the middle of the struct.
If there were more members of the struct, then one thing you could try would be to switch the order of the definitions. For example, if your Tester had members:
struct Tester {
int foo;
std::map<int, int> smap;
int bar;
};
then it would make sense to place the two ints next to each other to optimise alignment and avoid wasted space. However, in this case, you only have two members, and if you switch them around then the compiler will probably still add four bytes of padding to the end of the struct in order to optimise the alignment of Testers when placed inside an array.
I'm assuming you're compiling this on a 64-bit system.
On 64-bit systems, pointers are 8 bytes. Compilers will align structure members to natural boundaries, so an 8-byte pointer will start at an offset in a structure that is a multiple of 8 bytes.
Since int is only four bytes, the compiler inserted 4 bytes of "padding" after foo, so that smap is on an 8-byte boundary.
Edit: While smap is not a pointer, but a std::map, the same logic applies. I'm not sure what the exact rules for alignment of objects are, but the same thing is happening.
What to do? Nothing. Your code is perfectly fine, the compiler is just letting you know that this has taken place. There's absolutely nothing to worry about. -Weverything means turn on every possible warning, which is probably excessive for most all compilations.
Your compiler on your sytsem chose to give pointers on your 64bit system 8 bytes, int in the struct has 4 bytes. Similar problems/warnings are occurring to me a lot those days working with older code examples so I had to dig deeper.
To make it short, int was defined in the 60's with no 64 bit system, no Gigabytes of storage nor GB of ram in mind.
To solve your error message use size_t (size type) instead of int when necessary - in your case with the map stl since it is programmed to run on multiple different systems.
With size_t your compiler can choose itself what byte size it needs if it compiles on a 32 bit system or a 64 bit system or arm or what ever and the message is gone and you won't even have to modify your code no matter what system you may compile your code for in the future.
Related
Assume I'm on Windows x64. Also assume I have this 9-Byte long example class:
class Example{
public:
double x;
bool y;
void someFunction();
}
If I go ahead and make an array of 4 Example objects, I will be using memory with 36 bytes. My questions are these:
Since I'm on a x64 architecture, does that mean I will have 4 unusable bytes in the end of the array? (36 + 4 = 40 = 5 * 8bytes) And by unusable I mean that my program is not going to use that place of memory, as long as the array exists.
If I compile my c++ program for x32 and the above is true... Do I still have 4 unusable bytes? Is that dependent on what architecture the program runs?
Are there any cases that objects would not use a length of memory that's equal to the size sum of their member variables?
Disclaimer: Not computer scientist / engineer. Easy answers please! Thank you!
Edit 1: The example class is not 9 bytes, it's 16 when used with sizeof(), but in array context, addresses of objects are 9 bytes apart.
The only thing you can be really sure of is that sizeof(Example) is a constant, and is large enough to (at least) contain the values.
When defining the a class or struct you actually only specify two things: The types of the individual members, and their order. The compiler is basically free to do the memory representation in any way it wants, as long as it follows those two.
In most cases the compiler will add padding so all members are aligned for easy access, meaning for instance that the offset within the class of a double will be a multiple of 8 bytes.
("Easy access" can be a bit of a rabbit-hole to get into, which is outside of this answer).
Arrays are aligned with the same size as in non-array cases: sizeof(Example[4]) == sizeof(Example)*4
This also means that in most cases the size of Example will be padded to be a multiple of 8 bytes, because then all objects in an array are aligned for easy access.
Note that there are possibilities with preprocessor pragmas like #pragma pack to specify how the compiler should do all this, but they are all compiler-specific and not portable, so I suggest avoiding them.
In short: Don't assume anything about size, but instead use sizeof() where needed.
Even better: Avoid using the binary size anywhere, as the compiler will take care about it in most cases and it will often make the code more complicated than need be.
[Not a duplicate of Structure padding and packing. That question is about how and when padding occurs. This one is about how to deal with it.]
I have just realized how much memory is wasted as a result of alignment in C++. Consider the following simple example:
struct X
{
int a;
double b;
int c;
};
int main()
{
cout << "sizeof(int) = " << sizeof(int) << '\n';
cout << "sizeof(double) = " << sizeof(double) << '\n';
cout << "2 * sizeof(int) + sizeof(double) = " << 2 * sizeof(int) + sizeof(double) << '\n';
cout << "but sizeof(X) = " << sizeof(X) << '\n';
}
When using g++ the program gives the following output:
sizeof(int) = 4
sizeof(double) = 8
2 * sizeof(int) + sizeof(double) = 16
but sizeof(X) = 24
That's 50% memory overhead! In a 3-gigabyte array of 134'217'728 Xs 1 gigabyte would be pure padding.
Fortunately, the solution to the problem is very simple - we simply have to swap double b and int c around:
struct X
{
int a;
int c;
double b;
};
Now the result is much more satisfying:
sizeof(int) = 4
sizeof(double) = 8
2 * sizeof(int) + sizeof(double) = 16
but sizeof(X) = 16
There is however a problem: this isn't cross-compatible. Yes, under g++ an int is 4 bytes and a double is 8 bytes, but that's not necessarily always true (their alignment doesn't have to be the same either), so under a different environment this "fix" could not only be useless, but it could also potentially make things worse by increasing the amount of padding needed.
Is there a reliable cross-platform way to solve this problem (minimize the amount of needed padding without suffering from decreased performance caused by misalignment)? Why doesn't the compiler perform such optimizations (swap struct/class members around to decrease padding)?
Clarification
Due to misunderstanding and confusion, I'd like to emphasize that I don't want to "pack" my struct. That is, I don't want its members to be unaligned and thus slower to access. Instead, I still want all members to be self-aligned, but in a way that uses the least memory on padding. This could be solved by using, for example, manual rearrangement as described here and in The Lost Art of Packing by Eric Raymond. I am looking for an automated and as much cross-platform as possible way to do this, similar to what is described in proposal P1112 for the upcoming C++20 standard.
(Don't apply these rules without thinking. See ESR's point about cache locality for members you use together. And in multi-threaded programs, beware false sharing of members written by different threads. Generally you don't want per-thread data in a single struct at all for this reason, unless you're doing it to control the separation with a large alignas(128). This applies to atomic and non-atomic vars; what matters is threads writing to cache lines regardless of how they do it.)
Rule of thumb: largest to smallest alignof(). There's nothing you can do that's perfect everywhere, but by far the most common case these days is a sane "normal" C++ implementation for a normal 32 or 64-bit CPU. All primitive types have power-of-2 sizes.
Most types have alignof(T) = sizeof(T), or alignof(T) capped at the register width of the implementation. So larger types are usually more-aligned than smaller types.
Struct-packing rules in most ABIs give struct members their absolute alignof(T) alignment relative to the start of the struct, and the struct itself inherits the largest alignof() of any of its members.
Put always-64-bit members first (like double, long long, and int64_t). ISO C++ of course doesn't fix these types at 64 bits / 8 bytes, but in practice on all CPUs you care about they are. People porting your code to exotic CPUs can tweak struct layouts to optimize if necessary.
then pointers and pointer-width integers: size_t, intptr_t, and ptrdiff_t (which may be 32 or 64-bit). These are all the same width on normal modern C++ implementations for CPUs with a flat memory model.
Consider putting linked-list and tree left/right pointers first if you care about x86 and Intel CPUs. Pointer-chasing through nodes in a tree or linked list has penalties when the struct start address is in a different 4k page than the member you're accessing. Putting them first guarantees that can't be the case.
then long (which is sometimes 32-bit even when pointers are 64-bit, in LLP64 ABIs like Windows x64). But it's guaranteed at least as wide as int.
then 32-bit int32_t, int, float, enum. (Optionally separate int32_t and float ahead of int if you care about possible 8 / 16-bit systems that still pad those types to 32-bit, or do better with them naturally aligned. Most such systems don't have wider loads (FPU or SIMD) so wider types have to be handled as multiple separate chunks all the time anyway).
ISO C++ allows int to be as narrow as 16 bits, or arbitrarily wide, but in practice it's a 32-bit type even on 64-bit CPUs. ABI designers found that programs designed to work with 32-bit int just waste memory (and cache footprint) if int was wider. Don't make assumptions that would cause correctness problems, but for "portable performance" you just have to be right in the normal case.
People tuning your code for exotic platforms can tweak if necessary. If a certain struct layout is perf-critical, perhaps comment on your assumptions and reasoning in the header.
then short / int16_t
then char / int8_t / bool
(for multiple bool flags, especially if read-mostly or if they're all modified together, consider packing them with 1-bit bitfields.)
(For unsigned integer types, find the corresponding signed type in my list.)
A multiple-of-8 byte array of narrower types can go earlier if you want it to. But if you don't know the exact sizes of types, you can't guarantee that int i + char buf[4] will fill an 8-byte aligned slot between two doubles. But it's not a bad assumption, so I'd do it anyway if there was some reason (like spatial locality of members accessed together) for putting them together instead of at the end.
Exotic types: x86-64 System V has alignof(long double) = 16, but i386 System V has only alignof(long double) = 4, sizeof(long double) = 12. It's the x87 80-bit type, which is actually 10 bytes but padded to 12 or 16 so it's a multiple of its alignof, making arrays possible without violating the alignment guarantee.
And in general it gets trickier when your struct members themselves are aggregates (struct or union) with a sizeof(x) != alignof(x).
Another twist is that in some ABIs (e.g. 32-bit Windows if I recall correctly) struct members are aligned to their size (up to 8 bytes) relative to the start of the struct, even though alignof(T) is still only 4 for double and int64_t.
This is to optimize for the common case of separate allocation of 8-byte aligned memory for a single struct, without giving an alignment guarantee. i386 System V also has the same alignof(T) = 4 for most primitive types (but malloc still gives you 8-byte aligned memory because alignof(maxalign_t) = 8). But anyway, i386 System V doesn't have that struct-packing rule, so (if you don't arrange your struct from largest to smallest) you can end up with 8-byte members under-aligned relative to the start of the struct.
Most CPUs have addressing modes that, given a pointer in a register, allow access to any byte offset. The max offset is usually very large, but on x86 it saves code size if the byte offset fits in a signed byte ([-128 .. +127]). So if you have a large array of any kind, prefer putting it later in the struct after the frequently used members. Even if this costs a bit of padding.
Your compiler will pretty much always make code that has the struct address in a register, not some address in the middle of the struct to take advantage of short negative displacements.
Eric S. Raymond wrote an article The Lost Art of Structure Packing. Specifically the section on Structure reordering is basically an answer to this question.
He also makes another important point:
9. Readability and cache locality
While reordering by size is the simplest way to eliminate slop, it’s not necessarily the right thing. There are two more issues: readability and cache locality.
In a large struct that can easily be split across a cache-line boundary, it makes sense to put 2 things nearby if they're always used together. Or even contiguous to allow load/store coalescing, e.g. copying 8 or 16 bytes with one (unaliged) integer or SIMD load/store instead of separately loading smaller members.
Cache lines are typically 32 or 64 bytes on modern CPUs. (On modern x86, always 64 bytes. And Sandybridge-family has an adjacent-line spatial prefetcher in L2 cache that tries to complete 128-byte pairs of lines, separate from the main L2 streamer HW prefetch pattern detector and L1d prefetching).
Fun fact: Rust allows the compiler to reorder structs for better packing, or other reasons. IDK if any compilers actually do that, though. Probably only possible with link-time whole-program optimization if you want the choice to be based on how the struct is actually used. Otherwise separately-compiled parts of the program couldn't agree on a layout.
(#alexis posted a link-only answer linking to ESR's article, so thanks for that starting point.)
gcc has the -Wpadded warning that warns when padding is added to a structure:
https://godbolt.org/z/iwO5Q3:
<source>:4:12: warning: padding struct to align 'X::b' [-Wpadded]
4 | double b;
| ^
<source>:1:8: warning: padding struct size to alignment boundary [-Wpadded]
1 | struct X
| ^
And you can manually rearrange members so that there is less / no padding. But this is not a cross platform solution, as different types can have different sizes / alignments on different system (Most notably pointers being 4 or 8 bytes on different architectures). The general rule of thumb is go from largest to smallest alignment when declaring members, and if you're still worried, compile your code with -Wpadded once (But I wouldn't keep it on generally, because padding is necessary sometimes).
As for the reason why the compiler can't do it automatically is because of the standard ([class.mem]/19). It guarantees that, because this is a simple struct with only public members, &x.a < &x.c (for some X x;), so they can't be rearranged.
There really isn't a portable solution in the generic case. Baring minimal requirements the standard imposes, types can be any size the implementation wants to make them.
To go along with that, the compiler is not allowed to reorder class member to make it more efficient. The standard mandates that the objects must be laid out in their declared order (by access modifier), so that's out as well.
You can use fixed width types like
struct foo
{
int64_t a;
int16_t b;
int8_t c;
int8_t d;
};
and this will be the same on all platforms, provided they supply those types, but it only works with integer types. There are no fixed-width floating point types and many standard objects/containers can be different sizes on different platforms.
Mate, in case you have 3GB of data, you probably should approach an issue by other way then swapping data members.
Instead of using 'array of struct', 'struct of arrays' could be used.
So say
struct X
{
int a;
double b;
int c;
};
constexpr size_t ArraySize = 1'000'000;
X my_data[ArraySize];
is going to became
constexpr size_t ArraySize = 1'000'000;
struct X
{
int a[ArraySize];
double b[ArraySize];
int c[ArraySize];
};
X my_data;
Each element is still easily accessible mydata.a[i] = 5; mydata.b[i] = 1.5f;....
There is no paddings (except a few bytes between arrays). Memory layout is cache friendly. Prefetcher handles reading sequential memory blocks from a few separate memory regions.
That's not as unorthodox as it might looks at first glance. That approach is widely used for SIMD and GPU programming.
Array of Structures (AoS), Structure of Arrays
This is a textbook memory-vs-speed problem. The padding is to trade memory for speed. You can't say:
I don't want to "pack" my struct.
because pragma pack is the tool invented exactly to make this trade the other way: speed for memory.
Is there a reliable cross-platform way
No, there can't be any. Alignment is strictly platform-dependent issue. Sizeof different types is a platform-dependent issue. Avoiding padding by reorganizing is platform-dependent squared.
Speed, memory, and cross-platform - you can have only two.
Why doesn't the compiler perform such optimizations (swap struct/class members around to decrease padding)?
Because the C++ specifications specifically guarantee that the compiler won't mess up your meticulously organized structs. Imagine you have four floats in a row. Sometimes you use them by name, and sometimes you pass them to a method that takes a float[3] parameter.
You're proposing that compiler should shuffle them around, potentially breaking all the code since the 1970s. And for what reason? Can you guarantee that every programmer ever will actually want to save your 8 bytes per struct? I'm, for one, sure that if I have 3 GB array, I'm having bigger problems than a GB more or less.
Although the Standard grants implementations broad discretion to insert arbitrary amounts of space between structure members, that's because the authors didn't want to try to guess all the situations where padding might be useful, and the principle "don't waste space for no reason" was considered self-evident.
In practice, almost every commonplace implementation for commonplace hardware will use primitive objects whose size is a power of two, and whose required alignment is a power of two that is no larger than the size. Further, almost every such implementation will place each member of a struct at the first available multiple of its alignment that completely follows the previous member.
Some pedants will squawk that code which exploits that behavior is "non-portable". To them I would reply
C code can be non-portable. Although it strove to give programmers the opportunity to write truly portable programs, the C89 Committee did not want to force programmers into writing portably, to preclude the use of C as a “high-level assembler”: the ability to write machine specific code is one of the strengths of C.
As a slight extension to that principle, the ability of code which need only run on 90% of machines to exploit features common to that 90% of machines--even though such code wouldn't exactly be "machine-specific"--is one of the strengths of C. The notion that C programmers shouldn't be expected to bend over backward to accommodate limitations of architectures which for decades have only been used in museums should be self-evident, but apparently isn't.
You can use #pragma pack(1), but the very reason of this is that the compiler optimizes. Accessing a variable through the full register is faster than accessing it to the least bit.
Specific packing is only useful for serialization and intercompiler compatibility, etc.
As NathanOliver correctly added, this might even fail on some platforms.
I have the following innocuous looking code:
void myFunc(){
struct stance {
long double interval;
QString name;
};
// [...]
}
When I build this using standard version of gcc on Ubuntu 18.04 I get a warning like this:
MySource.cpp:12: warning: padding size of 'stance' with 8 bytes to
alignment boundary (-wpadded)
I know that this warning shows up because the compiler needs to adjust the padding for my struct to something that I might not have expected and is kind enough to warn me as the user about this.
However, I am trying to have a warning-free build and so the question is, how can I make it explicit in my code in a standard compliant way that the compiler does not need to issue this warning?
So to be clear, I don't want to suppress warnings in my build script, nor using #pragma or similar either. I want to change the code of this struct so that my alignment expectations are explicit and match whatever the compiler wants to do hence not needing the warning to be displayed.
Just disable the warning (or - well, don't enable it, I don't know of any warning set like -Wall or -Wextra that includes it). -Wpadded is not meant to be always enabled, unless you want to always manually specify the necessary padding explicitly.
-Wpadded
Warn if padding is included in a structure, either to align an element of the structure or to align the whole structure. Sometimes when this happens it is possible to rearrange the fields of the structure to reduce the padding and so make the structure smaller.
(emphasis added)
This is one case where it's not possible. long double is 10 bytes, and it requires 16 bytes alignment (4 on x86 Linux); QString is effectively a pointer, so it needs 8 bytes alignment (4 on 32 bit Linux). You can swap them however you want, but if you want to keep natural alignment (and thus best performance) you'll either get 6 + 8 bytes of padding or 8 + 6 bytes of padding.
In general, adding padding is not a problem, happens all the time and there are cases such as this when it's unavoidable. The general rule to keep it at a minimum is to place elements in order of decreasing alignment requirements, but again, it cannot always be avoided.
As mentioned above, the only alternative (keeping good alignment) is making the padding explicit, but it doesn't make much sense (unless you are designing a file format or something and you want to make everything explicit, but in that case you wouldn't use a QString and would pack to 1 byte).
struct stance {
long double interval;
char unused0[6];
QString name;
char unused1[8];
};
I want to change the code of this struct so that my alignment expectations are explicit
It looks like you want the alignof operator and use it with alignas specifier. So you need at least C++11 and you might want std::alignment_of
Currently we are trying to keep track of the variables stored in memory, however we have faced with the following issues, maybe you would help us out
Currently we defined some global variables in our code, as follows
int x;
char y;
And we added the following lines of code
int main ( int argc, char *argv[ ] ){
printf("Memory of x %p\n",&x);
printf("Memory of y %p\n",&y);
system( "pause");
return 0;
}
The program returned the following address
Memory of x 0x028EE80
Memory of y 0x028EE87
If I make a sizeof x and a sizeof y I get 4 and 1 (the size of types integer and char)
What is then in between 0x028EE84 and 0x028EE86? why did it took 7 positions in order to insert the char variable in memory instead of inserting it on the 0x028EE81 memory position?
In general, the compiler will try to do something called alignment. This means that the compiler will try to have variables ending on multiples of 2, 4, 8, 16, ..., depending on the machine architecture. By doing this, memory accesses and writes are faster.
There are a number of very good answers here already however I do not feel any of them reach the very core of this issue. Where a compiler decides to place global variables in memory is not defined by C or C++. Though it may appear convenient to the programmer to store variables contiguously, the compiler has an enormous amount of information regarding your specific system and can thus provide a wide array of optimisations, perhaps causing it to use memory in ways which are not at first obvious.
Perhaps the compiler decided to place the int in an area of memory with other types of the same alignment and stuck the char among some strings which do not need to be aligned.
Still, the essence of this is that the compiler makes no obligations or promises of where it will store most types of variables in memory and short of reading the full sources of the compiler there is no easy way to understand why it did so. If you care about this so badly you should not be using separate variables, consider putting them into a struct which then has well defined memory placement rules (note padding is still allowed).
Because the compiler is free to insert padding in order to get better alignment.
If you absolutely must have them right next to each other in memory, put them in a struct and use #pragma pack to force the packing alignment to 1 (no padding).
#pragma pack(push, 1)
struct MyStruct
{
int x;
char y;
};
#pragma pack(pop)
This is technically compiler-dependent behavior (not enforced by the C++ standard) but I've found it to be fairly consistent among the major compilers.
I am trying to add CUDA to an existing single threaded C program that was written sometime in the late 90s.
To do this I need to mix two languages, C and C++ (nvcc is a c++ compiler).
The problem is that the C++ compiler sees a structure as a certain size, while the C compile sees the same structure as a slightly different size. Thats bad. I am really puzzled by this because I can't find a cause for a 4 byte discrepancy.
/usr/lib/gcc/i586-suse-linux/4.3/../../../../i586-suse-linux/bin/ld: Warning: size of symbol `tree' changed from 324 in /tmp/ccvx8fpJ.o to 328 in gpu.o
My C++ looks like
#include <stdio.h>
#include <stdlib.h>
#include "assert.h"
extern "C"
{
#include "structInfo.h" //contains the structure declaration
}
...
and my C files look like
#include "structInfo.h"
...
with structInfo.h looking like
struct TB {
int nbranch, nnode, root, branches[NBRANCH][2];
double lnL;
} tree;
...
My make file looks like
PRGS = prog
CC = cc
CFLAGS=-std=gnu99 -m32
CuCC = nvcc
CuFlags =-arch=sm_20
LIBS = -lm -L/usr/local/cuda-5.0/lib -lcuda -lcudart
all : $(PRGS)
prog:
$(CC) $(CFLAGS) prog.c gpu.o $(LIBS) -o prog
gpu.o:
$(CuCC) $(CuFlags) -c gpu.cu
Some people asked me why I didn't use a different host compilation option. I think the host compilation option has been deprecated since 2 release ago? Also it never appeared to do what it said it would do.
nvcc warning : option 'host-compilation' has been deprecated and is ignored
GPUs require natural alignment for all data, e.g. a 4-byte int needs to be aligned to a 4-byte boundary and an 8-byte double or long long needs to have 8-byte alignment. CUDA enforces this for host code as well to make sure structs are as compatible as possible between the host and device portions of the code. x86 CPUs on the other hand do not generally require data to be naturally aligned (although performance penalty may result from a lack of alignment).
In this case, CUDA needs to align the double component of the struct to an 8-byte boundary. Since an odd number of int components preceed the double, this requires padding. Switching the order of components, i.e. putting the double component first, does not help because in an array of such structs each struct would have to be 8-byte aligned and the size of the struct therefore must be a multiple of 8 bytes to accomplish that, which also requires padding.
To force gcc to align doubles in the same way CUDA does, pass the flag -malign-double.
Seems like different padding applied by 2 compilers: one is working with 4-byte alignment and the other with at least 8-byte alignment. You should be able to force the alignment you want by compiler-specific #pragma directives (check your compiler documentation about the specific #pragma).
There is no guarantee that two different C compilers will use the same representation for the same type -- unless they both conform to some external standard (an ABI) that specifies the representation in sufficient detail.
It's most likely a difference in padding, where one compiler requires a double to be 4-byte aligned and the other requires it to be 8-byte aligned. Both choices are perfectly valid as far as the C and C++ standards are concerned.
You can investigate this in more detail by printing out the sizes and offsets of all the members of your structure:
printf("nbranch: size %3u offset %3u\n",
(unsigned)sizeof tree.nbranch,
(unsigned)offsetof(struct TB, nbranch));
/* and similarly for the other members */
There may be a compiler-specific way to specify a different alignment, but such techniques are not always safe.
The ideal solution would be to use the same compiler for the C and C++ code. C is not a subset of C++, but it generally shouldn't be too difficult to modify existing C code so it compiles as C++.
Or you might be able to rearrange your structure definition so that both compilers happen to lay it out the same way. Placing the double member first is likely to work. This is still not guaranteed to work, and it could break with future versions of either compiler, but it's probably good enough.
Don't forget that there could also be padding at the very end of the structure; this is sometimes necessary to guarantee proper alignment for arrays of structures. Look at sizeof (struct TB) and compare it to the size and offset of the last declared member.
Another possibility: Insert explicit unused members to force a consistent alignment. For example, suppose if you have:
struct foo {
uint16_t x;
uint32_t y;
};
and one compiler puts y at 16 bits, and the other puts it at 32 bits with 16 bits of padding. If you change the definition to:
struct foo {
uint16_t x;
uint16_t unused_padding;
uint32_t y;
};
then you're more likely to have x and y have the same offset under both compilers. You'll still have to experiment to make sure everything is consistent.
Since the C and C++ code are going to be part of the same program (right?), you shouldn't have to worry about things like varying byte order. If you wanted to transmit values of your structure type between separate programs, say by storing them in files or transmitting them over a network, you might need to define a consistent way to serialize a structure value into a sequence of bytes and vice versa.