Testing C++ code for endian-independence - c++

How can I test or check C++ code for endian-independence? It's already implemented, I would just like to verify that it works on both little- and big-endian platforms.
I could write unit tests and run them on the target platforms, but I don't have the hardware. Perhaps emulators?
Are there compile time checks that can be done?

If you have access to an x86-based Mac then you can take advantage of the fact that Mac OS X has PowerPC emulation built in as well as developer tool support for both x86 (little endian) and PowerPC (big endian). This enables you to compile and run a big and little endian executable on the same platform, e.g.
$ gcc -arch i386 foo.c -o foo_x86 # build little endian x86 executable
$ gcc -arch ppc foo.c -o foo_ppc # build big endian PowerPC executable
Having built both big endian and little endian executables you can then run whatever unit tests you have available on both, which will catch some classes of endianness-related problems, and you can also compare any data generated by the executables (files, network packets, whatever) - this should obviously match.

You can set up an execution environment in the opposite endianness using qemu. For example if you have access to little-endian amd64 or i386 hardware, you can set up qemu to emulate a PowerPC Linux platform, run your code there.

I read a story that used Flint (Flexible Lint) to diagnose this kind of errors.
Don't know the specifics anymore, but let me google the story back for you:
http://www.datacenterworks.com/stories/flint.html
An Example: Diagnosing Endianness Errors
On a recent engagement, we were porting code from an old Sequent to a SPARC, and after the specific pointer issues we discussed in the Story of Thud and Blunder, we needed to look for other null pointer issues and also endian-ness errors.

I would suggest adapting a coding technique that avoids the problem all together.
First, you have to understand in which situation an endianess problem occurs. Then either find an endianess-agnostic way to write this, or isolate the code.
For example, a typical problem where endianess issues can occur is when you use memory accesses or unions to pick out parts of a larger value. Concretely, avoid:
long x;
...
char second_byte = *(((char *)&x) + 1);
Instead, write:
long x;
...
char second_byte = (char)(x >> 8)
Concatenation, this is one of my favorites, as many people tend to think that you can only do this using strange tricks. Don't do this:
union uu
{
long x;
unsigned short s[2];
};
union uu u;
u.s[0] = low;
u.s[1] = high;
long res = u.x;
Instead write:
long res = (((unsigned long)high) << 16) | low

I could write unit tests and run them on the target platforms, but I don't have the hardware.
You can setup your design so that unit tests are easy to run independent of actually having hardware. You can do this using dependency injection. I can abstract away things like hardware interfaces by providing a base interface class that the code I'm testing talks to.
class IHw
{
public:
virtual void SendMsg1(const char* msg, size_t size) = 0;
virtual void RcvMsg2(Msg2Callback* callback) = 0;
...
};
Then I can have the concrete implementation that actually talks to hardware:
class CHw : public IHw
{
public:
void SendMsg1(const char* msg, size_t size);
void RcvMsg2(Msg2Callback* callback);
};
And I can make a test stub version:
class CTestHw : public IHw
{
public:
void SendMsg1(const char* msg, size_t);
void RcvMsg2(Msg2Callback* callback);
};
Then my real code can us the concrete Hw, but I can simulate it in test code with CTestHw.
class CSomeClassThatUsesHw
{
public:
void MyCallback(const char* msg, size_t size)
{
// process msg 2
}
void DoSomethingToHw()
{
hw->SendMsg1();
hw->RcvMsg2(&MyCallback);
}
private:
IHw* hw;
}

IMO, the only answer that comes close to being correct is Martin's. There are no endianness concerns to address if you aren't communicating with other applications in binary or reading/writing binary files. What happens in a little endian machine stays in a little endian machine if all of the persistent data are in the form of a stream of characters (e.g. packets are ASCII, input files are ASCII, output files are ASCII).
I'm making this an answer rather than a comment to Martin's answer because I am proposing you consider doing something different from what Martin proposed. Given that the dominant machine architecture is little endian while network order is big endian, many advantages arise if you can avoid byte swapping altogether. The solution is to make your application able to deal with wrong-endian inputs. Make the communications protocol start with some kind of machine identity packet. With this info at hand, your program can know whether it has to byte swap subsequent incoming packets or leave them as-is. The same concept applies if the header of your binary files has some indicator that lets you determine the endianness of those files. With this kind of architecture at hand, your application(s) can write in native format and can know how to deal with inputs that are not in native format.
Well, almost. There are other problems with binary exchange / binary files. One such problem is floating point data. The IEEE floating point standard doesn't say anything about how floating point data are stored. It says nothing regarding byte order, nothing about whether the significand comes before or after the exponent, nothing about the storage bit order of the as-stored exponent and significand. This means you can have two different machines of the same endianness that both follow the IEEE standard and you can still have problems communicating floating point data as binary.
Another problem, not so prevalent today, is that endianness is not binary. There are other options than big and little. Fortunately, the days of computers that stored things in 2143 order (as opposed to 1234 or 4321 order) are pretty much behind us, unless you deal with embedded systems.
Bottom line:
If you are dealing with a near-homogenous set of computers, with only one or two oddballs (but not too odd), you might want to think of avoiding network order. If the domain has machines of multiple architectures, some of them very odd, you might have to resort to the lingua franca of network order. (But do beware that this lingua franca does not completely resolve the floating point problem.)

I personally use Travis to test my software hosted on github and it supports running on multiple architectures [1], including s390x which is big endian.
I just had to add this to my .travis.yml:
arch:
- amd64
- s390x # Big endian arch
It's probably not the only CI proposing this, but that's the one I was already using. I run both unit tests and integrated test on both systems which gives me some reasonable confidence that it works fine no matter the endianness.
It's no silver bullet though, I'd like to have an easy way to test it manually too just to ensure there's no hidden error (e.g I'm using SDL, colors could be wrong. I'm using screenshot to validate the output but the code for taking screenshot could have errors compensating the display problem, so the tests could pass with the display being wrong).
[1] https://blog.travis-ci.com/2019-11-12-multi-cpu-architecture-ibm-power-ibm-z

Related

Are packed structs portable?

I have some code on a Cortex-M4 microcontroller and'd like to communicate with a PC using a binary protocol. Currently, I'm using packed structs using the GCC-specific packed attribute.
Here is a rough outline:
struct Sensor1Telemetry {
int16_t temperature;
uint32_t timestamp;
uint16_t voltageMv;
// etc...
} __attribute__((__packed__));
struct TelemetryPacket {
Sensor1Telemetry tele1;
Sensor2Telemetry tele2;
// etc...
} __attribute__((__packed__));
My question is:
Assuming that I use the exact same definition for the TelemetryPacket struct on the MCU and the client app, will the above code be portable accross multiple platforms? (I'm interested in x86 and x86_64, and need it to run on Windows, Linux and OS X.)
Do other compilers support packed structs with the same memory layout? With what syntax?
EDIT:
Yes, I know packed structs are non-standard, but they seem useful enough to consider using them.
I'm interested in both C and C++, although I don't think GCC would handle them differently.
These structs are not inherited and don't inherit anything.
These structs only contain fixed-size integer fields, and other similar packed structs. (I've been burned by floats before...)
Considering the mentioned platforms, yes, packed structs are completely fine to use. x86 and x86_64 always supported unaligned access, and contrary to the common belief, unaligned access on these platforms has (almost) the same speed as aligned access for a long time (there's no such thing that unaligned access is much slower). The only drawback is that the access may not be atomic, but I don't think it matters in this case. And there is an agreement between compilers, packed structs will use the same layout.
GCC/clang supports packed structs with the syntax you mentioned. MSVC has #pragma pack, which can be used like this:
#pragma pack(push, 1)
struct Sensor1Telemetry {
int16_t temperature;
uint32_t timestamp;
uint16_t voltageMv;
// etc...
};
#pragma pack(pop)
Two issues can arise:
Endianness must be the same across platforms (your MCU must be using little-endian)
If you assign a pointer to a packed struct member, and you're on an architecture which doesn't support unaligned access (or use instructions which have alignment requirements, like movaps or ldrd), then you may get a crash using that pointer (gcc doesn't warn you about this, but clang does).
Here's the doc from GCC:
The packed attribute specifies that a variable or structure field
should have the smallest possible alignment—one byte for a variable
So GCC guarantees that no padding will be used.
MSVC:
To pack a class is to place its members directly after each other in
memory
So MSVC guarantees that no padding will be used.
The only "dangerous" area I've found, is the usage of bitfields. Then the layout may differ between GCC and MSVC. But, there's an option in GCC, which makes them compatible: -mms-bitfields
Tip: even, if this solution works now, and it is highly unlikely that it will stop working, I recommend you keep dependency of your code on this solution low.
Note: I've considered only GCC, clang and MSVC in this answer. There are compilers maybe, for which these things are not true.
If
endianness is not an issue
both compilers handle packing correctly
the type definitions on both C implementations are accurate (Standard compliant).
then yes, "packed structures" are portable.
For my taste too many "if"s, do not do this. It's not worth the hassle to arise.
You could do that, or use a more reliable alternative.
For the hard core amongst the serialisation fanatics out there, there's CapnProto. This gives you a native structure to deal with, and undertakes to ensure that when it's transferred across a network and lightly worked on, it'll still make sense the other end. To call it a serialisation is nearly inaccurate; it aims to do a little as possible to the in-memmory representation of a structure. Might be amenable to porting to an M4
There's Google Protocol Buffers, that's binary. More bloaty, but pretty good. There's the accompanying nanopb (more suited to microcontrollers), but it doesn't do the whole of GPB (I don't think it does oneof). Many people use it successfully though.
Some of the C asn1 runtimes are small enough for use on micro controllers. I know this one fits on M0.
You should never use structs across compile domains, against memory (hardware registers, picking apart items read from a file or passing data between processors or the same processor different software (between an app and a kernel driver)). You are asking for trouble as the compiler has somewhat free will to choose alignment and then the user on top of that can make it worse by using modifiers.
No there is no reason to assume you can do this safely across platforms, even if you use the same gcc compiler version for example against different targets (different builds of the compiler as well as the target differences).
To reduce your odds of failure start with the largest items first (64 bit then 32 bit the 16 bit then lastly any 8 bit items) Ideally align on 32 minimum perhaps 64 which one would hope arm and x86 do, but that can always change as well as the default can be modified by whomever builds the compiler from sources.
Now if this is a job security thing, sure go ahead, you can do regular maintenance on this code, likely going to need a definition of each structure for each target (so one copy of the source code for the structure definition for ARM and another for x86, or will need this eventually if not immediately). And then every or every few product releases you get to be called in to do work on the code...Nice little maintenance time bombs that go off...
If you want to safely communicate between compile domains or processors the same or different architectures, use an array of some size, a stream of bytes a stream of halfwords or a stream of words. Significantly reduces your risk of failure and maintenance down the road. Do not use structures to pick apart those items that just restores the risk and failure.
The reason why folks seem to think this is okay because of using the same compiler or family against the same target or family (or compilers derived from other compilers choices), as you understand the rules of the language and where the implementation defined areas are you will eventually run across a difference, sometimes it takes decades in your career, sometimes it takes weeks...Its the "works on my machine" problem...
If you want something maximally portable, you can declare a buffer of uint8_t[TELEM1_SIZE] and memcpy() to and from offsets within it, performing endianness conversions such as htons() and htonl() (or little-endian equivalents such as the ones in glib). You could wrap this in a class with getter/setter methods in C++, or a struct with getter-setter functions in C.
It strongly depends on what struct is, bear in mind that in C++ struct is a class with default visibility public.
So you can inherit and even add virtual to this so this could break things for you.
If it is a pure data class (in C++ terms a standard layout class) this should work in combination with packed.
Also bear in mind, that if you start doing this you might get problems with strict aliasing rules of your compiler, because you will have to look at the byte representation of your memory (-fno-strict-aliasing is your friend).
Note
That being said I would strongly advise against using that for serialization. If you use tools for this (i.e.: protobuf, flatbuffers, msgpack, or others) you get a ton of features:
language independence
rpc (remote procedure call)
data specification languages
schemas/validation
versioning
Speaking about alternatives and considering your question Tuple-like container for packed data (for which I don't have enough reputation to comment on), I suggest having a look at Alex Robenko's CommsChampion project:
COMMS is the C++(11) headers only, platform independent library, which makes the implementation of a communication protocol to be an easy and relatively quick process. It provides all the necessary types and classes to make the definition of the custom messages, as well as wrapping transport data fields, to be simple declarative statements of type and class definitions. These statements will specify WHAT needs to be implemented. The COMMS library internals handle the HOW part.
Since you're working on a Cortex-M4 microcontroller, you may also find interesting that:
The COMMS library was specifically developed to be used in embedded systems including bare-metal ones. It doesn't use exceptions and/or RTTI. It also minimises usage of dynamic memory allocation and provides an ability to exclude it altogether if required, which may be needed when developing bare-metal embedded systems.
Alex provides an excellent free ebook titled Guide to Implementing Communication Protocols in C++ (for Embedded Systems) which describes the internals.
Here is pseudo code towards an algorithm that may fit your needs to ensure the use with the proper target OS and platform.
If using the C language you will not be able to use classes, templates and a few other things, but you can use preprocessor directives to create the version of your struct(s) you need based on the OS, the architect CPU-GPU-Hardware Controller Manufacturer {Intel, AMD, IBM, Apple, etc.}, platform x86 - x64 bit, and finally the endian of the byte layout. Otherwise the focus here would be towards C++ and the use of templates.
Take your struct(s) for example:
struct Sensor1Telemetry {
int16_t temperature;
uint32_t timestamp;
uint16_t voltageMv;
// etc...
} __attribute__((__packed__));
struct TelemetryPacket {
Sensor1Telemetry tele1;
Sensor2Telemetry tele2;
// etc...
} __attribute__((__packed__));
You could template these structs as such:
enum OS_Type {
// Flag Bits - Windows First 4bits
WINDOWS = 0x01 // 1
WINDOWS_7 = 0x02 // 2
WINDOWS_8 = 0x04, // 4
WINDOWS_10 = 0x08, // 8
// Flag Bits - Linux Second 4bits
LINUX = 0x10, // 16
LINUX_vA = 0x20, // 32
LINUX_vB = 0x40, // 64
LINUX_vC = 0x80, // 128
// Flag Bits - Linux Third Byte
OS = 0x100, // 256
OS_vA = 0x200, // 512
OS_vB = 0x400, // 1024
OS_vC = 0x800 // 2048
//....
};
enum ArchitectureType {
ANDROID = 0x01
AMD = 0x02,
ASUS = 0x04,
NVIDIA = 0x08,
IBM = 0x10,
INTEL = 0x20,
MOTOROALA = 0x40,
//...
};
enum PlatformType {
X86 = 0x01,
X64 = 0x02,
// Legacy - Deprecated Models
X32 = 0x04,
X16 = 0x08,
// ... etc.
};
enum EndianType {
LITTLE = 0x01,
BIG = 0x02,
MIXED = 0x04,
// ....
};
// Struct to hold the target machines properties & attributes: add this to your existing struct.
struct TargetMachine {
unsigned int os_;
unsigned int architecture_;
unsigned char platform_;
unsigned char endian_;
TargetMachine() :
os_(0), architecture_(0),
platform_(0), endian_(0) {
}
TargetMachine( unsigned int os, unsigned int architecture_,
unsigned char platform_, unsigned char endian_ ) :
os_(os), architecture_(architecture),
platform_(platform), endian_(endian) {
}
};
template<unsigned int OS, unsigned int Architecture, unsigned char Platform, unsigned char Endian>
struct Sensor1Telemetry {
int16_t temperature;
uint32_t timestamp;
uint16_t voltageMv;
// etc...
} __attribute__((__packed__));
template<unsigned int OS, unsigned int Architecture, unsigned char Platform, unsigned char Endian>
struct TelemetryPacket {
TargetMachine targetMachine { OS, Architecture, Platform, Endian };
Sensor1Telemetry tele1;
Sensor2Telemetry tele2;
// etc...
} __attribute__((__packed__));
With these enum identifiers you could then use class template specialization to set the up this class to its needs depending on the above combinations. Here I would take all the common cases that would seem to work fine with default class declaration & definition and set that as the main class's functionality. Then for those special cases, such as different Endian with byte order, or specific OS versions doing something in a different way, or GCC versus MS compilers with the use of __attribute__((__packed__)) versus #pragma pack() can then be the few specializations that need to be accounted for. You shouldn't need to specify a specialization for every possible combination; that would be too daunting and time consuming, should only need to do the few rare case scenarios that can occur to make sure you always have proper code instructions for the target audience. What also makes the enums very handy too is that if you pass these as a function argument, you can set multiple ones at a time as they are designed as bit flags. So if you want to create a function that takes this template struct as its first argument, then supported OS's as its second you could then pass in all available OS support as bit flags.
This may help to ensure that this set of packed structures is being "packed" and or aligned correctly according to the appropriate target and that it will always perform the same functionality to maintain portability across different platforms.
Now you may have to do this specialization twice between the preprocessor directives for different supporting compilers. Such that if the current compiler is GCC as it defines the struct in one way with its specializations, then Clang in another, or MSVC, Code Blocks etc. So there is a little overhead to get this initially set up, but it should, could highly ensure that it is being properly used in the specified scenario or combination of attributes of the target machine.
Not always. When you send data to different architect processor, you need to consider about Endianness, primitive data type, etc. Better to use Thrift or Message Pack. If not, create yourself Serialize and DeSerialize methods instead.

Convert a large project using bitfields to use something more portable

I'm working on a large software project (a few million lines of code) that has been cobbled together for more than 20 years. It's a mixture of Fortran/C/C++, currently targeted at Solaris & built with Sun Studio (though I have it compileable on clang/gcc).
We're moving the project to a Linux/x86 environment. One of the challenges we're facing is a few thousand bitfields spread liberally across the project.
The current thinking is to do, more or less, something along these lines:
1) Whether manually (ouch) or through some sort of refactoring tool (perhaps made with clang), change all the bitfield definitions to reverse the order of bitfield members; e.g.:
struct SomeStruct {
#if defined( BIG_ENDIAN )
int x1 : size1;
int x2 : size2;
/* ... */
#else
/* ... */
int x2 : size2;
int x1 : size1;
#endif
};
2) Any point at which we send/receive these bitfields we'd have to fix the byte order.
On the surface this seems like a passable (albeit not entirely portable) hack, but I'm very leery of it:
It's error prone (oops, forgot to fix byte order; oops, updated the Solaris side of that struct but not the Linux side)
Although Sun Studio & gcc appear to pack structs/bitfields the same way (when tested on the same platform) in the cases tested thus far, I won't be the least bit surprised to find there's corner cases, and knowing this code base it'll just so happen we have code using that corner case.
Historically developers have been allowed to get away with anything they want (e.g., when we started running static analysis tools we found lots of code casting away const and a variety of other badness). If there are /any/ corner cases involved here, some of the more questionable (existing) code may become even more brittle.
If we're already going to have to touch that much code to facilitate this move, my gut says we should just invest time in replacing them with something more portable that can serialize/deserialize itself so the bit/byte order is something dictated through the API as opposed to being a function of language/compiler nuances.
My question is: is the solution that stays with bitfields reasonable? Are there any better solutions? (keeping in mind that this could easily touch 30,000 lines of code or more)
my approach would be:
do not mess up the structure definitions
find all places where data is loaded or saved (or received or send)
add on this places a conversion to a defined byteorder (for convinience use the byteorder of your past target
for this conversions create functions that are implemented hardware dependent or even better use htonX and ntohX routines to convert data to and from network byteorder (that is afaik BigEndian)

Data conversion for ARM platform (from x86/x64)

We have developed win32 application for x86 and x64 platform. We want to use the same application on ARM platform. Endianness will vary for ARM platform i.e. ARM platform uses Big endian format in general. So we want to handle this in our application for our device.
For e.g. // In x86/x64, int nIntVal = 0x12345678
In ARM, int nIntVal = 0x78563412
How values will be stored for the following data types in ARM?
double
char array i.e. char chBuffer[256]
int64
Please clarify this.
Regards,
Raphel
Endianess only matters for register <-> memory operations.
In a register there is no endianess. If you put
int nIntVal = 0x12345678
in your code it will have the same effect on any endianess machine.
all IEEE formats (float, double) are identical in all architectures, so this does not matter.
You only have to care about endianess in two cases:
a) You write integers to files that have to be transferable between the two architectures.
Solution: Use the hton*, ntoh* family of converters, use a non-binary file format (e.g. XML) or a standardised file format (e.g. SQLite).
b) You cast integer pointers.
int a = 0x1875824715;
char b = a;
char c = *(char *)&a;
if (b == c) {
// You are working on Little endian
}
The latter code by the way is a handy way of testing your endianess at runtime.
Arrays and the likes if you use write, fwrite falimies of calls to transfer them you will have no problems unless they contain integers: then look above.
int64_t: look above. Only care if you have to store them binary in files or cast pointers.
(Sergey L., above, says, taht you mostly don't have to care for the byte order. He is right, with at least 1 exception: I assumed you want to convert binary data from one platform to the other ...)
http://en.wikipedia.org/wiki/Endianness has a good overview.
In short:
Little endian means, the least significant byte is stored first (at the lowest address)
Big endian means the most significant byte is stored first
The order in which array elements are stored is not affected (but the byte order in array elements, of course)
So
char array is unchanged
int64 - byte order is reversed compared to x86
With regard to the floating point format, consider http://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness. Generally it seems to obey the same rules of endianness as the integer format, but there is are exceptions for older ARM platforms. (I've no first hand experience of that).
Generally I'd suggest, test your conversion of primitive types by controlled experiments first.
Also consider, that compilers might use different padding in structs (a topic you haven't addressed yet).
Hope this helps.
In 98% cases you don't need to care about endianness. Unless you need to transfer some data between systems of different endiannness, or read/write some endian-sensitive file format, you should not bother with it. And even in those cases, you can write your code to perform properly when compiled under any endianness.
From Rob Pike's "The byte order fallacy" post:
Let's say your data stream has a little-endian-encoded 32-bit integer.
Here's how to extract it (assuming unsigned bytes):
i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);
If it's big-endian, here's how to extract it:
i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | (data[0]<<24);
Both these snippets work on any machine, independent of the machine's
byte order, independent of alignment issues, independent of just about
anything. They are totally portable, given unsigned bytes and 32-bit
integers.
The arm is little endian, it has two big endian variants depending on architecture, but it is better to just run native little endian, the tools and volumes of code out there are more fully tested in little endian mode.
Endianness is just one factor in system engineering if you do your system engineering it all works out, no fears, no worries. Define your interfaces and code to that design. Assuming for example that one processors endianness automatically results in having to byteswap is a bad assumption and will bite you eventually. You will end up having to swap an even number of times to undo other bad assumptions that cause a swap (ideally swapping zero times of course rather than 2 or 4 or 6, etc times). If you have any endian concerns at all when writing code you should write it endian independent.
Since some ARMs have BE32 (word invariant) and the newer arms BE8 (byte invariant) you would have to do even more work to try to make something generic that also tries to compensate for little intel, little arm, BE32 arm and BE8 arm. Xscale tends to run big endian natively but can be run as little endian to reduce the headaches. You may be assuming that because an ARM clone is big endian then all are big endian, another bad assumption.

How to write convertible code, 32 bit/64 bit?

A c++ specific question. So i read a question about what makes a program 32 bit/64 bit, and the anwser it got was something like this (sorry i cant find the question, was somedays ago i looked at it and i cant find it again:( ): As long as you dont make any "pointer assumptions", you only need to recompile it. So my question is, what are pointer assumtions ? To my understanding there is 32 bit pointer and 64 bit pointers so i figure it is something to do with that . Please show the diffrence in code between them. Any other good habits to keep in mind while writing code, that helps it making it easy to convert between the to are also welcome :) tho please share examples with them
Ps. I know there is this post:
How do you write code that is both 32 bit and 64 bit compatible?
but i tougth it was kind of to generall with no good examples, for new programmers like myself. Like what is a 32 bit storage unit ect. Kinda hopping to break it down a bit more (no pun intended ^^ ) ds.
In general it means that your program behavior should never depend on the sizeof() of any types (that are not made to be of some exact size), neither explicitly nor implicitly (this includes possible struct alignments as well).
Pointers are just a subset of them, and it probably also means that you should not try to rely on being able to convert between unrelated pointer types and/or integers, unless they are specifically made for this (e.g. intptr_t).
In the same way you need to take care of things written to disk, where you should also never rely on the size of e.g. built in types, being the same everywhere.
Whenever you have to (because of e.g. external data formats) use explicitly sized types like uint32_t.
For a well-formed program (that is, a program written according to syntax and semantic rules of C++ with no undefined behaviour), the C++ standard guarantees that your program will have one of a set of observable behaviours. The observable behaviours vary due to unspecified behaviour (including implementation-defined behaviour) within your program. If you avoid unspecified behaviour or resolve it, your program will be guaranteed to have a specific and certain output. If you write your program in this way, you will witness no differences between your program on a 32-bit or 64-bit machine.
A simple (forced) example of a program that will have different possible outputs is as follows:
int main()
{
std::cout << sizeof(void*) << std::endl;
return 0;
}
This program will likely have different output on 32- and 64-bit machines (but not necessarily). The result of sizeof(void*) is implementation-defined. However, it is certainly possible to have a program that contains implementation-defined behaviour but is resolved to be well-defined:
int main()
{
int size = sizeof(void*);
if (size != 4) {
size = 4;
}
std::cout << size << std::endl;
return 0;
}
This program will always print out 4, despite the fact it uses implementation-defined behaviour. This is a silly example because we could have just done int size = 4;, but there are cases when this does appear in writing platform-independent code.
So the rule for writing portable code is: aim to avoid or resolve unspecified behaviour.
Here are some tips for avoiding unspecified behaviour:
Do not assume anything about the size of the fundamental types beyond that which the C++ standard specifies. That is, a char is at least 8 bit, both short and int are at least 16 bits, and so on.
Don't try to do pointer magic (casting between pointer types or storing pointers in integral types).
Don't use a unsigned char* to read the value representation of a non-char object (for serialisation or related tasks).
Avoid reinterpret_cast.
Be careful when performing operations that may over or underflow. Think carefully when doing bit-shift operations.
Be careful when doing arithmetic on pointer types.
Don't use void*.
There are many more occurrences of unspecified or undefined behaviour in the standard. It's well worth looking them up. There are some great articles online that cover some of the more common differences that you'll experience between 32- and 64-bit platforms.
"Pointer assumptions" is when you write code that relies on pointers fitting in other data types, e.g. int copy_of_pointer = ptr; - if int is a 32-bit type, then this code will break on 64-bit machines, because only part of the pointer will be stored.
So long as pointers are only stored in pointer types, it should be no problem at all.
Typically, pointers are the size of the "machine word", so on a 32-bit architecture, 32 bits, and on a 64-bit architecture, all pointers are 64-bit. However, there are SOME architectures where this is not true. I have never worked on such machines myself [other than x86 with it's "far" and "near" pointers - but lets ignore that for now].
Most compilers will tell you when you convert pointers to integers that the pointer doesn't fit into, so if you enable warnings, MOST of the problems will become apparent - fix the warnings, and chances are pretty decent that your code will work straight away.
There will be no difference between 32bit code and 64bit code, the goal of C/C++ and other programming languages are their portability, instead of the assembly language.
The only difference will be the distrib you'll compile your code on, all the work is automatically done by your compiler/linker, so just don't think about that.
But: if you are programming on a 64bit distrib, and you need to use an external library for example SDL, the external library will have to also be compiled in 64bit if you want your code to compile.
One thing to know is that your ELF file will be bigger on a 64bit distrib than on a 32bit one, it's just logic.
What's the point with pointer? when you increment/change a pointer, the compiler will increment your pointer from the size of the pointing type.
The contained type size is defined by your processor's register size/the distrib your working on.
But you just don't have to care about this, the compilation will do everything for you.
Sum: That's why you can't execute a 64bit ELF file on a 32bit distrib.
Typical pitfalls for 32bit/64bit porting are:
The implicit assumption by the programmer that sizeof(void*) == 4 * sizeof(char).
If you're making this assumption and e.g. allocate arrays that way ("I need 20 pointers so I allocate 80 bytes"), your code breaks on 64bit because it'll cause buffer overruns.
The "kitten-killer" , int x = (int)&something; (and the reverse, void* ptr = (void*)some_int). Again an assumption of sizeof(int) == sizeof(void*). This doesn't cause overflows but looses data - the higher 32bit of the pointer, namely.
Both of these issues are of a class called type aliasing (assuming identity / interchangability / equivalence on a binary representation level between two types), and such assumptions are common; like on UN*X, assuming time_t, size_t, off_t being int, or on Windows, HANDLE, void* and long being interchangeable, etc...
Assumptions about data structure / stack space usage (See 5. below as well). In C/C++ code, local variables are allocated on the stack, and the space used there is different between 32bit and 64bit mode due to the point below, and due to the different rules for passing arguments (32bit x86 usually on the stack, 64bit x86 in part in registers). Code that just about gets away with the default stacksize on 32bit might cause stack overflow crashes on 64bit.
This is relatively easy to spot as a cause of the crash but depending on the configurability of the application possibly hard to fix.
Timing differences between 32bit and 64bit code (due to different code sizes / cache footprints, or different memory access characteristics / patterns, or different calling conventions ) might break "calibrations". Say, for (int i = 0; i < 1000000; ++i) sleep(0); is likely going to have different timings for 32bit and 64bit ...
Finally, the ABI (Application Binary Interface). There's usually bigger differences between 64bit and 32bit environments than the size of pointers...
Currently, two main "branches" of 64bit environments exist, IL32P64 (what Win64 uses - int and long are int32_t, only uintptr_t/void* is uint64_t, talking in terms of the sized integers from ) and LP64 (what UN*X uses - int is int32_t, long is int64_t and uintptr_t/void* is uint64_t), but there's the "subdivisions" of different alignment rules as well - some environments assume long, float or double align at their respective sizes, while others assume they align at multiples of four bytes. In 32bit Linux, they align all at four bytes, while in 64bit Linux, float aligns at four, long and double at eight-byte multiples.
The consequence of these rules is that in many cases, bith sizeof(struct { ...}) and the offset of structure/class members are different between 32bit and 64bit environments even if the data type declaration is completely identical.
Beyond impacting array/vector allocations, these issues also affect data in/output e.g. through files - if a 32bit app writes e.g. struct { char a; int b; char c, long d; double e } to a file that the same app recompiled for 64bit reads in, the result will not be quite what's hoped for.
The examples just given are only about language primitives (char, int, long etc.) but of course affect all sorts of platform-dependent / runtime library data types, whether size_t, off_t, time_t, HANDLE, essentially any nontrivial struct/union/class ... - so the space for error here is large,
And then there's the lower-level differences, which come into play e.g. for hand-optimized assembly (SSE/SSE2/...); 32bit and 64bit have different (numbers of) registers, different argument passing rules; all of this affects strongly how such optimizations perform and it's very likely that e.g. SSE2 code which gives best performance in 32bit mode will need to be rewritten / needs to be enhanced to give best performance 64bit mode.
There's also code design constraints which are very different for 32bit and 64bit, particularly around memory allocation / management; an application that's been carefully coded to "maximize the hell out of the mem it can get in 32bit" will have complex logic on how / when to allocate/free memory, memory-mapped file usage, internal caching, etc - much of which will be detrimental in 64bit where you could "simply" take advantage of the huge available address space. Such an app might recompile for 64bit just fine, but perform worse there than some "ancient simple deprecated version" which didn't have all the maximize-32bit peephole optimizations.
So, ultimately, it's also about enhancements / gains, and that's where more work, partly in programming, partly in design/requirements comes in. Even if your app cleanly recompiles both on 32bit and 64bit environments and is verified on both, is it actually benefitting from 64bit ? Are there changes that can/should be done to the code logic to make it do more / run faster in 64bit ? Can you do those changes without breaking 32bit backward compatibility ? Without negative impacts on the 32bit target ? Where will the enhancements be, and how much can you gain ?
For a large commercial project, answers to these questions are often important markers on the roadmap because your starting point is some existing "money maker"...

Porting data serialization code from C++ linux/mac to C++ windows

I have a software framework compiled and running successfully on both mac and linux. I am now trying to port it to windows (using mingw). So far, I have the software compiling and running under windows but its inevitably buggy. In particular, I have an issue with reading data that was serialized in macos (or linux) into the windows version of the program (segfaults).
The serialization process serializes values of primitive variables (longs, ints, doubles etc.) to disk.
This is the code I am using:
#include <iostream>
#include <fstream>
template <class T>
void serializeVariable(T var, std::ofstream &outFile)
{
outFile.write (reinterpret_cast < char *>(&var),sizeof (var));
}
template <class T>
void readSerializedVariable(T &var, std::ifstream &inFile)
{
inFile.read (reinterpret_cast < char *>(&var),sizeof (var));
}
So to save the state of a bunch of variables, I call serializeVariable for each variable in turn. Then to read the data back in, calls are made to readSerializedVariable in the same order in which they were saved. For example to save:
::serializeVariable<float>(spreadx,outFile);
::serializeVariable<int>(objectDensity,outFile);
::serializeVariable<int>(popSize,outFile);
And to read:
::readSerializedVariable<float>(spreadx,inFile);
::readSerializedVariable<int>(objectDensity,inFile);
::readSerializedVariable<int>(popSize,inFile);
But in windows, this reading of serialized data is failing. I am guessing that windows serializes data a little differently. I wonder if there is a way in which I could modify the above code so that data saved on any platform can be read on any other platform...any ideas?
Cheers,
Ben.
Binary serialization like this should work fine across those platforms. You do have to honor endianness, but that is trivial. I don't think these three platforms have any conflicts in this respect.
You really can't use as loose of type specifications when you do, though. int, float, size_t sizes can all change across platforms.
For integer types, use the strict sized types found in the cstdint header. uint32_t, int32_t, etc. Windows doesn't have the header available iirc, but you can use boost/cstdint.hpp instead.
Floating point should work as most compilers follow the same IEEE specs.
C - Serialization of the floating point numbers (floats, doubles)
Binary serialization really needs thorough unit testing. I would strongly recommend investing the time.
this is just a wild guess sry I can't help you more. My idea is that the byte order is different: big endian vs little endian. So anything larger than one byte will be messed up when loaded on a machine that has the order reversed.
For example I found this peace of code in msdn:
int isLittleEndian() {
long int testInt = 0x12345678;
char *pMem;
pMem = (char *) testInt;
if (pMem[0] == 0x78)
return(1);
else
return(0);
}
I guess you will have different results on linux vs windows. Best case would be if there is a flag option for your compiler(s) to use one format or the other. Just set it to be the same on all machines.
Hope this helps,
Alex
Just one more wild guess:
you forget open file in binary reading mode, and on windows file streams
convert sequence 13,10 to 10.
Did you consider using serialization libraries or formats, like e.g.:
XDR (supported by libc) or ASN1
s11n (a C++ serialization library)
Json, a very simple textual format with many libraries for it, e.g. JsonCpp, Jansson, Jaula, ....)
YAML, a more powerful textual format, with many libraries
or even XML, which is often used for serialization purposes...
(And for serialization of scalars, the htonl and companion routines should help)