I'm creating a D&D engine for fun just to practice my c++ skills and learn some of the more in depth topics. Currently, I am working on building a system to save and load characters. I have a Stats class, that holds all of the statistics for a character, and a character class that currently just has a name and a stats* to a stats object for that character.
So far, I've been able to successfully save the data using boost text archive, and now switched to boost binary archive. It appears to work when saving the data, but when I try to load the data I get this error:
"Exception Unhandled - Unhandled exception at [memory address] in VileEngine.exe Microsoft C++ exception: boost::archive::archive_exception at memory location [different mem address]"
I can skip past this error multiple times but when the program runs and loads, the data of the loaded character is way off so I know it has to be either in the way I'm saving it, or more likely in the way I'm loading it. I've tried reading through the boost docs but couldn't find a way to fix it. I also tried searching through other posts but couldn't find an answer, or maybe I just don't understand the answers. Any help is greatly appreciated.
Relevant code posted below. I can post all the code if needed but it's quite a bit for all the classes.
in Character.hpp
private:
friend class boost::serialization::access; //allows serialization saving
//creates the template class used by boost to serialize the classes data
//serialize is call whenever this class is attempting to be saved
template<class Archive>
void serialize(Archive& ar, const unsigned int version) {
ar << name;
ar << *charStats;
ar << inventory;
}
/*********************************
* Data Members
***********************************/
std::string name;
Stats* charStats;
std::vector<std::string> inventory;
public:
Character();
void loadCharacter(std::string &charName); //saves all character details
void saveCharacter(); //loads all character details
in Character.cpp
/*********************************************
Functions to save and load character details
**********************************************/
void Character::saveCharacter() {
//save all details of character to charactername.dat file
//create filename of format "CharacterName.dat"
std::string fileName = name + ".dat";
std::ofstream saveFile(fileName);
//create serialized archive and save this characters data
boost::archive::binary_oarchive outputArchive(saveFile);
outputArchive << this;
saveFile.close();
}
void Character::loadCharacter(std::string &charName) {
//load details of .dat file into character using the characters name
std::string fileName = charName + ".dat";
std::ifstream loadFile(fileName);
boost::archive::binary_iarchive inputArchive(loadFile);
inputArchive >> name;
Stats* temp = new Stats;
inputArchive >> temp;
charStats = temp;
inputArchive >> inventory;
loadFile.close();
}
in Stats.hpp
private:
friend class boost::serialization::access; //allows serialization saving
//creates the template class used by boost to serialize the classes data
//serialize is call whenever this class is attempting to be saved
template<class Archive>
void serialize(Archive& ar, const unsigned int version) {
ar & skillSet;
ar & subSkillMap;
ar & level;
ar & proficiencyBonus;
}
When you save, you ONLY write this (by pointer, which is an error, see below):
boost::archive::binary_oarchive outputArchive(saveFile);
outputArchive << this;
Whe you load, you somehow read three separate things. Why? They should obviously match. And 100%. So:
void Character::saveCharacter() {
std::ofstream saveFile(name + ".dat");
boost::archive::binary_oarchive outputArchive(saveFile);
outputArchive << *this;
}
You save *this (by reference) because you do not want the deserialization to allocate a new instance of Character on the heap. If you do, you cannot make it a member function.
Regardless, your serialize function uses operator<< where it MUST use operator& because otherwise it will only work for save, not load. Your compiler would have told you, so clearly, your code is different from what you posted.
See it live: Live On Coliru
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/serialization/access.hpp>
#include <boost/serialization/set.hpp>
#include <boost/serialization/map.hpp>
#include <boost/serialization/vector.hpp>
#include <boost/serialization/string.hpp>
#include <fstream>
struct Stats{
private:
std::set<int> skillSet{1, 2, 3};
std::map<int, std::string> subSkillMap{
{1, "one"},
{2, "two"},
{3, "three"},
};
int level = 13;
double proficiencyBonus = 0;
friend class boost::serialization::access; //allows serialization saving
template <class Archive> void serialize(Archive& ar, unsigned)
{
ar & skillSet;
ar & subSkillMap;
ar & level;
ar & proficiencyBonus;
}
};
struct Character {
private:
friend class boost::serialization::access; // allows serialization saving
template <class Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar & name;
ar & *charStats;
ar & inventory;
}
/*********************************
* Data Members
*********************************/
std::string name;
Stats* charStats = new Stats{};
std::vector<std::string> inventory;
public:
Character(std::string name = "unnamed") : name(std::move(name)){}
~Character() { delete charStats; }
// rule of three (suggest to use no raw pointers!)
Character(Character const&) = delete;
Character& operator=(Character const&) = delete;
void loadCharacter(std::string const& charName);
void saveCharacter();
};
/*********************************************
Functions to save and load character details
**********************************************/
void Character::saveCharacter() {
std::ofstream saveFile(name + ".dat");
boost::archive::binary_oarchive outputArchive(saveFile);
outputArchive << *this;
}
void Character::loadCharacter(std::string const &charName) {
std::ifstream loadFile(charName + ".dat");
boost::archive::binary_iarchive inputArchive(loadFile);
inputArchive >> *this;
loadFile.close();
}
int main() {
{
Character charlie { "Charlie" }, bokimov { "Bokimov" };
charlie.saveCharacter();
bokimov.saveCharacter();
}
{
Character someone, someone_else;
someone.loadCharacter("Charlie");
someone_else.loadCharacter("Bokimov");
}
}
Saves two files and loads them back:
==== Bokimov.dat ====
00000000: 1600 0000 0000 0000 7365 7269 616c 697a ........serializ
00000010: 6174 696f 6e3a 3a61 7263 6869 7665 1300 ation::archive..
00000020: 0408 0408 0100 0000 0000 0000 0007 0000 ................
00000030: 0000 0000 0042 6f6b 696d 6f76 0000 0000 .....Bokimov....
00000040: 0003 0000 0000 0000 0000 0000 0001 0000 ................
00000050: 0002 0000 0003 0000 0000 0000 0000 0300 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0001 ................
00000070: 0000 0003 0000 0000 0000 006f 6e65 0200 ...........one..
00000080: 0000 0300 0000 0000 0000 7477 6f03 0000 ..........two...
00000090: 0005 0000 0000 0000 0074 6872 6565 0d00 .........three..
000000a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000000b0: 0000 0000 0000 0000 0000 00 ...........
==== Charlie.dat ====
00000000: 1600 0000 0000 0000 7365 7269 616c 697a ........serializ
00000010: 6174 696f 6e3a 3a61 7263 6869 7665 1300 ation::archive..
00000020: 0408 0408 0100 0000 0000 0000 0007 0000 ................
00000030: 0000 0000 0043 6861 726c 6965 0000 0000 .....Charlie....
00000040: 0003 0000 0000 0000 0000 0000 0001 0000 ................
00000050: 0002 0000 0003 0000 0000 0000 0000 0300 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0001 ................
00000070: 0000 0003 0000 0000 0000 006f 6e65 0200 ...........one..
00000080: 0000 0300 0000 0000 0000 7477 6f03 0000 ..........two...
00000090: 0005 0000 0000 0000 0074 6872 6565 0d00 .........three..
000000a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000000b0: 0000 0000 0000 0000 0000 00 ...........
I want to make a bitmask. The following defines are already taken.
#define SEC_NO_FLAGS 0x000
#define SEC_ALLOC 0x001
#define SEC_LOAD 0x002
#define SEC_RELOC 0x004
#define SEC_READONLY 0x008
#define SEC_CODE 0x010
#define SEC_DATA 0x020
#define SEC_ROM 0x040
Then, I initialize the uint32_t ptr = 0; and I can OR it with the defines:
ptr |= SEC_ALLOC;
Now, I want to extend the defines to:
#define SEC_CORE_1 0x080
#define SEC_CORE_2 0x0F0
#define SEC_CORE_3 0x110
#define SEC_CORE_4 0x120
#define SEC_CORE_5 0x140
#define SEC_CORE_6 0x180
How should I choose the defines above to have a unique bitmask?
But if I test the bitmask. It prints several c's:
std::string
ParseManager::mapFlags(uint64_t flag)
{
std::string tmp = "";
if (flag & SEC_ALLOC)
{
tmp.append("a");
}
if (flag & SEC_CODE)
{
tmp.append("x");
}
if (flag & SEC_READONLY)
{
tmp.append("r");
}
if (flag & SEC_DATA)
{
tmp.append("w");
}
if (flag & SEC_LOAD)
{
tmp.append("l");
}
if (flag & SEC_CORE_1)
{
tmp.append("c1");
}
if (flag & SEC_CORE_2)
{
tmp.append("c2");
}
if (flag & SEC_CORE_3)
{
tmp.append("c3");
}
if (flag & SEC_CORE_4)
{
tmp.append("c4");
}
if (flag & SEC_CORE_5)
{
tmp.append("c5");
}
if (flag & SEC_CORE_6)
{
tmp.append("c6");
}
return tmp;
}
The first block of defined bitmasks expands to binary representataion as follows.
#define SEC_NO_FLAGS 0x000 0000 0000 0000 0000 0000
#define SEC_ALLOC 0x001 0000 0000 0000 0000 0001
#define SEC_LOAD 0x002 0000 0000 0000 0000 0010
#define SEC_RELOC 0x004 0000 0000 0000 0000 0100
#define SEC_READONLY 0x008 0000 0000 0000 0000 1000
#define SEC_CODE 0x010 0000 0000 0000 0001 0000
#define SEC_DATA 0x020 0000 0000 0000 0010 0000
#define SEC_ROM 0x040 0000 0000 0000 0100 0000
All of these have exactly one bit set, which is a different bit in every value. The second block of bitmasks looks as follows.
#define SEC_CORE_1 0x080 0000 0000 0000 1000 0000
#define SEC_CORE_2 0x0F0 0000 0000 0000 1111 0000
#define SEC_CORE_3 0x110 0000 0000 0001 0001 0000
#define SEC_CORE_4 0x120 0000 0000 0001 0010 0000
#define SEC_CORE_5 0x140 0000 0000 0001 0100 0000
#define SEC_CORE_6 0x180 0000 0000 0001 1000 0000
The newly defined bitmasks are different from the previously defined bitmasks, but they share some bits; for instance, SEC_CORE_2 includes the bit set in SEC_CODE. If the values need to be used as bit masks independently from each other, they are not permitted to share the same bits, which can be achieved, for instance, with the following values.
#define SEC_CORE_1 0x0100 0000 0000 0001 0000 0000
#define SEC_CORE_2 0x0200 0000 0000 0010 0000 0000
#define SEC_CORE_3 0x0400 0000 0000 0100 0000 0000
#define SEC_CORE_4 0x0800 0000 0000 1000 0000 0000
#define SEC_CORE_5 0x1000 0000 0001 0000 0000 0000
#define SEC_CORE_6 0x2000 0000 0010 0000 0000 0000
I'm trying to figure out the purpose of this piece of code, from the Tiled utility's map format documentation.
const int gid = data[i] |
data[i + 1] << 8 |
data[i + 2] << 16 |
data[i + 3] << 24;
It looks like there is some "or-ing" and shifting of bits, but I have no clue what the aim of this is, in the context of using data from the tiled program.
Tiled stores its layer "Global Tile ID" (GID) data in an array of 32-bit integers, base64-encoded and (optionally) compressed in the XML file.
According to the documentation, these 32-bit integers are stored in little-endian format -- that is, the first byte of the integer contains the least significant byte of the number. As an analogy, in decimal, writing the number "1234" in little-endian would look like 4321 -- the 4 is the least significant digit in the number (representing a value of just 4), the 3 is the next-least-significant (representing 30), and so on. The only difference between this example and what Tiled is doing is that we're using decimal digits, while Tiled is using bytes, which are effectively digits that can each hold 256 different values instead of just 10.
If we think about the code in terms of decimal numbers, though, it's actually pretty easy to understand what it's doing. It's basically reconstructing the integer value from the digits by doing just this:
int digit[4] = { 4, 3, 2, 1 }; // our decimal digits in little-endian order
int gid = digit[0] +
digit[1] * 10 +
digit[2] * 100 +
digit[3] * 1000;
It's just moving each digit into position to create the full integer value. (In binary, bit shifting by multiples of 8 is like multiplying by powers of 10 in decimal; it moves a value into the next 'significant digit' slot)
More information on big-endian and little-endian and why the difference matters can be found in On Holy Wars And A Plea For Peace, an important (and entertainingly written) document from 1980 in which Danny Cohen argued for the need to standardise on a single byte ordering for network protocols. (spoiler: big-endian eventually won that fight, and so the big-endian representation of integers is now the standard way to represent integers in files and network transmissions -- and has been for decades. Tiled's use of little-endian integers in their file format is somewhat unusual. And results in needing code like the code you quoted in order to reliably convert the little-endian integers in the data file into the computer's native format. If they'd stored their data in the standard big-endian format, every OS provides standard utility functions for converting back and forth from big-endian to native, and you could simply have called ntohl() to assemble the native-format integer, instead of needing to write and comprehend this sort of byte manipulation code manually).
As you noted, the << operator shifts bits to the left by the given number.
This block takes the data[] array, which has four (presumably one byte) elements, and "encodes" those four values into one integer.
Example Time!
data[0] = 0x3A; // 0x3A = 58 = 0011 1010 in binary
data[1] = 0x48; // 0x48 = 72 = 0100 1000 in binary
data[2] = 0xD2; // 0xD2 = 210 = 1101 0010 in binary
data[3] = 0x08; // 0x08 = 8 = 0000 1000 in binary
int tmp0 = data[0]; // 00 00 00 3A = 0000 0000 0000 0000 0000 0000 0011 1010
int tmp1 = data[1] << 8; // 00 00 48 00 = 0000 0000 0000 0000 0100 1000 0000 0000
int tmp2 = data[2] << 16; // 00 D2 00 00 = 0000 0000 1101 0010 0000 0000 0000 0000
int tmp3 = data[3] << 24; // 08 00 00 00 = 0000 1000 0000 0000 0000 0000 0000 0000
// "or-ing" these together will set each bit to 1 if any of the bits are 1
int gid = tmp1 | // 00 00 00 3A = 0000 0000 0000 0000 0000 0000 0011 1010
tmp2 | // 00 00 48 00 = 0000 0000 0000 0000 0100 1000 0000 0000
tmp3 | // 00 D2 00 00 = 0000 0000 1101 0010 0000 0000 0000 0000
tmp4; // 08 00 00 00 = 0000 1000 0000 0000 0000 0000 0000 0000
gid == 147998778;// 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
Now, you've just encoded four one-byte values into a single four-byte integer.
If you're (rightfully) wondering, why would anyone want to go through all that effort when you can just use byte and store the four single-byte pieces of data directly into four bytes, then you should check out this question:
int, short, byte performance in back-to-back for-loops
Bonus Example!
To get your encoded values back, we use the "and" operator along with the right-shift >>:
int gid = 147998778; // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
// "and-ing" will set each bit to 1 if BOTH bits are 1
int tmp0 = gid & // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
0x000000FF; // 00 00 00 FF = 0000 0000 0000 0000 0000 0000 1111 1111
int data0 = tmp0; // 00 00 00 3A = 0000 0000 0000 0000 0000 0000 0011 1010
int tmp1 = gid & // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
0x0000FF00; // 00 00 FF 00 = 0000 0000 0000 0000 1111 1111 0000 0000
tmp1; //value of tmp1 00 00 48 00 = 0000 0000 0000 0000 0100 1000 0000 0000
int data1 = tmp1 >> 8; // 00 00 00 48 = 0000 0000 0000 0000 0000 0000 0100 1000
int tmp2 = gid & // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
0x00FF0000; // 00 FF 00 00 = 0000 0000 1111 1111 0000 0000 0000 0000
tmp2; //value of tmp2 00 D2 00 00 = 0000 0000 1101 0010 0000 0000 0000 0000
int data2 = tmp2 >> 16; // 00 00 00 D2 = 0000 0000 0000 0000 0000 0000 1101 0010
int tmp3 = gid & // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
0xFF000000; // FF 00 00 00 = 1111 1111 0000 0000 0000 0000 0000 0000
tmp3; //value of tmp3 08 00 00 00 = 0000 1000 0000 0000 0000 0000 0000 0000
int data3 = tmp3 >> 24; // 00 00 00 08 = 0000 0000 0000 0000 0000 0000 0000 1000
The last "and-ing" for tmp3 isn't needed, since the bits that "fall off" when shifting are just lost and the bits coming in are zero. So:
gid; // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
int data3 = gid >> 24; // 00 00 00 08 = 0000 0000 0000 0000 0000 0000 0000 1000
but I wanted to provide a complete example.
I am quite new to bit masking and bit operations. Could you please help me understanding this. I have three integers a, b, and c and I have created a new number d with below operations:
int a = 1;
int b = 2;
int c = 92;
int d = (a << 14) + (b << 11) + c;
How do we reconstruct a, b and c using d?
I have no idea of the range of your a, b and c. However, assuming 3 bits for a and b, and 11 bits for c we can do:
a = ( d >> 14 ) & 7;
b = ( d >> 11 ) & 7;
c = ( d >> 0 ) & 2047;
Update:
The value of and-mask is computed as: (2^NumberOfBits)-1
a is 0000 0000 0000 0000 0000 0000 0000 0001
b is 0000 0000 0000 0000 0000 0000 0000 0010
c is 0000 0000 0000 0000 0000 0000 0101 1100
a<<14 is 0000 0000 0000 0000 0100 0000 0000 0000
b<<11 is 0000 0000 0000 0000 0001 0000 0000 0000
c is 0000 0000 0000 0000 0000 0000 0101 1100
d is 0000 0000 0000 0000 0101 0000 0101 1100
^ ^ { }
a b c
So a = d>>14
b = d>>11 & 7
c = d>>0 & 2047
By the way ,you should make sure the b <= 7 and c <= 2047
Can we say that C++ is platform dependent?
I know that C++ uses compiler, and those compiler are different for different platforms. When we compile C++ code using compiler for example: on Windows, .EXE format file created.
Why is an .EXE file OS/Platform dependent?
What is the format inside .EXE files?
Why can't we run it on other platforms?
This is actually a relatively extensive topic. For simplicity, it comes down to two things: operating system and CPU architecture.
First of all, *.exe is generally only Windows since it is binary code which the windows operating system knows how to execute. Furthermore, the operating system knows how to translate this to the proper code for the architecture (this is why Windows is "just compatible"). Note that a lot more is going on, but this is a (very) high-level abstraction of what is going on.
Now, compilers will take C++ code and generate its corresponding assembly code for the architecture (i.e. x86, MIPS, etc.). Usually the compiler also has an assembler (or one which it can rely on). The assembler the links code and generates binary code which the hardware can execute. For more information on this topic look for more information on co-generation.
Additional Note
Consider Java which is not platform-dependent. Java compilers generate Java bytecode which is run on the Java virtual machine (JVM). It is important to notice that any time you wish to run a Java application you must run the Java virtual machine. Since the precompiled JVM knows how to operate on your operating system and CPU architecture, it can run its Java bytecode and effectively run the corresponding actions for your particular system.
In a compiled binary file (i.e. one from C++ code), you have system bytecode. So the kind of instructions which Java simulates for you are directly hard-coded into the .exe or whatever binary format you are using. Consider the following example:
Notice that this java code must eventually be run in the JVM and cannot stand-alone.
Java Code:
System.out.println("hello") (To be compiled)
Compiled Java bytecode:
Print "hello" (To be run in JVM)
JVM:
(... some translation, maybe to architecture code - I forget exactly ...)
system_print_code "hello" (JVM translation to CPU specific)
Versus the C++ (which can be run in stand-alone mode):
C++ Code:
cout<< "hello";
Architecture Code:
some_assembly_routine "hello"
Binary output:
system_print_code "hello"
A real example
If you're curious about how this may actually look in a real-life example, I have included one below.
C++ Source
I placed this in a file called hello.cpp
#include <iostream>
int main() {
using namespace std;
cout << "Hello world!" << endl;
return 0;
}
Assembly (Generated from C++ source)
Generated via g++ -S hello.cpp
.file "test.c"
.section .rodata
.type _ZStL19piecewise_construct, #object
.size _ZStL19piecewise_construct, 1
_ZStL19piecewise_construct:
.zero 1
.local _ZStL8__ioinit
.comm _ZStL8__ioinit,1,1
.LC0:
.string "Hello world!"
.text
.globl main
.type main, #function
main:
.LFB1493:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rsi
leaq _ZSt4cout(%rip), %rdi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc#PLT
movq %rax, %rdx
movq _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_#GOTPCREL(%rip), %rax
movq %rax, %rsi
movq %rdx, %rdi
call _ZNSolsEPFRSoS_E#PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1493:
.size main, .-main
.type _Z41__static_initialization_and_destruction_0ii, #function
_Z41__static_initialization_and_destruction_0ii:
.LFB1982:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
cmpl $1, -4(%rbp)
jne .L5
cmpl $65535, -8(%rbp)
jne .L5
leaq _ZStL8__ioinit(%rip), %rdi
call _ZNSt8ios_base4InitC1Ev#PLT
leaq __dso_handle(%rip), %rdx
leaq _ZStL8__ioinit(%rip), %rsi
movq _ZNSt8ios_base4InitD1Ev#GOTPCREL(%rip), %rax
movq %rax, %rdi
call __cxa_atexit#PLT
.L5:
nop
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1982:
.size _Z41__static_initialization_and_destruction_0ii, .-_Z41__static_initialization_and_destruction_0ii
.type _GLOBAL__sub_I_main, #function
_GLOBAL__sub_I_main:
.LFB1983:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $65535, %esi
movl $1, %edi
call _Z41__static_initialization_and_destruction_0ii
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1983:
.size _GLOBAL__sub_I_main, .-_GLOBAL__sub_I_main
.section .init_array,"aw"
.align 8
.quad _GLOBAL__sub_I_main
.hidden __dso_handle
.ident "GCC: (GNU) 7.2.1 20171128"
.section .note.GNU-stack,"",#progbits
Binary output (Generated from assembly)
This is the unlinked form (i.e. not yet fully populated with symbol locations) of the binary output generated via g++ -c in hexadecimal form. I generated the hexadecimal representation using xxd.
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0100 3e00 0100 0000 0000 0000 0000 0000 ..>.............
00000020: 0000 0000 0000 0000 0807 0000 0000 0000 ................
00000030: 0000 0000 4000 0000 0000 4000 0f00 0e00 ....#.....#.....
00000040: 5548 89e5 488d 3500 0000 0048 8d3d 0000 UH..H.5....H.=..
00000050: 0000 e800 0000 0048 89c2 488b 0500 0000 .......H..H.....
00000060: 0048 89c6 4889 d7e8 0000 0000 b800 0000 .H..H...........
00000070: 005d c355 4889 e548 83ec 1089 7dfc 8975 .].UH..H....}..u
00000080: f883 7dfc 0175 3281 7df8 ffff 0000 7529 ..}..u2.}.....u)
00000090: 488d 3d00 0000 00e8 0000 0000 488d 1500 H.=.........H...
000000a0: 0000 0048 8d35 0000 0000 488b 0500 0000 ...H.5....H.....
000000b0: 0048 89c7 e800 0000 0090 c9c3 5548 89e5 .H..........UH..
000000c0: beff ff00 00bf 0100 0000 e8a4 ffff ff5d ...............]
000000d0: c300 4865 6c6c 6f20 776f 726c 6421 0000 ..Hello world!..
000000e0: 0000 0000 0000 0000 0047 4343 3a20 2847 .........GCC: (G
000000f0: 4e55 2920 372e 322e 3120 3230 3137 3131 NU) 7.2.1 201711
00000100: 3238 0000 0000 0000 1400 0000 0000 0000 28..............
00000110: 017a 5200 0178 1001 1b0c 0708 9001 0000 .zR..x..........
00000120: 1c00 0000 1c00 0000 0000 0000 3300 0000 ............3...
00000130: 0041 0e10 8602 430d 066e 0c07 0800 0000 .A....C..n......
00000140: 1c00 0000 3c00 0000 0000 0000 4900 0000 ....<.......I...
00000150: 0041 0e10 8602 430d 0602 440c 0708 0000 .A....C...D.....
00000160: 1c00 0000 5c00 0000 0000 0000 1500 0000 ....\...........
00000170: 0041 0e10 8602 430d 0650 0c07 0800 0000 .A....C..P......
00000180: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000190: 0000 0000 0000 0000 0100 0000 0400 f1ff ................
000001a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001b0: 0000 0000 0300 0100 0000 0000 0000 0000 ................
000001c0: 0000 0000 0000 0000 0000 0000 0300 0300 ................
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001e0: 0000 0000 0300 0400 0000 0000 0000 0000 ................
000001f0: 0000 0000 0000 0000 0000 0000 0300 0500 ................
00000200: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000210: 0800 0000 0100 0500 0000 0000 0000 0000 ................
00000220: 0100 0000 0000 0000 2300 0000 0100 0400 ........#.......
00000230: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000240: 3200 0000 0200 0100 3300 0000 0000 0000 2.......3.......
00000250: 4900 0000 0000 0000 6200 0000 0200 0100 I.......b.......
00000260: 7c00 0000 0000 0000 1500 0000 0000 0000 |...............
00000270: 0000 0000 0300 0600 0000 0000 0000 0000 ................
00000280: 0000 0000 0000 0000 0000 0000 0300 0900 ................
00000290: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000002a0: 0000 0000 0300 0a00 0000 0000 0000 0000 ................
000002b0: 0000 0000 0000 0000 0000 0000 0300 0800 ................
000002c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000002d0: 7100 0000 1200 0100 0000 0000 0000 0000 q...............
000002e0: 3300 0000 0000 0000 7600 0000 1000 0000 3.......v.......
000002f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000300: 8000 0000 1000 0000 0000 0000 0000 0000 ................
00000310: 0000 0000 0000 0000 9600 0000 1000 0000 ................
00000320: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000330: ce00 0000 1000 0000 0000 0000 0000 0000 ................
00000340: 0000 0000 0000 0000 0901 0000 1000 0000 ................
00000350: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000360: 1a01 0000 1000 0000 0000 0000 0000 0000 ................
00000370: 0000 0000 0000 0000 3201 0000 1002 0000 ........2.......
00000380: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000390: 3f01 0000 1000 0000 0000 0000 0000 0000 ?...............
000003a0: 0000 0000 0000 0000 5701 0000 1000 0000 ........W.......
000003b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000003c0: 0074 6573 742e 6300 5f5a 5374 4c31 3970 .test.c._ZStL19p
000003d0: 6965 6365 7769 7365 5f63 6f6e 7374 7275 iecewise_constru
000003e0: 6374 005f 5a53 744c 385f 5f69 6f69 6e69 ct._ZStL8__ioini
000003f0: 7400 5f5a 3431 5f5f 7374 6174 6963 5f69 t._Z41__static_i
00000400: 6e69 7469 616c 697a 6174 696f 6e5f 616e nitialization_an
00000410: 645f 6465 7374 7275 6374 696f 6e5f 3069 d_destruction_0i
00000420: 6900 5f47 4c4f 4241 4c5f 5f73 7562 5f49 i._GLOBAL__sub_I
00000430: 5f6d 6169 6e00 5f5a 5374 3463 6f75 7400 _main._ZSt4cout.
00000440: 5f47 4c4f 4241 4c5f 4f46 4653 4554 5f54 _GLOBAL_OFFSET_T
00000450: 4142 4c45 5f00 5f5a 5374 6c73 4953 7431 ABLE_._ZStlsISt1
00000460: 3163 6861 725f 7472 6169 7473 4963 4545 1char_traitsIcEE
00000470: 5253 7431 3362 6173 6963 5f6f 7374 7265 RSt13basic_ostre
00000480: 616d 4963 545f 4553 355f 504b 6300 5f5a amIcT_ES5_PKc._Z
00000490: 5374 3465 6e64 6c49 6353 7431 3163 6861 St4endlIcSt11cha
000004a0: 725f 7472 6169 7473 4963 4545 5253 7431 r_traitsIcEERSt1
000004b0: 3362 6173 6963 5f6f 7374 7265 616d 4954 3basic_ostreamIT
000004c0: 5f54 305f 4553 365f 005f 5a4e 536f 6c73 _T0_ES6_._ZNSols
000004d0: 4550 4652 536f 535f 4500 5f5a 4e53 7438 EPFRSoS_E._ZNSt8
000004e0: 696f 735f 6261 7365 3449 6e69 7443 3145 ios_base4InitC1E
000004f0: 7600 5f5f 6473 6f5f 6861 6e64 6c65 005f v.__dso_handle._
00000500: 5a4e 5374 3869 6f73 5f62 6173 6534 496e ZNSt8ios_base4In
00000510: 6974 4431 4576 005f 5f63 7861 5f61 7465 itD1Ev.__cxa_ate
00000520: 7869 7400 0000 0000 0700 0000 0000 0000 xit.............
00000530: 0200 0000 0500 0000 fdff ffff ffff ffff ................
00000540: 0e00 0000 0000 0000 0200 0000 0f00 0000 ................
00000550: fcff ffff ffff ffff 1300 0000 0000 0000 ................
00000560: 0400 0000 1100 0000 fcff ffff ffff ffff ................
00000570: 1d00 0000 0000 0000 2a00 0000 1200 0000 ........*.......
00000580: fcff ffff ffff ffff 2800 0000 0000 0000 ........(.......
00000590: 0400 0000 1300 0000 fcff ffff ffff ffff ................
000005a0: 5300 0000 0000 0000 0200 0000 0400 0000 S...............
000005b0: fcff ffff ffff ffff 5800 0000 0000 0000 ........X.......
000005c0: 0400 0000 1400 0000 fcff ffff ffff ffff ................
000005d0: 5f00 0000 0000 0000 0200 0000 1500 0000 _...............
000005e0: fcff ffff ffff ffff 6600 0000 0000 0000 ........f.......
000005f0: 0200 0000 0400 0000 fcff ffff ffff ffff ................
00000600: 6d00 0000 0000 0000 2a00 0000 1600 0000 m.......*.......
00000610: fcff ffff ffff ffff 7500 0000 0000 0000 ........u.......
00000620: 0400 0000 1700 0000 fcff ffff ffff ffff ................
00000630: 0000 0000 0000 0000 0100 0000 0200 0000 ................
00000640: 7c00 0000 0000 0000 2000 0000 0000 0000 |....... .......
00000650: 0200 0000 0200 0000 0000 0000 0000 0000 ................
00000660: 4000 0000 0000 0000 0200 0000 0200 0000 #...............
00000670: 3300 0000 0000 0000 6000 0000 0000 0000 3.......`.......
00000680: 0200 0000 0200 0000 7c00 0000 0000 0000 ........|.......
00000690: 002e 7379 6d74 6162 002e 7374 7274 6162 ..symtab..strtab
000006a0: 002e 7368 7374 7274 6162 002e 7265 6c61 ..shstrtab..rela
000006b0: 2e74 6578 7400 2e64 6174 6100 2e62 7373 .text..data..bss
000006c0: 002e 726f 6461 7461 002e 7265 6c61 2e69 ..rodata..rela.i
000006d0: 6e69 745f 6172 7261 7900 2e63 6f6d 6d65 nit_array..comme
000006e0: 6e74 002e 6e6f 7465 2e47 4e55 2d73 7461 nt..note.GNU-sta
000006f0: 636b 002e 7265 6c61 2e65 685f 6672 616d ck..rela.eh_fram
00000700: 6500 0000 0000 0000 0000 0000 0000 0000 e...............
00000710: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000720: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000730: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000740: 0000 0000 0000 0000 2000 0000 0100 0000 ........ .......
00000750: 0600 0000 0000 0000 0000 0000 0000 0000 ................
00000760: 4000 0000 0000 0000 9100 0000 0000 0000 #...............
00000770: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000780: 0000 0000 0000 0000 1b00 0000 0400 0000 ................
00000790: 4000 0000 0000 0000 0000 0000 0000 0000 #...............
000007a0: 2805 0000 0000 0000 0801 0000 0000 0000 (...............
000007b0: 0c00 0000 0100 0000 0800 0000 0000 0000 ................
000007c0: 1800 0000 0000 0000 2600 0000 0100 0000 ........&.......
000007d0: 0300 0000 0000 0000 0000 0000 0000 0000 ................
000007e0: d100 0000 0000 0000 0000 0000 0000 0000 ................
000007f0: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000800: 0000 0000 0000 0000 2c00 0000 0800 0000 ........,.......
00000810: 0300 0000 0000 0000 0000 0000 0000 0000 ................
00000820: d100 0000 0000 0000 0100 0000 0000 0000 ................
00000830: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000840: 0000 0000 0000 0000 3100 0000 0100 0000 ........1.......
00000850: 0200 0000 0000 0000 0000 0000 0000 0000 ................
00000860: d100 0000 0000 0000 0e00 0000 0000 0000 ................
00000870: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000880: 0000 0000 0000 0000 3e00 0000 0e00 0000 ........>.......
00000890: 0300 0000 0000 0000 0000 0000 0000 0000 ................
000008a0: e000 0000 0000 0000 0800 0000 0000 0000 ................
000008b0: 0000 0000 0000 0000 0800 0000 0000 0000 ................
000008c0: 0800 0000 0000 0000 3900 0000 0400 0000 ........9.......
000008d0: 4000 0000 0000 0000 0000 0000 0000 0000 #...............
000008e0: 3006 0000 0000 0000 1800 0000 0000 0000 0...............
000008f0: 0c00 0000 0600 0000 0800 0000 0000 0000 ................
00000900: 1800 0000 0000 0000 4a00 0000 0100 0000 ........J.......
00000910: 3000 0000 0000 0000 0000 0000 0000 0000 0...............
00000920: e800 0000 0000 0000 1b00 0000 0000 0000 ................
00000930: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000940: 0100 0000 0000 0000 5300 0000 0100 0000 ........S.......
00000950: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000960: 0301 0000 0000 0000 0000 0000 0000 0000 ................
00000970: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000980: 0000 0000 0000 0000 6800 0000 0100 0000 ........h.......
00000990: 0200 0000 0000 0000 0000 0000 0000 0000 ................
000009a0: 0801 0000 0000 0000 7800 0000 0000 0000 ........x.......
000009b0: 0000 0000 0000 0000 0800 0000 0000 0000 ................
000009c0: 0000 0000 0000 0000 6300 0000 0400 0000 ........c.......
000009d0: 4000 0000 0000 0000 0000 0000 0000 0000 #...............
000009e0: 4806 0000 0000 0000 4800 0000 0000 0000 H.......H.......
000009f0: 0c00 0000 0a00 0000 0800 0000 0000 0000 ................
00000a00: 1800 0000 0000 0000 0100 0000 0200 0000 ................
00000a10: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000a20: 8001 0000 0000 0000 4002 0000 0000 0000 ........#.......
00000a30: 0d00 0000 0e00 0000 0800 0000 0000 0000 ................
00000a40: 1800 0000 0000 0000 0900 0000 0300 0000 ................
00000a50: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000a60: c003 0000 0000 0000 6401 0000 0000 0000 ........d.......
00000a70: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000a80: 0000 0000 0000 0000 1100 0000 0300 0000 ................
00000a90: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000aa0: 9006 0000 0000 0000 7200 0000 0000 0000 ........r.......
00000ab0: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000ac0: 0000 0000 0000 0000 ........
These instructions correspond to an x86_64 machine. If you're interested in following along and matching the op codes, you can look at this reference or download the Intel manual for completeness. Likewise, it is an ELF file so you can observe that we see things we expect (i.e. starting magic number of 0x7f, etc.).
In any case, once linked against the system (i.e. run g++ test.cpp or g++ test.s or g++ test.o), this executable runs directly on top of your OS. There are no additional translation layers between this and the OS. Even so, the OS still does OS things like abstracting hardware interfaces, manage system resources, etc.
Tying this back to the original question, the output binary will look very different on a windows machine (for the same C++ code). At the very least, on a windows machine you would expect to see the file in the Portable Executable (PE) format which is distinctly not ELF.
This is unlike the following Java example which requires a JVM to run:
Java Source File
This is placed in a file called Test.java
package mytest;
public class Test {
public static void main(String[] args) {
System.out.println("Hello world!");
}
}
Java Byte Code (Generated from Java Source)
This is generated by running javac -d . Test.java and running the output file (i.e. mytest/Test.class) through xxd
00000000: cafe babe 0000 0034 001d 0a00 0600 0f09 .......4........
00000010: 0010 0011 0800 120a 0013 0014 0700 1507 ................
00000020: 0016 0100 063c 696e 6974 3e01 0003 2829 .....<init>...()
00000030: 5601 0004 436f 6465 0100 0f4c 696e 654e V...Code...LineN
00000040: 756d 6265 7254 6162 6c65 0100 046d 6169 umberTable...mai
00000050: 6e01 0016 285b 4c6a 6176 612f 6c61 6e67 n...([Ljava/lang
00000060: 2f53 7472 696e 673b 2956 0100 0a53 6f75 /String;)V...Sou
00000070: 7263 6546 696c 6501 0009 5465 7374 2e6a rceFile...Test.j
00000080: 6176 610c 0007 0008 0700 170c 0018 0019 ava.............
00000090: 0100 0c48 656c 6c6f 2077 6f72 6c64 2107 ...Hello world!.
000000a0: 001a 0c00 1b00 1c01 000b 6d79 7465 7374 ..........mytest
000000b0: 2f54 6573 7401 0010 6a61 7661 2f6c 616e /Test...java/lan
000000c0: 672f 4f62 6a65 6374 0100 106a 6176 612f g/Object...java/
000000d0: 6c61 6e67 2f53 7973 7465 6d01 0003 6f75 lang/System...ou
000000e0: 7401 0015 4c6a 6176 612f 696f 2f50 7269 t...Ljava/io/Pri
000000f0: 6e74 5374 7265 616d 3b01 0013 6a61 7661 ntStream;...java
00000100: 2f69 6f2f 5072 696e 7453 7472 6561 6d01 /io/PrintStream.
00000110: 0007 7072 696e 746c 6e01 0015 284c 6a61 ..println...(Lja
00000120: 7661 2f6c 616e 672f 5374 7269 6e67 3b29 va/lang/String;)
00000130: 5600 2100 0500 0600 0000 0000 0200 0100 V.!.............
00000140: 0700 0800 0100 0900 0000 1d00 0100 0100 ................
00000150: 0000 052a b700 01b1 0000 0001 000a 0000 ...*............
00000160: 0006 0001 0000 0003 0009 000b 000c 0001 ................
00000170: 0009 0000 0025 0002 0001 0000 0009 b200 .....%..........
00000180: 0212 03b6 0004 b100 0000 0100 0a00 0000 ................
00000190: 0a00 0200 0000 0500 0800 0600 0100 0d00 ................
000001a0: 0000 0200 0e .....
As one would expect, the byte code output starts with 0xCAFEBABE.
The critical distinction here, however, is that this code cannot be run directly. It is still a binary output, but it's not intended to be executed directly by the operating system. If you tried to run this without a JVM on your system, you would just get an error. However, this code can be run on any operating system that contains a compatible JVM. The set of compatible JVM's depends on how you've set your source and target. By default, it's equivalent to the Java version you're using to compile. In this case, I used Java 8.
The way this works is that your JVM is compiled for each system specifically (similarly to the C++ example above) and translates its binary Java byte code into something your system can now execute.
At the end of the day, there is no free lunch-- as DanielKO mentioned in the comments, the JVM is still a "platform" but it's one-level higher than the OS so it can seem a bit more portable. Eventually, somewhere along the way, the code must translate to instructions valid for your specific operating system family and CPU architecture. However, in the case of Java and the JVM, you only have to compile a single application (i.e. the JVM itself) for all system flavors. At that point, everything written on top of the JVM has system support "for free" so to speak (as long as your application is entirely written in Java and isn't using native interfaces, etc.).
As I mentioned before, there are many caveats to this information :) This was a pretty straightforward example intended to illustrate what you might actually observe. That said, we didn't get into calling native code, using custom JVM agents, or anything else which may affect this answer slightly. In general, however, these more often fall into the category of "special cases" and you wouldn't often be using these things unless you understood why and (hopefully) their implications to portability.
C++ is not platform dependent - in fact there is a standard that all vendors try to implement. What you mean is that the EXECUTABLE that is produced is platform dependent. That is because each OS has a different definition and requirements of what constitutes a valid executable file. Also, each OS has a different set of APIs used for implementing core services that need to be linked against by the C++ linker and compiler. But this has nothing to do with C++ as a language.
What makes a language, such as C++, "platform independent" is that it doesn't rely upon language constructs that are heavily favored by a given CPU architecture. Assembly language, for example and in contrast, is quite specific to a CPU architecture and instruction set. The front-end of a C++ compiler (parsing and semantic analysis) can be the same or basically the same for any computing platform it's targeted for. However, there still needs to be a platform or CPU specific code generator (e.g., for x86, ARM, etc).
An EXE is a binary file specifically compiled and code-generated for DOS/Windows platform. It's structure is known by the DOS/Windows system and it contains information for how to locate the executable in memory as well as all of the instruction codes specific to CPU/platform for it to run. As indicated by Oleksandr, its specific format can be found, for example, on Wikipedia.
Actually C++ is not a platform dependent, but the output it produces is in .exe or other format which is depend on the platform you are using. so simply C++'s code is independent of platform, just the output comes after compilation is dependent.
C++ is not platform dependent per se, but it is possible to write platform dependent code with C++ by calling Windows and/or Linux only APIs. It is also possible to be locked to a particular platform if you use Microsoft-only C++ extensions etc.
The executable format for a given platform is a whole different league.
There are two important aspects to any programming language (such as C++)
that this question touches upon:
How you tell the computer to do what you want.
How the computer does what you told it to.
The second aspect will always be platform-dependent, because the computer must use the machine codes, system libraries, and so forth that work on the particular processor architecture and operating system of that computer.
The first aspect may or may not be platform-dependent, depending on the language and how you use it.
So a *.exe file is a platform-dependent thing, because it says exactly how the computer will do what you told it to do.
But the *.exe file is not C++; it could have been compiled from some other programming language.
A C++ program may or may not be platform-dependent.
If you call functions that are provided by the compiler on Windows and not in other operating systems, then your C++ program will compile only on Windows.
That is a platform dependence.
But if you avoid calling platform-dependent functions such as those,
you can compile the C++ program anywhere.
C++ is not platform dependent.
There are other platforms out there besides Windows.
There are other processors out there besides the X86 or Pentium that Windows runs on.
The is an area called "Embedded Systems" which uses the C++ language on many other kinds and brands of operating systems and processors. For example, there are DSPs, the good old 8051 and the ARM series.
The reason high level languages were invented is so that a program can be written once but compiled (translated) to other platforms. For example, a platform independent C++ program can be compiled for a PDP machine, Windows, Mac, Unix, Vrtx, Windriver, ARM processor, all without changing the program.
In general executables are platform dependent.