Same binary code on Windows and Linux (x86) - c++

I want to compile a bunch of C++ files into raw machine code and the run it with a platform-dependent starter written in C. Something like
fread(buffer, 1, len, file);
a=((*int(*)(int))buffer)(b);
How can I tell g++ to output raw code?
Will function calls work? How can I make it work?
I think the calling conventions of Linux and Windows differ. Is this a problem? How can I solve it?
EDIT: I know that PE and ELF prevent the DIRECT starting of the executable. But that's what I have the starter for.

There is one (relatively) simple way of achieving some of this, and that's called "position independent code". See your compiler documentation for this.
Meaning you can compile some sources into a binary which will execute no matter where in the address space you place it. If you have such a piece of x86 binary code in a file and mmap() it (or the Windows equivalent) it is possible to invoke it from both Linux and Windows.
Limitations already mentioned are of course still present - namely, the binary code must restrict itself to using a calling convention that's identical on both platforms / can be represented on both platforms (for 32bit x86, that'd be passing args on the stack and returning values in EAX), and of course the code must be fully self-contained - no DLL function calls as resolving these is system dependent, no system calls either.
I.e.:
You need position-independent code
You must create self-contained code without any external dependencies
You must extract the machine code from the object file.
Then mmap() that file, initialize a function pointer, and (*myblob)(someArgs) may do.
If you're using gcc, the -ffreestanding -nostdinc -fPIC options should give you most of what you want regarding the first two, then use objdump to extract the binary blob from the ELF object file afterwards.

Theoretically, some of this is achievable. However there are so many gotchas along the way that it's not really a practical solution for anything.
System call formats are totally incompatible
DEP will prevent data executing as code
Memory layouts are different
You need to effectively dynamically 'relink' the code before you can run it.
.. and so forth...

The same executable cannot be run on both Windows and Linux.
You write your code platform independently (STL, Boost & Qt can help with this), then compile in G++ on Linux to output a linux-binary, and similarly on a compiler on the windows platform.
EDIT: Also, perhaps these two posts might help you:
One
Two

Why don't you take a look at wine? It's for using windows executables on Linux. Another solution for that is using Java or .NET bytecode.
You can run .NET executables on Linux (requires mono runtime)
Also have a look at Agner's objconv (disassembling, converting PE executable to ELF etc.)
http://www.agner.org/optimize/#objconv

Someone actually figured this out. It’s called αcτµαlly pδrταblε εxεcµταblε (APE) and you use the Cosmopolitan C library. The gist is that there’s a way to cause Windows PE executable headers to be ignored and treated as a shell script. Same goes for MacOS allowing you to define a single executable. Additionally, they also figured out how to smuggle ZIP into it so that it can incrementally compress the various sections of the file / decompress on run.
https://justine.lol/ape.html
https://github.com/jart/cosmopolitan
Example of a single identical Lua binary running on Linux and Windows:
https://ahgamut.github.io/2021/02/27/ape-cosmo/

Doing such a thing would be rather complicated. It isn't just a matter of the cpu commands being issued, the compiler has dependencies on many libraries that will be linked into the code. Those libraries will have to match at run-time or it won't work.
For example, the STL library is a series of templates and library functions. The compiler will inline some constructs and call the library for others. It'd have to be the exact same library to work.
Now, in theory you could avoid using any library and just write in fundamentals, but even there the compiler may make assumptions about how they work, what type of data alignment is involved, calling convention, etc.
Don't get me wrong, it can work. Look at the WINE project and other native drivers from windows being used on Linux. I'm just saying it isn't something you can quickly and easily do.
Far better would be to recompile on each platform.

That is achievable only if you have WINE available on your Linux system. Otherwise, the difference in the executable file format will prevent you from running Windows code on Linux.

Related

C++ Versions, do they auto-detect the version of an exe?

Okay so I know that there are multiple c++ versions. And I dont really know much about the differences between them but my question is:
Lets say i made a c++ application in c++ 11 and sent it off to another computer would it come up with errors from other versions of c++ or will it automatically detect it and run with that version? Or am I getting this wrong and is it defined at compile time? Someone please tell me because I am yet to find a single answer to my question on google.
It depends if you copy the source code to the other machine, and compile it there, or if you compile it on your machine and send the resulting binary to the other computer.
C++ is translated by the compiler to machine code, which runs directly on the processor. Any computer with a compatible processor will understand the machine code, but there is more than that. The program needs to interface with a filesystem, graphic adapters, and so on. This part is typically handled by the operating system, in different ways of course. Even if some of this is abstracted by C++ libraries, the calls to the operating system are different, and specific to it.
A compiled binary for ubuntu will not run on windows, for example, even if both computers have the same processor and hardware.
If you copy the source code to the other machine, and compile it there (or use a cross-compiler), your program should compile and run fine, if you don't use OS-specific features.
The C++ version does matter for compilation, you need a C++11 capable compiler of course if you have C++11 source code, but once the program is compiled, it does not matter any more.
C++ is compiled to machine code, which is then runnable on any computer having that architecture e.g. i386 or x64 (putting processor features like SSE etc. aside).
For Java, to bring a counterexample, it is different. There the code is compiled to a bytecode format, that is machine independent. This bytecodeformat is read/understood by the Java Virtual Machine (JVM). The JVM then has to be available for your architecture and the correct version has to be installed.
Or am I getting this wrong and is it defined at compile time?
This is precisely the idea: The code is compiled, and after that the language version is almost irrelevant. The only possible pitfall would be if a newer C++ version would include a breaking change to the standard C++ library (the library, not the language itself!). However, since the vast majority of that library is template code, it's compiled along with your own code anyway. It's basically baked into your .exe file along with your own code, so it's just as portable as yours. Also, both the C and C++ designers take great care not to break old code; so you can expect even those parts that are provided by the system itself (the standard C library) not to break anything.
So, even though there are things that could break in theory, pure C++ code should run fine on all machines that understand the same .exe format as the machine it was compiled on.

Can Winelib link a DLL directly to the ELF executable?

There is a DLL (no source code, but no fancy stuff expected inside, hopefully). Going to write a Linux application to use it. So, GNU all the way: native Linux gcc/gdb/ELF, etc.
I've found here on SO some solutions: with WineLib it's possible to write a code that have access to the win32 LoadLibrary function, and that code still compiles into ELF binary. A bit of API forwarding and here is a *.so file that calls LoadLibrary on the dll and exposes its functions.
Is it correct?
Is it possible to automate it? Is there an example with winedump and winegcc that are probably the tools for this job?
Sounds all perfectly reasonable. The DLL format is properly ancient, and not excessively complex (it had to work on the original 8086 CPU, and it got simpler with 32 bits Windows). Code is just that, x86 instructions, and data may be even more boring.
However, it sounds also very specialized, which probably explains why I've never heard of an actual implementation of this idea.

C++ Compile on different platforms

I am currently developing a C++ command line utility to be distributed as an open-source utility on Github. However, I want people who download the program to be able to easily compile and run the program on any platform (specifically Mac, Linux, and Windows) in as few steps as possible. Assuming only small changes have to be made to the code to make it compatible with the various platform-independent C++ compilers (g++ and win32), how can I do this? Are makefiles relevant?
My advice is, do not use make files, maintaining the files for big enougth projects is tedious and errors happen sometimes which you don't catch immediatly (because the *.o file is still there).
See this question here
Makefiles are indeed highly relevant. You may find that you need (at least) two different makefiles to compensate for the fact that you have different compilers.
It's hard to be specific about how you solve this, since it depends on how complex the project is. It may be easiest to write a script/batchfile, and just document "Use the command build.sh on Linux/Unix, and build.bat on Windows") - and then let the respective files deal with for example setting up the name of the compiler and flags, etc.
Or you can have an include into the makefile, which is determined by the architecture. Or different makefiles.
If the project is REALLY simple, it may be just enough to provide a basic makefile - but it's unlikely, as a compile of x.cpp on Linux/MacOS makes an object file is called x.o, on windows the object file is called x.obj. Libraries have different names, dll's have differnet names, and on Linux/MacOS, the final executable has no extension (typically) so it's called "myprog", where the executable under windows is called "myprog.exe".
These sorts of differences mean that the makefile needs to be different.

protecting C++ code in program

I realize this must be a somewhat naive question, but I have written C++ program for a client. He needs the program installed on his machine, but I don't want to give him the code obviously.
How can I protect the code so he doesn't have access to the source code? any suggestions to help me get started would be appreciated.
thanks!
Compile the program, and give him the compiled version? Like most computer programs?
Beyond that, I refer you to Protecting executable from reverse engineering?
You don't have to give your customer the source code of your program. Generally speaking, he should only need the executable program.
C++ is a compiled language. That means that after compilation, the compiler will generate a binary file which contains machine code - for example, a dll, a lib or an exe file under Windows. In windows, all you have to do is deliver the exe's and associated dll's, if they are not already present on the client's machine. There can be different versions of the binaries (depending on platforms, e.g. 32bit vs 64bit compilations) so you might have to run more compilations and let an installer utility handle the distribution.

g++ produces big binaries despite small project

Probably this is a common question. In fact I think I asked it years ago... but I can't remember the answer.
The problem is: I have a project that is composed of 6 source files. All of them no more than 200 lines of code. It uses many STL containers, stdlib.h and iostream. Now the executable is around 800kb in size.... I guess I shouldn't statically link libraries. How to do this with GCC? And in Eclipse CDT?
EDIT:
As I responses away from what I want I think it's the case for a clarification. What I want to know is why such a small program is so big in size and what is the relationship with static, shared libraries and their difference. If it's a too long story to tell feel free to give pointers to docs. Thank you
If you give g++ dynamic library names, and don't pass the -static flag, it should link dynamically.
To reduce size, you could of course strip the binary, and pass the -Os (optimize for size) optimization flag to g++.
One thing to remember is that using the STL results in having that extra code in your executable even if you are dynamically linking with the C++ library. This is by virtue of the fact that the STL is a bunch of templates that aren't actually compiled until you write and compile your code. Since the library can't anticipate what you might store in a container, there's no way for the library to already contain the code for that particular usage of the container. Same goes with algorithms and everything else in the STL.
I'm not saying this is definitely the reason your executable is so much larger than you expect. But it may be a factor.
Use -O3 and -s flags to produce the most optimized binary. Also see this link for some more information.
If you are building for Windows, consider using the Microsoft compiler. It always produces the smallest binary on that platform.
Eclipse should be linking dynamically by default, unless you've set the static flag on the linker in your makefile.
In response to your EDIT :
-when you link statically, the executable contains a full copy of each library you've linked to.
-when you link dynamically, the executable only contains references and hooks to the linked libraries, which is a much much smaller amount of code.
The executable has to contain more than just your code.
At the very least, it contains some startup code, setting up the environment and if necessary, loading any external libraries, before the program launches.
If you've statically linked the runtime library, you also get that included in your executable. Otherwise you only get a small stub, just big enough to redirect system calls to the external runtime.
It may, depending on compiler settings also include a lot of debugging info and other non-essential data. If optimizations are enabled, that may have increased code size as well.
The real question is why does this matter? 800KB still fits easily on a floppy disk!
Most of this is a one-time cost. it doesn't mean that if you write twice as much code, it'll take up 1600KB. More likely, it'll take 810KB or something like that.
Don't worry about one-time startup costs.
The size usually results in static libraries being linked into your application.
You can reduce the size of the compiled binary by compiling to RELEASE versions, with optimizations to binary size.
Another source of executable size are the libraries. You said that you don't use external libraries, except for STD, so I believe you're including the C Runtime with your executable, ie, linking statically. so check for dynamic linking.
IMO you shouldn't really worry about that, but if you're really paranoid, check this: Smallest x86 ELF Hello World
use Visual C++ 6.0
it supported with Windows 95 to Windows 7.
and can be compiled as x86 platforms but only for Windows.
so if you are a Windows user just stick with Windows Compilers other than GCC which is sux actually.most of people who say Visual C++ is sux cause they are Anti-Microsofters.
also remember use "Visual C++ 6.0" if you use a newer one probably you can't run your files on Windows 95. I have tested all those things that's why I said.
GCC produces largest binaries, but Visual C++ not ,Intel Compiler can use to save more than 30% of space but it demands a Intel processor unless performance would be horrible.
another thing u need to remember is when u use templates though you see small lines
when you compiles those functions would be expanded so the result is make larger binaries.
if you need smaller binaries I suggest move to C cause C is actually widely used but not OO
infact C is easy to use than C++
this does make sense then C++ example
cout << "Hello World" << endl;
printf("%s","Hello World");
second one say print field %s means you type a string so it's easy. :P