How to compile C++ freestanding / baremetal with g++?

How to compile C++ freestanding / baremetal with g++? - c++

C++ has many aspects, included by default, that are not desirable or even functional on embedded bare metal / free standing platforms.
How can I configure g++ to compile C++ in a manner suitable for embedded bare-metal / freestanding?

The compiler option -ffreestanding is explicitly there for this purpose. It enables implementation-defined forms of main and it will disable various assumptions about standard library functions, optimizing code a bit differently in some situations.
Unfortunately in terms of of main(), C++ explicitly doesn't allow void main (void) as an implementation-defined form, unlike C. There is no sensible solution to this, it's a language flaw. Name it main_ or some such.
However, this doesn't disable or block any dangerous and unsuitable C++ features from being used - that burden lies on the programmer who picked C++. You have to manually ensure that things like heap allocation, RTTI, exceptions and so on aren't present. Removing the entire heap segment from your linker script is a sensible thing to do - it should block things like std::vector or std::string from linking.
Then depending on target, there might be certain suitable gcc compiler ports. Such as in case of ARM where one compiler port is called "gcc-none-eabi", which would be the suitable one for bare metal Cortex M microcontrollers.
At a minimum, you'll need a "C run-time" (CRT) for the target - either use one provided by the tool chain (generally recommended) or create one yourself (compile with -nostdlib). It's not a beginner task to create one, particularly not for targets with advanced MMUs. Some general advise of how to do so here. Creating one for C++ is slightly more intricate than for C, since it also has to include all constructor calls to static storage duration objects.

Related

How to use -fsplit-stack on a large project

I recently posted a question about stack segmentation and boost coroutines but it seems like the -fsplit-stack approach only works with source files that are compiled with that flag, the runtime breaks down when you branch to another function that has not been compiled with -fsplit-stack. For example
This implies that the runtime uses a function local technique to detect when the current stack has been surpassed. And not a "guard page signal" trick, where the end of the stack always has a guard page which will raise a signal on write or read, telling the runtime to allocate a new stack frame and branch to that.
Then what is the use of this flag? If I link to any other library that has not been built with this, code will break (even libstdc++ and libc), then how is this something people use practically with big projects?
From reading the gcc wiki about split stacks it seems like calling a non split stack function from a split stack function results in an allocation of a 64KB stack frame. Good.
But it seems like calling a non split stack function from a function pointer has not yet been implemented to follow the above scheme.
What use is this flag then? If I proceed to call any virtual function will my program break?
Further from the answer below it seems like clang has not implemented split stacks?

You have to compile boost (at least boost.context and boost.coroutine) with segmeented-stacks support AND your application.
compile boost (boost.context and boost.coroutine) with b2 property segmented-stacks=on (enables special code inside boost.coroutine and boost.context).
your app has to be compiled with -DBOOST_USE_SEGMENTED_STACKS and -fsplit-stack (required by boost.coroutines headers).
see boost.coroutine documentation
boost.coroutine contains an example that demonstrates segmented stacks (in directory coroutine/example/asymmetric/ call b2 toolset=gcc segmented-stacks=on).
regarding your last question GCC Wiki states:
For calls from split-stack code to non-split-stack code, the linker
will change the initial instructions in the split-stack (caller)
function. This means that the linker will have to have special
knowledge of the instructions that the compiler emits. The effect of
the changes will be to increase the required framesize by a number
large enough to reasonably work for a non-split-stack. This will be a
target dependent number; the default will be something like 64K. Note
that this large stack will be released when the split-stack function
returns. Note that I'm disregarding the case of split-stack code in a
shared library calling non-split-stack code in the main executable;
that seems like an unlikely problem.
please note: while llvm supports segmented stacks, clang seams not to provide the __splitstack_<xyz> functions.

First I'd say split stack support is somewhat experimental in nature to begin with. It is not a widely supported thing nor has a single implementation become accepted as the way to go. As such, part of the purpose of it existing in the compiler is to enable research in real use.
That said, one generally wants to use such a feature to enable lots of threads with small stacks, but which can get bigger if they need to. In some applications, the code that runs in these threads can be tightly controlled. E.g. fairly specialized request handlers that do not call general purpose libraries such as Boost. High performance systems work often involves tightening down the constraints on what code is used in a given path and this would be an example thereof. It certainly limits the applicability of the feature, but I wouldn't be surprised if someone is using it in production this way.
Note that similar issues exist with flags such as -fno-exceptions and -fno-rtti . Generally C++ requires compiling everything that goes into an executable with a compatible set of flags. Sometimes one can mix and match, but it is often fragile. This is part of the motivation of building everything from source and hermetic build tools like bazel. Other languages have different approaches to non-source components, especially virtual machine based languages such as Java and the .NET family. In those worlds things like split stacks are decided at a lower-level of compilation, but typically one would not have any control over or awareness of them at the source code level.

Using shared libraries and mismatched compilers

What are the chances using shared libraries that are compiled using different compiler versions than your program would introduce problems?
What if the language standard your programs use is different from their?
i.e Could it be a problem if I link boost libraries compiled with gcc-4.8,c++11 while compiling my code with gcc-6,c++14, for instance?

If the ABI (and API) is the same, it will work fine, according to gcc.gnu.org's ABI Policy and Guidelines, "given an application compiled with a given compiler ABI and library API, it will work correctly with a Standard C++ Library created with the same constraints."
A shared library compiled with a different ABI could work, but there are some cases of which to be aware because they could cause major bugs which would be incredibly difficult to detect.
gcc-4.8 and gcc-6 have different ABIs (Application Binary Interfaces), and so could output different compiled code in very specific cases, and cause an application to break.
However, "the GNU C++ compiler, g++, has a compiler command-line option to switch between various different C++ ABIs." (According to ABI Policy and Guidelines.)
You can read more details about specific ABI issues from gcc.gnu.org:

Short answer, close to 100% that you'll have some issues unless the library was designed with this in mind. Boost was not designed with this in mind at all, it will not work.
Long answer, it might work under certain very specific circumstances (not for boost though). There are 2 main things that come into play:
ABI compatibility
Sub-library compatibility / inlined code
The ABI is the easy part. If e.g. compiler A mangles names differently than compiler B, it won't even link. Or if they have different calling conventions (e.g. how arguments are passed through registers/stack etc) then it might link but it will not work at all / crash in pretty obvious manners. Also most compilers on the same platform have the same calling conventions (or can be configured appropriately) so this shouldn't be a huge problem.
Sub-library compatibility and inlined code is trickier. For example, let's say that you have a library that passes out an allocated object, and it's the client's job to deallocate. If the library's allocator works differently from the one in the client, then this will cause problems (e.g. the library uses compiler A's new, and the main program uses compiler B's delete).
Or there might be code in the headers (e.g. inline methods). The two compilers might compile them differently, which will cause issues.
Or the library might return a std::vector. The implementations of compiler A's vector might differ from compiler B's vector, so that won't work either.
Or there might be a struct or class passed around. The two compilers might not pack/pad them the same way, so they won't be laid out the same way in memory, and things will break.
So if the library is designed with this in mind, then it might work. Generally speaking that means that:
All calls have to be extern C to avoid name mangling.
No passing around of any externally defined structures (e.g. the STL)
Even structs might cause issues unless the packing/padding is the same in both compilers
Everything allocated by the library must also be deallocated by the library
No throwing of exceptions (which is sort of implied by extern C)
Probably a few more that I'm forgetting right now

DLL compatibility for different compilers

Try to don't judge me really hard for this question.
Will I have problems if I try to use a DLL (written in C++) in a Visual Studio project that was compiled with the MinGW toolchain (g++)?
If the answer is yes, could someone explain me why?

As far as code execution goes, if it targets Windows, then it should run.
For interoperability; it depends on the API exposed from the dll.
If it is a "C" style API then you should be fine.
If it is a C++ API with classes and STL use, then you may run into trouble using different compilers (even different settings in the compilers). C++ doesn't have a standard ABI.
As noted in the comments and probably worth touching on here; the "C" style API is more than just the function calls, compatibility includes (but not limited) to things such as;
Memory allocation; general rule is "deallocation must be done by the same code as the allocation" - i.e. in the dll such that the matching allocation/deallocation is used (e.g. new and delete, malloc and free, CustomAlloc and CustomDealloc)
Data alignment and packing, not all compilers have the same defaults
Calling conventions, again, not all compilers have the same defaults, when in doubt, be explicit on what that convention should be (e.g. __stdcall)
As for the C++ ABI, in addition to all of the above, incompatibilities include;
Name mangling
STL class and container implementations
Separate and different implementations of new and delete, with custom memory pooling etc.
Exception mechanisms, e.g. generation, and stack unwinding
These are not comprehensive lists, but they go some way to motivating the general advice that when possible, use the same toolchains and same compiler/linker settings.

The problem is not about DLL. It's about how you describe such DLL to your compiler (ie. header files and exports).
Those descriptions affect how your code expect on data layout (e.g. different compiler may have different default alignment and padding), or even different code path (e.g. resolve to different inline functions and macro, such as in STL case).
Then each compiler may have different name mangling scheme, you may need a bit hack to glue the codes.
With very special care on the above issues you should be fine using a DLL from different compiler.

No, because it is all one standard. And in fact your C++ application compiled with gcc calls system functions from OS dll files. So no matter if you call them manually or automatically, it is all the same.

What do terms like static compiler and runtime compilers practically mean?

I'm working to learn C++ more and trying to know basics about different compilers and their technologies. I googled this a lot but every time I stepped through I happened to meet new terms which need more explanation. So, what do these terms such as static compiling, dynamic linking and so on which reside in this topic mean in action?

Some languages like C++ compile all of a program to "native machine code" understood by the CPU before it starts running (i.e. actually being used). That's "static compilation".
Other languages (e.g. Java) use "Just In Time" compilers to produce CPU-native code from some other "byte-code" representation of the program, but do so only once they start running. That's "runtime" compilation.
Many other languages (e.g. common python, perl, ruby, Java implementations) use an "interpreter", which means they have native code that keeps consulting some manner of "byte-code" to work out what to do next. (Some very basic inter-company or specialised interpreters even keep consulting the source code without ever generating a more compact byte-code representation thereof, but no popular language does that - it's awfully slow and clumsy).
A single language may potentially use any combination of these approaches, but in general it's either static compilation, or an interpreter that might add a just-in-time compiler to speed up execution.
Sometimes, a single language has implementations that use different approaches, for example: there are limited C++ interpreters (like http://root.cern.ch/drupal/content/cint, but I've never heard of it being used "in anger"), and systems that compile python to native code.
For "dynamic linking": say you have a function "void f();" that does something wonderful. If you put that function in a library for use by many applications, you could "statically link" the function into a specific application to take a "snapshot" of f()'s functionality at the specific point in time your program's executable is being created. Then if f() changes later you have to re-link and redistribute your application to incorporate the changes to f(). Alternatively, you could put f() into a dynamic linked library, which means a separate library file containing f() is distributed alongside - or independently from - your program. Each time your program starts running, if looks to the dynamic library file for the code to use for f(). So, if you distribute an updated dynamic library you can update f() without redistributing all the application programs that call f(). Sometimes, that's just a better model for distributing updated software to your users, and avoiding getting each individual application program involved in the distribution of updates to f(). (Occasionally it's a disaster because a dynamic version of f() hasn't actually been tested with the application, and does something subtly differently that breaks the application).

About static or dynamic linking, read also Levine's Linker & Loader book.
About shared or static libraries, read Program Library HowTo
For shared objects (or libraries) on Linux, read Drupper's How To Write Shared Libraries paper
You could load plugins with dlopen(3) but then read C++ dlopen mini-howto.
Compilation is usually static, since it is ahead of time (e.g. when compiling with GCC). Sometimes, a just in time compilation is done (e.g. by most JVMs).
If coding in C++ on Linux, you'll want to compile with g++ -Wall -g (and later use -O2 to ask GCC to optimize when your program is debugged). See this and that hints.
Also, learn C++11 and use the latest GCC 4.8.2 compiler (GCC 4.9 might be released in a few weeks, e.g. in march or april 2014).

Where do I learn "what I need to know" about C++ compilers?

I'm just starting to explore C++, so forgive the newbiness of this question. I also beg your indulgence on how open ended this question is. I think it could be broken down, but I think that this information belongs in the same place.
(FYI -- I am working predominantly with the QT SDK and mingw32-make right now and I seem to have configured them correctly for my machine.)
I knew that there was a lot in the language which is compiler-driven -- I've heard about pre-compiler directives, but it seems like someone would be able to write books the different C++ compilers and their respective parameters. In addition, there are commands which apparently precede make (like qmake, for example (is this something only in QT)).
I would like to know if there is any place which gives me an overview of what compilers are out there, and what their different options are. I'd also like to know how each of them views Makefiles (it seems that there is a difference in syntax between them?).
If there is no website regarding, "Everything you need to know about C++ compilers but were afraid to ask," what would be the best way to go about learning the answers to these questions?

Concerning the "numerous options of the various compilers"
A piece of good news: you needn't worry about the detail of most of these options. You will, in due time, delve into this, only for the very compiler you use, and maybe only for the options that pertain to a particular set of features. But as a novice, generally trust the default options or the ones supplied with the make files.
The broad categories of these features (and I may be missing a few) are:
pre-processor defines (now, you may need a few of these)
code generation (target CPU, FPU usage...)
optimization (hints for the compiler to favor speed over size and such)
inclusion of debug info (which is extra data left in the object/binary and which enables the debugger to know where each line of code starts, what the variables names are etc.)
directives for the linker
output type (exe, library, memory maps...)
C/C++ language compliance and warnings (compatibility with previous version of the compiler, compliance to current and past C Standards, warning about common possible bug-indicative patterns...)
compile-time verbosity and help
Concerning an inventory of compilers with their options and features
I know of no such list but I'm sure it probably exists on the web. However, suggest that, as a novice you worry little about these "details", and use whatever free compiler you can find (gcc certainly a great choice), and build experience with the language and the build process. C professionals may likely argue, with good reason and at length on the merits of various compilers and associated runtine etc., but for generic purposes -and then some- the free stuff is all that is needed.
Concerning the build process
The most trivial applications, such these made of a single unit of compilation (read a single C/C++ source file), can be built with a simple batch file where the various compiler and linker options are hardcoded, and where the name of file is specified on the command line.
For all other cases, it is very important to codify the build process so that it can be done
a) automatically and
b) reliably, i.e. with repeatability.
The "recipe" associated with this build process is often encapsulated in a make file or as the complexity grows, possibly several make files, possibly "bundled together in a script/bat file.
This (make file syntax) you need to get familiar with, even if you use alternatives to make/nmake, such as Apache Ant; the reason is that many (most?) source code packages include a make file.
In a nutshell, make files are text files and they allow defining targets, and the associated command to build a target. Each target is associated with its dependencies, which allows the make logic to decide what targets are out of date and should be rebuilt, and, before rebuilding them, what possibly dependencies should also be rebuilt. That way, when you modify say an include file (and if the make file is properly configured) any c file that used this header will be recompiled and any binary which links with the corresponding obj file will be rebuilt as well. make also include options to force all targets to be rebuilt, and this is sometimes handy to be sure that you truly have a current built (for example in the case some dependencies of a given object are not declared in the make).
On the Pre-processor:
The pre-processor is the first step toward compiling, although it is technically not part of the compilation. The purposes of this step are:
to remove any comment, and extraneous whitespace
to substitute any macro reference with the relevant C/C++ syntax. Some macros for example are used to define constant values such as say some email address used in the program; during per-processing any reference to this constant value (btw by convention such constants are named with ALL_CAPS_AND_UNDERSCORES) is replace by the actual C string literal containing the email address.
to exclude all conditional compiling branches that are not relevant (the #IFDEF and the like)
What's important to know about the pre-processor is that the pre-processor directive are NOT part of the C-Language proper, and they serve several important functions such as the conditional compiling mentionned earlier (used for example to have multiple versions of the program, say for different Operating Systems, or indeed for different compilers)
Taking it from there...
After this manifesto of mine... I encourage to read but little more, and to dive into programming and building binaries. It is a very good idea to try and get a broad picture of the framework etc. but this can be overdone, a bit akin to the exchange student who stays in his/her room reading the Webster dictionary to be "prepared" for meeting native speakers, rather than just "doing it!".

Ideally you shouldn't need to care what C++ compiler you are using. The compatability to the standard has got much better in recent years (even from microsoft)
Compiler flags obviously differ but the same features are generally available, it's just a differently named option to eg. set warning level on GCC and ms-cl
The build system is indepenant of the compiler, you can use any make with any compiler.

That is a lot of questions in one.
C++ compilers are a lot like hammers: They come in all sizes and shapes, with different abilities and features, intended for different types of users, and at different price points; ultimately they all are for doing the same basic task as the others.
Some are intended for highly specialized applications, like high-performance graphics, and have numerous extensions and libraries to assist the engineer with those types of problems. Others are meant for general purpose use, and aren't necessarily always the greatest for extreme work.
The technique for using each type of hammer varies from model to model—and version to version—but they all have a lot in common. The macro preprocessor is a standard part of C and C++ compilers.
A brief comparison of many C++ compilers is here. Also check out the list of C compilers, since many programs don't use any C++ features and can be compiled by ordinary C.
C++ compilers don't "view" makefiles. The rules of a makefile may invoke a C++ compiler, but also may "compile" assembly language modules (assembling), process other languages, build libraries, link modules, and/or post-process object modules. Makefiles often contain rules for cleaning up intermediate files, establishing debug environments, obtaining source code, etc., etc. Compilation is one link in a long chain of steps to develop software.
Also, many development environments abstract the makefile into a "project file" which is used by an integrated development environment (IDE) in an attempt to simplify or automate many programming tasks. See a comparison here.
As for learning: choose a specific problem to solve and dive in. The target platform (Linux/Windows/etc.) and problem space will narrow the choices pretty well. Which you choose is often linked to other considerations, such as working for a particular company, or being part of a team. C++ has something like 95% commonality among all its flavors. Learn any one of them well, and learning the next is a piece of cake.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js