How does the C++ standard library work behind the scenes? - c++

This question has been bothering me so much for the past couple of days. I was wondering how the standard library works, in terms of functionality. I couldn't find an answer anywhere, even by checking the source code provided by the LLVM compiler which is, for a beginner like me, a really complicated piece of code.
What I'm basically trying to understand here is how does the C++ standard library work. For example let's take the fstream header file which consist of a bunch of functions that help to write to and read from files.
How does it work? Does it use the OS specific API (since the library is cross platform), or what? And, if the standard library can do it, aren't I supposed to be able to mess with some files as well without calling the standard fstream file (which to my experience I can't do)?
I apologize if my questions are unclear since I'm not a native English speaker: feel free to modify this text so as to make it clearer.

Does it use the OS specific API (since the library is cross platform), or what?
At some point, the OS specific API is used. The fstream implementation does not necessarily call an OS function directly. It might use other classes, which call functions inherited from C, etc., but eventually the call chain will lead to an OS call. (Yes, the details are often too complicated for an intermediate programmer to follow. So, as a self-described beginner, your findings are not surprising.)
The library is cross-platform in the sense that on your end (the C++ programmer), the interface is the same regardless of platform. It is not, however, the same library on every platform. Each platform has its own library, exposing the same interface on the C++ side, but making use of different OS calls. (In fact, the same platform might have multiple standard libraries, as the library implementation is provided by your toolchain, not by the standards committee.)
And, if the standard library can do it, aren't I supposed to be able to mess with some files as well without calling the standard fstream file (which to my experience I can't do)?
Yes, you are allowed to. Apparently, you have not been able to yet, but with some practice and guidance you should be able to. Everything in the standard library can be recreated in your own code. The point of the standard library (and most libraries, for that matter) is to save you time, not to enable something that was otherwise unavailable. For example, you don't have to implement a file stream for every program you write; it's in the standard library so you can focus on more interesting aspects of your project.

A compiler is just a program which create executable file or library. You can use the compiler default libraries to gain time or write your own. The default libraries communicate with the os for file operation or memory allocation and provide a simple standard classes to allow the developper to write only one code which work on all target platforms supported by the compiler and the libraries. If you want to write your own you have to write each function for all your target os.

The standard library is cross-platform in a sense that its interface does not change between platforms but its implementation does - or in practical terms - if you only use C++ and its standard library, you can write your code the same way for Linux / Windows / MacOS / Android / Whatever and if you find a C++ compiler for one of those platforms that supports the language features you used, you will be able to compile your code for that platform without rewriting anything.
So while you can use std::vector or std::fstream or any other feature in the library independently of the platform you're writing for and expect the function definitions, type names, etc. to look the same, you cannot expect the executable which you compiled for PC with Windows 10 to run on a phone with Android. You cannot even expect the same executable to run on the same PC but with different system - that is what I mean by "the implementation is different"
There are two main reasons for this difference:
Processors with different architectures (x86-64 and ARM for example) use different instruction sets and as such the C++ source would need to be compiled to a completely different machine code to run properly
Computers with processors of the same architecture which have a different operating system have different ways of dynamically allocating memory, creating files, creating streams, writing to console, creating and scheduling threads etc. - which is part of the system functionality that you use via the standard library
If you really wanted to you could use HeapAlloc() instead of operator new() or CreateThread() instead of stdlib's std::thread but that would force you to both rewrite your program every time you wanted to compile it for something else than Windows and recompile it with the target platform's compiler (and by proxy learn its API). Standard library saves you from that trouble by abstracting away those system calls.
As for the fstream in particular, here is what it uses internally on most PCs nowadays.

Basically, fstream, iostream and printf works based on a kernel function write(). When your code call printf (we use printf as an example), it will finally call write() to let the kernel work on the IO stuff. After that, write() returns and printf returns and your code continues.
So if you really want to know how the printf works internally, you have to read the source code of the Kernel.
But you shouldn't do that for now.
For a beginner, do not try to go deeper when you haven't got a basic cognition about computer. A computer is a project, just like a building. So the right way to learn it is to learn it level by level. First, learning how to use brick and cement to build a building, this is what you should do for now. What you shouldn't do is that you are learning how to build a building and this is your first time to try to use brick, then you are interested in how to produce a brick and start to focus on brick, this is a wrong way to learn IT.
If you are learning C/C++, just learn it. Remember, learn it level by level. For now, knowing how to use printf is enough.

Related

How does a language expand itself?

I am learning C++ and I've just started learning about some of Qt's capabilities to code GUI programs. I asked myself the following question:
How does C++, which previously had no syntax capable of asking the OS for a window or a way to communicate through networks (with APIs which I don't completely understand either, I admit) suddenly get such capabilities through libraries written in C++ themselves? It all seems terribly circular to me. What C++ instructions could you possibly come up with in those libraries?
I realize this question might seem trivial to an experienced software developer but I've been researching for hours without finding any direct response. It's gotten to the point where I can't follow the tutorial about Qt because the existence of libraries is incomprehensible to me.
A computer is like an onion, it has many many layers, from the inner core of pure hardware to the outermost application layer. Each layer exposes parts of itself to the next outer layer, so that the outer layer may use some of the inner layers functionality.
In the case of e.g. Windows the operating system exposes the so-called WIN32 API for applications running on Windows. The Qt library uses that API to provide applications using Qt to its own API. You use Qt, Qt uses WIN32, WIN32 uses lower levels of the Windows operating system, and so on until it's electrical signals in the hardware.
You're right that in general, libraries cannot make anything possible that isn't already possible.
But the libraries don't have to be written in C++ in order to be usable by a C++ program. Even if they are written in C++, they may internally use other libraries not written in C++. So the fact that C++ didn't provide any way to do it doesn't prevent it from being added, so long as there is some way to do it outside of C++.
At a quite low level, some functions called by C++ (or by C) will be written in assembly, and the assembly contains the required instructions to do whatever isn't possible (or isn't easy) in C++, for example to call a system function. At that point, that system call can do anything your computer is capable of, simply because there's nothing stopping it.
C and C++ have 2 properties that allow all this extensibility that the OP is talking about.
C and C++ can access memory
C and C++ can call assembly code for instructions not in the C or C++ language.
In the kernel or in a basic non-protected mode platform, peripherals like the serial port or disk drive are mapped into the memory map in the same way as RAM is. Memory is a series of switches and flipping the switches of the peripheral (like a serial port or disk driver) gets your peripheral to do useful things.
In a protected mode operating system, when one wants to access the kernel from userspace (say when writing to the file system or to draw a pixel on the screen) one needs to make a system call. C does not have an instruction to make a system calls but C can call assembler code which can trigger the correct system call, This is what allows one's C code to talk to the kernel.
In order to make programming a particular platform easier, system calls are wrapped in more complex functions which may perform some useful function within one's own program. One is free to call the system calls directly (using assembler) but it is probably easier to just make use of one of the wrapper functions that the platform supplies.
There is another level of API that are a lot more useful than a system call. Take for example malloc. Not only will this call the system to obtain large blocks of memory but will manage this memory by doing all the book keeping on what is take place.
Win32 APIs wrap some graphic functionality with a common platform widget set. Qt takes this a bit further by wrapping the Win32 (or X Windows) API in a cross platform way.
Fundamentally though a C compiler turns C code into machine code and since the computer is designed to use machine code, you should expect C to be able to accomplish the lions share or what a computer can do. All that the wrapper libraries do is do the heavy lifting for you so that you don't have to.
Languages (like C++11) are specifications, on paper, usually written in English. Look inside the latest C++11 draft (or buy the costly final spec from your ISO vendor).
You generally use a computer with some language implementation (You could in principle run a C++ program without any computer, e.g. using a bunch of human slaves interpreting it; that would be unethical and inefficient)
Your C++ implementation general works above some operating system and communicate with it (using some implementation specific code, often in some system library). Generally that communication is done thru system calls. Look for instance into syscalls(2) for a list of system calls available on the Linux kernel.
From the application point of view, a syscall is an elementary machine instruction like SYSENTER on x86-64 with some conventions (ABI)
On my Linux desktop, the Qt libraries are above X11 client libraries communicating with the X11 server Xorg thru X Windows protocols.
On Linux, use ldd on your executable to see the (long) list of dependencies on libraries. Use pmap on your running process to see which ones are "loaded" at runtime. BTW, on Linux, your application is probably using only free software, you could study its source code (from Qt, to Xlib, libc, ... the kernel) to understand more what is happening
I think the concept you are missing is system calls. Each operating system provides an enormous amount of resources and functionality that you can tap into to do low-level operating system related things. Even when you call a regular library function, it is probably making a system call behind the scenes.
System calls are a low-level way of making use of the power of the operating system, but can be complex and cumbersome to use, so are often "wrapped" in APIs so that you don't have to deal with them directly. But underneath, just about anything you do that involves O/S related resources will use system calls, including printing, networking and sockets, etc.
In the case of windows, Microsoft Windows has its GUI actually written into the kernel, so there are system calls for making windows, painting graphics, etc. In other operating systems, the GUI may not be a part of the kernel, in which case as far as I know there wouldn't be any system calls for GUI related things, and you could only work at an even lower level with whatever low-level graphics and input related calls are available.
Good question. Every new C or C++ developer has this in mind. I am assuming a standard x86 machine for the rest of this post. If you are using Microsoft C++ compiler, open your notepad and type this (name the file Test.c)
int main(int argc, char **argv)
{
return 0
}
And now compile this file (using developer command prompt) cl Test.c /FaTest.asm
Now open Test.asm in your notepad. What you see is the translated code - C/C++ is translated to assembler. Do you get the hint ?
_main PROC
push ebp
mov ebp, esp
xor eax, eax
pop ebp
ret 0
_main ENDP
C/C++ programs are designed to run on the metal. Which means they have access to lower level hardware which makes it easier to exploit the capabilities of the hardware. Say, I am going to write a C library getch() on a x86 machine.
Depending on the assembler I would type something this way :
_getch proc
xor AH, AH
int 16h
;AL contains the keycode (AX is already there - so just return)
ret
I run it over with an assembler and generate a .OBJ - Name it getch.obj.
I then write a C program (I dont #include anything)
extern char getch();
void main(int, char **)
{
getch();
}
Now name this file - GetChTest.c. Compile this file by passing getch.obj along. (Or compile individually to .obj and LINK GetChTest.Obj and getch.Obj together to produce GetChTest.exe).
Run GetChTest.exe and you would find that it waits for the keyboard input.
C/C++ programming is not just about language. To be a good C/C++ programmer you need to have a good understanding on the type of machine that it runs. You will need to know how the memory management is handled, how the registers are structured, etc., You may not need all these information for regular programming - but they would help you immensely. Apart from the basic hardware knowledge, it certainly helps if you understand how the compiler works (ie., how it translates) - which could enable you to tweak your code as necessary. It is an interesting package!
Both languages support __asm keyword which means you could mix your assembly language code too. Learning C and C++ will make you a better rounded programmer overall.
It is not necessary to always link with Assembler. I had mentioned it because I thought that would help you understand better. Mostly, most such library calls make use of system calls / APIs provided by the Operating System (the OS in turn does the hardware interaction stuff).
How does C++ ... suddenly get such capabilities through libraries
written in C++ themselves ?
There's nothing magical about using other libraries. Libraries are simple big bags of functions that you can call.
Consider yourself writing a function like this
void addExclamation(std::string &str)
{
str.push_back('!');
}
Now if you include that file you can write addExclamation(myVeryOwnString);. Now you might ask, "how did C++ suddenly get the capability to add exclamation points to a string?" The answer is easy: you wrote a function to do that then you called it.
So to answer your question about how C++ can get capabilities to draw windows through libraries written in C++, the answer is the same. Someone else wrote function(s) to do that, and then compiled them and gave them to you in the form of a library.
The other questions answer how the window drawing actually works, but you sounded confused about how libraries work so I wanted to address the most fundamental part of your question.
The key is the possibility of the operating system to expose an API and a detailed description on how this API is to be used.
The operating system offers a set of APIs with calling conventions.
The calling convention is defining the way a parameter is given into the API and how results are returned and how to execute the actual call.
Operating systems and the compilers creating code for them play nicely together, so you usually have not to think about it, just use it.
There is no need for a special syntax for creating windows. All that is required is that the OS provides an API to create windows. Such an API consists of simple function calls for which C++ does provide syntax.
Furthermore C and C++ are so called systems programming languages and are able to access arbitrary pointers (which might be mapped to some device by the hardware). Additionally, it is also fairly simple to call functions defined in assembly, which allows the full range of operations the processor provides. Therefore it is possible to write an OS itself using C or C++ and a small amount of assembly.
It should also be mentioned that Qt is a bad example, as it uses a so-called meta compiler to extend C++' syntax. This is however not related to it's ability to call into the APIs provided by the OS to actually draw or create windows.
First, there's a little misunderstading, I think
How does C++, which previously had no syntax capable of asking the OS for a window or a way to communicate through networks
There is no syntax for doing OS operations. It's the question of semantics.
suddenly get such capabilities through libraries written in C++ themselves
Well, the operating system is writen mostly in C. You can use shared libraries (so, dll) to call the external code. Additionally, the operating system code can register system routines on syscalls* or interrupts which you can call using assembly. That shared libraries often just make that system calls for you, so you are spared using inline assembly.
Here's the nice tutorial on that: http://www.win.tue.nl/~aeb/linux/lk/lk-4.html
It's for Linux, but the principles are the same.
How the operating system is doing operations on graphic cards, network cards etc? It's a very broad thema, but mostly you need to access interrupts, ports or write some data to special memory region. Since that operations are protected, you need to call them through the operating system anyway.
In an attempt to provide a slightly different view to other answers, I shall answer like this.
(Disclaimer: I am simplifying things slightly, the situation I give is purely hypothetical and is written as a means of demonstrating concepts rather than being 100% true to life).
Think of things from the other perspective, imagine you've just written a simple operating system with basic threading, windowing and memory management capabilities. You want to implement a C++ library to let users program in C++ and do things like make windows, draw onto windows etc. The question is, how to do this.
Firstly, since C++ compiles to machine code, you need to define a way to use machine code to interface with C++. This is where functions come in, functions accept arguments and give return values, thus they provide a standard way of transferring data between different sections of code. They do this by establishing something known as a calling convention.
A calling convention states where and how arguments should be placed in memory so that a function can find them when it gets executed. When a function gets called, the calling function places the arguments in memory and then asks the CPU to jump over to the other function, where it does what it does before jumping back to where it was called from. This means that the code being called can be absolutely anything and it will not change how the function is called. In this case however, the code behind the function would be relevant to the operating system and would operate on the operating system's internal state.
So, many months later and you've got all your OS functions sorted out. Your user can call functions to create windows and draw onto them, they can make threads and all sorts of wonderful things. Here's the problem though, your OS's functions are going to be different to Linux's functions or Windows' functions. So you decide you need to give the user a standard interface so they can write portable code. Here is where QT comes in.
As you almost certainly know, QT has loads of useful classes and functions for doing the sorts of things that operating systems do, but in a way that appears independent of the underlying operating system. The way this works is that QT provides classes and functions that are uniform in the way they appear to the user, but the code behind the functions is different for each operating system. For example QT's QApplication::closeAllWindows() would actually be calling each operating system's specialised window closing function depending on the version used. In Windows it would most likely call CloseWindow(hwnd) whereas on an os using the X Window System, it would potentially call XDestroyWindow(display,window).
As is evident, an operating system has many layers, all of which have to interact through interfaces of many varieties. There are many aspects I haven't even touched on, but to explain them all would take a very long time. If you are further interested in the inner workings of operating systems, I recommend checking out the OS dev wiki.
Bear in mind though that the reason many operating systems choose to expose interfaces to C/C++ is that they compile to machine code, they allow assembly instructions to be mixed in with their own code and they provide a great degree of freedom to the programmer.
Again, there is a lot going on here. I would like to go on to explain how libraries like .so and .dll files do not have to be written in C/C++ and can be written in assembly or other languages, but I feel that if I add any more I might as well write an entire article, and as much as I'd love to do that I don't have a site to host it on.
When you try to draw something on the screen, your code calls some other piece of code which calls some other code (etc.) until finally there is a "system call", which is a special instruction that the CPU can execute. These instructions can be either written in assembly or can be written in C++ if the compiler supports their "intrinsics" (which are functions that the compiler handles "specially" by converting them into special code that the CPU can understand). Their job is to tell the operating system to do something.
When a system call happens, a function gets called that calls another function (etc.) until finally the display driver is told to draw something on the screen. At that point, the display driver looks at a particular region in physical memory which is actually not memory, but rather an address range that can be written to as if it were memory. Instead, however, writing to that address range causes the graphics hardware to intercept the memory write, and draw something on the screen.
Writing to this region of memory is something that could be coded in C++, since on the software side it's just a regular memory access. It's just that the hardware handles it differently.
So that's a really basic explanation of how it can work.
Your C++ program is using Qt library (also coded in C++). The Qt library will be using Windows CreateWindowEx function (which was coded in C inside kernel32.dll). Or under Linux it may be using Xlib (also coded in C), but it could as well be sending the raw bytes that in X protocol mean "Please create a window for me".
Related to your catch-22 question is the historical note that “the first C++ compiler was written in C++”, although actually it was a C compiler with a few C++ notions, enough so it could compile the first version, which could then compile itself.
Similarly, the GCC compiler uses GCC extensions: it is first compiled to a version then used to recompile itself. (GCC build instructions)
How i see the question this is actually a compiler question.
Look at it this way, you write a piece of code in Assembly(you can do it in any language) which translates your newly written language you want to call Z++ into Assembly, for simplicity lets call it a compiler (it is a compiler).
Now you give this compiler some basic functions, so that you can write int, string, arrays etc. actually you give it enough abilities so that you can write the compiler itself in Z++. and now you have a compiler for Z++ written in Z++, pretty neat right.
Whats even cooler is that now you can add abilities to that compiler using the abilities it already has, thus expanding the Z++ language with new features by using the previous features
An example, if you write enough code to draw a pixel in any color, then you can expand it using the Z++ to draw anything you want.
The hardware is what allows this to happen. You can think of the graphics memory as a large array (consisting of every pixel on the screen). To draw to the screen you can write to this memory using C++ or any language that allows direct access to that memory. That memory just happens to be accessible by or located on the graphics card.
On modern systems accessing the graphics memory directly would require writing a driver because of various restrictions so you use indirect means. Libraries that create a window (really just an image like any other image) and then write that image to the graphics memory which the GPU then displays on screen. Nothing has to be added to the language except the ability to write to specific memory locations, which is what pointers are for.

Does C++ deprecate some parts of the Linux API?

I am in the middle of reading The Linux Programming Interface and Linux programming by examples. Both are very good books and explain Linux API very well. But quite often I find myself thinking that in real world projects I would prefer C++ standard library, Boost or some other good C++ library (there are many well written and portable C++ libs) over C API whenever possible. This naturally bags a question - why do I need to use Linux API directly when good C++ compiler and libs (Boost, TBB and etc) are available on target platforms? I guess the same could be said about Windows API too, but I don't know much about Windows system programing.
This is not new to C++. In C, there have been two ways to open files for a long, long time:
// Only on POSIX
int fdes = open("file.txt", O_RDONLY);
Or:
// Any hosted C environment, POSIX or otherwise
FILE *fp = fopen("file.txt", "rb");
So why would anyone ever use the POSIX-specific version? The answer is simple -- there are a large number of system calls which work with POSIX file descriptors. For example, select. You can also make things other than files, like pipes and sockets, and you can pass them to other processes. There is a long tradition of using POSIX file descriptors, and we have a large number of books and references on how to do network programming with them.
So the trade-off is between the portable version and the powerful version. It always has been.
The other half of this is that time you work with files on Linux you are working with the POSIX interface. Libraries just hide it from you. Boost uses it, the C runtime uses it, the JRE uses it, and GHC uses it. Many (most?) language runtimes are written in C, and direct access to the system calls is preferred.
You should use higher level API whenever possible. It's usually faster to work with and makes it easier to port your code to another platform. However:
Due to the law of leaky abstractions it's useful to know the underyling operating system API so you understand various quirks and performance issues that the higher level API was not able to hide.
Some things are not doable with the portable API, usually because it's so different between operating systems that it's not easily possible.
All portable APIs incur some overhead. In a big project it's small compared to the rest of the code, but if you are doing something small, you might want to avoid that overhead, especially if you know you'll need to use the specific API somewhere anyway.
C++ standard is not published for a particular platform. It is platform independent, So If you are going to use some platform features/functionality you will have to use platform dependent feature/functionality usually called system api. So in that sense No the C++ library does not deprecate linux/windows api.

dynamic code compilation

I'm working on a program that renders iterated fractal systems. I wanted to add the functionality where someone could define their own iteration process, and compile that code so that it would run efficiently.
I currently don't know how to do this and would like tips on what to read to learn how to do this.
The main program is written in C++ and I'm familiar with C++. In fact given most of the scenarios I know how to convert it to assembly code that would accomplish the goal, but I don't know how to take the extra step to convert it to machine code. If possible I'd like to dynamically compile the code like how I believe many game system emulators work.
If it is unclear what I'm asking, tell me so I can clarify.
Thanks!
Does the routine to be compiled dynamically need to be in any particular language. If the answer to that question is "Yes, it must be C++" you're probably out of luck. C++ is about the worst possible choice for online recompilation.
Is the dynamic portion of your application (the fractal iterator routine) a major CPU bottleneck? If you can afford using a language that isn't compiled, you can probably save yourself an awful lot of trouble. Lua and JavaScript are both heavily optimized interpreted languages that only run a few times slower than native, compiled code.
If you really need the dynamic functionality to be compiled to machine code, your best bet is probably going to be using clang/llvm. clang is the C/Objective-C front end being developed by Apple (and a few others) to make online, dynamic recompilation perform well. llvm is the backend clang uses to translate from a portable bytecode to native machine code. Be advised that clang does not currently support much of C++, since that's such a difficult language to get right.
Some CPU emulators treat the machine code as if it was byte code and they do a JIT compile, almost as if it was Java. This is very efficient, but it means that the developers need to write a version of the compiler for each CPU their emulator runs on and for each CPU emulated.
That usually means it only works on x86 and is annoying to anyone who would like to use something different.
They could also translate it to LLVM or Java byte code or .Net CIL and then compile it, which would also work.
In your case I am not sure that sort of thing is the best way to go. I think that I would do this by using dynamic libraries. Make a directory that is supposed to contain "plugins" and let the user compile their own. Make your program scan the directory and load each DLL or .so it finds.
Doing it this way means you spend less time writing code compilers and more time actually getting stuff done.
If you can write your dynamic extensions in C (not C++), you might find the Tiny C Compiler to be of use. It's available under the LGPL, it's compatible for Windows and Linux, and it's a small executable (or library) at ~100kb for the preprocessor, compiler, linker and assembler, all of which it does very fast. The downside to that, of course, is that it can't compare to the optimizations you can get with GCC. Another potential downside is that it's X86 only AFAIK.
If you did decide to write assembly, TCC can handle that -- the documentation says it supports a gas-like syntax, and it does support X86 opcodes.
TCC also fully supports ANSI C, and it's nearly fully compliant with C99.
That being said, you could either include TCC as an executable with your application or use libtcc (there's not too much documentation of libtcc online, but it's available in the source package). Either way, you can use tcc to generate dynamic or shared libraries, or executables. If you went the dynamic library route, you would just put in a Render (or whatever) function in it, and dlopen or LoadLibrary on it, and call Render to finally run the user-designed rendering. Alternatively, you could make a standalone executable and popen it, and do all your communication through the standalone's stdin and stdout.
Since you're generating pixels to be displayed on a screen, have you considered using HLSL with dynamic shader compile? That will give you access to SIMD hardware designed for exactly this sort of thing, as well as the dynamic compiler built right into DirectX.
LLVM should be able to do what you want to do. It allows you to form a description of the program you'd like to compile in an object-oriented manner, and then it can compile that program description into native machine code at runtime.
Nanojit is a pretty good example of what you want. It generates machine code from an intermediate langauge. It's C++, and it's small and cross-platform. I haven't used it very extensively, but I enjoyed toying around just for demos.
Spit the code to a file and compile it as a dynamically loaded library, then load it and call it.
Is there are reason why you can't use a GPU-based solutions? This seems to be screaming for one.

How do you create a freestanding C++ program?

I'm just wondering how you create a freestanding program in C++?
Edit: By freestanding I mean a program that doesn't run in a hosted envrioment (eg. OS). I want my program to be the first program the computer loads, instead of the OS.
Have a look at this article:
http://www.codeproject.com/KB/tips/boot-loader.aspx
You would need a little assembly start-up code to get you as far as main() but then you could write the rest in C++. You'd have to write your own heap manager (new/delete) if you wanted to create objects at runtime and your own scheduler if you wanted more than one thread.
See this page: http://wiki.osdev.org/C++
It has everything necessary to start writing an OS using c++ as the core language using the more popular toolchains.
In addition this page should prove to be very helpful: http://wiki.osdev.org/C++_Bare_Bones. It pretty much walks you through getting to the c++ entry point of an OS.
Legacy Systems
Even with your clarification, the answer is that it depends -- the exact boot sequence depends on the hardware -- though there's quite a bit of commonality. The boot loader is typically loaded at an absolute address, and the file it's contained in is frequently read into memory exactly as-is. This means instead of a normal linker, you typically use a "linking locator". Where a typical linker produces an executable file ready for relocation and loading, a locator produces an executable that's already set up to run at one exact address, with all relocations already applied. For those old enough to remember them, it's typically pretty much like an MS-DOS .COM file.
Along with that, it has to (of course) statically link the whole run-time that the program depends upon -- it can't depend on something like a DLL or shared object library, because the code to load either of those hasn't itself been loaded yet.
EFI/UEFI
Current PCs (and Macs) use EFI/UEFI. I'm going to just refer to UEFI throughout the remainder of this article, but most of it applies about equally to EFI as well (but UEFI is much more common).
These provide quite a bit more support for boot code. This includes drivers for most devices (it supports installing device drivers), so your boot code can use networking and such, which is much more difficult to support in legacy mode.
Bootable code under EFI uses the same PE format as Windows executables. Libraries are also available so quite a bit of boot code can be written much more like normal code that runs inside an OS. I won't try to get into a lot of detail, but here are links to some information.
https://www.intel.com/content/www/us/en/developer/articles/tool/unified-extensible-firmware-interface.html
https://www.intel.com/content/dam/doc/guide/uefi-driver-network-boot-devices-guide.pdf
https://www.intel.com/content/dam/www/public/us/en/documents/guides/bldk-v2-uefi-standard-based-guide.pdf
And perhaps the most important one--the development kit:
https://github.com/tianocore/edk2
google 'embedded c++' for a start
Another idea is to start with the embedded systems emulators, for example the atmel AVR site has a nice IDE the emulates atmel AVR systems, and allows you to build raw code in C and load it into an emulated CPU, they use gcc as toolchain (I think)
C++ is used in embedded systems programming, even to write OS kernels.
Usually you have at least a few assembler instructions early in the boot sequence. A few things are just easier to express that way, or there may be reference code from the CPU vendor you need to use.
For the initial boot process, you won't be able to use the standard library. No exceptions, RTII, new/delete. It's back to "C with classes". Most people just use C here.
Once you have enough supporting infrastructure loaded though, you can use whatever parts of the standard library you can port.
You will need an environment that provides:
A working C library, or enough of it to do what you want
The parts of the C++ runtime that you intend to use. This is compiler-specific
In addition to any other libraries. If you have no dynamic linker on your platform (if you have no OS, you probably have no linker) then you will have to static-link it all.
In practice this means linking some small C++ runtime, and a C library appropriate for your platform. Then you can simply write a standalone C++ program.
If you were using BSD Unix, you would link with the standalone library. That included a basic IO system for disk and tty. Your source code looked the same as if it were to be run under Unix, but the binary could be loaded into a naked machine.
Yes, of course. The ISO Standard for C and C++ support both hosted and free-standing environments, and the macro STDC_HOSTED is used to distinguish between the two. This has been the case since 2011.
Since most of the replies I see here are in the dark on the fact that this is all standard terminology (and is supposed to be common knowledge for C and C++ users), I'll recap, here, the distinction between the two, laid out by the ISO, and refer you to the following link for more details:
https://en.cppreference.com/w/cpp/freestanding
Hosted:
"main" is defined, and is started up in the main thread. The static objects may be constructed and destructed at the stand and end of the thread. Normally that means there is a host operating system (OS) that provides all the infrastructure required for an environment to run the program in. The POSIX standard, in particular, spells out a set of conditions expected for utilities that are called on a host system from the command-line; and the ISO C/C++ standards dovetail into each other and into the POSIX standard. (POSIX's support for C is at C99 at minimum.)
Free-Standing:
"main" may, but need not, be defined. Whether/how constructors are applied at start-up to static objects, and destructors to static objects on termination, depends on the implementation.
It's any environment where the routine made by the C or C++ program is not being called by some OS, but runs on its own. That's the typical case for embedded systems.
"Free-standing", of course, also includes OS's themselves, as a special case. An OS kernel is the epitome of a free-standing program. One of the respondents here actually went as far as to provide a link to a roll-your-own-OS kit for C++. In that vein, it would be an interesting exercise for you to test out the concept by rewriting the old 0.96 version of Linus' OS (which you can find in UNIX Archive) as a free-standing C++ program. His source code is actually filled with C++'isms (especially in the "fs" directory) that practically scream out "I'm a virtual function", "I'm a base class", "I'm a derived class" or "This is class inheritance"!
Libraries Required For Free-Standing Programs:
There are also minimum requirements on which of the standard libraries must be supported - and how much. Worthy of note: <new>, <exception> are mandatory, only partial support is required for <cstdlib> for start-up and termination, <atomic> (since 2011), <bit> and <coroutine> (both since 2020) are mandatory, as they had better be, if you're doing anything with embedded systems! <thread> is not mandatory.
A roll-your-own-OS kit would then provide templates for the mandatory libraries, and the required parts of the other semi-mandatory libraries. Any responsible embedded systems developer will be laying out their run-time system and environment, and compilers geared toward such users will be providing the under-the-box transparency needed to allow this to be done and to allow user-defined run-time systems to interface cleanly with the compiler.

How do I write a C++ program that will easily compile in Linux and Windows?

I am making a C++ program.
One of my biggest annoyances with C++ is its supposed platform independence.
You all probably know that it is pretty much impossible to compile a Linux C++ program in Windows and a Windows one to Linux without a deluge of cryptic errors and platform specific include files.
Of course you can always switch to some emulation like Cygwin and wine, but I ask you, is there really no other way?
The language itself is cross-platform but most libraries are not, but there are three things that you should keep in mind if you want to go completely cross-platform when programming in C++.
Firstly, you need to start using some kind of cross-platform build system, like SCons. Secondly, you need to make sure that all of the libraries that you are using are built to be cross-platform.
And a minor third point, I would recommend using a compiler that exists on all of your target platforms, gcc comes in mind here (C++ is a rather complex beast and all compilers have their own specific quirks).
I have some further suggestions regarding graphical user interfaces for you. There are several of these available to use, the three most notable are:
GTK+
QT
wxWidgets
GTK+ and QT are two API's that come with their own widget sets (buttons, lists, etc.), whilst wxWidgets is more of a wrapper API to the currently running platforms native widget set. This means that the two former might look a bit differently compared to the rest of the system whilst the latter one will look just like a native program.
And if you're into games programming there are equally many API's to choose from, all of them cross-platform as well. The two most fully featured that I know of are:
SDL
SFML
Both of which contains everything from graphics to input and audio routines, either through plugins or built-in.
Also, if you feel that the standard library in C++ is a bit lacking, check out Boost for some general purpose cross-platform sweetness.
Good Luck.
C++ is cross platform. The problem you seem to have is that you are using platform dependent libraries.
I assume you are really talking about UI componenets- in which case I suggest using something like GTK+, Qt, or wxWindows- each of which have UI components that can be compiled for different systems.
The only solution is for you to find and use platform independent libraries.
And, on a side note, neither cygwin or Wine are emulation- they are 100% native implementations of the same functionality found their respective systems.
Once you're aware of the gotchas, it's actually not that hard. All of the code I am currently working on compiles on 32 and 64-bit Windows, all flavors of Linux, as well as Unix (Sun, HP and IBM). Obviously, these are not GUI products. Also, we don't use third-party libraries, unless we're compiling them ourselves.
I have one .h file that contains all of the compiler-specific code. For example, Microsoft and gcc disagree on how to specify an 8-bit integer. So in the .h, I have
#if defined(_MSC_VER)
typedef __int8 int8_t;
#elif defined(__unix)
typedef char int8_t;
#endif
There's also quite a bit of code that uniformizes certain lower-level function calls, for example:
#if defined(_MSC_VER)
#define SplitPath(Path__,Drive__,Name__,Ext__) _splitpath(Path__,Drive__,Dir__,Name__,Ext__)
#elif defined(__unix)
#define SplitPath(Path__,Drive__,Name__,Ext__) UnixSplitPath(Path__,Drive__,Name__,Ext__)
#endif
Now in this case, I believe I had to write a UnixSplitPath() function - there will be times when you need to. But most of the time, you just have to find the correct replacement function. In my code, I'll call SplitPath(), even though it's not a native function on either platform; the #defines will sort it out for me. It takes a while to train yourself.
Believe it or not, my .h file is only 240 lines long. There's really not much to it. And that includes handling endian issues.
Some of the lower-level stuff will need conditional compilation. For example, in Windows, I use Critical Sections, but in Linux I need to use pthread_mutex's. CriticalSection's were encapsulated in a class, and this class has a good deal of conditional compilation. However, the upper-level program is totally unaware, the class functions exactly the same regardless of the platform.
The other secret I can give you is: build your project on all platforms often (particularly at the beginning). It is a lot easier when you nip the compiler problems in the bud. Don't wait until you're done development before you attempt to go cross-platform.
Stick to ANSI C++ and libraries that are cross-platform and you should be fine.
Create some low-level layer that will contain all the platform-specific code in your project. Implement 2 versions of this layer - one for Windows, and one for Linux - with the same interface, and build them to 2 libraries. Access all platform-specific functionality in your project through that interface.
This layer can contain general classes for file access, printing, GUI, etc.
All the (now non-platform-specific) code that uses that layer can now be compiled once on Windows and once on Linux.
Compile it in Window and again in Linux. Unless you used platform specific libraries, it should work. It's not like Java, where you compile it once and it works everywhere. No one has made a virtual machine for C++, and probably never will. The code you write in C++ will work in any platform. You just have to compile it in every platform first.
Suggestions:
Use typedef's for ints. Or #include <stdint.h>. Some machines think int is 8 bytes, some 4. (It used to be 2 and 4. How the times have changed.)
Use encapsulation wherever possible. My last window's compiler thought %lld was %I64d", gave screwy return values for vsnprintf(), similar issues with close() and sockets, etc.
Watch out for stack size / buffer size limits. I've run into an 8k UDP buffer limit under Windows, amongst other problems.
For some reason, my Window's C++ compiler wouldn't accept dynamicly-sized allocations off the stack. E.g.: void foo(int a) { int b[a]; } Be aware of those sort of things. Plan how you will recode.
#ifdef can be your best friend. And your worst enemy! (At the same time!)
It can certainly be done. But compile and test early and often!
Also Linux and Windows have diffrent data model.
See article: The forgotten problems of 64-bit programs development
Standard C++ is code compiles without errors on any platform.
Try using Bloodshed Dev C++ on windows (instead of VC++ / Borland C++).
As Bloodshed Dev C++ confirms C++ standards, so programs compiled using it will be compiled on linux without errors in most of the cases.