How do you create a freestanding C++ program? - c++

I'm just wondering how you create a freestanding program in C++?
Edit: By freestanding I mean a program that doesn't run in a hosted envrioment (eg. OS). I want my program to be the first program the computer loads, instead of the OS.

Have a look at this article:
http://www.codeproject.com/KB/tips/boot-loader.aspx
You would need a little assembly start-up code to get you as far as main() but then you could write the rest in C++. You'd have to write your own heap manager (new/delete) if you wanted to create objects at runtime and your own scheduler if you wanted more than one thread.

See this page: http://wiki.osdev.org/C++
It has everything necessary to start writing an OS using c++ as the core language using the more popular toolchains.
In addition this page should prove to be very helpful: http://wiki.osdev.org/C++_Bare_Bones. It pretty much walks you through getting to the c++ entry point of an OS.

Legacy Systems
Even with your clarification, the answer is that it depends -- the exact boot sequence depends on the hardware -- though there's quite a bit of commonality. The boot loader is typically loaded at an absolute address, and the file it's contained in is frequently read into memory exactly as-is. This means instead of a normal linker, you typically use a "linking locator". Where a typical linker produces an executable file ready for relocation and loading, a locator produces an executable that's already set up to run at one exact address, with all relocations already applied. For those old enough to remember them, it's typically pretty much like an MS-DOS .COM file.
Along with that, it has to (of course) statically link the whole run-time that the program depends upon -- it can't depend on something like a DLL or shared object library, because the code to load either of those hasn't itself been loaded yet.
EFI/UEFI
Current PCs (and Macs) use EFI/UEFI. I'm going to just refer to UEFI throughout the remainder of this article, but most of it applies about equally to EFI as well (but UEFI is much more common).
These provide quite a bit more support for boot code. This includes drivers for most devices (it supports installing device drivers), so your boot code can use networking and such, which is much more difficult to support in legacy mode.
Bootable code under EFI uses the same PE format as Windows executables. Libraries are also available so quite a bit of boot code can be written much more like normal code that runs inside an OS. I won't try to get into a lot of detail, but here are links to some information.
https://www.intel.com/content/www/us/en/developer/articles/tool/unified-extensible-firmware-interface.html
https://www.intel.com/content/dam/doc/guide/uefi-driver-network-boot-devices-guide.pdf
https://www.intel.com/content/dam/www/public/us/en/documents/guides/bldk-v2-uefi-standard-based-guide.pdf
And perhaps the most important one--the development kit:
https://github.com/tianocore/edk2

google 'embedded c++' for a start
Another idea is to start with the embedded systems emulators, for example the atmel AVR site has a nice IDE the emulates atmel AVR systems, and allows you to build raw code in C and load it into an emulated CPU, they use gcc as toolchain (I think)

C++ is used in embedded systems programming, even to write OS kernels.
Usually you have at least a few assembler instructions early in the boot sequence. A few things are just easier to express that way, or there may be reference code from the CPU vendor you need to use.
For the initial boot process, you won't be able to use the standard library. No exceptions, RTII, new/delete. It's back to "C with classes". Most people just use C here.
Once you have enough supporting infrastructure loaded though, you can use whatever parts of the standard library you can port.

You will need an environment that provides:
A working C library, or enough of it to do what you want
The parts of the C++ runtime that you intend to use. This is compiler-specific
In addition to any other libraries. If you have no dynamic linker on your platform (if you have no OS, you probably have no linker) then you will have to static-link it all.
In practice this means linking some small C++ runtime, and a C library appropriate for your platform. Then you can simply write a standalone C++ program.

If you were using BSD Unix, you would link with the standalone library. That included a basic IO system for disk and tty. Your source code looked the same as if it were to be run under Unix, but the binary could be loaded into a naked machine.

Yes, of course. The ISO Standard for C and C++ support both hosted and free-standing environments, and the macro STDC_HOSTED is used to distinguish between the two. This has been the case since 2011.
Since most of the replies I see here are in the dark on the fact that this is all standard terminology (and is supposed to be common knowledge for C and C++ users), I'll recap, here, the distinction between the two, laid out by the ISO, and refer you to the following link for more details:
https://en.cppreference.com/w/cpp/freestanding
Hosted:
"main" is defined, and is started up in the main thread. The static objects may be constructed and destructed at the stand and end of the thread. Normally that means there is a host operating system (OS) that provides all the infrastructure required for an environment to run the program in. The POSIX standard, in particular, spells out a set of conditions expected for utilities that are called on a host system from the command-line; and the ISO C/C++ standards dovetail into each other and into the POSIX standard. (POSIX's support for C is at C99 at minimum.)
Free-Standing:
"main" may, but need not, be defined. Whether/how constructors are applied at start-up to static objects, and destructors to static objects on termination, depends on the implementation.
It's any environment where the routine made by the C or C++ program is not being called by some OS, but runs on its own. That's the typical case for embedded systems.
"Free-standing", of course, also includes OS's themselves, as a special case. An OS kernel is the epitome of a free-standing program. One of the respondents here actually went as far as to provide a link to a roll-your-own-OS kit for C++. In that vein, it would be an interesting exercise for you to test out the concept by rewriting the old 0.96 version of Linus' OS (which you can find in UNIX Archive) as a free-standing C++ program. His source code is actually filled with C++'isms (especially in the "fs" directory) that practically scream out "I'm a virtual function", "I'm a base class", "I'm a derived class" or "This is class inheritance"!
Libraries Required For Free-Standing Programs:
There are also minimum requirements on which of the standard libraries must be supported - and how much. Worthy of note: <new>, <exception> are mandatory, only partial support is required for <cstdlib> for start-up and termination, <atomic> (since 2011), <bit> and <coroutine> (both since 2020) are mandatory, as they had better be, if you're doing anything with embedded systems! <thread> is not mandatory.
A roll-your-own-OS kit would then provide templates for the mandatory libraries, and the required parts of the other semi-mandatory libraries. Any responsible embedded systems developer will be laying out their run-time system and environment, and compilers geared toward such users will be providing the under-the-box transparency needed to allow this to be done and to allow user-defined run-time systems to interface cleanly with the compiler.

Related

How does the C++ standard library work behind the scenes?

This question has been bothering me so much for the past couple of days. I was wondering how the standard library works, in terms of functionality. I couldn't find an answer anywhere, even by checking the source code provided by the LLVM compiler which is, for a beginner like me, a really complicated piece of code.
What I'm basically trying to understand here is how does the C++ standard library work. For example let's take the fstream header file which consist of a bunch of functions that help to write to and read from files.
How does it work? Does it use the OS specific API (since the library is cross platform), or what? And, if the standard library can do it, aren't I supposed to be able to mess with some files as well without calling the standard fstream file (which to my experience I can't do)?
I apologize if my questions are unclear since I'm not a native English speaker: feel free to modify this text so as to make it clearer.
Does it use the OS specific API (since the library is cross platform), or what?
At some point, the OS specific API is used. The fstream implementation does not necessarily call an OS function directly. It might use other classes, which call functions inherited from C, etc., but eventually the call chain will lead to an OS call. (Yes, the details are often too complicated for an intermediate programmer to follow. So, as a self-described beginner, your findings are not surprising.)
The library is cross-platform in the sense that on your end (the C++ programmer), the interface is the same regardless of platform. It is not, however, the same library on every platform. Each platform has its own library, exposing the same interface on the C++ side, but making use of different OS calls. (In fact, the same platform might have multiple standard libraries, as the library implementation is provided by your toolchain, not by the standards committee.)
And, if the standard library can do it, aren't I supposed to be able to mess with some files as well without calling the standard fstream file (which to my experience I can't do)?
Yes, you are allowed to. Apparently, you have not been able to yet, but with some practice and guidance you should be able to. Everything in the standard library can be recreated in your own code. The point of the standard library (and most libraries, for that matter) is to save you time, not to enable something that was otherwise unavailable. For example, you don't have to implement a file stream for every program you write; it's in the standard library so you can focus on more interesting aspects of your project.
A compiler is just a program which create executable file or library. You can use the compiler default libraries to gain time or write your own. The default libraries communicate with the os for file operation or memory allocation and provide a simple standard classes to allow the developper to write only one code which work on all target platforms supported by the compiler and the libraries. If you want to write your own you have to write each function for all your target os.
The standard library is cross-platform in a sense that its interface does not change between platforms but its implementation does - or in practical terms - if you only use C++ and its standard library, you can write your code the same way for Linux / Windows / MacOS / Android / Whatever and if you find a C++ compiler for one of those platforms that supports the language features you used, you will be able to compile your code for that platform without rewriting anything.
So while you can use std::vector or std::fstream or any other feature in the library independently of the platform you're writing for and expect the function definitions, type names, etc. to look the same, you cannot expect the executable which you compiled for PC with Windows 10 to run on a phone with Android. You cannot even expect the same executable to run on the same PC but with different system - that is what I mean by "the implementation is different"
There are two main reasons for this difference:
Processors with different architectures (x86-64 and ARM for example) use different instruction sets and as such the C++ source would need to be compiled to a completely different machine code to run properly
Computers with processors of the same architecture which have a different operating system have different ways of dynamically allocating memory, creating files, creating streams, writing to console, creating and scheduling threads etc. - which is part of the system functionality that you use via the standard library
If you really wanted to you could use HeapAlloc() instead of operator new() or CreateThread() instead of stdlib's std::thread but that would force you to both rewrite your program every time you wanted to compile it for something else than Windows and recompile it with the target platform's compiler (and by proxy learn its API). Standard library saves you from that trouble by abstracting away those system calls.
As for the fstream in particular, here is what it uses internally on most PCs nowadays.
Basically, fstream, iostream and printf works based on a kernel function write(). When your code call printf (we use printf as an example), it will finally call write() to let the kernel work on the IO stuff. After that, write() returns and printf returns and your code continues.
So if you really want to know how the printf works internally, you have to read the source code of the Kernel.
But you shouldn't do that for now.
For a beginner, do not try to go deeper when you haven't got a basic cognition about computer. A computer is a project, just like a building. So the right way to learn it is to learn it level by level. First, learning how to use brick and cement to build a building, this is what you should do for now. What you shouldn't do is that you are learning how to build a building and this is your first time to try to use brick, then you are interested in how to produce a brick and start to focus on brick, this is a wrong way to learn IT.
If you are learning C/C++, just learn it. Remember, learn it level by level. For now, knowing how to use printf is enough.

In C++, how does the system associate each of these objects with the window in which the program is executed?

Essentially, I am learning C++. Anyways, I have bought this book called C++ Primer 5th Edition, and in the first unit, when learning about istream and outstream, I came across this statement,
"Ordinarily, the system associates each of these objects with the window in which the program is executed."
Can someone please explain what this means, and how the system does this.
First, your book is wrong, or at least very simplifying. For example, you could compile and run valid and standard C++ code on a Linux server (that you have rented from some VPS provider) or a supercomputer and such a server has no screens and no windows. And the C++ code (compiled) that you use on servers running in datacenters at Google or Facebook (probably the most common computations you do) has no "windows".
Or you could cross-compile some C++ code for a small microcontroller like an arduino which has no screens and no windows.
Then, even if you assume to compile then run some C++ code on a laptop (running an ordinary operating system like Linux, Window, MacOSX, Android....) there are many layers of software involved (cumulating many millions of lines of code).
If you want to understand more, read about display servers, terminal emulators, processes and about operating systems.
You C++ implementation gives you (thru its C++ standard library and many other layers of software) the abstraction of console streams. How these are really implemented is a complex topic. Windows are unknown to the C++14 standard (but consider using a complex GUI framework library like Qt if you need some).
If you have access to some Linux system, try to strace(1) a hello-world program in C++ and read the tty demystified page; you'll be surprised by the amounts of system calls involved (and your terminal emulator is much more complex than hello-world, and so is the display server).
PS. At Paris 6 university, an full-year course (of several hours per week) is required to explain only the basics of the answer to your question
Magic.
No, I'm serious. The C++ standard dictates that the compiler needs to provide a certian environment to the C++ program. This environment is rather abstract -- it is called the abstract machine.
Specific compilers decide to attach that abstract machine to particular behaviors on the actual machine they run on. How this happens is outside the C++ language.
In practice, what happens is that a given C++ executable is compiled to run in a given operating system (and usually on given kinds of compatible hardware).
The operating system knows how to load your executable into memory, and "wires up" the parts of the executable to its own services (or directly to hardware services) that provide features, like console output/input, memory allocation, threading, etc etc etc.
The compiler was written with the ability to write that executable in mind. The operating system was written with the ability to load that executable in mind.
In some cases, the "operating system" is firmware on a chip or motherboard, and "loading the executable" consists of initializing a ROM with the bits that come out of the compiler.
In more common cases, the "operating system" is a modern desktop operating system. It specifies an executable layout, and provides ways for such an executable to talk to the OS/kernel/screen/etc.
The compiler could have libraries (dynamic or not) that are written to help the program interact with the operating system of your computer, which it links to your program implicitly.
The operating system itself is going to be written, usually, in a mixture of machine code, assembly and higher level languages (like C, C++ or even Java). It will often be running in a "different mode" than client code is running in, and have access to a very different environment.
In short, Magic.
C++ does not specify how this happens, it just demands the compiler do it. These things are usually not fully implemented in C++.

Can an x86 executable run on any x86 platform given the right runtime libraries?

While I did find similar-ish questions, they did not really answer this specific question.
Can a compiled x86 executable run on any x86 platform given the right runtime libraries?
Say I make a C++17 program without dependencies, could I run this program on Windows 95 or is there some sort of support required by the OS?
I also heard that RTTI (in the case of C++) may not be supported everywhere, is this only due to the processor having to support this feature or does the OS play a role in that? This would imply that new features would maybe not be supported by, e.g., Windows 95.
Edit
What I'm after is whether an executable (e.g., x86) can run on any platform supporting that instruction set or wether certain features, like RTTI, need specific OS support and thus are not available on all platforms supporting that instruction set.
In general you cannot, even if you restricted your universe to x86 hardware - at least not without some conversion of the binary or some platform-specific "loader" for each target platform.
For exmaple a typical binary emitted by a C or C++ compiler1 will have some minimal dependency on the OS and runtime, for example to load and do runtime linking on the executable. Different platforms have different binary formats (such as PE/COFF on Windows or ELF across various UNIX flavors and Linux) and there isn't any common "x86 format" that would work directly on any platform.
Furthermore, any non-trivial program and in many cases any program, trivial or not, is going to have platform-specific dependencies on the the langauge runtime. For example, even an empty main() function often requires runtime support to get from the OS-defined "start" method to the main method, and without unusual build options there are often calls at startup to initialize parts of the standard library.
Finally, as you alluded to with your comment about RTTI, various language or platform features may essentially be compiled into the binary and require OS support. RTTI probably doesn't obviously fall into this category, but things like position-independent code, thread-local storage and stack-unwinding support for exception handling often do. The compiled x86 code that uses such features may be quite different on different platforms since it needs to build in assumptions of how those work.
In principle, however, you could imagine this working, at least for some limited subset of programs. For example, while the various executable formats are in practice incompatible, they aren't that different and tools exist to convert between them. So you could certainly implement a minimal runtime on your platform of interest that takes an x86 executable compiled to whatever fixed format you choose and converts at runtime to the local format and runs it.
Beyond that actually trying to map even standard library calls would be quite difficult since different operating systems using different calling conventions, but it could be possible for "C" functions using some thunks to put things in the right place. C++ is pretty much right out because the ABI there is much more complex, compiler-and-platform specific and much of the implementation detail is already compiled-in for stuff implemented in headers.
In fact, the idea that (a subset of) x86 might provide a interesting intermediate language for cross-platform execution is exactly the idea behind exploited in Google's [NaCl project]. Essentially, the NaCl runtime provides platform agnostic "loading" capabilities which allow x86 code to run more-or-less natively on various platforms. Subsequently other native formats such as ARM were added, but it started as an x86 sandbox. A large part of the project deals with running code that provably safe (i.e., sandboxed) - but it shows that with some infrastructure you can write "portable" x86. A standard C or C++ compiler isn't going to emit NaCl compatible code directly, however.
1 Really, any compiler that compiles to a native format. I just call out C and C++ since they seem like the ones you are interested in and are widely familiar.
This question misses the point. C++ is, first and foremost, a language to describe the behaviour of a computer program.
Using a compiler to create a native binary executable file to produce that behaviour on an actual computer is the typical way of using the language.
Once you have the binary file, all traces of the source code used to produce it are gone (unless you have built a special version for debugging purposes). The compatibility of the binary file with specific hardware or operating systems is beyond the scope of C++ itself.
The same is true for C, or any other programming language which typically gets compiled to native binary code.
Or, to answer the question more briefly:
Can compiled C++/C code (i.e. an executable) run anywhere given the right runtime libraries?
No.
Can a compiled x86 executable run anywhere given the right runtime libraries?
No, it will only work on x86 hardware, or other hardware (or software, such as a virtual machine) that emulates the x86 instruction set (such as a x64 CPU). In practice, that's very likely to be a far cry from "anywhere."
And even if the hardware matches, an x86 executable will have operating system dependencies. A Windows binary won't run on Linux, even if the hardware is the same. There are various strategies that can make things like this "work" in some situations, Microsoft's Linux Subsystem for Windows is one recent example which allows Linux binaries to run unchanged on Windows. Again, a fry cry from "anywhere."

Are C++ applications cross-platform?

One of the first things I learned as a student was that C++ applications don't run on different operating systems. Recently, I read that Qt based C++ applications run everywhere. So, what is going on? Are C++ applications cross-platform or not?
Source code compatible. If I compile the source code, will it run everywhere?
API/ABI compatibility. Does the OS provide the interface to its components in a way that the code will understand?
Binary compatibility. Is the code capable of running on the target host?
Source code compatible
C++ is a standard which defines how structures, memory, files can be read and written.
#include <iostream>
int main( int argc, char ** argv )
{
std::cout << "Hello World" << std::endl;
}
Code written to process data (e.g. grep, awk, sed) is generally cross-platform.
When you want to interact with the user, modern operating systems have a GUI, these are not cross-platform, and cause code to be written for a specific platform.
Libraries such as qt or wxWidgets have implementations for multiple platforms and allow you to program for qt instead of Windows or iOS, with the result being compatible with both.
The problem with these anonymizing libraries, is they take some of the specific benefits of platform X away in the interest of uniformity across platforms.
Examples of this would be on Windows using the WaitForMultipleObjects function, which allows you to wait for different types of events to occur, or the fork function on UNIX, which allows two copies of your process to be running with significant shared state. In the UI, the forms look and behave slightly different (e.g. color-picker, maximize, minimize, the ability to track mouse outside of your window, the behaviour of gestures).
When the work you need to be done is important to you, then you may end up wanting to write platform specific code to leverage the advantages of the specific application.
The C library sqlite is broadly cross-platform code, but its low-level IO is platform specific, so it can make guarantees for database integrity (that the data is really written to disk).
So libraries such as Qt do work, they may produce results which are unsatisfactory, and you end up having to write native code.
API/ABI compatibility
Different releases of UNIX and Windows have some form of compatibility with each other. These allow a binary built for one version of the OS to run on other versions of the OS.
In UNIX the choice of your build machine defines the compatibility. The lowest OS revision you wish to support should be your build machine, and it will produce binaries compatible with subsequent minor versions until they make a breaking change (deprecate a library).
On Windows and Mac OS X, you choose an SDK which allows you to target a set of OS's with the same issues with breaking changes.
On Linux, each kernel revision is ABI incompatible with any other, and kernel modules need to be re-compiled for each kernel revision.
Binary compatibility
This is the ability of the CPU to understand the code. This is more complex than you might think, as the x64 chips, can be capable (depending on OS support) of running x86 code.
Typically a C++ program is packaged inside a container (PE executable, ELF format) which is used by the operating system to unpack the sections of code and data and to load libraries. This makes the final program have both binary (type of code) and API (format of the container) forms of incompatibilities.
Also today if you compile a x86 Windows Application (targeting Windows 7 on Visual Studio 2015), then the code may fail to execute if the processor does not have SSE2 instructions (about 10 years old CPU).
Finally when Apple changed from PowerPC to x86, they provided an emulation layer which allowed the old PowerPC code to run in an emulator on the x86 platform.
So in general binary incompatibility is a murky area. It would be possible to produce an OS which identified invalid instructions (e.g. SSE2) and in the fault, emulated the behaviour, this could be updated as new features come out, and keeps your code running, even though it is binary incompatible.
Even if your platform is incapable of running a form of instruction set, it could be emulated and behave compatibly.
Standard C++ is cross platform in the "write once, compile anywhere" sense, but not in the "compile once, run anywhere" sense.
That means that if you write a program in standard C++, you can compile and then run it on any target environment that has a standard conforming implementation of C++.
You can however not compile your program on your machine, ship the binary and then expect it to work on other targets. (At least not in general. One can of course distribute binaries from C++ code under certain conditions, but those depend on the actual target. This is a broad field.)
Of course, if you use extra, non-standard features like gcc's variable length arrays or third party libraries, you can only compile on systems that provide those extensions and libraries.
Some libraries like Qt and Boost are available on many systems (those two on Linux, Mac and Windows at least I believe), so your code will stay cross platform if you use those.
You can achieve that your source compiles on various platforms, giving you various binaries from the same source base.
This is not "compile once, run anywhere with an appropriate VM" as Java or C# do it, but "write once, compile anywhere with an appropriate environment" the way C has done it all the time.
Since the standard library does not provide everything you might need, you have to look for third-party libraries to provide that functionality. Certain frameworks -- like Boost, Qt, GTK+, wxWidgets etc. -- can provide that. Since these frameworks are written in a way that they compile on different platforms, you can achieve cross-platform functionality in the aforementioned sense.
There are various things to be aware of if you want your C++ code to be cross-platform.
The obvious thing is source that makes assumption on data types. Your long might be 32bit here and 64bit there. Data type alignment and struct padding might differ. There are ways to "play it safe" here, like size_t / size_type / uint16_t typedefs etc., and ways to get it wrong, like wchar_t and std::wstring. It takes discipline and some experience to "get it right".
Not all compilers are created equal. You cannot use all the latest C++ language features, or use libraries that rely on those features, if you require your source to compile on other C++ compilers. Check the compatibility chart first.
Another thing is endianess. Just one example, when you're writing a stream of integers to file on one platform (say, x86 or x86_64), and then read it back again on a different platform (say, POWER), you can run into problems. Why would you write integers to file? Well, UTF-16 is integers... again, discipline and some experience go a long way toward making this rather painless.
Once you've checked all those boxes, you need to make sure of the availability of the libraries you base your code on. While std:: is safe (but see "not all compilers are created equal" above), something as innocent as boost:: can become a problem if you're looking beyond the mainstream. (I helped the Boost guys to fix one or two showstoppers regarding AIX / Visual Age in past years simply because they didn't have access to that platform for testing new releases...)
Oh, and watch out for the various licensing schemes out there. Some frameworks that improve your cross-platform capabilities -- like Qt or Cygwin -- have their strings attached. That is not to say they are not a big help in the right circumstances, just that you need to be aware of copyleft / proprietary licensing requirements.
All that being said, there is Wine ("Wine is not emulation"), which makes executables compiled for Windows run on a variety of Unix-alike systems (Linux, OS X, *BSD, Solaris). There are certain limits to its capabilities, but it's getting better all the time.
Yes. No. Maybe. What is cross-platform C++ code? Cross-platform C++ code is such a code that can be compiled under different operation systems without the need to be modified.
That means, if you explicitly use any platform-dependant headers, your code is no longer cross-platform. Qt solves this problem in the following way: they provide wrappers for everything that is platform-specific. For example, imagine that you are using QFile to open/read/write a file. Your code looks like
QFile file(filename);
file.open(QFile::ReadOnly);
//other stuff
You can compile this code under any OS as long as you have a suitable compiler and Qt libraries for that OS. The code hidden under QFile will use the OS-appropriate file-handling functions, but that shouldn't concern you.
Also, if you only use the standard library, your code can be compiled anywhere where a C++ compiler is present.
The already-compiled applications, however, are not cross-platform in a way that, say, Java applications are - for instance, you can't compile an app for Windows and then run in in Linux, you will have to recompile your code under Linux instead.
C++ is cross-platform. You can use it to build applications that will run on many different operating systems.
What is not cross-platform is the compilers that translate C++ into object code. No single compiler, to my knowledge, has all the necessary features so that when you use it to compile a C++ program, it will automatically run on Windows, Linux and Mac OS.
Qt Creator is integrated with multiple compilers and has build automation. It makes it easy to switch between different setups and target platforms. It provides support for building, running and deploying C++ applications not only for desktop environments but also for mobile devices.
C++ is a programming language. Text. As such, it doesn't run anywhere.
Conforming Standard C++ code is expected to behave equally on any platform; "cross-platform" if you want. Writing (strictly) conforming C++ code requires pedantry because some assumptions often made have dependencies on details that are final to the actual implementation and this is inherited from the targets C++ itself aims to.
Notice we're still talking about C++ code, not C++ programs. Indeed, when we pass to term "program", we've no more guarantees because we aren't talking about C++ anymore; rather, the output of the compiler. This is where portability begins to fade away: executable format, ISA, ABI, low-level routines and so on.
Can you rely on that? If you can't, then you need to integrate your C++ program in the environment it will run on, by recompiling it or using platform-specific elements.

How does a language expand itself?

I am learning C++ and I've just started learning about some of Qt's capabilities to code GUI programs. I asked myself the following question:
How does C++, which previously had no syntax capable of asking the OS for a window or a way to communicate through networks (with APIs which I don't completely understand either, I admit) suddenly get such capabilities through libraries written in C++ themselves? It all seems terribly circular to me. What C++ instructions could you possibly come up with in those libraries?
I realize this question might seem trivial to an experienced software developer but I've been researching for hours without finding any direct response. It's gotten to the point where I can't follow the tutorial about Qt because the existence of libraries is incomprehensible to me.
A computer is like an onion, it has many many layers, from the inner core of pure hardware to the outermost application layer. Each layer exposes parts of itself to the next outer layer, so that the outer layer may use some of the inner layers functionality.
In the case of e.g. Windows the operating system exposes the so-called WIN32 API for applications running on Windows. The Qt library uses that API to provide applications using Qt to its own API. You use Qt, Qt uses WIN32, WIN32 uses lower levels of the Windows operating system, and so on until it's electrical signals in the hardware.
You're right that in general, libraries cannot make anything possible that isn't already possible.
But the libraries don't have to be written in C++ in order to be usable by a C++ program. Even if they are written in C++, they may internally use other libraries not written in C++. So the fact that C++ didn't provide any way to do it doesn't prevent it from being added, so long as there is some way to do it outside of C++.
At a quite low level, some functions called by C++ (or by C) will be written in assembly, and the assembly contains the required instructions to do whatever isn't possible (or isn't easy) in C++, for example to call a system function. At that point, that system call can do anything your computer is capable of, simply because there's nothing stopping it.
C and C++ have 2 properties that allow all this extensibility that the OP is talking about.
C and C++ can access memory
C and C++ can call assembly code for instructions not in the C or C++ language.
In the kernel or in a basic non-protected mode platform, peripherals like the serial port or disk drive are mapped into the memory map in the same way as RAM is. Memory is a series of switches and flipping the switches of the peripheral (like a serial port or disk driver) gets your peripheral to do useful things.
In a protected mode operating system, when one wants to access the kernel from userspace (say when writing to the file system or to draw a pixel on the screen) one needs to make a system call. C does not have an instruction to make a system calls but C can call assembler code which can trigger the correct system call, This is what allows one's C code to talk to the kernel.
In order to make programming a particular platform easier, system calls are wrapped in more complex functions which may perform some useful function within one's own program. One is free to call the system calls directly (using assembler) but it is probably easier to just make use of one of the wrapper functions that the platform supplies.
There is another level of API that are a lot more useful than a system call. Take for example malloc. Not only will this call the system to obtain large blocks of memory but will manage this memory by doing all the book keeping on what is take place.
Win32 APIs wrap some graphic functionality with a common platform widget set. Qt takes this a bit further by wrapping the Win32 (or X Windows) API in a cross platform way.
Fundamentally though a C compiler turns C code into machine code and since the computer is designed to use machine code, you should expect C to be able to accomplish the lions share or what a computer can do. All that the wrapper libraries do is do the heavy lifting for you so that you don't have to.
Languages (like C++11) are specifications, on paper, usually written in English. Look inside the latest C++11 draft (or buy the costly final spec from your ISO vendor).
You generally use a computer with some language implementation (You could in principle run a C++ program without any computer, e.g. using a bunch of human slaves interpreting it; that would be unethical and inefficient)
Your C++ implementation general works above some operating system and communicate with it (using some implementation specific code, often in some system library). Generally that communication is done thru system calls. Look for instance into syscalls(2) for a list of system calls available on the Linux kernel.
From the application point of view, a syscall is an elementary machine instruction like SYSENTER on x86-64 with some conventions (ABI)
On my Linux desktop, the Qt libraries are above X11 client libraries communicating with the X11 server Xorg thru X Windows protocols.
On Linux, use ldd on your executable to see the (long) list of dependencies on libraries. Use pmap on your running process to see which ones are "loaded" at runtime. BTW, on Linux, your application is probably using only free software, you could study its source code (from Qt, to Xlib, libc, ... the kernel) to understand more what is happening
I think the concept you are missing is system calls. Each operating system provides an enormous amount of resources and functionality that you can tap into to do low-level operating system related things. Even when you call a regular library function, it is probably making a system call behind the scenes.
System calls are a low-level way of making use of the power of the operating system, but can be complex and cumbersome to use, so are often "wrapped" in APIs so that you don't have to deal with them directly. But underneath, just about anything you do that involves O/S related resources will use system calls, including printing, networking and sockets, etc.
In the case of windows, Microsoft Windows has its GUI actually written into the kernel, so there are system calls for making windows, painting graphics, etc. In other operating systems, the GUI may not be a part of the kernel, in which case as far as I know there wouldn't be any system calls for GUI related things, and you could only work at an even lower level with whatever low-level graphics and input related calls are available.
Good question. Every new C or C++ developer has this in mind. I am assuming a standard x86 machine for the rest of this post. If you are using Microsoft C++ compiler, open your notepad and type this (name the file Test.c)
int main(int argc, char **argv)
{
return 0
}
And now compile this file (using developer command prompt) cl Test.c /FaTest.asm
Now open Test.asm in your notepad. What you see is the translated code - C/C++ is translated to assembler. Do you get the hint ?
_main PROC
push ebp
mov ebp, esp
xor eax, eax
pop ebp
ret 0
_main ENDP
C/C++ programs are designed to run on the metal. Which means they have access to lower level hardware which makes it easier to exploit the capabilities of the hardware. Say, I am going to write a C library getch() on a x86 machine.
Depending on the assembler I would type something this way :
_getch proc
xor AH, AH
int 16h
;AL contains the keycode (AX is already there - so just return)
ret
I run it over with an assembler and generate a .OBJ - Name it getch.obj.
I then write a C program (I dont #include anything)
extern char getch();
void main(int, char **)
{
getch();
}
Now name this file - GetChTest.c. Compile this file by passing getch.obj along. (Or compile individually to .obj and LINK GetChTest.Obj and getch.Obj together to produce GetChTest.exe).
Run GetChTest.exe and you would find that it waits for the keyboard input.
C/C++ programming is not just about language. To be a good C/C++ programmer you need to have a good understanding on the type of machine that it runs. You will need to know how the memory management is handled, how the registers are structured, etc., You may not need all these information for regular programming - but they would help you immensely. Apart from the basic hardware knowledge, it certainly helps if you understand how the compiler works (ie., how it translates) - which could enable you to tweak your code as necessary. It is an interesting package!
Both languages support __asm keyword which means you could mix your assembly language code too. Learning C and C++ will make you a better rounded programmer overall.
It is not necessary to always link with Assembler. I had mentioned it because I thought that would help you understand better. Mostly, most such library calls make use of system calls / APIs provided by the Operating System (the OS in turn does the hardware interaction stuff).
How does C++ ... suddenly get such capabilities through libraries
written in C++ themselves ?
There's nothing magical about using other libraries. Libraries are simple big bags of functions that you can call.
Consider yourself writing a function like this
void addExclamation(std::string &str)
{
str.push_back('!');
}
Now if you include that file you can write addExclamation(myVeryOwnString);. Now you might ask, "how did C++ suddenly get the capability to add exclamation points to a string?" The answer is easy: you wrote a function to do that then you called it.
So to answer your question about how C++ can get capabilities to draw windows through libraries written in C++, the answer is the same. Someone else wrote function(s) to do that, and then compiled them and gave them to you in the form of a library.
The other questions answer how the window drawing actually works, but you sounded confused about how libraries work so I wanted to address the most fundamental part of your question.
The key is the possibility of the operating system to expose an API and a detailed description on how this API is to be used.
The operating system offers a set of APIs with calling conventions.
The calling convention is defining the way a parameter is given into the API and how results are returned and how to execute the actual call.
Operating systems and the compilers creating code for them play nicely together, so you usually have not to think about it, just use it.
There is no need for a special syntax for creating windows. All that is required is that the OS provides an API to create windows. Such an API consists of simple function calls for which C++ does provide syntax.
Furthermore C and C++ are so called systems programming languages and are able to access arbitrary pointers (which might be mapped to some device by the hardware). Additionally, it is also fairly simple to call functions defined in assembly, which allows the full range of operations the processor provides. Therefore it is possible to write an OS itself using C or C++ and a small amount of assembly.
It should also be mentioned that Qt is a bad example, as it uses a so-called meta compiler to extend C++' syntax. This is however not related to it's ability to call into the APIs provided by the OS to actually draw or create windows.
First, there's a little misunderstading, I think
How does C++, which previously had no syntax capable of asking the OS for a window or a way to communicate through networks
There is no syntax for doing OS operations. It's the question of semantics.
suddenly get such capabilities through libraries written in C++ themselves
Well, the operating system is writen mostly in C. You can use shared libraries (so, dll) to call the external code. Additionally, the operating system code can register system routines on syscalls* or interrupts which you can call using assembly. That shared libraries often just make that system calls for you, so you are spared using inline assembly.
Here's the nice tutorial on that: http://www.win.tue.nl/~aeb/linux/lk/lk-4.html
It's for Linux, but the principles are the same.
How the operating system is doing operations on graphic cards, network cards etc? It's a very broad thema, but mostly you need to access interrupts, ports or write some data to special memory region. Since that operations are protected, you need to call them through the operating system anyway.
In an attempt to provide a slightly different view to other answers, I shall answer like this.
(Disclaimer: I am simplifying things slightly, the situation I give is purely hypothetical and is written as a means of demonstrating concepts rather than being 100% true to life).
Think of things from the other perspective, imagine you've just written a simple operating system with basic threading, windowing and memory management capabilities. You want to implement a C++ library to let users program in C++ and do things like make windows, draw onto windows etc. The question is, how to do this.
Firstly, since C++ compiles to machine code, you need to define a way to use machine code to interface with C++. This is where functions come in, functions accept arguments and give return values, thus they provide a standard way of transferring data between different sections of code. They do this by establishing something known as a calling convention.
A calling convention states where and how arguments should be placed in memory so that a function can find them when it gets executed. When a function gets called, the calling function places the arguments in memory and then asks the CPU to jump over to the other function, where it does what it does before jumping back to where it was called from. This means that the code being called can be absolutely anything and it will not change how the function is called. In this case however, the code behind the function would be relevant to the operating system and would operate on the operating system's internal state.
So, many months later and you've got all your OS functions sorted out. Your user can call functions to create windows and draw onto them, they can make threads and all sorts of wonderful things. Here's the problem though, your OS's functions are going to be different to Linux's functions or Windows' functions. So you decide you need to give the user a standard interface so they can write portable code. Here is where QT comes in.
As you almost certainly know, QT has loads of useful classes and functions for doing the sorts of things that operating systems do, but in a way that appears independent of the underlying operating system. The way this works is that QT provides classes and functions that are uniform in the way they appear to the user, but the code behind the functions is different for each operating system. For example QT's QApplication::closeAllWindows() would actually be calling each operating system's specialised window closing function depending on the version used. In Windows it would most likely call CloseWindow(hwnd) whereas on an os using the X Window System, it would potentially call XDestroyWindow(display,window).
As is evident, an operating system has many layers, all of which have to interact through interfaces of many varieties. There are many aspects I haven't even touched on, but to explain them all would take a very long time. If you are further interested in the inner workings of operating systems, I recommend checking out the OS dev wiki.
Bear in mind though that the reason many operating systems choose to expose interfaces to C/C++ is that they compile to machine code, they allow assembly instructions to be mixed in with their own code and they provide a great degree of freedom to the programmer.
Again, there is a lot going on here. I would like to go on to explain how libraries like .so and .dll files do not have to be written in C/C++ and can be written in assembly or other languages, but I feel that if I add any more I might as well write an entire article, and as much as I'd love to do that I don't have a site to host it on.
When you try to draw something on the screen, your code calls some other piece of code which calls some other code (etc.) until finally there is a "system call", which is a special instruction that the CPU can execute. These instructions can be either written in assembly or can be written in C++ if the compiler supports their "intrinsics" (which are functions that the compiler handles "specially" by converting them into special code that the CPU can understand). Their job is to tell the operating system to do something.
When a system call happens, a function gets called that calls another function (etc.) until finally the display driver is told to draw something on the screen. At that point, the display driver looks at a particular region in physical memory which is actually not memory, but rather an address range that can be written to as if it were memory. Instead, however, writing to that address range causes the graphics hardware to intercept the memory write, and draw something on the screen.
Writing to this region of memory is something that could be coded in C++, since on the software side it's just a regular memory access. It's just that the hardware handles it differently.
So that's a really basic explanation of how it can work.
Your C++ program is using Qt library (also coded in C++). The Qt library will be using Windows CreateWindowEx function (which was coded in C inside kernel32.dll). Or under Linux it may be using Xlib (also coded in C), but it could as well be sending the raw bytes that in X protocol mean "Please create a window for me".
Related to your catch-22 question is the historical note that “the first C++ compiler was written in C++”, although actually it was a C compiler with a few C++ notions, enough so it could compile the first version, which could then compile itself.
Similarly, the GCC compiler uses GCC extensions: it is first compiled to a version then used to recompile itself. (GCC build instructions)
How i see the question this is actually a compiler question.
Look at it this way, you write a piece of code in Assembly(you can do it in any language) which translates your newly written language you want to call Z++ into Assembly, for simplicity lets call it a compiler (it is a compiler).
Now you give this compiler some basic functions, so that you can write int, string, arrays etc. actually you give it enough abilities so that you can write the compiler itself in Z++. and now you have a compiler for Z++ written in Z++, pretty neat right.
Whats even cooler is that now you can add abilities to that compiler using the abilities it already has, thus expanding the Z++ language with new features by using the previous features
An example, if you write enough code to draw a pixel in any color, then you can expand it using the Z++ to draw anything you want.
The hardware is what allows this to happen. You can think of the graphics memory as a large array (consisting of every pixel on the screen). To draw to the screen you can write to this memory using C++ or any language that allows direct access to that memory. That memory just happens to be accessible by or located on the graphics card.
On modern systems accessing the graphics memory directly would require writing a driver because of various restrictions so you use indirect means. Libraries that create a window (really just an image like any other image) and then write that image to the graphics memory which the GPU then displays on screen. Nothing has to be added to the language except the ability to write to specific memory locations, which is what pointers are for.