Can C++ be compiled into platform independent code? Why Not? - c++

Is it possible to compile C++ program into some intermediate stage (similar to bytecode in java) where the output is platform independent and than later compile/link at runtime to run in native (platform dependent) code?
If answer is no, why?

It is indeed possible, see for example LLVM.

Of course. Keep in mind that the C++ standard only specifies behavior: What should happen when this program executes. It doesn't specify how it should be implemented.
C++ code can be compiled to an intermediate format and JIT'ed to machine code, or it can be interpreted or anything else you like.

This is trivial, and most compilers already do that. gcc compiles to RTL (register transfer language) which is then translated to the target CPU.
Similarly, managed C++ and C++/CLI are compiled to .NET.
Finally you can consider the Church Turing thesis that is a statement of equivalence of programming languages, so C++ can be compiled/translated to your favorite platform independent language (say, Perl, lisp, C--, etc).

C++ source code (with some restrictions) is a platform-independent bytecode.
Why is it not?
Indeed, "bytecode" compilation procedure is then mere copying. The virtual machine that runs the "bytecode" is C++ compiler and a wrapper script. Yeah, it does some stuff that resembles compilation to machine code--but that's an implementation detail.
Here's a Linux implementation of such a "C++ virtual machine":
#/bin/sh
tmp=`mktemp`
g++ $1 -o $tmp && $tmp $2 $3 $4 ...
Does it answer the question? I think, it does. To the extent how specific the question is. Because it clearly explains theoretical possibility of compiling C++ into bytecode. Practical implementations also exist, for example, LLVM.

Yes it is technically feasible. A bit of a plug for a former employer, but here's an implementation of exactly that: http://antixlabs.com/products/antixgamedevelopmentkit/. The packaging process is, roughly speaking, C/C++ -> (compiler) -> LLVM -> (backend) -> bespoke bytecode -> zip file. This is platform-independent. Once it's on the user's device the "player" converts bespoke bytecode -> (translator for that device) -> native elf file -> (loader/linker) -> fixed up code.
If the real question is, "does there exist any such industry-standard intermediate format which is widely supported on multiple platforms and suitable for all-purpose use, like Java bytecode?" then the answer is "no".
As for why, I'd say it's because there is no one organisation which has enough influence over C++ programmers, and no true necessity for Java-style deployment of C++ applications. Sun invented Java and a GUI library in one go, presented it to programmers, and didn't introduce the big proliferation of profiles until later.
C++ doesn't even have a standard GUI, and C++ environments are far more fragmented than Java. How do you tell a Windows app developer, a mobile phone developer, a smartcard implementer and a stock exchange backend implementer that they need to ditch their existing toolchain in favour of a platform-independent deployment mechanism for C++? They don't. And that's even before you get to the folks writing OSes and device drivers in C or C++ mixed with assembly. It's simply impossible to come up with a standard environment to support all of them.

Parrot project will have c++ bytecode compilation and execution parrot Visual Studio can compile C++ as bytecode C++ managed

Related

Is There a C++ Command Line?

So Python has a sort of command line thing, and so does Linux bash (obviously), and I'm sure other programming languages do, but does C++? If not, why do C++ scripts have to be compiled first and then run?
If not, why do C++ scripts have to be compiled first and then run?
C++ code does not need to be compiled to be run. There are interpreters.
The reason most of us prefer compiled C++ is that the resulting executable is 'faster'.
Interpreted computer languages can do extra things to achieve similar performance (i.e. just-in-time compile), but generally, 'scripts' are not in the same league of fast.
Some developers think not having to edit, compile, link is a good thing ... just type in code and see what it does.
Anyway, the answer is, there is no reason that C++ "has to" be compiled. It is just the preferred tool for most C++ developers.
Should you want to try out C++ interpreters, search the net for CINT, Ch, and others.
Indeed there are interpreters for C++ that do what you want. Check out
Cling.
To the commenters saying C++ can't have interpreters because it's a compiled language: yes, typically you use a compiler with C++. But that doesn't mean it's impossible to write an interpreter for it.
There is no command lines to run C++ instructions. It is compiled first and then target machine code generated (intermediate obj code, and linked) to run.
The reason is, It is matter of language design for various considerations like performance, error recovery etc. Compiled code generate target machine code directly and run faster than interpreted languages. Compiled code take program as whole and generate the target machine code vs interpreted code take few instruction at once. Interpreted language require intermediate programs to target the final machine code, so it may be slow.
In nutshell, it is language design evolution. When first computers appeared programming is done directly in machine language. Those programs run instruction by instruction. Later high level language appeared, where machine language is abstracted with human friendly instructions and compilers designed to generate equivalent machine code.
Later Computer program design advanced, and CPU instruction cycle speed increased, we could afford intermediate interpreters for writing safer programs.
Choice is wider now, earlier performance centric apps demanded compiled code. Now even interpreted code equally faster in common use cases.
While there are interpreters for C++-like languages, that is not really the point; C++ is a compiled language that is translated to native machine code. Conversely scripting languages are (typically) interpreted (albeit that there are also compilers for scripting languages to translate them to native code).
C++ is a systems-level capable language. You have to ask yourself - if all languages ran in a shell with a command line and were interpreted, what language is that shell or interpreter, or even the OS they are running on written in?
Ultimately you need a systems level language, and those are most often C, C++ and assembler.
Moreover because it is translated to machine level code at compilation, that code runs directly and stand-alone without the presence of any interpreter, and consequently can be simpler to deploy, and will execute faster.

In what languages are the C and C++ standard libraries written? Where is their source code?

I always wonder what language was the C/C++ runtime and standard library written in. At first I thought it is a casual C/C++ language, but to be able to talks with the machine I doubt C/C++ is enough. Therefore, I think it may be assembly language instead. If it is either C/C++ or asm language, then why don't I really see source codes flying around? or maybe I'm lacking in searching skill...
They are typically written in their host language, interoperating with the operating system API to obtain things that can't be obtained natively. Many features are written in pure language- for example, the containers & algorithms section of the C++ Standard library pretty much has to be written in C++. In fact, nearly the entire C++ Standard library has to be written in C++, because it's templated. I don't know why you haven't found any source- the Microsoft CRT source is available to any dev, I think, I've certainly seen questions on here posting the CRT source, and the GNU libc++ is open source, I'm pretty sure.
All the run-time and standard libraries are written in C/C++. Most vendors ship the source code.
They are usually written in C with some assembly mixed in. Visual Studio ships with big chunks of source code for CRT (C RunTime) if not all of it. Linux glibc is obviously open source:
http://ftp.gnu.org/gnu/glibc/
Most of the runtime is written in C or C++ but there is an important exception and that is the code that calls into the kernel. Onmost operating systems this is done by generating a Software Interrupt using a special instruction (SWI or SVC) on arm. There is no equivalent for this in C or C++ and therefore assembler will be used. Assembler is also typically used to implement highly optimized memcpy, memmove, memcmp and other similar functions.

Is there a library that can compile C++ or C

I came here to ask this question because this site has been very useful to me in the past, seems to have very knowledgeable users who are willing to discuss a question even if it is metaphysical at times. And also because googling it did not work.
Java has a compiler and then it has a JDT library that can compile java on the fly (for example used in JasperReports to turn a report template into Java code).
My question: Does anyone know of a library/project that would offer compiling as a set of library classes in c/c++. For example: a suite of classes to perform Preprocessing, Parsing, CodeOptimization and of course Binary rendering to executable images such as ELF or Win format. Basically something that would allow one to compile c or c++ scriptlets as part of an application.
Yes: llvm. In particular, clang. At least, that's how they advertise the projects. Also, check this question. It might be relevant if you decide to use llvm.
You might be able to adapt something from the LLVM project to your needs.
You can just require that a compiler be installed, then call it. This is a fairly hefty requirement, but about the only way to truly "embed" C or C++. There are interpreters that you may be able to embed, but that currently seems a poor choice, not the least because any libraries used in the script must have development versions (i.e. headers and source/compiled-libraries) installed, and those libraries could be restricted to the feature set supported by the interpreter (e.g. quality of template implementation).
You're better off using a language like Python or Lua to embed.
There is the ch interpreter but I have not used it. Generally for scripting type applications a more natural scripted language is used.
Great. It looks like LLVM is what I was after.
Thanks a lot for your feedback.
I am not primarily after C++ as a scripting language. I have noticed that Python is used as an embedded script engine.
My primary reason is two fold:
Get rid off Make,CMake and the hell that is Autoconf and replace it with something like Scons that binds into and interacts with all phases of compiling
Hook into the compiling process after parsing and auto generate code. Specificaly meta related code. In my case I have been able to implement almost every feature of Java in C++ except one: Reflection.
Why impose on your code uneeded bload like RTTI that is often inadequate. Instead one could selectively generate added features. But developer would have to choice when and how to use this extra code.

Do all C++ compilers generate C code?

Probably a pretty vague and broad question, but do all C++ compilers compile code into C first before compiling them into machine code?
Because C compilers are nearly ubiquitous and available on nearly every platform, a lot of (compiled) languages go through this phase in their development to bootstrap the process.
In the early phases of language development to see if the language is feasible the easiest way to get a working compiler out is to build a compiler that converts your language to C then let the native C compiler build the actual binary.
The trouble with this is that language specific constructs are lost and thus potential opportunities for optimization may be missed thus most languages in phase two get their own dedicated compiler front end that understands language specific constructs and can thus provide optimization strategies based on these constructs.
C++ has gone through phase 1 and phase 2 over two decades ago. So it is easy to find a `front end' of a compiler that is dedicated to C++ and generates an intermediate format that is passed directly to a backed. But you can still find versions of C++ that are translated into C (as an intermediate format) before being compiled.
Nope. GCC for example goes from C++ -> assembler. You can see this by using the -S option with g++.
Actually, now that I think about it, I don't think any modern compiler goes to C before ASM.
No. C++ -> C was used only in the earliest phases of C++'s development and evolution. Most C++ compilers today compile directly to assembler or machine code. Borland C++ compiles directly to machine code, for example.
No. This is a myth, based around the fact that a very early version of Stroustrup's work was implemented that way. C++ compilers generate machine code in almost exactly the same way that C compilers do.
As of this writing in 2010, the only C++ compiler that I was aware of that created C code was Comeau*. However, that compiler hasn't been heard from in over 5 years now (2022). There may be one or two more for embedded targets, but it is certainly not a mainstream thing.
* - There's a link to their old website on this WP page. I'd suggest not clicking that unless your computer has all its shots up to date
This is not defined by the standard. Certainly, compiling to C-source is a reasonable way to do it. It only requires the destination platform to have a C-compiler with a reasonable degree of compliance, so it is a highly portable way of doing things.
The downside is speed. Probably compilation speed and perhaps also execution speed (due to loads of casts for e.g. virtual functions that prevents the compiler to optimise fully) will suffer.
Not that long ago there was a company that had a very nice C++ compiler doing exactly that. Unfortunately, I do not remember the name of the company and a short google did not bring the name back. The owner of the company was an active participant in the ISO C++ committee and you could test your code directly on the homepage, which also had some quite decent ressources about C++.
Edit: one of my fellow posters just reminded me. I was talking about Comeau, of course.

Developing embedded software library, C or C++?

I'm in the process of developing a software library to be used for embedded systems like an ARM chip or a TI DSP (for mostly embedded systems, but it would also be nice if it could also be used in a PC environment). Obviously this is a pretty broad range of target systems, so being able to easily port to different systems is a priority.The library will be used for interfacing with a specific hardware and running some algorithms.
I am thinking C++ is the best option, over C, because it is much easier to maintain and read. I think the additional overhead is worth it for being able to work in the object oriented paradigm. If I was writing for a very specific system, I would work in C but this is not the case.
I'm assuming that these days most compilers for popular embedded systems can handle C++. Is this correct?
Is there any other factors I should consider? Is my line of thinking correct?
If portability is very important for you, especially on an embedded system, then C is certainly a better option than C++. While C++ compilers on embedded platforms are catching up, there's simply no match for the widespread use of C, for which any self-respecting platform has a compliant compiler.
Moreover, I don't think C is inferior to C++ where it comes to interfacing hardware. The amount of abstraction is sufficiently low (i.e. no deep class hierarchies) to make C just as good an option.
There is certainly good support of C++ for ARM. ARM have their own compiler and g++ can also generate EABI compliant ARM code. When it comes to the DSPs, you will have to look at their toolchain to decide what you are going to do. Be aware that the library that comes with a DSP may well not implement the full C or C++ standard library.
C++ is suitable for low-level embedded development and is used in the SymbianOS Kernel. Having said that, you should keep things as simple as possible.
Avoid exceptions which may demand more library support than what is present (therefore use new (std::nothrow) Foo instead of new Foo).
Avoid memory allocations as much as possible and do them as early as possible.
Avoid complex patterns.
Be aware that templates can bloat your code.
I have seen many complaints that C++ is "bloated" and inappropriate for embedded systems.
However, in an interview with Stroustrup and Sutter, Bjarne Stroustrup mentioned that he'd seen heavily templated C++ code going into (IIRC) the braking systems of BMWs, as well as in missile guidance systems for fighter aircraft.
What I take away from this is that experts of the language can generate sophisticated, efficient code in C++ that is most certainly suitable for embedded systems. However, a "C With Classes"[1] programmer that does not know the language inside out will generate bloated code that is inappropriate.
The question boils down to, as always: in which language can your team deliver the best product?
[1] I know that sounds somewhat derogatory, but let me say that I know an awful lot of these guys, and they churn out an awful lot of relatively simple code that gets the job done.
C++ compilers for embedded platforms are much closer to 83's C with classes than 98's C++ standard, let alone C++0x. For instance, some platform we use still compile with a special version of gcc made from gcc-2.95!
This means that your library interface will not be able to provide interfaces with containers/iterators, streams, or such advanced C++ features. You'll have to stick with simple C++ classes, that can very easily be expressed as a C interface with a pointer to a structure as first parameter.
This also means that within your library, you won't be able to use templates to their full power. If you want portability, you will still be restricted to generic containers use of templates, which is, I'm sure you'll admit, only a very tiny part of C++ templates power.
C++ has little or no overhead compared to C if used properly in an embedded environment. C++ has many advantages for information hiding, OO, etc. If your embedded processor is supported by gcc in C then chances are it will also be supported with C++.
On the PC, C++ isn't a problem at all -- high quality compilers are extremely widespread and almost every C compiler is directly associated with a C++ compiler that's quite good, though there are a few exceptions such as lcc and the newly revived pcc.
Larger embedded systems like those based on the ARM are generally quite similar to desktop systems in terms of tool chain availability. In fact, many of the same tools available for desktop machines can also generate code to run on ARM-based machines (e.g., lots of them use ports of gcc/g++). There's less variety for TI DSPs (and a greater emphasis on quality of generated code than source code features), but there are still at least a couple of respectable C++ compilers available.
If you want to work with smaller embedded systems, the situation changes in a hurry. If you want to be able to target something like a PIC or an AVR, C++ isn't really much of an option. In theory, you could get (for example) Comeau to produce a custom port that generated code you could compile on that target's C compiler -- but chances are pretty good that even if you did, it wouldn't work out very well. These systems are really just too limitated (especially on memory size) for C++ to fit them well.
Depending on what your intended use is for the library, I think I'd suggest implementing it first as C - but the design should keep in mind how it would be incorporated into a C++ design. Then implement C++ classes on top of and/or along side of the C implementation (there's no reason this step cannot be done concurrently with the first). If your C design is done with a C++ design in mind, it's likely to be as clean, readable and maintainable as the C++ design would be. This is somewhat more work, but I think you'll end up with a library that's useful in more situations.
While you'll find C++ used more and more on various embedded projects, there are still many that restrict themselves to C (and I'd guess this is more often the case than not) - regardless of whether or not the tools support C++. It would be a shame to have a nice library of routines that you could bring to a new project you're working on, but be unable to use them because C++ isn't being used on that particular project.
In general, it's much easier to use a well-designed C library from C++ than the other way around. I've taken this approach with several sets of code including parsing Intel Hex files, a simple command parser, manipulating synchronization objects, FSM frameworks, etc. I'm planning on doing a simple XML parser at some point.
Here's an entirely different C++-vs-C argument: stable ABIs. If your library exports a C ABI, it can be compiled with any compiler that works on the system, because C ABIs are generally platform standards. If your library exports a C++ ABI, it can only be compiled with a matching compiler -- because C++ ABIs are usually not platform standards, and often differ from compiler to compiler and even version to version.
Interestingly, one of the rare exceptions to this is ARM; there's an ARM C++ ABI specification, and all compliant ARM compilers follow it. This is not true on x86; on x86, you're lucky if a C++ library compiled with a 4.1 version of GCC will link correctly with an application compiled with GCC 4.4, and don't even ask about 3.4.6.
Even if you export a C ABI, you can have problems. If your library uses C++ internally, it will then link to libstdc++ for things in the C++ std:: namespace. If your user compiles a C++ application that uses your library, they'll also link to libstdc++ -- and so the overall application gets linked to libstdc++ twice, and their libstdc++ may not be compatible with your libstdc++, which can (or so I understand) lead to odd errors from the intersection of the two. Considerably less likely, but still possible.
All of these arguments only apply because you're writing a library, and they're not showstoppers. But they are things to be aware of.