This question already has answers here:
Does a compiler always produce an assembly code?
(4 answers)
Closed 3 years ago.
What is the reason that languages like C, C++ and similar compile their code down to assembler code, instead of just producing the binary directly? Is it just too hard to infer the "correct" programming from the abstracted language? It seems to me that converting to something that will again be converted is not an optimal way of doing things, but there are probably good reasons for this that I am unaware of. Is this connected to every CPU architecture having different implementations?
Assembler code and the binary are logically equivalent, the assembler code is just represented using so-called mnemonics which are a more human-readable form of the machine instructions.
All compilers do directly produce a binary.
Related
This question already has answers here:
can i use c++ compiler to compile c source code? [duplicate]
(3 answers)
What is the difference between g++ and gcc?
(10 answers)
Closed 2 years ago.
It's what I believe to be a very simple question.
Context: I'm following a tutorial that allows me to run C++ code in Visual Studio Code, but I'm trying to run C code, not C++ code. The program I'm trying to run is a simple Hello World program (shown below), but this question applies to all C code.
#include <stdio.h>
int main() {
printf("Hello World!")
}
C and C++ are different languages. And even though they share a similar syntax, the semantic meaning of certain constructs are different.
C++ incorporates a large part of C, but it also diverges. You cannot just assume that C code compiled as C++ will give the same result.
You can write code that is both valid C and valid C++ yet mean different things in the two languages.
While C++ can be seen for the most part a superset of C, there are some constructions that are invalid C++ and others that have different behavior.
Instead of dealing with that, tell your compiler to target C instead of C++. All the popular C++ compilers also support C (at least one version).
This question already has answers here:
Does a compiler always produce an assembly code?
(4 answers)
What do C and Assembler actually compile to? [closed]
(11 answers)
Closed 2 years ago.
I have started learning C++, and I have learned that a compiler turns source code from a program into machine code through compilation.
However, I've learned that C++ compilers actually translate the source code into Assembly as an interim step before translating the Assembly code into machine code. What is the purpose of this step?
Why don`t they translate it directly into the machine code?
First of all: There is no need to write an intermediate assembly language representation. Every compiler vendor is free to emit machine code directly.
But there are a lot of good reasons to "write" an intermediate assembly and pass it to an assembler to generate the final executable file. Important is, that there is no need to really write a file to some kind of media, but the output can directly piped to the assembler itself.
Some of the reasons why vendors are using intermediate assembly language:
The assembler is already available and "knows" how to generate some executable file formats ( elf for example ).
Some tasks can be postponed until assembly level is reached. Resolving jump targets for example. This is possible because the intermediate assembly is often not only 1:1 representation but some kind of "macro-assembler" which can do a lot more than simply creating bits from mnomics.
the assembler level is followed by executing the linker. This must also be done if a compiler directly wants to create executable file formats. A lot of duplicated jobs if this must be coded again. As an example all the relocation of before "unknown addresses" must be done on the way to an executable file. Simply use the assembler/linker and the job is done.
The intermediate assembly is always useful for debugging purpose. So there is a more or less hard requirement to be able to do this intermediate step, even if it can be omitted if no debug output is requested from the user.
I believe there are are lot more...
The bad side is:
"writing" a text representation and parsing the program from the text takes longer as directly passing the information to the linker.
Usually, compilers invoke the assembler (and the linker, or the archiver) on your behalf unless you ask it to do otherwise, because it is convenient.
But separating the distinct steps is useful because it allows you to swap the assembler (and linker and archiver) for another if you so desire or need to. And conversely, this assembler may potentially be used with other compilers.
The separation is also useful because assemblers already existed before the compiler did. By using a pre-existing assembler, there is no need to re-implement the machine code translation. This is still potentially relevant because occasionally there will be a need to boot-strap a new CPU architecture.
This question already has answers here:
What is the difference between 'asm', '__asm' and '__asm__'?
(4 answers)
Closed 3 years ago.
Several years ago I wrote some significant Cpp code that accessed the hardware registers by a coding command that switches to assembler language. I lost the compiler and computer. Please tell me a Cpp compiler that allows inline asembler in the middle of the Cpp code. Intel cpu, Windows. Thank you.
It seems I lacked clarity in the question. My apologies. The answers given were a refresher of the code. Well done. The answers given today suggest the C++ compilers might not have been updated for 64 bit assemblers. Here is a clearer question which has been only partially answered. It needs an updated response.
I am thinking of buying an Intel i7 desk computer. I will write C++ code for i/o and setup. The inner loops will be written in assembler language to take advantage of the hardware register multiply and divide: two multiplicands in separate registers give a double register product. My experience years ago was that not all C++ compilers are alike. Which of the many brands of C++ software out there give a good link to assembler, __asm, and make full advantage of 64 bit machines?
I feel this question has not been asked. Thanks for the great answers so far.
I once used Microsoft Visual Studio to write inline assembly, like this:
// --- Get current frame pointer
ADDR oriFramePtr = 0;
_asm mov DWORD PTR [oriFramePtr], ebp
Unfortunately, this only worked for 32-bit, because at that time the 64-bit compiler of Microsoft didn't support inline assembly (didn't check recently).
By default, C++ provides the asm keyword for writing assembly (bolded by me):
7.4 The asm declaration [dcl.asm]
1 An asm declaration has the form
asm-definition:
asm ( string-literal ) ;
The asm declaration is conditionally-supported; its meaning is implementation-defined. [ Note: Typically it is used to pass information through the implementation to an assembler. — end note ]
GCC appears to support asm based on the above article on asm, but I couldn't find anything besides its support in C
MSVC does support assembly, but not via the asm keyword; one must use __asm:
The __asm keyword invokes the inline assembler and can appear wherever a C or C++ statement is legal.
Visual C++ support for the Standard C++ asm keyword is limited to the fact that the compiler will not generate an error on the keyword. However, an asm block will not generate any meaningful code. Use __asm instead of asm.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
What language is a c or c++ header file written in. Another big doubt that I have is, since a computer understands only binary, how do people actually write a programming language and a compiler that the computer can actually understand?
The C and C++ header files are written as C and C++ source code respectively. The code in these [1] is compiled and linked into your executable file.
A compiler is written in some arbitrary language - these days, you typically take an existing C or C++ compiler for some platform that you have available, but ages ago, the process was basically to write a very basic compiler in assembler [or some other available and suitable language], and then use that to bootstrap into a higher language compiler.
Of course, if you have just invented "Chip X", and haven't got a portable assembler, you'd also have to write an assembler. Hopefully you have some OTHER computer with a programming language - but if we pretend that no computers are available, then we'd have to come up with binary code "by hand", and then enter that into the ROM of the computer. That code would perhaps be able to perform some really simple task such as printing "Hello" to some output device. Once that works, we'd expand it to have a loader, so we can load a binary file (or add new commands some other way). A very simple editor to edit files, and a file-storage would be very useful to have. And then we could start writing some code that can read human readable instructions (assembler code) and produce binary from that. Once we have an assembler, we can write a program in assembler that takes (very simple) C input and outputs assembler. Assemble that code, and we have a (very simple) C compiler. Now we can use the simple C compiler to write a better C compiler in "simple C". Keep at this for a while, and you end up with a decent C compiler... But it's probably a few years worth of work unless you have done this sort of thing many times...
Any language that can read text files and compare strings and output binary files in "free format" is pretty much usable to write a compiler. It's of course not trivial.
I have written a compiler for Pascal which uses the LLVM compiler framework to produce the actual code meaning, I've done the simple part of the compiler, the hard part in a good quality compiler is the code-generation pass, and I only do that into LLVM Intermediate Representation, which is the whole idea of LLVM - it's a simplified machine code "language", and then LLVM provides IR -> machine code for your language. My compiler is currently about 13400 lines of C++ code - the code generation and optimisations in LLVM is millions of lines - much of which I don't even know how it works [beyond the simple overview what the function does according to the description]
[1] There are typically also libraries, which contain larger functions that aren't suitable to store in the headers directly. These are built using the same compiler [or one compatible to it] that you use to build your source into a binary file.
This question already has answers here:
Do all C++ compilers generate C code?
(5 answers)
Closed 8 years ago.
I have read that the original implementation of C++ by Bjarne Stroustrup was using a compiler named Cfront that converted C++ to C during the compilation process.
Is this still the case with modern compilers (most of them ?) ?
I couldn't find a good answer using Google (or I couldn't find the right search terms).
edit: This is not an exact duplicate because I'm asking for current/modern ones. But both questions & answers apply.
Absolutely not. The CFront way of doing things became untenable long ago. There are some C++ constructs with no C interpretation, especially exceptions, and stamping out literal C source for every template instantiation is a bit ridiculous. The entire reason Bjarne stopped making Cfront is because it was impossible.
It is, however, common to lower the code to a more useful IR like LLVM IR, and GCC also has an internal IR, before converting to machine code.
Short answer: no. Modern C++ compilers generate native code directly.
There's no reason why you can't compile C++ to C, there's just no real reason to do so either any more, so you're adding an extra stage in the compilation process that could just as easily not exist. However, there are still a couple of options if you really need C code output for some reason: the Comeau C++ compiler emits C code with the aim of porting your C++ to platforms where a C++ compiler may not exist (which these days, is very few), and Clang uses LLVM as a backend code generator, which has C as one of its many target instruction languages. (edit: of these options, the first is outdated and the second is no longer maintained)
In neither case does the C look anything like the code you put in: it's significantly less readable than machine code would be. The days of converting method calls to function calls with a this are certainly long gone - it's very much a case of "compiling" rather than "converting".
No, modern compilers, such as GCC and clang (and others based on LLVM) have generally two parts: back-end and front-end.
Front-end handles compiling source code language into some intermediate representaton, such as LLVM IR.
Back-end generates machine code on target platform, possibly using some optimisations from that intermediate form.