C++ From the Compiler's Point of View [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What are the stages of compilation of a C++ program?
I find that understanding how a given software language is compiled can be key to understanding best practices and getting the most out of that language. This seems to be doubly true with C++. Is there a good primer or document (for mortals) that describes C++ from the point of view of the compiler? (Obviously every compiler is a little different.)
I thought there may be something along those lines in the beginning of Stroustrup's book.

Personally I like this one. Not a compiler's eye view exactly, but it tells you what's going on "under the hood" of a C++ program.
Inside the C++ Object Model

It depends on what you want to obtain. I find that the Itaium ABI is a good document to understand some of the intricacies of the C++ object model. It will not deal with optimizations or the like, but I found it quite useful to understand how things like virtual inheritance can be implemented, or things that seemed much more simple as constructors and destructors (did you know that the compiler might generate up to 3 versions of each constructor that you provide? 2 destructors?)
Disclaimer: the document is quite dense, you will probably need to go over the sections more than once, at least I did. And you will need a good understanding of the semantics of the language to actually grasp why the solutions are so complex.

I've head Compilers: Principles, Techniques, and Tools is solid.
http://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811/ref=sr_1_1?s=books&ie=UTF8&qid=1328013630&sr=1-1

I don't know of any such book, but if you wanted an understanding of how C++ is treated by the compiler the easiest way would be to write some code and get the compiler to spit out the annotated assembly listing and inspect that. This would give you an idea of how a particular compiler treats the code.
You could also get involved in a compiler project, perhaps something like the llvm's clang project?

Related

How to view the C++ program after it is converted to C (If so)?

Correct me if I am wrong, but in one book I have read that every C++ program is converted to C while it is going through different compiling phases.
I just want to see the C code.
Can any one tell me how to view that code?
Search for CFront for the answer to your question; Wikipedia has a good summary
C++ started out as a C code generator called CFront, but this was abandoned in 1993. Since then, all C++ compilers have been normal compilers, not C front-ends. Exceptions were the original difficulty, but there are weird corners like the subtle difference in the meaning of "void" that would be awkward too.
It is a good approach to learning C++ to think "What would the C equivalent of this be?", but you can no longer generate it from the compiler, sorry.
Edit: some people are commenting that there are products available to do what you want. I was unaware of these. I would say that although this is what you want, it probably isn't what you need. If your aim is to understand C++, read about C++.

How should the C++ standard be used

I have this classic question of how should the C++ Standard (I mean the actual official document of the finalized ones) e.g. C++98, C++03 be used to learn and teach C++. My idea is only from the point of view of an average C++ user and not from the point of view of the language lawyers or someone who wishes to be in the Standards committee, compiler writers and so on.
Here are my personal thoughts:
a) It is aweful place to start learning C++. Books like "C++ in a Nutshell", "The C++ programming Language" etc do a very good job on that front while closely aligning with the Standard.
b) One needs to revert to the Standard only when
a compiler gives a behavior which is not consistent with what the common books say or,
a certain behavior is inconsistent across compilers e.g. GCC, VS, Comeau etc. I understand the fact that these compilers could be inconsistent is in very few cases / dark corners of the language e.g. templates/exception handling etc. However one really comes to know about the possible different compiler behaviors only when either one is porting and/or migrating to a different environment or when there is a compiler upgrade e.g.
if a concept is poorly explained / not explained in the books at hand e.g. if it is a really advanced concept
Any thoughts/ideas/recommendation on this?
The C++ language standard would be an absolutely terrible place to start learning the language. It is dense, obtuse, and really long. Often the information you are looking for is spread across seven different clauses or hidden in a half of a sentence in a clause completely unrelated to where you think it should be (or worse, a behavior is specified in the sentence you ignored because you didn't think it was relevant).
It does have its uses, of course. To name a few,
If you think you've found a bug in a compiler, it's often necessary to refer to the standard to make sure you aren't just misunderstanding what the specified behavior is.
If you find behavior that is inconsistent between compilers, it's handy to be able to look up which is correct (or which is more correct), though often you'll need to write workarounds regardless.
If you want to know why things are the way they are, it is often a good reference: you can see how different features of the language are related and understand how they interact. Things aren't always clear, of course, but they often are. There are a lot of condensed examples and notes demonstrating and explaining the normative text.
If you reference the C++ standard in a post on Stack Overflow, you get more a lot more upvotes. :-)
It's very interesting to learn about the language. It's one thing to write code and stumble through getting things to compile and run. It's another thing altogether to go and try to understand the language as a whole and understand why you have to do things a certain way.
The standard should be used to ensure portability of code.
When writing basic c++ code you shouldn't need to refer to the standards, but when using templates or advanced use of the STL, reference to the standard is essential to maintain compatibility with more than one compiler, and forward compatibility with future versions.
I use g++ to compile my C++ programs and there I use the option -std=c++0x (earlier, -std=c++98) to make sure that my code is always standard compliant. If I get any warning or error regarding standard compliance, I research on that to educate myself and fix my code.

Why do programmers sometimes refer to "C++/STL" like it's a separate language?

This may seem a trivial question, but it's one that's bothered me a lot lately. Why do some programmers refer to "C++/STL" like it's a different language? The STL is part of the C++ standard library -- and therefore is part of the language, "C++". It's not a separate component, and it does not live alone in the scope of things C++. Yet some continually act like it's a different language altogether. Why?
It's possible to be a competent and experienced C++ programmer and never use the STL. You may be using Boost or ACE, or been an MFC windows programmer for 10 years.
If you want someone experienced in using the STL, asking for someone who knows C++ is no guarantee that you'll get one.
Also for my mind, writing code that's heavily dependent on the STL feels very different to writing, say, MFC code. They might as well be different languages. They certainly won't look particularly similar.
An understanding of the STL isn't necessary to understand C++. It's useful to have when you need ADTs, but you can go (could have gone?) through your whole C++ career without needing it.
The above answers are really good; I'm only going to add to their content in a broader context.
Developers might refer to language/api|library e.g. C/Win32, Java/Struts, Java/Spring, C#/.net MVC because there are in essence two knowledge bases - knowledge of the language in question and knowledge of how to use that specific library, API or framework. Something like Win32 is pretty huge, as is say Django, which I'm currently learning. Django itself works in a very specific way and knowing that is what I'm learning, not Python.
The same is true of C++/MFC or C++/Boost or C++/STL. The language is C++ - the API/library you're using is MFC, Boost or STL.
Probably because STL came a little late to the C++ game, and many people have written code that does not use any STL. For example, think early win32 programming with MFC.
Guess:
When C++ was first released, the STL did not exist. It came into existence later as an optional addition and then was incorporated into the standard.
When writing a resume, people would often list C/C++ as a language, which, in many cases means they don't know either.
Sometime resumes would list "Visual C++" as a language, trying to indicate they don't know what a language is.
This, together with "great knowledge of C++ and PHP" statements, go strait into recycle bin at my firm. Not because they are necessarily bad programmers - but because the interview time waste is not worth it.

How do C/C++ compilers work?

After over a decade of C/C++ coding, I've noticed the following pattern - very good programmers tend to have detailed knowledge of the innards of the compiler.
I'm a reasonably good programmer, and I have an ad-hoc collection of compiler "superstitions", so I'd like to reboot my knowledge and start from the basics.
Can anyone recommend links to online resources or favorite books? I'm particularly interested in C/C++ compiling, optimization, GCC and LLVM.
Start with the dragon book....(stress more on code optimization and code generation)
Go onto write a toy compiler for an educational programming language like Decaf or Cool.., you may use parser generators (lex and yacc) for your front end(to make life easier and focus on more imp stuff)....
Then read gcc internals book along with browsing gcc source code.
GCC Internals Manual.
CPP Internals Manual
LLVM Documentation
Compiler Text are good, but they are a bit heavy for teaching yourself. Jack Crenshaw has a "Book" that was a series of articles you can download and read call "Lets Build a Compiler." It follows a "Learn By Doing" methodology that is great if you didn't get anything out of taking formal classes on the subject, or it's been WAY too many years since took it (that's my case). It holds your hand and leads you through writting a compiler instead of smacking you around with Lambda Calculus and deep theoretical issues that only academia cares about. It was a good way to stir up those brain cells that only had a fuzzy memory of writting something on the Vax (YEAH, that right a VAX!) many many moons ago at school. It's written very conversationally and easy to just sit down and read, unlike most text books which require several pots of coffee just to get past the first chapter. Once you have a basis for understanding then more traditional text such as the Dragon book are great references to expand on your understanding. (And personal I like the Dead Tree versions, I printed out Jack's, it's much easier to read in a comfortable position than on a laptop. And the Ebook readers are too expensive for something that doesn't actually feel like you're reading a real book yet.)
What some might call a "downside" is that it's written in Pascal, but I thought that just made me think about it more than if someone had given me a working C program to start with. Appart from that it was written with the 68000 in mind, which is only being used in embedded systems at this point time. Again for me this wasn't a problem, I knew 68000 asm and 68000 asm is easier to read than some other asm.
If you want dead-tree edition, try The Art of Compiler Design: Theory and Practice.
As noted by Pete Eddy, Jack Crenshaw's tutorial is excellent for newbies. But if you want to see how to a real, production C compiler works—one which was designed by brilliant engineers instead of created by throwing code at the wall until something stuck—get yourself a copy of Fraser and Hanson's A Retargetable C Compiler: Design and Implementation, which contains the source code to the very clean lcc compiler. Explanations of the design and implementation are mixed in with the code. It is not a first book for a beginner, but it will repay careful study, and you can get a used copy for $35.
For a longer blurb about lcc, see Compile C Faster on Linux.
The lcc web page also has links to a number of good textbooks. I don't know of an intro text that I really like, however.
P.S. Sorry you got ripped off at Uni.
http://se-radio.net/podcast/2007-07/episode-61-internals-gcc
see Fabrice Bellard's otcc source code
http://bellard.org/otcc/
Depending on what you exactly want to know, you should have a look at pipes&filter pattern, because as far as I know this (or something similar) is used in a lot of compilers in the last years.
When my compiler knowledge is not too outdated it works like this:
Parse sourcecode into symbolic representation
Clean up symbolic representation, do some normalization
Optimization of the symbolic tree based on certain rules
write out executable code based on symbolic tree
Of course dependencies etc. have to be resolved too.
And of course having a look at gcc or javac sourcecode may help in getting more detailed understanding.
It may also be valuable to pick up and read the source code to a compiler. I doubt that GCC is the best first choice, since it is burdened with full compatibility to more than 20 years of evolution of the language. But I'm also sure that a reading of its source, guided by one of the internal reference manuals, would be educational.
I'd seriously consider looking at the source to a scripting language that is internally compiled to a bytecode for a virtual machine. Several languages fit that description, but I would start with Lua. The language is small, and the VM is novel. The source code is also small and the bits I've looked at have been very clear although lightly commented.
have a look on Kaleidoscope.
You can write your own compiler in just a few days with LLVM.

Isn't saying "C/C++" wrong? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I've seen a lot of questions around that use improperly the expression "C/C++".
The reasons in my opinion are:
Newbie C and C++ programmers probably don't understand the difference between the two languages.
People don't really care about it since they want a generic, quick and "dirty" answer
While C/C++ could sometimes be interpreted as "either C or C++", I think it's a big error. C and C++ offer different approaches to programming, and even if C code can be easily implemented into C++ programs I think that referring to two separate languages with that single expression ( C/C++ ) is wrong.
It's true that some questions can be considered either as C or C++ ones, anyway.
What do you think about it?
C/C++ is a holdout from the early days of C++, where they were much more similar than they were today. It's something that wasn't really wrong at first, but is getting more-so all the time.
The basic structure is similar enough that most simple questions do still work between the two, though. There is an entire Wikipedia article on this topic: http://en.wikipedia.org/wiki/Compatibility_of_C_and_C%2B%2B
The biggest fallacy that comes from this is that because someone is well-versed in C, they will be equally good at C++.
Please remember that the original implementations of C++ were simply as a pre-compiler that output C code for the 'real' compiler. All C++ concepts can be manually coded (but not compiler-enforced) in plain C.
"C/C++" is also valid when referring to compilers and other language/programming tools. Virtually every C++ compiler available will compile either - and are thus referred to as "C/C++" compilers. Most have options on whether to treat .C and .CPP files based on the extension, or compile all of them as C or all of them as C++.
Also note that mixing C and C++ source in a single compiler project was possible from the very first C/C++ compiler. This is probably the key factor in blurring the line between the languages.
Many language/programming tools that are created for C++ also work on C because the language syntax is virtually identical. Many language tools have separate Java, C#, Python versions - but they have a single "C/C++" version that works for C and C++ due to the strong similarities.
We in our company have noticed the following curious fact: if a job applicant writes in his CV about "advanced C/C++ knowledge", there is usually a good chance that he really knows neither ;)
The two languages are distinct, but they have a lot in common. A lot of C code would compile just fine on a C++ compiler. At the early-student level, a lot of C++ code would still work on a C compiler.
Note that in some circumstances the meaning of the code may differ in very subtle ways between the two compilers, but I suppose that's true in some circumstances even between different brands of C++ compiler if you're foolish enough to rely on undefined or contested/non-conformant behavior.
Yes and no.
C and C++ share a lot in common (in fact, the majority of C is a subset of C++).
But C is more oriented "imperating programming", whereas C++, in addition to C paradigm, has more paradigms easily accessible, like functional programing, generic programing, object oriented programing, metaprograming.
So I see the "C/C++" item saying either as "the intersection of C and C++" or "familiarity with C programing as well as C++ programing", depending on the context.
Now, the two languages are really different, and have different solutions to similar problems. A C developer would find it difficult to "parse/understand" a C++ source, whereas a C++ developer would not easily recognize the patterns used in a C source.
Thus, if you want to see how far the C is from the C++ in the "C/C++" expression, a good comparison would be the GTK+ C tutorials, and the same in C++ (GTKmm):
C : GTK+ Hello World: http://library.gnome.org/devel/gtk-tutorial/stable/c39.html#SEC-HELLOWORLD
C++ : GTKmm Hello World: http://www.gtkmm.org/docs/gtkmm-2.4/docs/tutorial/html/sec-helloworld.html
Reading those sources is quite enlightening, as they are, as far as I parsed them, producing exactly the same thing, the "same" way (as far as the languages are concerned).
Thus, I guess the C/C++ "expression" can quite be expressed by the comparison of those sources.
:-)
The conclusion of all this is that it is Ok if used on the following contexts:
describing the intersection of C and C++
describing familiarity with C programing as well as C++ programing
describing compatible code
But it would not be for:
justifying keeping to code in a subset of C++ (or C) for candy compatibility with C (or C++) when compatibility is not desired (and in most C++ project, it is not desired because quite limitating).
asserting that C and C++ can/should be coded the same way (as NOT shown by the GTK+/GTKmm example above)
I think it's more of the second answer - they want something that's easily integrated into their project.
While a C answer may not be idiomatic C++ (and vice versa), I think that's one of C++'s big selling points - you can basically embed C into it. If an idiomatic answer is important, they can always specify C/C++/C++ with STL/C++ with boost/etc.
An answer in lisp is going to be pretty unusable. But an answer in either C or C++ will be directly usable.
Yeah, C/C++ is pretty useless. It seems to be a term mostly used by C++ newbies. We C-only curmudgeons just say "C" and the experienced C++ folks know how much it has diverged from C and so they properly say "C++".
Even if C is (nearly) a subset of C++, this doesn't really have any bearing on their actual usage. Practically every interesting C feature is frowned upon in modern C++ code:
C pointers (use iterators/smart pointers/references instead), macros (use templates and inline functions instead), stdio (use iostreams instead), etc. etc.
So, as Alex Jenter put it, it's unlikely that anyone who knows either language well would say C/C++. Saying that you know how to program in "C/C++" is like saying you know how to program in "Perl/PHP"... sure they've got some significant similarities, but the differences in how they are actually used are vast.
C/C++ often means a programiing style, which is like C and classes, or C and STL :-) Technicaly it is C++, but minimum of its advantages are used.
I agree. I read the C tag RSS feed, and I see tons of C++ questions come through that really don't have anything to do with C.
I also see this exchange a lot:
Asker: How do you do this in C?
Answer: Use the X library for C++.
Asker: OK, how about someone actually answer my question in C?
I use that term myself, and it is because it is my style, I don't use boost, stl or some other things, not even standard C++ libs, like "cout" and "cin", I program C but using classes, templates and other (non-library) features to my advantage.
I can say that I am not a master of C, neither a master of C++, but I am really good at that particular style that I use since 10 years ago. (and I am still improving!)
I was under the impression that all c code is valid c++ code.
Isn’t saying “C/C++” wrong?
No, it isn't. Watcom International Corporation for example, founded more than 25 years ago, called their set of C and C++ compilers and tools "Watcom C/C++", and this product is still developed and available in the open-source form as OpenWatcom C/C++
If it is a complex question needing to write more than one function, yes, it can be wrong.
If it is just to ask a detail about sprintf or bit manipulation, I think it can be legitimate (the latter can even be tagged C/C++/Java/C#, I suppose...).
Whoever is asking the question should write C, C++ or C/C++ depending on the question.
From what I've seen, you can write C++ code the C way or the C++ way. Both work, but C++ code that is not written C style is usually easier to maintain in the long run.
In the end, it all depends on the particular question. If someone is asking how to concatenate strings, than it is very important whether he wants C or C++ solution. Or, another example, if someone is asking for a qsort algorithm. To support different types, you might want to use macros with C and templates with C++, etc.
This was too long for a comment, so I had to make it an answer, but it's in response to Jeff B's answer.
Please remember that the original
implementations of C++ were simply as
a pre-compiler that output C code for
the 'real' compiler.
I have a friend (who writes C++ compilers -- yes, plural), who would take offense to your first sentence. A compiler whose object code is C source code is every bit as much a compiler as any other. The essence of a compiler is that it understands that syntax of the language, and generates new code based on that. A pre-processor has no knowledge of the language and merely reformats its input.
Remember that the C Compilers which would compile the output of those C++ compilers, would themselves output ASM code would would then be run through an assembler.
I tend to put C / C++ in my questions.
Typically I am looking for something that I can use in my c++ application.
If the code is in C or in C++ then I can use it, so I would rather not just limit the possible answers to one or the other.
Not only these two languages are different, but also the approaches are different. C++ is an OO language, while C is procedural language.
Do I have to mention templates?
Also, there are differences in C and C++ standards. If something is good in C, doesn't have to compile in C++