Are STL headers written entirely by hand? - c++

I am looking at various STL headers provided with compilers and I cant imagine the developers actually writing all this code by hand.
All the macros and the weird names of varaibles and classes - they would have to remember all of them! Seems error prone to me.
Are parts of the headers result of some text preprocessing or generation?

I've maintained Visual Studio's implementation of the C++ Standard Library for 7 years (VC's STL was written by and licensed from P.J. Plauger of Dinkumware back in the mid-90s, and I work with PJP to pick up new features and maintenance bugfixes), and I can tell you that I do all of my editing "by hand" in a plain text editor. None of the STL's headers or sources are automatically generated (although Dinkumware's master sources, which I have never seen, go through automated filtering in order to produce customized drops for Microsoft), and the stuff that's checked into source control is shipped directly to users without any further modification (now, that is; previously we ran them through a filtering step that caused lots of headaches). I am notorious for not using IDEs/autocomplete, although I do use Source Insight to browse the codebase (especially the underlying CRT whose guts I am less familiar with), and I extensively rely on grep. (And of course I use diff tools; my favorite is an internal tool named "odd".) I do engage in very very careful cut-and-paste editing, but for the opposite reason as novices; I do this when I understand the structure of code completely, and I wish to exactly replicate parts of it without accidentally leaving things out. (For example, different containers need very similar machinery to deal with allocators; it should probably be centralized, but in the meantime when I need to fix basic_string I'll verify that vector is correct and then copy its machinery.) I've generated code perhaps twice - once when stamping out the C++14 transparent operator functors that I designed (plus<>, multiplies<>, greater<>, etc. are highly repetitive), and again when implementing/proposing variable templates for type traits (recently voted into the Library Fundamentals Technical Specification, probably destined for C++17). IIRC, I wrote an actual program for the operator functors, while I used sed for the variable templates. The plain text editor that I use (Metapad) has search-and-replace capabilities that are quite useful although weaker than outright regexes; I need stronger tools if I want to replicate chunks of text (e.g. is_same_v = is_same< T >::value).
How do STL maintainers remember all this stuff? It's a full time job. And of course, we're constantly consulting the Standard/Working Paper for the required interfaces and behavior of code. (I recently discovered that I can, with great difficulty, enumerate all 50 US states from memory, but I would surely be unable to enumerate all STL algorithms from memory. However, I have memorized the longest name, as a useless bit of trivia. :->)

The looks of it are designed to be weird in some sense. The standard library and the code in there needs to avoid conflicts with names used in user programs, including macros and there are almost no restrictions as to what can be in a user program.
They are most probably hand written, and as others have mentioned, if you spend some time looking at them you will figure out what the coding conventions are, how variables are named and so on. One of the few restrictions include that user code cannot use identifiers starting with _ followed by a capital letter or __ (two consecutive underscores), so you will find many names in the standard headers that look like _M_xxx or __yyy and it might surprise at first, but after some time you just ignore the prefix...

Related

Why isn't the C++ standard library already pre-included in any C++ source?

In C++ the standard library is wrapped in the std namespace and the programmer is not supposed to define anything inside that namespace. Of course the standard include files don't step on each other names inside the standard library (so it's never a problem to include a standard header).
Then why isn't the whole standard library included by default instead of forcing programmers to write for example #include <vector> each time? This would also speed up compilation as the compilers could start with a pre-built symbol table for all the standard headers.
Pre-including everything would also solve some portability problems: for example when you include <map> it's defined what symbols are taken into std namespace, but it's not guaranteed that other standard symbols are not loaded into it and for example you could end up (in theory) with std::vector also becoming available.
It happens sometimes that a programmer forgets to include a standard header but the program compiles anyway because of an include dependence of the specific implementation. When moving the program to another environment (or just another version of the same compiler) the same source code could however stop compiling.
From a technical point of view I can image a compiler just preloading (with mmap) an optimal perfect-hash symbol table for the standard library.
This should be faster to do than loading and doing a C++ parse of even a single standard include file and should be able to provide faster lookup for std:: names. This data would also be read-only (thus probably allowing a more compact representation and also shareable between multiple instances of the compiler).
These are however just shoulds as I never implemented this.
The only downside I see is that we C++ programmers would lose compilation coffee breaks and Stack Overflow visits :-)
EDIT
Just to clarify the main advantage I see is for the programmers that today, despite the C++ standard library being a single monolithic namespace, are required to know which sub-part (include file) contains which function/class. To add insult to injury when they make a mistake and forget an include file still the code may compile or not depending on the implementation (thus leading to non-portable programs).
Short answer is because it is not the way the C++ language is supposed to be used
There are good reasons for that:
namespace pollution - even if this could be mitigated because std namespace is supposed to be self coherent and programmer are not forced to use using namespace std;. But including the whole library with using namespace std; will certainly lead to a big mess...
force programmer to declare the modules that he wants to use to avoid inadvertently calling a wrong standard function because standard library is now huge and not all programmers know all modules
history: C++ has still strong inheritance from C where namespace do not exist and where the standard library is supposed to be used as any other library.
To go in your sense, Windows API is an example where you only have one big include (windows.h) that loads many other smaller include files. And in fact, precompiled headers allows that to be fast enough
So IMHO a new language deriving from C++ could decide to automatically declare the whole standard library. A new major release could also do it, but it could break code intensively using using namespace directive and having custom implementations using same names as some standard modules.
But all common languages that I know (C#, Python, Java, Ruby) require the programmer to declare the parts of the standard library that he wants to use, so I suppose that systematically making available every piece of the standard library is still more awkward than really useful for the programmer, at least until someone find how to declare the parts that should not be loaded - that's why I spoke of a new derivative from C++
Most of the C++ standard libraries are template based which means that the code they'll generate will depend ultimately in how you use them. In other words, there is very little that could be compiled before instantiate a template like std::vector<MyType> m_collection;.
Also, C++ is probably the slowest language to compile and there is a lot parsing work that compilers have to do when you #include a header file that also includes other headers.
Well, first thing first, C++ tries to adhere to "you only pay for what you use".
The standard-library is sometimes not part of what you use at all, or even of what you could use if you wanted.
Also, you can replace it if there's a reason to do so: See libstdc++ and libc++.
That means just including it all without question isn't actually such a bright idea.
Anyway, the committee are slowly plugging away at creating a module-system (It takes lots of time, hopefully it will work for C++1z: C++ Modules - why were they removed from C++0x? Will they be back later on?), and when that's done most downsides to including more of the standard-library than strictly neccessary should disappear, and the individual modules should more cleanly exclude symbols they need not contain.
Also, as those modules are pre-parsed, they should give the compilation-speed improvement you want.
You offer two advantages of your scheme:
Compile-time performance. But nothing in the standard prevents an implementation from doing what you suggest[*] with a very slight modification: that the pre-compiled table is only mapped in when the translation unit includes at least one standard header. From the POV of the standard, it's unnecessary to impose potential implementation burden over a QoI issue.
Convenience to programmers: under your scheme we wouldn't have to specify which headers we need. We do this in order to support C++ implementations that have chosen not to implement your idea of making the standard headers monolithic (which currently is all of them), and so from the POV of the C++ standard it's a matter of "supporting existing practice and implementation freedom at a cost to programmers considered acceptable". Which is kind of the slogan of C++, isn't it?
Since no C++ implementation (that I know of) actually does this, my suspicion is that in point of fact it does not grant the performance improvement you think it does. Microsoft provides precompiled headers (via stdafx.h) for exactly this performance reason, and yet it still doesn't give you an option for "all the standard libraries", instead it requires you to say what you want in. It would be dead easy for this or any other implementation to provide an implementation-specific header defined to have the same effect as including all standard headers. This suggests to me that at least in Microsoft's opinion there would be no great overall benefit to providing that.
If implementations were to start providing monolithic standard libraries with a demonstrable compile-time performance improvement, then we'd discuss whether or not it's a good idea for the C++ standard to continue permitting implementations that don't. As things stand, it has to.
[*] Except perhaps for the fact that <cassert> is defined to have different behaviour according to the definition of NDEBUG at the point it's included. But I think implementations could just preprocess the user's code as normal, and then map in one of two different tables according to whether it's defined.
I think the answer comes down to C++'s philosophy of not making you pay for what you don't use. It also gives you more flexibility: you aren't forced to use parts of the standard library if you don't need them. And then there's the fact that some platforms might not support things like throwing exceptions or dynamically allocating memory (like the processors used in the Arduino, for example). And there's one other thing you said that is incorrect. As long as it's not a template class, you are allowed to add swap operators to the std namespace for your own classes.
First of all, I am afraid that having a prelude is a bit late to the game. Or rather, seeing as preludes are not easily extensible, we have to content ourselves with a very thin one (built-in types...).
As an example, let's say that I have a C++03 program:
#include <boost/unordered_map.hpp>
using namespace std;
using boost::unordered_map;
static unordered_map<int, string> const Symbols = ...;
It all works fine, but suddenly when I migrate to C++11:
error: ambiguous symbol "unordered_map", do you mean:
- std::unordered_map
- boost::unordered_map
Congratulations, you have invented the least backward compatible scheme for growing the standard library (just kidding, whoever uses using namespace std; is to blame...).
Alright, let's not pre-include them, but still bundle the perfect hash table anyway. The performance gain would be worth it, right?
Well, I seriously doubt it. First of all because the Standard Library is tiny compared to most other header files that you include (hint: compare it to Boost). Therefore the performance gain would be... smallish.
Oh, not all programs are big; but the small ones compile fast already (by virtue of being small) and the big ones include much more code than the Standard Library headers so you won't get much mileage out of it.
Note: and yes, I did benchmark the file look-up in a project with "only" a hundred -I directives; the conclusion was that pre-computing the "include path" to "file location" map and feeding it to gcc resulted in a 30% speed-up (after using ccache already). Generating it and keeping it up-to-date was complicated, so we never used it...
But could we at least include a provision that the compiler could do it in the Standard?
As far as I know, it is already included. I cannot remember if there is a specific blurb about it, but the Standard Library is really part of the "implementation" so resolving #include <vector> to an internal hash-map would fall under the as-if rule anyway.
But they could do it, still!
And lose any flexibility. For example Clang can use either the libstdc++ or the libc++ on Linux, and I believe it to be compatible with the Dirkumware's derivative that ships with VC++ (or if not completely, at least greatly).
This is another point of customization: if the Standard library does not fit your needs, or your platforms, by virtue of being mostly treated like any other library you can replace part of most of it with relative ease.
But! But!
#include <stdafx.h>
If you work on Windows, you will recognize it. This is called a pre-compiled header. It must be included first (or all benefits are lost) and in exchange instead of parsing files you are pulling in an efficient binary representation of those parsed files (ie, a serialized AST version, possibly with some type resolution already performed) which saves off maybe 30% to 50% of the work. Yep, this is close to your proposal; this is Computer Science for you, there's always someone else who thought about it first...
Clang and gcc have a similar mechanism; though from what I've heard it can be so painful to use that people prefer the more transparent ccache in practice.
And all of these will come to naught with modules.
This is the true solution to this pre-processing/parsing/type-resolving madness. Because modules are truly isolated (ie, unlike headers, not subject to inclusion order), an efficient binary representation (like pre-compiled headers) can be pre-computed for each and every module you depend on.
This not only means the Standard Library, but all libraries.
Your solution, more flexible, and dressed to the nines!
One could use an alternative implementation of the C++ Standard Library to the one shipped with the compiler. Or wrap headers with one's definitions, to add, enable or disable features (see GNU wrapper headers). Plain text headers and the C inclusion model are a more powerful and flexible mechanism than a binary black box.

Open-source C++ scanning library

Rationale: In my day-to-day C++ code development, I frequently need to
answer basic questions such as who calls what in a very large C++ code
base that is frequently changing. But, I also need to have some
automated way to exactly identify what the code is doing around a
particular area of code. "grep" tools such as Cscope are useful (and
I use them heavily already), but are not C++-language-aware: They
don't give any way to identify the types and kinds of lexical
environment of a given use of a type or function a such way that is
conducive to automation (even if said automation is limited to
"read-only" operations such as code browsing and navigation, but I'm
asking for much more than that below).
Question: Does there exist already an open-source C/C++-based library
(native, not managed, not Microsoft- or Linux-specific) that can
statically scan or analyze a large tree of C++ code, and can produce
result sets that answer detailed questions such as:
What functions are called by some supplied function?
What functions make use of this supplied type?
Ditto the above questions if C++ classes or class templates are involved.
The result set should provide some sort of "handle". I should be able
to feed that handle back to the library to perform the following types
of introspection:
What is the byte offset into the file where the reference was made?
What is the reference into the abstract syntax tree (AST) of that
reference, so that I can inspect surrounding code constructs? And
each AST entity would also have file path, byte-offset, and
type-info data associated with it, so that I could recursively walk
up the graph of callers or referrers to do useful operations.
The answer should meet the following requirements:
API: The API exposed must be one of the following:
C or C++ and probably is "C handle" or C++-class-instance-based
(and if it is, must be generic C o C++ code and not Microsoft- or
Linux-specific code constructs unless it is to meet specifics of
the given platform), or
Command-line standard input and standard output based.
C++ aware: Is not limited to C code, but understands C++ language
constructs in minute detail including awareness of inter-class
inheritance relationships and C++ templates.
Fast: Should scan large code bases significantly faster than
compiling the entire code base from scratch. This probably needs to
be relaxed, but only if Incremental result retrieval and Resilient
to small code changes requirements are fully met below.
Provide Result counts: I should be able to ask "How many results
would you provide to some request (and no don't send me all of the
results)?" that responds on the order of less than 3 seconds versus
having to retrieve all results for any given question. If it takes
too long to get that answer, then wastes development time. This is
coupled with the next requirement.
Incremental result retrieval: I should be able to then ask "Give me
just the next N results of this request", and then a handle to the
result set so that I can ask the question repeatedly, thus
incrementally pulling out the results in stages. This means I
should not have to wait for the entire result set before seeing
some subset of all of the results. And that I can cancel the
operation safely if I have seen enough results. Reason: I need to
answer the question: "What is the build or development impact of
changing some particular function signature?"
Resilient to small code changes: If I change a header or source
file, I should not have to wait for the entire code base to be
rescanned, but only that header or source file
rescanned. Rescanning should be quick. E.g., don't do what cscope
requires you to do, which is to rescan the entire code base for
small changes. It is understood that if you change a header, then
scanning can take longer since other files that include that header
would have to be rescanned.
IDE Agnostic: Is text editor agnostic (don't make me use a specific
text editor; I've made my choice already, thank you!)
Platform Agnostic: Is platform-agnostic (don't make me only use it
on Linux or only on Windows, as I have to use both of those
platforms in my daily grind, but I need the tool to be useful on
both as I have code sandboxes on both platforms).
Non-binary: Should not cost me anything other than time to download
and compile the library and all of its dependencies.
Not trial-ware.
Actively Supported: It is likely that sending help requests to mailing lists
or associated forums is likely to get a response in less than 2
days.
Network agnostic: Databases the library builds should be able to be used directly on
a network from 32-bit and 64-bit systems, both Linux and Windows
interchangeably, at the same time, and do not embed hardcoded paths
to filesystems that would otherwise "root" the database to a
particular network.
Build environment agnostic: Does not require intimate knowledge of my build environment, with
the notable exception of possibly requiring knowledge of compiler
supplied CPP macro definitions (e.g. -Dmacro=value).
I would say that CLang Index is a close fit. However I don't think that it stores data in a database.
Anyway the CLang framework offer what you actually need to build a tool tailored to your needs, if only because of its C, C++ and Objective-C parsing / indexing capabitilies. And since it's provided as a set of reusable libraries... it was crafted for being developed on!
I have to admit that I haven't used either because I work with a lot of Microsoft-specific code that uses Microsoft compiler extensions that i don't expect them to understand, but the two open source analyzers I'm aware of are Mozilla Pork and the Clang Analyzer.
If you are looking for results of code analysis (metrics, graphs, ...) why not use a tool (instead of API) to do that? If you can, I suggest you to take a look at Understand.
It's not free (there's a trial version) but I found it very useful.
Maybe Doxygen with GraphViz could be the answer of some of your constraints but not all,for example the analysis of Doxygen is not incremental.

Why STL implementation is so unreadable? How C++ could have been improved here?

For instance why does most members in STL implementation have _M_ or _ or __ prefix?
Why there is so much boilerplate code ?
What features C++ is lacking that would allow make vector (for instance) implementation clear and more concise?
Implementations use names starting with an underscore followed by an uppercase letter or two underscores to avoid conflicts with user-defined macros. Such names are reserved in C++.
For example, one could define a macro called Type and then #include <vector>. If vector implementations used Type as a template parameter name, it would break.
However, one is not allowed to define macros called _Type (or __type, type__ etc.). Therefore, vector can safely use such names.
Lots of STL implementations also include checking for debug builds, such as verifying that two iterators are from the same container when comparing them, and watching for iterators going out of bounds. This involves fairly complex code to track the container and validity of every iterator created, but is invaluable for finding bugs. This code is also all interwoven with the standard release code with #ifdefs - even in the STL algorithms. So it's never going to be as clear as their most basic operation. Sites like this one show the most basic functionality of STL algorithms, stating their functionality is "equivalent to" the code they show. You won't see that in your header files though.
In addition to the good reasons robson and AshleysBrain have already given, one reason that C++ standard library implementations have such terse names and compact code is that virtually every C++ program (compilation unit, really) includes a large number of the standard library headers, and they are thus repeatedly recompiled (remember that they're largely inlined and template-based, whereas the C standard library headers only contain a handful of function declarations). A standard library written to "industry standard" style guidelines would take longer to compile and thus lead to the perception that a particular compiler was "slow". By minimizing whitespace and using short identifier names, the lexer and parser have less work to do, and the whole compilation process completes a little bit faster.
Another reason worth mentioning is that many standard library implementations (e.g. Dinkumware, Rogue Wave (old), etc.) can be used with several different compilers with widely different standards compliance and quirks. There's frequently a lot of macro hackery aimed at satisfying each supported platform.

C++ Developer Tools: The Dark Areas [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
While C++ Standards Committee works hard to define its intricate but powerful features and maintain its backward compatibility with C, in my personal experience I've found many aspects of programming with C++ cumbersome due to lack of tools.
For example, I recently tried to refactor some C++ code, replacing many shared_ptr by T& to remove pointer usages where not needed within a large library. I had to perform almost the whole refactoring manually as none of the refactoring tools out there would help me do this safely.
Dealing with STL data structures using the debugger is like raking out the phone number of a stranger when she disagrees.
In your experience, what essential developer tools are lacking in C++?
My dream tool would be a compile-time template debugger. Something that'd let me interactively step through template instantiations and examine the types as they get instantiated, just like the regular debugger does at runtime.
In your experience, what essential developer tools are lacking in C++?
Code completion. Seriously. Refactoring is a nice-to-have feature but I think code completion is much more fundamental and more important for API discoverabilty and usabilty.
Basically, tools that require any undestanding of C++ code suck.
Code generation of class methods. When I type in the declaration you should be able to figure out the definition. And while I'm on the topic can we fix "goto declaration / goto definition" always going to the declaration?
Refactoring. Yes I know it's formally impossible because of the pre-processor - but the compiler could still do a better job of a search and replace on a variable name than I can maually. You could also syntax highlight local, members and paramaters while your at it.
Lint. So the variable I just defined shadows a higher one? C would have told me that in 1979, but c++ in 2009 apparently prefers me to find out on my own.
Some decent error messages. If I promise never to define a class with the same name inside the method of a class - do you promise to tell me about a missing "}". In fact can the compiler have some knowledge of history - so if I added an unbalanced "{" or "(" to a previously working file could we consider mentioning this in the message?
Can the STL error messages please (sorry to quote another comment) not look like you read "/dev/random", stuck "!/bin/perl" in front and then ran the tax code through the result?
How about some warnings for useful things? "Integer used as bool performance warning" is not useful, it doesn't make any performance difference, I don't have a choice - it's what the library does, and you have already told me 50 times.
But if I miss a ";" from the end of a class declaration or a "}" from the end of a method definition you don't warn me - you go out of your way to find the least likely (but theoretically) correct way to parse the result.
It's like the built in spell checker in this browser which happily accepts me misspelling wether (because that spelling is an archaic term for a castrated male goat! How many times do I write about soprano herbivores?)
How about spell checking? 40 years ago mainframe Fortran compilers had spell checking so if misspelled "WRITE" you didn't come back the next day to a pile of cards and a snotty error message. You got a warning that "WRIET" had been changed to WRITE in line X. Now the compiler happily continues and spends 10mins building some massive browse file and debugger output before telling you that you misspelled prinft 10,000 lines ago.
ps. Yes a lot of these only apply to Visual C++.
pps. Yes they are coming with my medication now.
If talking about MS Visual Studio C++, Visual Assist is a very handy tool for code completition, some refactorings - e.g. rename all/selected references, find/goto declaration, but I still miss the richness of Java IDEs like JBuilder or IntelliJ.
What I still miss, is a semantic diff tool - you know, one which does not compare the two files line-by-line, but statements/expressions. What I've found on the internet are only some abandoned tries - if you know one, please write in comment
The main problem with C++ is that it is hard to parse. That's why there are so very few tools out there that work on source code. (And that's also why we're stuck with some of the most horrific error messages in the history of compilers.) The result is, that, with very few exceptions (I only know doxygen and Visual Assist), it's down to the actual compiler to support everything needed to assist us writing and massaging the code. With compilers traditionally being rather streamlined command line tools, that's a very weak foundation to build rich editor support on.
For about ten years now, I'm working with VS. meanwhile, its code completion is almost usable. (Yes, I'm working on dual core machines. I wouldn't have said this otherwise, wouldn't I?) If you use Visual Assist, code completion is actually quite good. Both VS itself and VA come with some basic refactoring nowadays. That, too, is almost usable for the few things it aims for (even though it's still notably less so than code completion). Of course, >15 years of refactoring with search & replace being the only tool in the box, my demands are probably much too deteriorated compared to other languages, so this might not mean much.
However, what I am really lacking is still: Fully standard conforming compilers and standard library implementations on all platforms my code is ported to. And I'm saying this >10 years after the release of the last standard and about a year before the release of the next one! (Which just adds this: C++1x being widely adopted by 2011.)
Once these are solved, there's a few things that keep being mentioned now and then, but which vendors, still fighting with compliance to a >10 year old standard (or, as is actually the case with some features, having even given up on it), never got around to actually tackle:
usable, sensible, comprehensible compiler messages (como is actually pretty good, but that's only if you compare it to other C++ compilers); a linker that doesn't just throw up its hands and says "something's wrong, I can't continue" (if you have taught C++ as a first language, you'll know what I mean); concepts ('nuff said)
an IO stream implementation that doesn't throw away all the compile-time advantages which overloading operator<<() gives us by resorting to calling the run-time-parsing printf() under the hood (Dietmar Kühl once set out to do this, unfortunately his implementation died without the techniques becoming widespread)
STL implementations on all platforms that give rich debugging support (Dinkumware is already pretty good in that)
standard library implementations on all platforms that use every trick in the book to give us stricter checking at compile-time and run-time and more performance (wnhatever happened to yasli?)
the ability to debug template meta programs (yes, jalf already mentioned this, but it cannot be said too often)
a compiler that renders tools like lint useless (no need to fear, lint vendors, that's just wishful thinking)
If all these and a lot of others that I have forgotten to mention (feel free to add) are solved, it would be nice to get refactoring support that almost plays in the same league as, say, Java or C#. But only then.
A compiler which tries to optimize the compilation model.
Rather than naively include headers as needed, parsing them again in every compilation unit, why not parse the headers once first, build complete syntax trees for them (which would have to include preprocessor directives, since we don't yet know which macros are defined), and then simply run through that syntax tree whenever the header is included, applying the known #defines to prune it.
It could even be be used as a replacement for precompiled headers, so every header could be precompiled individually, just by dumping this syntax tree to the disk. We wouldn't need one single monolithic and error-prone precompiled header, and would get finer granularity on rebuilds, rebuilding as little as possible even if a header is modified.
Like my other suggestions, this would be a lot of work to implement, but I can't see any fundamental problems rendering it impossible.
It seems like it could dramatically speed up compile-times, pretty much rendering it linear in the number of header files, rather than in the number of #includes.
A fast and reliable indexer. Most of the fancy features come after this.
A common tool to enforce coding standards.
Take all the common standards and allow you to turn them on/off as appropriate for your project.
Currently just a bunch of perl scrips usullay has to supstitute.
I'm pretty happy with the state of C++ tools. The only thing I can think of is a default install of Boost in VS/gcc.
Refactoring, Refactoring, Refactoring. And compilation while typing. For refactorings I am missing at least half of what most modern Java IDEs can do. While Visual Assist X goes a long way, a lot of refactoring is missing. The task of writing C++ code is still pretty much that. Writing C++ code. The more the IDE supports high level refactoring the more it becomes construction, the more mallable the structure is the easier it will be to iterate over the structure and improve it. Pick up a demo version of Intellij and see what you are missing. These are just some that I remember from a couple of years ago.
Extract interface: taken a view classes with a common interface, move the common functions into an interface class (for C++ this would be an abstract base class) and derive the designated functions as abstract
Better extract method: mark a section of code and have the ide write a function that executes that code, constructing the correct parameters and return values
Know the type of each of the symbols that you are working with so that not only command completion can be correct for derived values e.g. symbol->... but also only offer functions that return the type that can be used in the current expression e.g. for
UiButton button = window->...
At the ... only insert functions that actually return a UiButton.
A tool all on it's own: Naming Conventions.
Intelligent Intellisense/Code Completion even for template-heavy code.
When you're inside a function template, of course the compiler can't say anything for sure about the template parameter (at least not without Concepts), but it should be able to make a lot of guesses and estimates. Depending on how the type is used in the function, it should be able to narrow the possible types down, in effect a kind of conservative ad-hoc Concepts. If one line in the function calls .Foo() on a template type, obviously a Foo member method must exist, and Intellisense should suggest it in the rest of the function as well.
It could even look at where the function is invoked from, and use that to determine at least one valid template parameter type, and simply offer Intellisense inside the function based on that.
If the function is called with a int as a template parameter, then obviously, use of int must be valid, and so the IDE could use that as a "sample type" inside the function and offer Intellisense suggestions based on that.
JavaScript just got Intellisense support in VS, which had to overcome a lot of similar problems, so it can be done. Of course, with C++'s level of complexity, it'd be a ridiculous amount of work. But it'd be a nice feature.

Where do I learn "what I need to know" about C++ compilers?

I'm just starting to explore C++, so forgive the newbiness of this question. I also beg your indulgence on how open ended this question is. I think it could be broken down, but I think that this information belongs in the same place.
(FYI -- I am working predominantly with the QT SDK and mingw32-make right now and I seem to have configured them correctly for my machine.)
I knew that there was a lot in the language which is compiler-driven -- I've heard about pre-compiler directives, but it seems like someone would be able to write books the different C++ compilers and their respective parameters. In addition, there are commands which apparently precede make (like qmake, for example (is this something only in QT)).
I would like to know if there is any place which gives me an overview of what compilers are out there, and what their different options are. I'd also like to know how each of them views Makefiles (it seems that there is a difference in syntax between them?).
If there is no website regarding, "Everything you need to know about C++ compilers but were afraid to ask," what would be the best way to go about learning the answers to these questions?
Concerning the "numerous options of the various compilers"
A piece of good news: you needn't worry about the detail of most of these options. You will, in due time, delve into this, only for the very compiler you use, and maybe only for the options that pertain to a particular set of features. But as a novice, generally trust the default options or the ones supplied with the make files.
The broad categories of these features (and I may be missing a few) are:
pre-processor defines (now, you may need a few of these)
code generation (target CPU, FPU usage...)
optimization (hints for the compiler to favor speed over size and such)
inclusion of debug info (which is extra data left in the object/binary and which enables the debugger to know where each line of code starts, what the variables names are etc.)
directives for the linker
output type (exe, library, memory maps...)
C/C++ language compliance and warnings (compatibility with previous version of the compiler, compliance to current and past C Standards, warning about common possible bug-indicative patterns...)
compile-time verbosity and help
Concerning an inventory of compilers with their options and features
I know of no such list but I'm sure it probably exists on the web. However, suggest that, as a novice you worry little about these "details", and use whatever free compiler you can find (gcc certainly a great choice), and build experience with the language and the build process. C professionals may likely argue, with good reason and at length on the merits of various compilers and associated runtine etc., but for generic purposes -and then some- the free stuff is all that is needed.
Concerning the build process
The most trivial applications, such these made of a single unit of compilation (read a single C/C++ source file), can be built with a simple batch file where the various compiler and linker options are hardcoded, and where the name of file is specified on the command line.
For all other cases, it is very important to codify the build process so that it can be done
a) automatically and
b) reliably, i.e. with repeatability.
The "recipe" associated with this build process is often encapsulated in a make file or as the complexity grows, possibly several make files, possibly "bundled together in a script/bat file.
This (make file syntax) you need to get familiar with, even if you use alternatives to make/nmake, such as Apache Ant; the reason is that many (most?) source code packages include a make file.
In a nutshell, make files are text files and they allow defining targets, and the associated command to build a target. Each target is associated with its dependencies, which allows the make logic to decide what targets are out of date and should be rebuilt, and, before rebuilding them, what possibly dependencies should also be rebuilt. That way, when you modify say an include file (and if the make file is properly configured) any c file that used this header will be recompiled and any binary which links with the corresponding obj file will be rebuilt as well. make also include options to force all targets to be rebuilt, and this is sometimes handy to be sure that you truly have a current built (for example in the case some dependencies of a given object are not declared in the make).
On the Pre-processor:
The pre-processor is the first step toward compiling, although it is technically not part of the compilation. The purposes of this step are:
to remove any comment, and extraneous whitespace
to substitute any macro reference with the relevant C/C++ syntax. Some macros for example are used to define constant values such as say some email address used in the program; during per-processing any reference to this constant value (btw by convention such constants are named with ALL_CAPS_AND_UNDERSCORES) is replace by the actual C string literal containing the email address.
to exclude all conditional compiling branches that are not relevant (the #IFDEF and the like)
What's important to know about the pre-processor is that the pre-processor directive are NOT part of the C-Language proper, and they serve several important functions such as the conditional compiling mentionned earlier (used for example to have multiple versions of the program, say for different Operating Systems, or indeed for different compilers)
Taking it from there...
After this manifesto of mine... I encourage to read but little more, and to dive into programming and building binaries. It is a very good idea to try and get a broad picture of the framework etc. but this can be overdone, a bit akin to the exchange student who stays in his/her room reading the Webster dictionary to be "prepared" for meeting native speakers, rather than just "doing it!".
Ideally you shouldn't need to care what C++ compiler you are using. The compatability to the standard has got much better in recent years (even from microsoft)
Compiler flags obviously differ but the same features are generally available, it's just a differently named option to eg. set warning level on GCC and ms-cl
The build system is indepenant of the compiler, you can use any make with any compiler.
That is a lot of questions in one.
C++ compilers are a lot like hammers: They come in all sizes and shapes, with different abilities and features, intended for different types of users, and at different price points; ultimately they all are for doing the same basic task as the others.
Some are intended for highly specialized applications, like high-performance graphics, and have numerous extensions and libraries to assist the engineer with those types of problems. Others are meant for general purpose use, and aren't necessarily always the greatest for extreme work.
The technique for using each type of hammer varies from model to model—and version to version—but they all have a lot in common. The macro preprocessor is a standard part of C and C++ compilers.
A brief comparison of many C++ compilers is here. Also check out the list of C compilers, since many programs don't use any C++ features and can be compiled by ordinary C.
C++ compilers don't "view" makefiles. The rules of a makefile may invoke a C++ compiler, but also may "compile" assembly language modules (assembling), process other languages, build libraries, link modules, and/or post-process object modules. Makefiles often contain rules for cleaning up intermediate files, establishing debug environments, obtaining source code, etc., etc. Compilation is one link in a long chain of steps to develop software.
Also, many development environments abstract the makefile into a "project file" which is used by an integrated development environment (IDE) in an attempt to simplify or automate many programming tasks. See a comparison here.
As for learning: choose a specific problem to solve and dive in. The target platform (Linux/Windows/etc.) and problem space will narrow the choices pretty well. Which you choose is often linked to other considerations, such as working for a particular company, or being part of a team. C++ has something like 95% commonality among all its flavors. Learn any one of them well, and learning the next is a piece of cake.