Related
Say I have the below (very simple) code.
#include <iostream>
int main() {
std::cout << std::stoi("12");
}
This compiles fine on both g++ and clang; however, it fails to compile on MSVC with the following error:
error C2039: 'stoi': is not a member of 'std'
error C3861: 'stoi': identifier not found
I know that std::stoi is part of the <string> header, which presumably the two former compilers include as part of <iostream> and the latter does not. According to the C++ standard [res.on.headers]
A C++ header may include other C++ headers.
Which, to me, basically says that all three compilers are correct.
This issue arose when one of my students submitted work, which the TA marked as not compiling; I of course went and fixed it. However, I would like to prevent future incidents like this. So, is there a way to determine which header files should be included, short of compiling on three different compilers to check every time?
The only way I can think of is to ensure that for every std function call, an appropriate include exists; but if you have existing code which is thousands of lines long, this may be tedious to search through. Is there an easier/better way to ensure cross-compiler compatibility?
Example with the three compilers: https://godbolt.org/z/kJhS6U
Is there an easier/better way to ensure cross-compiler compatibility?
This is always going to be a bit of a chore if you have a huge codebase and haven't been doing this so far, but once you've gone through fixing your includes, you can stick to a simple procedure:
When you write new code that uses a standard feature, like std::stoi, plug that name into Google, go to the cppreference.com article for it, then look at the top to see which header it's defined in.
Then include that, if it's not already included. Job done!
(You could use the standard for this, but that's not as accessible.)
Do not be tempted to sack it all off in favour of cheap, unportable hacks like <bits/stdc++.h>!
tl;dr: documentation
Besides reviewing documentation and doing that manually (painful and time consuming) you can use some tools which can do that for you.
You can use ReSharper in Visual Studio which is capable to organize imports (in fact VS without ReSharper is not very usable). If include is missing it recommends to add it and if it is obsolete line with include is shown in more pale colors.
Or you can use CLion (available for all platforms) which also has this capability (in fact this is the same manufacture JetBrains).
There is also tool called include what you used, but its aim is take advantages of forward declaration, I never used that (personally - my team mate did that for our project).
In C++ the standard library is wrapped in the std namespace and the programmer is not supposed to define anything inside that namespace. Of course the standard include files don't step on each other names inside the standard library (so it's never a problem to include a standard header).
Then why isn't the whole standard library included by default instead of forcing programmers to write for example #include <vector> each time? This would also speed up compilation as the compilers could start with a pre-built symbol table for all the standard headers.
Pre-including everything would also solve some portability problems: for example when you include <map> it's defined what symbols are taken into std namespace, but it's not guaranteed that other standard symbols are not loaded into it and for example you could end up (in theory) with std::vector also becoming available.
It happens sometimes that a programmer forgets to include a standard header but the program compiles anyway because of an include dependence of the specific implementation. When moving the program to another environment (or just another version of the same compiler) the same source code could however stop compiling.
From a technical point of view I can image a compiler just preloading (with mmap) an optimal perfect-hash symbol table for the standard library.
This should be faster to do than loading and doing a C++ parse of even a single standard include file and should be able to provide faster lookup for std:: names. This data would also be read-only (thus probably allowing a more compact representation and also shareable between multiple instances of the compiler).
These are however just shoulds as I never implemented this.
The only downside I see is that we C++ programmers would lose compilation coffee breaks and Stack Overflow visits :-)
EDIT
Just to clarify the main advantage I see is for the programmers that today, despite the C++ standard library being a single monolithic namespace, are required to know which sub-part (include file) contains which function/class. To add insult to injury when they make a mistake and forget an include file still the code may compile or not depending on the implementation (thus leading to non-portable programs).
Short answer is because it is not the way the C++ language is supposed to be used
There are good reasons for that:
namespace pollution - even if this could be mitigated because std namespace is supposed to be self coherent and programmer are not forced to use using namespace std;. But including the whole library with using namespace std; will certainly lead to a big mess...
force programmer to declare the modules that he wants to use to avoid inadvertently calling a wrong standard function because standard library is now huge and not all programmers know all modules
history: C++ has still strong inheritance from C where namespace do not exist and where the standard library is supposed to be used as any other library.
To go in your sense, Windows API is an example where you only have one big include (windows.h) that loads many other smaller include files. And in fact, precompiled headers allows that to be fast enough
So IMHO a new language deriving from C++ could decide to automatically declare the whole standard library. A new major release could also do it, but it could break code intensively using using namespace directive and having custom implementations using same names as some standard modules.
But all common languages that I know (C#, Python, Java, Ruby) require the programmer to declare the parts of the standard library that he wants to use, so I suppose that systematically making available every piece of the standard library is still more awkward than really useful for the programmer, at least until someone find how to declare the parts that should not be loaded - that's why I spoke of a new derivative from C++
Most of the C++ standard libraries are template based which means that the code they'll generate will depend ultimately in how you use them. In other words, there is very little that could be compiled before instantiate a template like std::vector<MyType> m_collection;.
Also, C++ is probably the slowest language to compile and there is a lot parsing work that compilers have to do when you #include a header file that also includes other headers.
Well, first thing first, C++ tries to adhere to "you only pay for what you use".
The standard-library is sometimes not part of what you use at all, or even of what you could use if you wanted.
Also, you can replace it if there's a reason to do so: See libstdc++ and libc++.
That means just including it all without question isn't actually such a bright idea.
Anyway, the committee are slowly plugging away at creating a module-system (It takes lots of time, hopefully it will work for C++1z: C++ Modules - why were they removed from C++0x? Will they be back later on?), and when that's done most downsides to including more of the standard-library than strictly neccessary should disappear, and the individual modules should more cleanly exclude symbols they need not contain.
Also, as those modules are pre-parsed, they should give the compilation-speed improvement you want.
You offer two advantages of your scheme:
Compile-time performance. But nothing in the standard prevents an implementation from doing what you suggest[*] with a very slight modification: that the pre-compiled table is only mapped in when the translation unit includes at least one standard header. From the POV of the standard, it's unnecessary to impose potential implementation burden over a QoI issue.
Convenience to programmers: under your scheme we wouldn't have to specify which headers we need. We do this in order to support C++ implementations that have chosen not to implement your idea of making the standard headers monolithic (which currently is all of them), and so from the POV of the C++ standard it's a matter of "supporting existing practice and implementation freedom at a cost to programmers considered acceptable". Which is kind of the slogan of C++, isn't it?
Since no C++ implementation (that I know of) actually does this, my suspicion is that in point of fact it does not grant the performance improvement you think it does. Microsoft provides precompiled headers (via stdafx.h) for exactly this performance reason, and yet it still doesn't give you an option for "all the standard libraries", instead it requires you to say what you want in. It would be dead easy for this or any other implementation to provide an implementation-specific header defined to have the same effect as including all standard headers. This suggests to me that at least in Microsoft's opinion there would be no great overall benefit to providing that.
If implementations were to start providing monolithic standard libraries with a demonstrable compile-time performance improvement, then we'd discuss whether or not it's a good idea for the C++ standard to continue permitting implementations that don't. As things stand, it has to.
[*] Except perhaps for the fact that <cassert> is defined to have different behaviour according to the definition of NDEBUG at the point it's included. But I think implementations could just preprocess the user's code as normal, and then map in one of two different tables according to whether it's defined.
I think the answer comes down to C++'s philosophy of not making you pay for what you don't use. It also gives you more flexibility: you aren't forced to use parts of the standard library if you don't need them. And then there's the fact that some platforms might not support things like throwing exceptions or dynamically allocating memory (like the processors used in the Arduino, for example). And there's one other thing you said that is incorrect. As long as it's not a template class, you are allowed to add swap operators to the std namespace for your own classes.
First of all, I am afraid that having a prelude is a bit late to the game. Or rather, seeing as preludes are not easily extensible, we have to content ourselves with a very thin one (built-in types...).
As an example, let's say that I have a C++03 program:
#include <boost/unordered_map.hpp>
using namespace std;
using boost::unordered_map;
static unordered_map<int, string> const Symbols = ...;
It all works fine, but suddenly when I migrate to C++11:
error: ambiguous symbol "unordered_map", do you mean:
- std::unordered_map
- boost::unordered_map
Congratulations, you have invented the least backward compatible scheme for growing the standard library (just kidding, whoever uses using namespace std; is to blame...).
Alright, let's not pre-include them, but still bundle the perfect hash table anyway. The performance gain would be worth it, right?
Well, I seriously doubt it. First of all because the Standard Library is tiny compared to most other header files that you include (hint: compare it to Boost). Therefore the performance gain would be... smallish.
Oh, not all programs are big; but the small ones compile fast already (by virtue of being small) and the big ones include much more code than the Standard Library headers so you won't get much mileage out of it.
Note: and yes, I did benchmark the file look-up in a project with "only" a hundred -I directives; the conclusion was that pre-computing the "include path" to "file location" map and feeding it to gcc resulted in a 30% speed-up (after using ccache already). Generating it and keeping it up-to-date was complicated, so we never used it...
But could we at least include a provision that the compiler could do it in the Standard?
As far as I know, it is already included. I cannot remember if there is a specific blurb about it, but the Standard Library is really part of the "implementation" so resolving #include <vector> to an internal hash-map would fall under the as-if rule anyway.
But they could do it, still!
And lose any flexibility. For example Clang can use either the libstdc++ or the libc++ on Linux, and I believe it to be compatible with the Dirkumware's derivative that ships with VC++ (or if not completely, at least greatly).
This is another point of customization: if the Standard library does not fit your needs, or your platforms, by virtue of being mostly treated like any other library you can replace part of most of it with relative ease.
But! But!
#include <stdafx.h>
If you work on Windows, you will recognize it. This is called a pre-compiled header. It must be included first (or all benefits are lost) and in exchange instead of parsing files you are pulling in an efficient binary representation of those parsed files (ie, a serialized AST version, possibly with some type resolution already performed) which saves off maybe 30% to 50% of the work. Yep, this is close to your proposal; this is Computer Science for you, there's always someone else who thought about it first...
Clang and gcc have a similar mechanism; though from what I've heard it can be so painful to use that people prefer the more transparent ccache in practice.
And all of these will come to naught with modules.
This is the true solution to this pre-processing/parsing/type-resolving madness. Because modules are truly isolated (ie, unlike headers, not subject to inclusion order), an efficient binary representation (like pre-compiled headers) can be pre-computed for each and every module you depend on.
This not only means the Standard Library, but all libraries.
Your solution, more flexible, and dressed to the nines!
One could use an alternative implementation of the C++ Standard Library to the one shipped with the compiler. Or wrap headers with one's definitions, to add, enable or disable features (see GNU wrapper headers). Plain text headers and the C inclusion model are a more powerful and flexible mechanism than a binary black box.
This question already has answers here:
Why should I not #include <bits/stdc++.h>?
(9 answers)
Closed 4 years ago.
I have read from a codeforces blog that if we add #include <bits/stdc++.h> in a C++ program then there is no need to include any other header files. How does #include <bits/stdc++.h> work and is it ok to use it instead of including individual header files?
It is basically a header file that also includes every standard library and STL include file. The only purpose I can see for it would be for testing and education.
Se e.g. GCC 4.8.0 /bits/stdc++.h source.
Using it would include a lot of unnecessary stuff and increases compilation time.
Edit: As Neil says, it's an implementation for precompiled headers. If you set it up for precompilation correctly it could, in fact, speed up compilation time depending on your project. (https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html)
I would, however, suggest that you take time to learn about each of the sl/stl headers and include them separately instead, and not use "super headers" except for precompilation purposes.
#include <bits/stdc++.h> is an implementation file for a precompiled header.
From a software engineering perspective, it is a good idea to minimize the include. If you use <bits/stdc++.h> it actually includes a lot of files, which your program may not need, thus increase both compile-time and program size unnecessarily. [edit: as pointed out by #Swordfish in the comments that the output program size remains unaffected. But still, it's good practice to include only the libraries you actually need, unless it's some competitive competition ]
But in contests, using this file is a good idea, when you want to reduce the time wasted in doing chores; especially when your rank is time-sensitive.
It works in most online judges, programming contest environments, including ACM-ICPC (Sub-Regionals, Regionals, and World Finals) and many online judges.
The disadvantages of it are that it:
increases the compilation time.
uses an internal non-standard header file of the GNU C++ library, and so will not compile in MSVC, XCode, and many other compilers
That header file is not part of the C++ standard, is therefore non-portable, and should be avoided.
Moreover, even if there were some catch-all header in the standard, you would want to avoid it in lieu of specific headers, since the compiler has to actually read in and parse every included header (including recursively included headers) every single time that translation unit is compiled.
Unfortunately that approach is not portable C++ (so far).
All standard names are in namespace std and moreover you cannot know which names are NOT defined by including and header (in other words it's perfectly legal for an implementation to declare the name std::string directly or indirectly when using #include <vector>).
Despite this however you are required by the language to know and tell the compiler which standard header includes which part of the standard library. This is a source of portability bugs because if you forget for example #include <map> but use std::map it's possible that the program compiles anyway silently and without warnings on a specific version of a specific compiler, and you may get errors only later when porting to another compiler or version.
In my opinion there are no valid technical excuses that explain why this is necessary for the general user: the compiler binary could have all standard namespace built in and this could actually increase the performance even more than precompiled headers (e.g. using perfect hashing for lookups, removing standard headers parsing or loading/demarshalling and so on).
The use of standard headers simplifies the life of who builds compilers or standard libraries and that's all. It's not something to help users.
However this is the way the language is defined and you need to know which header defines which names so plan for some extra neurons to be burnt in pointless configurations to remember that (or try to find and IDE that automatically adds the standard headers you use and removes the ones you don't... a reasonable alternative).
A question was asked about whether namespace and folder structure would affect performance of an assembly in C#. The answers were very useful, but where specific to C# and the CLR.
How will the namespace and folder structure affect the performance of an assembly if it is written in C++ with gcc? What's the situation on other OSes, such as Linux or Mac OS?
If there are any significant performance issues, what should I do or avoid doing to maximize performance?
Neither your directory hierarchy nor namespaces will affect your compiled code. The code your compiler will generate will be the same. This goes for all compilers and all OS's.
To expand a bit on what Kyle said:
Namespaces are nothing more than a syntactic way for the user and the compiler to put names in different buckets. They exist to allow you to use more common and appropriate names for things without having to worry about conflicts with someone else. std::vector is a different type from a mathematical vector class. They can share the same name so long as they're in different namespaces.
As far as the compiler is concerned, a function in a namespace is no different from a function anywhere else. It just has a funny name. Indeed, compilers are allowed the freedom to do what's called "name mangling": when the compiler sees std::vector<int>, it can actually convert that into something like __std~~vector~t~~int32_t~~__ or whatever. The mangling algorithm is chosen so that no user-defined name in the global namespace can match the name used by the namespace mangler. Thus all namespace-scoped names are separate from names in other namespaces, even the global one.
Basically, the first step in the compilation process is to effectively eliminate namespaces. Therefore, later compiler steps have no clue what namespace something is even in. So they cannot generate code from it. And thus, namespaces cannot have any effect on the execution speed of compiled code.
Folders... cannot possibly matter. After compilation, you get a single executable, library, or DLL. If a compiler ever did any code generation based on the location of the source files, you would be well advised to avoid that compiler like the plague. The compiler writers would have to be trolling their users to make that happen.
All applicants to our company must pass a simple quiz using C as part of early screening process.
It consists of a C source file that must be modified to provide the desired functionality. We clearly state that we will attempt to compile the file as-is, with no changes.
Almost all applicants user "strlen" but half of them do not include "string.h", so it does not compile until I include it.
Are they just lazy or are there compilers that do not require you to include standard library files, such as "string.h"?
GCC will happily compile the following code as is:
main()
{
printf("%u\n",strlen("Hello world"));
}
It will complain about incompatible implicit declaration of built-in function ‘printf’ and strlen(), but it will still produce an executable.
If you compile with -Werror it won't compile.
I'm pretty sure it's non-conformant for a compiler to include headers that aren't asked for. The reason for this is that the C standard says that various names are reserved, if the relevant header is included. I think this implies they aren't reserved if they aren't included, since compilers aren't allowed to reserve names the standard doesn't say are reserved (unless of course they include a non-standard header which happens to be provided by the compiler, and is documented elsewhere to reserve extra names. Which is what happens when you use POSIX).
This doesn't fully answer your question - there do exist non-conformant compilers. As for your applicants, maybe they're just used to including "windows.h", and so have never thought before about what header strlen might be defined in by the C standard. I assume without testing that MSVC does in principle require you to include "string.h". But since "windows.h" does that for you, for the vast majority of practical Windows programs you don't need to know that you have to include "string.h".
They might be lazy, but you can't tell. Standard library implementations (as opposed to compilers, though of course each compiler usually has "its own" stdlib impl.) are allowed to include other headers. For example, #include <stdlib.h> could include every other library described in the standard. (I'm talking in the context of "C/C++", not strictly C.)
As a result, programmers get accustomed to such things, even if not strictly guaranteed, and it's easy to forget whether some function comes from a general catch-all like stdlib.h or something else—many people forget that memcpy is from string.h too.
If they do not include any headers, I would count them as wrong. If you don't allow them to test it with a particular implementation, however, it's hard to say they're wrong. And if you don't provide them with man pages (which represent the resources they'll need to know how to use on the job), then you're wrong.
At that point, you can certainly say the don't follow the exact letter of the standard; but do you want coders that get things done and know how to fix problems when they see them, or coders that worry about minutiea that won't matter?
If you provide a C file to start working with, make it have all the headers that could be needed from the beginning and ask the applicants to remove the unused ones.
The most common engineering experience is to add (or delete) a few lines of code to/from an application with thousands of lines already working correctly. It would be extremely rare in such a case to need another header file when adding a call to printf() or strlen().
It would be interesting to look over the shoulder of experienced engineers—not just graduated from school, but with extensive experience in the trenches—to see if they simply add strlen() and try compiling, or if they check to see if stdlib.h or string.h is already included before compiling. I bet the overwhelming majority do the former.
C implementations usually still allow implicit function declarations.
Anyway, I wouldn't consider all the boilerplate a required part of an interview, unless you specifically ask for it (e.g. "please don't omit anything you'd normally have in a source file").
(And with Visual Assist's "add ... include" I know less and less where they comde from ;))
Most compilers provide some kind of option to force headers inclusion.
Eg. the GCC compiler has the -include option which is the equivalent of #include preprocessor directive.
TCC will also happily compile a file such as the accepted answer's example:
int main()
{
printf("%u\n", strlen("hello world"));
}
without any warnings (unless you pass -Wall); as an added bonus, you can just call tcc -run file.c to execute file.c without compiling to an output file first.
In C89 (the most commonly applied standard), a function that is not declared will be assumed to return an int and have unknown arguments, and will compile (probably with warnings). If it does not, teh compiler is not compliant. If on the other hand you compiled the code as C++ it will fail, andf must do so if the C++ compiler is compliant.
Why not simply specify in the question that all necessary headers must be included (and perhaps irrelevant ones omitted).
Alternatively, sit the candidates at a machine with a compiler and have them check their own code, and state that the code must compile at the maximum warning level, without warnings.
I'm doing C/C++ for 20 years (even taught classes) and I guess there's a 50% probability that I'd forget the include too (99% of the time when I code, the include is already there somewhere). I know this isn't exactly answering your question, but if someone knows strlen() they will know within seconds what to do with that specific compiler error, so from a job qualification standpoint the slip is practically irrelevant.
Rather than putting emphasis on stuff like that, checking for the subtleties that require real understanding of the language should be far more relevant, i.e. buffer overruns, strncpy not appending a final \0 when hitting the limits, asking someone to make a safe (truncating) copy to a buffer of limited length, etc. Especially in C/C++ programming, the errors that do not generate a compiler error are the ones which will cause you/your company the real trouble.