All applicants to our company must pass a simple quiz using C as part of early screening process.
It consists of a C source file that must be modified to provide the desired functionality. We clearly state that we will attempt to compile the file as-is, with no changes.
Almost all applicants user "strlen" but half of them do not include "string.h", so it does not compile until I include it.
Are they just lazy or are there compilers that do not require you to include standard library files, such as "string.h"?
GCC will happily compile the following code as is:
main()
{
printf("%u\n",strlen("Hello world"));
}
It will complain about incompatible implicit declaration of built-in function ‘printf’ and strlen(), but it will still produce an executable.
If you compile with -Werror it won't compile.
I'm pretty sure it's non-conformant for a compiler to include headers that aren't asked for. The reason for this is that the C standard says that various names are reserved, if the relevant header is included. I think this implies they aren't reserved if they aren't included, since compilers aren't allowed to reserve names the standard doesn't say are reserved (unless of course they include a non-standard header which happens to be provided by the compiler, and is documented elsewhere to reserve extra names. Which is what happens when you use POSIX).
This doesn't fully answer your question - there do exist non-conformant compilers. As for your applicants, maybe they're just used to including "windows.h", and so have never thought before about what header strlen might be defined in by the C standard. I assume without testing that MSVC does in principle require you to include "string.h". But since "windows.h" does that for you, for the vast majority of practical Windows programs you don't need to know that you have to include "string.h".
They might be lazy, but you can't tell. Standard library implementations (as opposed to compilers, though of course each compiler usually has "its own" stdlib impl.) are allowed to include other headers. For example, #include <stdlib.h> could include every other library described in the standard. (I'm talking in the context of "C/C++", not strictly C.)
As a result, programmers get accustomed to such things, even if not strictly guaranteed, and it's easy to forget whether some function comes from a general catch-all like stdlib.h or something else—many people forget that memcpy is from string.h too.
If they do not include any headers, I would count them as wrong. If you don't allow them to test it with a particular implementation, however, it's hard to say they're wrong. And if you don't provide them with man pages (which represent the resources they'll need to know how to use on the job), then you're wrong.
At that point, you can certainly say the don't follow the exact letter of the standard; but do you want coders that get things done and know how to fix problems when they see them, or coders that worry about minutiea that won't matter?
If you provide a C file to start working with, make it have all the headers that could be needed from the beginning and ask the applicants to remove the unused ones.
The most common engineering experience is to add (or delete) a few lines of code to/from an application with thousands of lines already working correctly. It would be extremely rare in such a case to need another header file when adding a call to printf() or strlen().
It would be interesting to look over the shoulder of experienced engineers—not just graduated from school, but with extensive experience in the trenches—to see if they simply add strlen() and try compiling, or if they check to see if stdlib.h or string.h is already included before compiling. I bet the overwhelming majority do the former.
C implementations usually still allow implicit function declarations.
Anyway, I wouldn't consider all the boilerplate a required part of an interview, unless you specifically ask for it (e.g. "please don't omit anything you'd normally have in a source file").
(And with Visual Assist's "add ... include" I know less and less where they comde from ;))
Most compilers provide some kind of option to force headers inclusion.
Eg. the GCC compiler has the -include option which is the equivalent of #include preprocessor directive.
TCC will also happily compile a file such as the accepted answer's example:
int main()
{
printf("%u\n", strlen("hello world"));
}
without any warnings (unless you pass -Wall); as an added bonus, you can just call tcc -run file.c to execute file.c without compiling to an output file first.
In C89 (the most commonly applied standard), a function that is not declared will be assumed to return an int and have unknown arguments, and will compile (probably with warnings). If it does not, teh compiler is not compliant. If on the other hand you compiled the code as C++ it will fail, andf must do so if the C++ compiler is compliant.
Why not simply specify in the question that all necessary headers must be included (and perhaps irrelevant ones omitted).
Alternatively, sit the candidates at a machine with a compiler and have them check their own code, and state that the code must compile at the maximum warning level, without warnings.
I'm doing C/C++ for 20 years (even taught classes) and I guess there's a 50% probability that I'd forget the include too (99% of the time when I code, the include is already there somewhere). I know this isn't exactly answering your question, but if someone knows strlen() they will know within seconds what to do with that specific compiler error, so from a job qualification standpoint the slip is practically irrelevant.
Rather than putting emphasis on stuff like that, checking for the subtleties that require real understanding of the language should be far more relevant, i.e. buffer overruns, strncpy not appending a final \0 when hitting the limits, asking someone to make a safe (truncating) copy to a buffer of limited length, etc. Especially in C/C++ programming, the errors that do not generate a compiler error are the ones which will cause you/your company the real trouble.
Related
I figured one of the best ways I could learn and improve in programming is to look at various source codes. I was looking at Blender's source code and noticed something about the header files. Most of them used #ifndef include guards, where macros were surrounded by double underscores (e.g. __BMESH_CLASS_H__).
It got me thinking, the whole "just don't make anything that starts with underscores at all" advice is good for beginners, but I would think that in order to progress further in programming I should learn when creating my own reserved identifiers is and isn't appropriate.
I would think that in order to progress further in programming I should learn when creating my own reserved identifiers is and isn't appropriate.
Reserved identifiers are reserved for the implementation, which means roughly the compiler, its runtime library, and maybe parts of the operating system.
So, it's appropriate to create your own when your progression has led to you writing your own compiler or operating system. That's pretty much it.
There is only one case I know of where reserved identifiers are allowed to be self-made. All others are considered undefined behavior, and while they will still most likely work completely the same, it's against the standard and shouldn't be done.
That said, reserved identifiers can be made by you when interacting with certain components of your development environment. For example, some compilers may support something like __FILE_NAME__ and others may not, and it may even vary from compiler version to compiler version. If you are making it yourself for, say, backwards compatibility (i.e. adding a preprocessor definition to define said macro), then it is 100% okay, so long its implementation follows the requirements of what is expected of using said identifier exactly (e.g. it should substitute to the file name for __FILE_NAME__, not something else).
Say I have the below (very simple) code.
#include <iostream>
int main() {
std::cout << std::stoi("12");
}
This compiles fine on both g++ and clang; however, it fails to compile on MSVC with the following error:
error C2039: 'stoi': is not a member of 'std'
error C3861: 'stoi': identifier not found
I know that std::stoi is part of the <string> header, which presumably the two former compilers include as part of <iostream> and the latter does not. According to the C++ standard [res.on.headers]
A C++ header may include other C++ headers.
Which, to me, basically says that all three compilers are correct.
This issue arose when one of my students submitted work, which the TA marked as not compiling; I of course went and fixed it. However, I would like to prevent future incidents like this. So, is there a way to determine which header files should be included, short of compiling on three different compilers to check every time?
The only way I can think of is to ensure that for every std function call, an appropriate include exists; but if you have existing code which is thousands of lines long, this may be tedious to search through. Is there an easier/better way to ensure cross-compiler compatibility?
Example with the three compilers: https://godbolt.org/z/kJhS6U
Is there an easier/better way to ensure cross-compiler compatibility?
This is always going to be a bit of a chore if you have a huge codebase and haven't been doing this so far, but once you've gone through fixing your includes, you can stick to a simple procedure:
When you write new code that uses a standard feature, like std::stoi, plug that name into Google, go to the cppreference.com article for it, then look at the top to see which header it's defined in.
Then include that, if it's not already included. Job done!
(You could use the standard for this, but that's not as accessible.)
Do not be tempted to sack it all off in favour of cheap, unportable hacks like <bits/stdc++.h>!
tl;dr: documentation
Besides reviewing documentation and doing that manually (painful and time consuming) you can use some tools which can do that for you.
You can use ReSharper in Visual Studio which is capable to organize imports (in fact VS without ReSharper is not very usable). If include is missing it recommends to add it and if it is obsolete line with include is shown in more pale colors.
Or you can use CLion (available for all platforms) which also has this capability (in fact this is the same manufacture JetBrains).
There is also tool called include what you used, but its aim is take advantages of forward declaration, I never used that (personally - my team mate did that for our project).
This is a follow-up question for this which says that
In C++, unlike C, standard headers are allowed to #include other standard headers.
Is there any way to know which headers were automatically included, since it may be difficult to guess which symbols are defined in which headers.
Motivation: My homework compiles and works correctly on my computer but TA told me it was not compiling and needed couple of headers (mutex and algorithm) to compile. How I can be sure the code I submit in future be bulletproof.
My compiler is not giving any warning about implicit declaration.
I'm using clang++ -std=c++11 to compile my code.
The standard lists the symbols made available by each header. There are no guarantees beyond that, neither that symbols which are obviously used nor that there not all symbols are declared. You'll need to include each header for any name you are using. You should not rely on indirect includes.
On the positive side, there is no case in the standard library where any of the standard library headers requires extra headers.
If you want to know what other headers a particular header file pulls, the easiest way to do so is to run the include file through the compiler's preprocessor phase only, instead of compiling it fully. For example, if you want to know what <iostream> pulls in, create a file containing only:
#include <iostream>
then preprocess it. With gcc, the -E option runs the preprocessor only, without compiling the file, and dumps the preprocessed file to standard output. The resulting output begins with:
# 1 "t.C"
That's my one-line source file.
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
Apparently, gcc automatically pulls in this header file, no matter what. This can be ignored.
# 1 "<command-line>" 2
# 1 "t.C"
# 1 "/usr/include/c++/6.2.1/iostream" 1 3
Ok, now we finally get to the actual #include statement in my one-line source file. That's where my <iostream> is:
# 36 "/usr/include/c++/6.2.1/iostream" 3
# 37 "/usr/include/c++/6.2.1/iostream" 3
# 1 "/usr/include/c++/6.2.1/x86_64-redhat-linux/bits/c++config.h" 1 3
Ok, so iostream itself #includes this "c++-config.h" header file, obviously an internal compiler header.
If I keep going, I can see that <iostream> pulls in, unsurprisingly, <ios>, <type_traits>, as well as C header files like stdio.h.
It shouldn't be too hard to write a quick little script that takes a header file, runs the compiler in preprocessing phase, and produces a nice, formatted list of all header files that got pulled in.
As far as I know, there is no way to do what you want.
If you try to compile your code on several example platforms, and it is successful, there is a greater chance that it will compile on any other platform, but there is no easy way to be sure.
In my experience, MinGW C++ headers use fewer #includes to each other. So MinGW can be a practical tool for checking portability.
This question has considerable overlap with Are there tools that help organizing #includes? , and the latter would nowadays be considered as "Off topic" because it's asking for external tools/resources. Strictly speaking, this question is only about a way to find indirect includes, but the goal is certainly the same.
The problem of using "the right" include statements is distressingly hard. What was described in the question is mainly about indirect inclusion: One header includes a certain symbol, but on another machine with another compiler and another header, the symbol might not be included.
Some of the comments and other answers suggested seemingly pragmatic but unrealistic approaches:
Try it out with another compiler: This is brittle, and does not tell you anything about a possible third compiler
Read the included headers, to see which other headers they include: No. The idea of headers is exactly the opposite, namely not having to read them, but just to use the API that they offer
Generate preprocessor output and write a tool to solve the problem semi-automatically: Nope. This won't solve the problem of different header versions anyhow.
Explicitly include the headers that you need for the thing that you are using: This is basically impossible to maintain. When you move one function from your file to another during a refactoring, you never know which includes you can safely remove in one file, and which ones you have to add to the other one.
(And when you change your includes, you'll likely break third-party code that included your headers, because everybody is facing the same problem...)
All this does not cover the caveat that indirect inclusion is not always a problem, but sometimes intended: When you want to use std::vector, you'd #include <vector>, and not #include <bits/stl_vector.h>, even though the latter contains the definition...
So the only realistic solution for this is to rely on tools that support the developer. Some tools are mentioned in the answers of Are there tools that help organizing #includes? , and originally, I mentioned CDT in a comment to the question three years ago, but I think it's worth being mentioned here as well:
Eclipse CDT (C/C++ Development Tooling)
This is an IDE that offers an "Organize Includes" feature. The features is described in the article at https://www.eclipse.org/community/eclipse_newsletter/2013/october/article3.php , which points out the difficulties and caveats (and I mentioned some of them above), and how they are tackled in CDT.
I have tried this out (quite a while ago), although only in a very simple test project. So I can say that it basically works, but considering the complexity of the task, this comes with a disclaimer: It's close to magic, but there might still be cases where it fails.
C++ is not intended to be parsed and compiled. This statement is true, because one of my previous answers here on stack overflow contained the line #define not certainly.
The article also points to another tool:
Include What You Use
I have not tried this out, but it seems to be quite actively maintained. The documentation pages also list some of the difficulties and goals. From a first glance, it does not seem to be as powerful and configurable as CDT (and of course, does not come in an IDE), but some might want to try it out.
In C++ the standard library is wrapped in the std namespace and the programmer is not supposed to define anything inside that namespace. Of course the standard include files don't step on each other names inside the standard library (so it's never a problem to include a standard header).
Then why isn't the whole standard library included by default instead of forcing programmers to write for example #include <vector> each time? This would also speed up compilation as the compilers could start with a pre-built symbol table for all the standard headers.
Pre-including everything would also solve some portability problems: for example when you include <map> it's defined what symbols are taken into std namespace, but it's not guaranteed that other standard symbols are not loaded into it and for example you could end up (in theory) with std::vector also becoming available.
It happens sometimes that a programmer forgets to include a standard header but the program compiles anyway because of an include dependence of the specific implementation. When moving the program to another environment (or just another version of the same compiler) the same source code could however stop compiling.
From a technical point of view I can image a compiler just preloading (with mmap) an optimal perfect-hash symbol table for the standard library.
This should be faster to do than loading and doing a C++ parse of even a single standard include file and should be able to provide faster lookup for std:: names. This data would also be read-only (thus probably allowing a more compact representation and also shareable between multiple instances of the compiler).
These are however just shoulds as I never implemented this.
The only downside I see is that we C++ programmers would lose compilation coffee breaks and Stack Overflow visits :-)
EDIT
Just to clarify the main advantage I see is for the programmers that today, despite the C++ standard library being a single monolithic namespace, are required to know which sub-part (include file) contains which function/class. To add insult to injury when they make a mistake and forget an include file still the code may compile or not depending on the implementation (thus leading to non-portable programs).
Short answer is because it is not the way the C++ language is supposed to be used
There are good reasons for that:
namespace pollution - even if this could be mitigated because std namespace is supposed to be self coherent and programmer are not forced to use using namespace std;. But including the whole library with using namespace std; will certainly lead to a big mess...
force programmer to declare the modules that he wants to use to avoid inadvertently calling a wrong standard function because standard library is now huge and not all programmers know all modules
history: C++ has still strong inheritance from C where namespace do not exist and where the standard library is supposed to be used as any other library.
To go in your sense, Windows API is an example where you only have one big include (windows.h) that loads many other smaller include files. And in fact, precompiled headers allows that to be fast enough
So IMHO a new language deriving from C++ could decide to automatically declare the whole standard library. A new major release could also do it, but it could break code intensively using using namespace directive and having custom implementations using same names as some standard modules.
But all common languages that I know (C#, Python, Java, Ruby) require the programmer to declare the parts of the standard library that he wants to use, so I suppose that systematically making available every piece of the standard library is still more awkward than really useful for the programmer, at least until someone find how to declare the parts that should not be loaded - that's why I spoke of a new derivative from C++
Most of the C++ standard libraries are template based which means that the code they'll generate will depend ultimately in how you use them. In other words, there is very little that could be compiled before instantiate a template like std::vector<MyType> m_collection;.
Also, C++ is probably the slowest language to compile and there is a lot parsing work that compilers have to do when you #include a header file that also includes other headers.
Well, first thing first, C++ tries to adhere to "you only pay for what you use".
The standard-library is sometimes not part of what you use at all, or even of what you could use if you wanted.
Also, you can replace it if there's a reason to do so: See libstdc++ and libc++.
That means just including it all without question isn't actually such a bright idea.
Anyway, the committee are slowly plugging away at creating a module-system (It takes lots of time, hopefully it will work for C++1z: C++ Modules - why were they removed from C++0x? Will they be back later on?), and when that's done most downsides to including more of the standard-library than strictly neccessary should disappear, and the individual modules should more cleanly exclude symbols they need not contain.
Also, as those modules are pre-parsed, they should give the compilation-speed improvement you want.
You offer two advantages of your scheme:
Compile-time performance. But nothing in the standard prevents an implementation from doing what you suggest[*] with a very slight modification: that the pre-compiled table is only mapped in when the translation unit includes at least one standard header. From the POV of the standard, it's unnecessary to impose potential implementation burden over a QoI issue.
Convenience to programmers: under your scheme we wouldn't have to specify which headers we need. We do this in order to support C++ implementations that have chosen not to implement your idea of making the standard headers monolithic (which currently is all of them), and so from the POV of the C++ standard it's a matter of "supporting existing practice and implementation freedom at a cost to programmers considered acceptable". Which is kind of the slogan of C++, isn't it?
Since no C++ implementation (that I know of) actually does this, my suspicion is that in point of fact it does not grant the performance improvement you think it does. Microsoft provides precompiled headers (via stdafx.h) for exactly this performance reason, and yet it still doesn't give you an option for "all the standard libraries", instead it requires you to say what you want in. It would be dead easy for this or any other implementation to provide an implementation-specific header defined to have the same effect as including all standard headers. This suggests to me that at least in Microsoft's opinion there would be no great overall benefit to providing that.
If implementations were to start providing monolithic standard libraries with a demonstrable compile-time performance improvement, then we'd discuss whether or not it's a good idea for the C++ standard to continue permitting implementations that don't. As things stand, it has to.
[*] Except perhaps for the fact that <cassert> is defined to have different behaviour according to the definition of NDEBUG at the point it's included. But I think implementations could just preprocess the user's code as normal, and then map in one of two different tables according to whether it's defined.
I think the answer comes down to C++'s philosophy of not making you pay for what you don't use. It also gives you more flexibility: you aren't forced to use parts of the standard library if you don't need them. And then there's the fact that some platforms might not support things like throwing exceptions or dynamically allocating memory (like the processors used in the Arduino, for example). And there's one other thing you said that is incorrect. As long as it's not a template class, you are allowed to add swap operators to the std namespace for your own classes.
First of all, I am afraid that having a prelude is a bit late to the game. Or rather, seeing as preludes are not easily extensible, we have to content ourselves with a very thin one (built-in types...).
As an example, let's say that I have a C++03 program:
#include <boost/unordered_map.hpp>
using namespace std;
using boost::unordered_map;
static unordered_map<int, string> const Symbols = ...;
It all works fine, but suddenly when I migrate to C++11:
error: ambiguous symbol "unordered_map", do you mean:
- std::unordered_map
- boost::unordered_map
Congratulations, you have invented the least backward compatible scheme for growing the standard library (just kidding, whoever uses using namespace std; is to blame...).
Alright, let's not pre-include them, but still bundle the perfect hash table anyway. The performance gain would be worth it, right?
Well, I seriously doubt it. First of all because the Standard Library is tiny compared to most other header files that you include (hint: compare it to Boost). Therefore the performance gain would be... smallish.
Oh, not all programs are big; but the small ones compile fast already (by virtue of being small) and the big ones include much more code than the Standard Library headers so you won't get much mileage out of it.
Note: and yes, I did benchmark the file look-up in a project with "only" a hundred -I directives; the conclusion was that pre-computing the "include path" to "file location" map and feeding it to gcc resulted in a 30% speed-up (after using ccache already). Generating it and keeping it up-to-date was complicated, so we never used it...
But could we at least include a provision that the compiler could do it in the Standard?
As far as I know, it is already included. I cannot remember if there is a specific blurb about it, but the Standard Library is really part of the "implementation" so resolving #include <vector> to an internal hash-map would fall under the as-if rule anyway.
But they could do it, still!
And lose any flexibility. For example Clang can use either the libstdc++ or the libc++ on Linux, and I believe it to be compatible with the Dirkumware's derivative that ships with VC++ (or if not completely, at least greatly).
This is another point of customization: if the Standard library does not fit your needs, or your platforms, by virtue of being mostly treated like any other library you can replace part of most of it with relative ease.
But! But!
#include <stdafx.h>
If you work on Windows, you will recognize it. This is called a pre-compiled header. It must be included first (or all benefits are lost) and in exchange instead of parsing files you are pulling in an efficient binary representation of those parsed files (ie, a serialized AST version, possibly with some type resolution already performed) which saves off maybe 30% to 50% of the work. Yep, this is close to your proposal; this is Computer Science for you, there's always someone else who thought about it first...
Clang and gcc have a similar mechanism; though from what I've heard it can be so painful to use that people prefer the more transparent ccache in practice.
And all of these will come to naught with modules.
This is the true solution to this pre-processing/parsing/type-resolving madness. Because modules are truly isolated (ie, unlike headers, not subject to inclusion order), an efficient binary representation (like pre-compiled headers) can be pre-computed for each and every module you depend on.
This not only means the Standard Library, but all libraries.
Your solution, more flexible, and dressed to the nines!
One could use an alternative implementation of the C++ Standard Library to the one shipped with the compiler. Or wrap headers with one's definitions, to add, enable or disable features (see GNU wrapper headers). Plain text headers and the C inclusion model are a more powerful and flexible mechanism than a binary black box.
This question already has answers here:
Why should I not #include <bits/stdc++.h>?
(9 answers)
Closed 4 years ago.
I have read from a codeforces blog that if we add #include <bits/stdc++.h> in a C++ program then there is no need to include any other header files. How does #include <bits/stdc++.h> work and is it ok to use it instead of including individual header files?
It is basically a header file that also includes every standard library and STL include file. The only purpose I can see for it would be for testing and education.
Se e.g. GCC 4.8.0 /bits/stdc++.h source.
Using it would include a lot of unnecessary stuff and increases compilation time.
Edit: As Neil says, it's an implementation for precompiled headers. If you set it up for precompilation correctly it could, in fact, speed up compilation time depending on your project. (https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html)
I would, however, suggest that you take time to learn about each of the sl/stl headers and include them separately instead, and not use "super headers" except for precompilation purposes.
#include <bits/stdc++.h> is an implementation file for a precompiled header.
From a software engineering perspective, it is a good idea to minimize the include. If you use <bits/stdc++.h> it actually includes a lot of files, which your program may not need, thus increase both compile-time and program size unnecessarily. [edit: as pointed out by #Swordfish in the comments that the output program size remains unaffected. But still, it's good practice to include only the libraries you actually need, unless it's some competitive competition ]
But in contests, using this file is a good idea, when you want to reduce the time wasted in doing chores; especially when your rank is time-sensitive.
It works in most online judges, programming contest environments, including ACM-ICPC (Sub-Regionals, Regionals, and World Finals) and many online judges.
The disadvantages of it are that it:
increases the compilation time.
uses an internal non-standard header file of the GNU C++ library, and so will not compile in MSVC, XCode, and many other compilers
That header file is not part of the C++ standard, is therefore non-portable, and should be avoided.
Moreover, even if there were some catch-all header in the standard, you would want to avoid it in lieu of specific headers, since the compiler has to actually read in and parse every included header (including recursively included headers) every single time that translation unit is compiled.
Unfortunately that approach is not portable C++ (so far).
All standard names are in namespace std and moreover you cannot know which names are NOT defined by including and header (in other words it's perfectly legal for an implementation to declare the name std::string directly or indirectly when using #include <vector>).
Despite this however you are required by the language to know and tell the compiler which standard header includes which part of the standard library. This is a source of portability bugs because if you forget for example #include <map> but use std::map it's possible that the program compiles anyway silently and without warnings on a specific version of a specific compiler, and you may get errors only later when porting to another compiler or version.
In my opinion there are no valid technical excuses that explain why this is necessary for the general user: the compiler binary could have all standard namespace built in and this could actually increase the performance even more than precompiled headers (e.g. using perfect hashing for lookups, removing standard headers parsing or loading/demarshalling and so on).
The use of standard headers simplifies the life of who builds compilers or standard libraries and that's all. It's not something to help users.
However this is the way the language is defined and you need to know which header defines which names so plan for some extra neurons to be burnt in pointless configurations to remember that (or try to find and IDE that automatically adds the standard headers you use and removes the ones you don't... a reasonable alternative).