Why do projects use the -I include switch given the dangers?

Why do projects use the -I include switch given the dangers? - c++

Reading the fine print of the -I switch in GCC, I'm rather shocked to find that using it on the command line overrides system includes. From the preprocessor docs
"You can use -I to override a system header file, substituting your own version, since these directories are searched before the standard system header file directories."
They don't seem to be lying. On two different Ubuntu systems with GCC 7, if I create a file endian.h:
#error "This endian.h shouldn't be included"
...and then in the same directory create a main.cpp (or main.c, same difference):
#include <stdlib.h>
int main() {}
Then compiling with g++ main.cpp -I. -o main (or clang, same difference) gives me:
In file included from /usr/include/x86_64-linux-gnu/sys/types.h:194:0,
from /usr/include/stdlib.h:394,
from /usr/include/c++/7/cstdlib:75,
from /usr/include/c++/7/stdlib.h:36,
from main.cpp:1:
./endian.h:1:2: error: #error "This endian.h shouldn't be included"
So stdlib.h includes this types.h file, which on line 194 just says #include <endian.h>. My apparent misconception (and perhaps that of others) was that the angle brackets would have prevented this, but -I is stronger than I'd thought.
Though not strong enough, because you can't even fix it by sticking /usr/include in on the command line first, because:
"If a standard system include directory, or a directory specified with -isystem, is also specified with -I, the -I option is ignored. The directory is still searched but as a system directory at its normal position in the system include chain."
Indeed, the verbose output for g++ -v main.cpp -I/usr/include -I. -o main leaves /usr/include at the bottom of the list:
#include "..." search starts here:
#include <...> search starts here:
.
/usr/include/c++/7
/usr/include/x86_64-linux-gnu/c++/7
/usr/include/c++/7/backward
/usr/lib/gcc/x86_64-linux-gnu/7/include
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/7/include-fixed
/usr/include/x86_64-linux-gnu
/usr/include
Color me surprised. I guess to make this a question:
What legitimate reason is there for most projects to use -I considering this extremely serious issue? You can override arbitrary headers on systems based on incidental name collisions. Shouldn't pretty much everyone be using -iquote instead?

What legitimate reasons are there for -I over -iquote? -I is standardized (at least by POSIX) while -iquote isn't. (Practically, I'm using -I because tinycc (one of the compilers I want my project to compile with) doesn't support -iquote.)
How do projects manage with -I given the dangers? You'd have the includes wrapped in a directory and use -I to add the directory containing that directory.
filesystem: includes/mylib/endian.h
command line: -Iincludes
C/C++ file: #include "mylib/endian.h" //or <mylib/endian.h>
With that, as long as you don't clash on the mylib name, you don't clash (at least as far header names are concerned).

Looking back at the GCC manuals it looks like -iquote and other options were only added in GCC 4: https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Directory-Options.html#Directory%20Options
So the use of "-I" is probably some combination of: habit, lazyness, backwards compatibility, ignorance of the new options, compatibility with other compilers.
The solution is to "namespace" your header files by putting them in sub directories. For example put your endian header in "include/mylib/endian.h" then add "-Iinclude" to the command line and you can #include "mylib/endian.h" which shouldn't conflict with other libraries or system libraries.

I this your premise that it's -I that's dangerous is false. The language leaves the search for header files with either form of #include sufficiently implementation-defined that it's unsafe to use header files that conflict with the names of the standard header files at all. Simply refrain from doing this.

An obvious case is cross-compilation. GCC suffers a bit from a historical UNIX assumption that you're always compiling for your local system, or at least something that's very close. That's why the compiler's header files are in the system root. The clean interface is missing.
In comparison, Windows assumes no compiler, and Windows compilers do not assume you're targeting the local system. That's why you can have a set of compilers and a set of SDK's installed.
Now in cross-compilation, GCC behaves much more like a compiler for Windows. It no longer assumes that you intend to use the local system headers, but lets you specify exactly which headers you want. And obviously, the same then goes for the libraries you link in.
Now note that when you do this, the set of replacement headers is designed to go on top of the base system. You can leave out headers in the replacement set if their implementation would be identical. E.g. chances are that <complex.h> is the same. There's not that much variation in complex number implementations. However, you can't randomly replace internal implementation bits like <endian.h>.
TL,DR : this option if for people who know what they're doing. "Being unsafe" is not an argument for the target audience.

Related

How to find the library path that clang uses [duplicate]

How can I tell where g++ was able to find an include file? Basically if I
#include <foo.h>
g++ will scan the search path, using any include options to add or alter the path. But, at the end of days, is there a way I can tell the absolute path of foo.h that g++ chose to compile? Especially relevant if there is more than one foo.h in the myriad of search paths.
Short of a way of accomplishing that... is there a way to get g++ to tell me what its final search path is after including defaults and all include options?

g++ -H ...
will also print the full path of include files in a format which shows which header includes which

This will give make dependencies which list absolute paths of include files:
gcc -M showtime.c
If you don't want the system includes (i.e. #include <something.h>) then use:
gcc -MM showtime.c

Sure use
g++ -E -dI ... (whatever the original command arguments were)

If your build process is very complicated...
constexpr static auto iWillBreak =
#include "where/the/heck/is/this/file.h"
This will (almost certainly) cause a compilation error near the top of the file in question. That should show you a compiler error with the path the compiler sees.
Obviously this is worse than the other answers, but sometimes this kind of hack is useful.

If you use -MM or one of the related options (-M, etc), you get just the list of headers that are included without having all the other preprocessor output (which you seem to get with the suggested g++ -E -dI solution).

For MSVC you can use the /showInclude option, which will display the files that are included.
(This was stated in a comment of Michael Burr on this answer but I wanted to make it more visible and therefore added it as a separate answer.)
Usability note: The compiler will emit this information to the standard error output which seems to be suppressed by default when using the windows command prompt. Use 2>&1 to redirect stderr to stdout to see it nonetheless.

Is it possible to force compiler to use the name of "working" directory to construct all full names from includes?

Let us assume that in first.h we have #include "aaa/second.h" and in the aaa/second.h we have #include "bbb/third.h". I think that in the "default settings" the compiler will complain if "third.h" is not located in "aaa/bbb".
Is it possible to change this behavior in such a way that the directory, in which the first.cpp is located is used to construct the full names in all includes?
For example, if "first.h" is located in '/home/bucky/' then #include "bbb/third.h" (from "aaa/second.h") should be interpreted as /home/bucky/bbb/third.h and not as /home/bucky/aaa/bbb/third.h.
EDITS
I cannot change the whole source code. In the code quotation marks are used instead of angle brackets.
I compile using g++ -std=c++0x name.cpp -o name in the command line. I do it in two different terminals. It looks like in the first terminal the working directory is used to construct the full names and in the second terminal it is not the case. I am almost sure that it happens because of the environment variables but I do not know which ones. So, my question is, to larger extent, what environment variables can force the compiler to construct full names using the working directory.
EDIT 2
In my test.cpp file I include "first.h". This inclusion does not cause any problem (complier sees "first.h"). The "first.h" file includes "ppp/second.h". It also causes no problems. But "ppp/second.h" includes "ppp/third.h" and this is the place where the problem appears. I think that the reason of the problem is that "second.h" tries to find "third.h" in the "ppp" subdirectory of the directory where second.h is located. In other words, second.h tries to find the third.h in the "ppp/ppp" subdirectory (because second.h is located in the ppp subdirectory).
In another terminal, the same compilation command, in the same directory does not cause any problem. The reason, is obviously in the values of the environment varibales.

Yes. The exact mechanics depend on the compiler but the short and the long of it is that you need to configure your compiler to include the project path in the search path. For GCC and clang that’s done via the -I command line flag (-I path/of/first.cpp). This configuration would usually be done in the project settings (if you’re working with an IDE), a Makefile or similar.
Since you’re talking about environment variables: the flags that are passed to the g++ and c++ compiler are controlled by the CXXFLAGS and CFLAGS variables.

You should set up include paths for your project globally. In your example, you would pass some option like -I /home/bucky to your compiler (if it is GCC or Clang). MSVC has analogous options.
(All #includes are searched relative to the include paths. The difference between <...> and "..." is that the latter also searches the current directory.)

How Does G++ Find Headers I Don't Have?

I'm new to C++ and trying to understand how it is finding headers. Originally I was just trying to find out what classes are available for me to include in my source code. I believe that different compilers will use different include directories, and hence class availability will vary. My plan was to find the "include" directory that the compiler was using and assume I can include anything there. So I am just getting more confused as I go.
First, I am writing C++ code in Code::Blocks, on Windows 7. The IDE is set to use GNU GCC for compilation, which I learned means it uses the G++ compiler for C++ code. I found my compiler here: C:\MinGW\bin\mingw32-g++.exe, Code::Blocks settings point to that.
So I assumed that G++ must be using C:\MinGW\include recursively to find all its headers. To test my theory, I searched for "iostream.h". To my surprise, I do not even have "iostream.h" on my C drive. Despite that, my code compiles and works when I include that.
So my questions:
How is G++ finding the iostream header when my hard drive does not even have it?
Will all the standard C++ headers (as listed here: http://msdn.microsoft.com/en-us/library/ct1as7hw.aspx) be available to all C++ compilers? with the same exact name so I don't have to change my source code?

Regarding the second question, the standard does not require the headers to be available as files. It requires the #include directive to be present in the program and the compiler to behave as if the declarations that the standard required were present in the program. But the compiler is free to inject the declarations in any way it deems fit.
That being said, g++ in particular does have files backing up each one of the headers. Without knowing your particular configuration I cannot tell you where the headers will be but you can stop the compilation process after the preprocessor and examine the output:
$ cat test.cpp
#include <iostream>
int main() {
std::cout << "Hi\n";
}
$ g++ -E test.cpp | head -10
# 1 "test.cpp"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "test.cpp"
# 1 "/usr/include/c++/4.2.1/iostream" 1 3
# 42 "/usr/include/c++/4.2.1/iostream" 3
# 43 "/usr/include/c++/4.2.1/iostream" 3
The paths above are on a MacOSX Lion, and it shows that in this particular configuration iostream is included from /usr/include/c++/4.2.1/iostream. In windows head might not be available, but you can redirect the output to a file and read it from there.

Using quotation marks will search the current directory before moving to the ones it would search with angle brackets. There are other directories than the files listed in \include. That's where you'll find the ones without any extension. You may have a c++ folder in there with those files inside, but even searching for them from within the CodeBlocks directory using the Windows 7 search shows you where they're located.
Yes, they should all be accessible with the same name, using angle bracket includes to reach them. Of course there is always a small chance that some might be missing. If this is the case, that implementation would seem pretty unreliable from first sight. All major compilers today should definitely adhere to this.
To use your own headers with your classes in them, they should be in the same directory as the main cpp file, and you should use quotation includes.

1) The reason you can't find iostream.h is there's no such header, it's spelled iostream in standard C++. If you try #include <iostream.h> it will fail, so it isn't finding headers you don't have :)
If you run g++ -v test.cpp it will show all the paths it uses to look for headers when compiling test.cpp
2) Yes, they wouldn't be standard headers if they were missing or had different names depending on your compiler!

Answer 1: I've used Code::Blocks for very little time, but for what I remember in Dev-C++ and MSVC++, the headers are in the IDE's directory. (For example, my MSVC++ include directory is in C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include, and it HAS the iostream file.)
Answer 2: The most common header files are in the most common IDE's / compilers' set.
Tip: Did you not mean "iostream" instead of "iostream.h"?

Which directories does include statement search in C/C++?

test.c:
#include "file.h"
In the above statement, which directories will be searched ?
I suppose the directory where test.c locates will be searched, right?
But is that all?
BTW, what's the benefit to use a header file? Java doesn't require a header file...

#include <header_name>: Standard include file: look in standard paths (system include paths setup for the compiler) first
#include "header_name": Look in current path first, then in include path (project specific lookup paths)
The benefit of using a header file is to provide others with the interface of your library, without the implementation. Java does not require it because java bytecode or jar is able to describe itself (reflexion). C code cannot (yet) do it.
In Java you would only need the jar and have the correct use statement. In C you will (mostly) need a header and a lib file (or header and dll).
The other reason is the way c code is compiled. The compiler compiles translation units (a c/cpp file with all included headers) and the linker in a second step links the whole stuff. Declarations must not be compiled, and this saves time and avoid code to be useless generated for each compilation unit which the linker would have to cleanup.
This is just a general idea, I am not a compiler specialist but should help a bit.

It's controlled by your compiler. For example, gcc has a -I option to specify the include path.
C and C++ prototypes are required so the compiler (distinguished from the linker) can make sure functions are being called with the correct arguments. Java is different in that the compiler uses the binary .class files (which unlike C are in a standard bytecode format with all type information preserved) to check that calls are valid.

Using quotes after the #include instructs the preprocessor to look for include files in the same directory of the file that contains the #include statement, and then in the directories of any files that include (#include) that file. The preprocessor then searches along the path specified by the /I compiler option, then along paths specified by the INCLUDE environment variable.
If you use the angle bracket form, it instructs the preprocessor to search for include files first along the path specified by the /I compiler option, then, when compiling from the command line, along the path specified by the INCLUDE environment variable.

The C++ Standard doesn't really say which directories should be searched. This is how the C++ Standard describes what happens when #include "somefile.h" is encountered:
A preprocessing directive of the form
# include "q-char-sequence" new-line
causes the replacement of that
directive by the entire contents of
the source file identified by the
specified sequence between the "
delimiters. The named source file is
searched for in an
implementation-defined manner. If this
search is not supported, or if the
search fails, the directive is
reprocessed as if it read
# include <h-char-sequence> new-line
with the identical contained sequence
(including > characters, if any) from
the original directive.
So exactly what directories are searched are at the mercy of your specific C++ implementation.

strace, or truss, etc., may be helpful. For example, create a file foo.c with the single line #include "foo.h". Then on CYGWIN, the command:
strace /usr/bin/gcc-4.exe foo.c | grep 'src_path.*foo.h,' | sed 's/.*src_path //;s/foo.h.*//'
produces:
foo.c:1:22: error: foo.h: No such file or directory
/home/Joe/src/utilities/
/usr/lib/gcc/i686-pc-cygwin/4.3.4/include/
/usr/lib/gcc/i686-pc-cygwin/4.3.4/include-fixed/
/usr/include/
/usr/include/w32api/
I was actually surprised this list was so short: the last time I did this exercise, on SunOS 4 fifteen years ago, the search path had over a dozen directories.
The first directory, obviously, is where foo.c lives. See aslo http://gcc.gnu.org/onlinedocs/cpp/Environment-Variables.html for the environment variables CPATH and C_INCLUDE_PATH. But these are not set on my machine. (And I am unclear whether CYGWIN uses them, anyway.)
Edit: The simplest solution is to use cpp -v (not gcc -v). It gives:
COLLECT_GCC_OPTIONS='-E' '-v' '-mtune=generic' '-march=i686'
/usr/lib/gcc/i686-pc-cygwin/4.3.4/cc1.exe -E -quiet -v -D__CYGWIN32__ -D__CYGWIN__
-Dunix -D__unix__ -D__unix -idirafter /usr/lib/gcc/i686-pc-cygwin/4.3.4/../../../..
/include/w32api -idirafter
/usr/lib/gcc/i686-pc-cygwin/4.3.4/../../../../i686-pc-cygwin/lib/../../include/w32api -
-mtune=generic -march=i686
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/usr/lib/gcc/i686-pc-cygwin/4.3.4/../../../../i686-pc-cygwin/include"
ignoring duplicate directory "/usr/lib/gcc/i686-pc-cygwin/4.3.4/../../../../i686-pc-cygwin/lib/../../include/w32api"
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/i686-pc-cygwin/4.3.4/include
/usr/lib/gcc/i686-pc-cygwin/4.3.4/include-fixed
/usr/include
/usr/lib/gcc/i686-pc-cygwin/4.3.4/../../../../include/w32api
End of search list.

How do compilers know where to find #include <stdio.h>?

I am wondering how compilers on Mac OS X, Windows and Linux know where to find the C header files.
Specifically I am wondering how it knows where to find the #include with the <> brackets.
#include "/Users/Brock/Desktop/Myfile.h" // absolute reference
#include <stdio.h> // system relative reference?
I assume there is a text file on the system that it consults. How does it know where to look for the headers? Is it possible to modify this file, if so where does this file reside on the operating system?

When the compiler is built, it knows about a few standard locations to look for header file. Some of them are independent of where the compiler is installed (such as /usr/include, /usr/local/include, etc.) and some of the are based on where the compiler is installed (which for gcc, is controlled by the --prefix option when running configure).
Locations like /usr/include are well known and 'knowledge' of that location is built into gcc. Locations like /usr/local/include is not considered completely standard and can be set when gcc is built with the --with-local-prefix option of configure.
That said, you can add new directories for where to search for include files using the compiler -I command line option. When trying to include a file, it will look in the directories specified with the -I flag before the directories I talked about in the first paragraph.

The OS does not know where look for these files — the compiler does (or more accurately, the preprocessor). It has a set of search paths where it knows to look for headers, much like your command shell has a set of places where it will look for programs to execute when you type in a name. The GCC documentation explains how that compiler does it and how these search paths can be changed.

The location of the file is system dependent. Indeed, the file might be precompiled, or it may not even exist—the compiler may have it as a 'built-in'. On my macbook, I see that there's such a file in /usr/include/c++/4.2.1/iostream, but you shouldn't rely on it, and it's definitely a bad idea to edit it.

If you were using g++, you could do something like this to find out what include paths were searched:
touch empty.cpp
g++ -v empty.cpp
I don't know if there's an equivalent for Xcode. Maybe that will work since Xcode is based on GCC?

In Visual Studio, it's either in the project settings if you use the IDE, or in the %INCLUDE% environment variable if you use the command line.

You should avoid #include-ing files using absolute paths. The compiler searches for the include files in various directories and includes files, starting from each directory. For example;
#include <boost/tokenizer.hpp>
Works because the boost root directory contains a folder called 'boost' and that folder is either in your default include path or you did something like.
g++ -I$BOOST_ROOT {blah, blah}
It is C and C++ standard that the UNIX separator '/' will work the same way for all systems, regardless of what the host system actually uses to denote directories. As others of mentioned, occasionally #include doesn't actually include a real file at all.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js