I have a header file that I am trying to include from another source file using include pre-processor directory. I have tried to use both quoted form as well as angle-braket form, but neither seem to do the job.
The file name is .>"hello.h and a directory where it is searched by the compiler. I have tried to include it like this:
#include <.>"hello.h>
#include <.\>"hello.h>
#include <.\>\"hello.h>
#include ".>"hello.h"
#include ".>\"hello.h"
I also tried different C and C++ compilers — clang, gcc, clang++ and g++.
Obviously, none of the above worked or otherwise there would have been no question.
I thought that maybe the name is not legal according to the standard. Unfortunately, I have neither C nor C++ standard specifications on hand. The only authoritative source of information I could find was this MSDN page about #include directive, and GNU C preprocessor documentation, here. GNU's documentation does not say much, MSDN has the following clause, however:
The path-spec is a file name optionally preceded by a directory
specification. The file name must name an existing file. The syntax of
the path-spec depends on the operating system on which the program is
compiled.
I am curious as to what C and C++ standards say about this?
Where do I find those OS-specific rules for C and C++ header file naming requirements? I am particularly interested in OS X, Linux and FreeBSD.
Why escaping < and/or " characters does not work?
How do I include my file?
I think you are out of luck with that file name from the draft C99 standard section 6.4.7 Header names the grammar is as follows:
header-name:
< h-char-sequence >
" q-char-sequence "
h-char-sequence:
h-char
h-char-sequence h-char
h-char:
any member of the source character set except
the new-line character and >
q-char-sequence:
q-char
q-char-sequence q-char
q-char:
any member of the source character set except
the new-line character and "
You have both a " and > in the file name which excludes you from both the q-char and h-char specification. I don't think you have much choice but to change the file name.
The grammar is the same in the draft C++ standard, section 2.9 Header names.
In both C and C++, that's not a valid header name since it contains both > and ".
The syntax for header names allows those delimited by <> to contain "any member of the source character set except new-line and >", and those delimited by "" to contain "any member of the source character set except new-line and "". There is no concept of an escape sequence.
" and > are not valid characters for a filename in Windows. Your filename should be hello.h, or .\hello.h, or ..\hello.h, but not .>"hello.h.
#include "hello.h"
#include ".\hello.h"
#include "..\hello.h"
#include "c:/temp/hello.h"
Which is why you will not find anything in MSDN about it.
ext3 allows most characters (several have to be escaped when used), but it is HIGHLY recommended that you do not use them when naming your header and source files (if for no other reason than readability). For more information: http://pic.dhe.ibm.com/infocenter/compbg/v121v141/index.jsp?topic=%2Fcom.ibm.xlcpp121.bg.doc%2Flanguage_ref%2Fc99preprocessor.html
What is an acceptable filename is implementation defined.
I would have expected #include ".>\"hello.h" to work, at least
on systems where'>'and'"'` are legal in filenames, but
there's no requirement that it work, and there are clearly
systems where it won't, because such names are not legal in the
system.
You might try forcing the issue:
#define NAME ".>\"hello.h"
#include NAME
But for practical purposes, you should limit your filenames to
alphanumerics, underscores, and maybe hyphens (but I'd be
leary of those as well). And with only one dot, before the
extension. If you go looking for trouble, don't be surprised if
you find it.
Which, of course, answers your last question: how do I include
the file. You rename it to something sensible.
Related
This is quite a contrived problem, I admit, but here it is.
Suppose you have a file with the > character in its name. This is possible on most Unix systems afaik:
$ touch 'weird>name'
$ ls -l
-rw-r--r-- 1 user user 0 28 Mag 11:05 weird>name
Now, suppose this file contains C/C++ code and you want to include it as an header:
#include <weird>name>
int main() {
return weird_function();
}
Clang gives me the following error:
test.cpp:1:10: fatal error: 'weird' file not found
#include <weird>name>
Of course, since the preprocessor parses the directive up to the first > and looks for the weird file. But, I wonder if some escaping mechanism exists to allow me to include the right file.
So, in C and/or C++, is there a way to include an header file which has the > character in its name?
Edit: Many suggested me why not to use #include "weird>name". I admit that my mind slipped over the quotes syntax while writing the question, but it remains valid because the two syntaxes may ask the compiler to search in different paths (theoretically at least). So is there any escaping mechanism to let me include weird>name using the #include <> syntax?
So, in C and/or C++, is there a way to include an header file which has the > character in its name?
Yes:
#include "weird>name"
So is there any escaping mechanism to let me include weird>name using the #include <> syntax?
No. The characters between the < and > must be "any member of the source character set except new-line and >" ([lex.header]). Any escaped form of > would still be a way to represent the > character, which is not allowed. Edit: Implementations are allowed to support implementation-defined escape sequences there though (see [lex.header] p2 and its footnote).
The #include " q-char-sequence " form does allow the > character to appear, even though it might get reprocessed as #include <...> if searching as "..." fails ([cpp.include] p3).
The preprocessor also allows another form ([cpp.include] p4](http://eel.is/c++draft/cpp.include#4)), but its effect are implementation-defined, and the implementations I tried do not allow joining weird and > and name into a single preprocessor-token that can then be included
Ask the author of your compiler.
The C and C++ standards grant a lot of leeway to implementations over the interpretation of #include directives. There's no requirement that #include <foo.h> causes the inclusion of a file called "foo.h". For instance, a compiler can choose to ROT13 all the source file names if it likes. And for non-alphanumeric characters, the implementation can identify and remap certain character sequences. So if there were a platform where > regularly showed up in filenames, it's likely that a compiler for that platform would specify that, say, \g or something would be remapped to >. But the standard doesn't mandate a particular encoding.
Incidentally, the implementation could also just choose to allow #include <weird>name>. Since that is not well-formed under the language standards, an implementation is free to define a meaning for it as an extension.
Try below syntax:
#include "weird>name"
T.C. left an interesting comment to my answer on this question:
Why aren't include guards in c++ the default?
T.C. states:
There's "header" and there's "source file". "header"s don't need to be
actual files.
What does this mean?
Perusing the standard, I see plenty of references to both "header files" and "headers". However, regarding #include, I noticed that the standard seems to make reference to "headers" and "source files". (C++11, § 16.2)
A preprocessing directive of the form
# include < h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely
by the specified sequence between the < and > delimiters, and causes the replacement
of that directive by the entire contents of the header. How the places are specified
or the header identified is implementation-defined.
and
A preprocessing directive of the form
# include " q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source *file*
identified by the specified sequence between the " delimiters. The named source *file*
is searched for in an implementation-defined manner.
I don't know if this is significant. It could be that "headers" in a C++ context unambiguously means "header files" but the word "sources" would be ambiguous so "headers" is a shorthand but "sources" is not. Or it could be that a C++ compiler is allowed leeway for bracket includes and only needs to act as if textual replacement takes place.
So when are header (files) not files?
The footnote mentioned by T.C. in the comments below is quite direct:
174) A header is not necessarily a source file, nor are the sequences
delimited by < and > in header names necessarily valid source file
names (16.2).
For the standard header "files" the C++ standard doesn't really make a mandate that the compiler uses a file or that the file, if it uses one, actually looks like a C++ file. Instead, the standard header files are specified to make a certain set of declarations and definitions available to the C++ program.
An alternative implementation to a file could be a readily packaged set of declarations represented in the compiler as data structure which is made available when using the corresponding #include-directive. I'm not aware of any compiler which does exactly that but clang started to implement a module system which makes the headers available from some already processed format.
They do not have to be files, since the C and C++ preprocessor are nearly identical it is reasonable to look into the C99 rationale for some clarity on this. If we look at the Rationale for International Standard—Programming Languages—C it says in section 7.1.2 Standard headers says (emphasis mine):
In many implementations the names of headers are the names of files in
special directories. This implementation technique is not required,
however: the Standard makes no assumptions about the form that a file
name may take on any system. Headers may thus have a special status if
an implementation so chooses. Standard headers may even be built into
a translator, provided that their contents do not become “known” until
after they are explicitly included. One purpose of permitting these
header “files” to be “built in” to the translator is to allow an
implementation of the C language as an interpreter in a free-standing
environment where the only “file” support may be a network interface.
It really depends on the definition of files.
If you consider any database which maps filenames to contents to be a filesystem, then yes, headers are files. If you only consider files to be that which is recognized by the OS kernel open system call, then no, headers don't have to be files.
They could be stored in a relational database. Or a compressed archive. Or downloaded over the network. Or stored in alternate streams or embedded resources of the compiler executable itself.
In the end, though, textual replacement is done, and the text comes from some sort of indexed-by-name database.
Dietmar mentioned modules and loading already processed content... but this is generally NOT allowable behavior for #include according to the C++ standard (modules will have to use a different syntax, or perhaps #include with a completely new quotation scheme other than <> or ""). The only processing that could be done in advance is tokenization. But contents of headers and included source files are subject to stateful preprocessing.
Some compilers implement "precompiled headers" which have done more processing than mere tokenization, but eventually you find some behavior that violates the Standard. For example, in Visual C++:
The compiler ... skips to just beyond the #include directive associated with the .h file, uses the code contained in the .pch file, and then compiles all code after filename.
Ignoring the actual source code prior to #include definitely does not conform to the Standard. (That doesn't prevent it from being useful, but you need to be aware that edits may not produce the expected behavior changes)
I got thousands of files that have that include files with forward slashes
#include <this/thread.hpp>
Why? the original program was written in VS 2008.
This causes a fatal error C1083
If I change the path to #include "..\this\thread.hpp" it finds the file
Windows accepts both forward- and backslash as path separator. At least since Windows XP.
I cannot read minds, but I could guess that forward slash was used in the name of (potential) portability and/or standards compliance because backslash in an include directive has undefined behaviour in c++03.
c++03 §2.8/2:
If either of the characters ’ or \, or either of the character sequences /* or // appears in a q-char-sequence or a h-char-sequence, or the character " appears in a h-char-sequence, the behavior is undefined.
The wording was changed in c++11 according to the draft. The behaviour is no longer undefined, but still implementation defined.
c++11 draft §2.9/2
The appearance of either of the characters ’ or \ or of either of the character sequences /* or // in a
q-char-sequence or an h-char-sequence is conditionally supported with implementation-defined semantics, as
is the appearance of the character " in an h-char-sequence.
About your bug:
If I change the path to #include "..\this\thread.hpp" it finds the file
Pay close attention to your two different include directives. There's more difference than the path separator. Firstly, the forward slash version doesn't refer to parent path (../), secondly the path is enclosed in < > which is wrong in this case since it appears that the path is intended to be relative to the current file. See https://stackoverflow.com/a/21594/2079303 for more details.
Error C1083 is "cannot open include file", which typically means that the compiler couldn't find the file.
#include <this/thread.hpp>
Is there a directory called 'this' anywhere in your include directory paths? That is far more likely to be the problem than the forward slash.
While not C++ -specific, it is where using \ is an escape character, at least when it is within the <...> tags, such that if you really wanted to specify it as a path separator, you would need to type \\. To avoid doing double-backslashes every time you want to have only one type of backslash, and because it acts the same way, you can apply the same ability to specify a path separation between folders if you simply use /. This simplicity cuts down on confusion to someone that doesn't understand escaping, so that they do not take an escaped path literally and put it into an Explorer address bar and get confused when it does not take them to the right place.
Note that if it requires <...> tags, you are specifying a system file, while the "..." statement was to include a locally-generated one. These syntaxes are different in their escaping requirements.
At this tutorial it mentions the following about #include "filename":
#include "filename" tells the compiler to look for the file in
directory containing the source file
doing the #include. If that fails,
it will act identically to the angled
brackets case.
What is meant by the bolded font sentence?
Thanks.
The bold bit simply means that, if the file specified inside quotes cannot be located using the " method, it will revert to the <> method.
I should mention that the bit about where it looks for the include files is actually incorrect. In both cases (quotes and angle brackets), the search locations are implementation defined.
From the lex.header section:
The sequences in both forms of header-names are mapped in an implementation-defined manner to headers or to external source file names as specified in 16.2.
The 16.2 section follows:
A #include directive shall identify a header or source file that can be processed by the implementation.
A preprocessing directive of the form
# include < h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
A preprocessing directive of the form
# include " q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
# include < h-char-sequence> new-line
with the identical contained sequence (including > characters, if any) from the original directive.
So the statement "... tells the compiler to look for the file in directory containing the source file doing the #include ..." is wrong. It's totally up to the implementation how it finds the files, in both cases.
Having said that, the rest is correct. If the method used by the " type does not locate the header, the method used by the <> type is then used. That's really all the bold bit means.
You just have to read the documentation for your particular implementation to see what those methods are.
While the exact details are implementation-dependent, there are a few common practices. In most common compilers, using the quotes #include "filename.h" searches the current directory by default. Using angle brackets #include <filename.h> searches system-defined library directories. What it is saying is that if the current directory doesn't have the file you need, it will search the system directories instead.
Note that some compilers may be different, and your compiler itself may have options to change these directories. There is also the possibility that system headers don't actually exist, but that #include <foo.h> is directly recognized by the compiler to enable certain built-in definitions.
For the purposes of this question, I am interested only in Standard-Compliant C++, not C or C++0x, and not any implementation-specific details.
Questions arise from time to time regarding the difference between #include "" and #include <>. The argument typically boils down to two differences:
Specific implementations often search different paths for the two forms. This is platform-specific, and not in the scope of this question.
The Standard says #include <> is for "headers" whereas #include "" is for a "source file." Here is the relevant reference:
ISO/IEC 14882:2003(E)
16.2 Source file inclusion [cpp.include]
1 A #include directive shall identify a header or source file that can be processed by the implementation.
2 A preprocessing directive of the form
# include < h-char-sequence > new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
3 A preprocessing directive of the form
# include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
# include < h-char-sequence > new-line
with the identical contained sequence (including > characters, if any) from the original directive.
(Emphasis in quote above is mine.) The implication of this difference seems to be that the Standard intends to differentiate between a 'header' and a 'source file', but nowhere does the document define either of these terms or the difference between them.
There are few other places where headers or source files are even mentioned. A few:
158) A header is not necessarily a source file, nor are the sequences delimited by in header names necessarily valid source file names (16.2).
Seems to imply a header may not reside in the filesystem, but it doesn't say that source files do, either.
2 Lexical conventions [lex]
1 The text of the program is kept in units called source files in this International Standard. A source file together with all the headers (17.4.1.2) and source files included (16.2) via the preprocessing directive #include, less any source lines skipped by any of the conditional inclusion (16.1) preprocessing directives, is called a translation unit. [Note: a C + + program need not all be translated at the same time. ]
This is the closest I could find to a definition, and it seems to imply that headers are not the "text of the program." But if you #include a header, doesn't it become part of the text of the program? This is a bit misleading.
So what is a header? What is a source file?
My reading is that the standard headers, included by use of <> angle brackets, need not be actual files on the filesystem; e.g. an implementation would be free to enable a set of "built-in" operations providing the functionality of iostream when it sees #include <iostream>.
On the other hand, "source files" included with #include "xxx.h" are intended to be literal files residing on the filesystem, searched in some implementation-dependent manner.
Edit: to answer your specific question, I believe that "headers" are limited only to those #includeable facilities specified in the standard: iostream, vector and friends---or by the implementation as extensions to the standard. "Source files" would be any non-standard facilities (as .h files, etc.) the programmer may write or use.
Isn't this saying that a header may be implemented as a source file, but there again may not be? as for "what is a source file", it seems very sensible for the standard not to spell this out, given the many ways that "files" are implemented.
The standard headers (string, iostream) don't necessarily have to be files with those names, or even files at all. As long as when you say
#include <iostream>
a certain list of declarations come into scope, the Standard is satisfied. Exactly how that comes about is an implementation detail. (when the Standard was being written, DOS could only handle 8.3 filenames, but some of the standard header names were longer than that)
As your quotes say: a header is something included using <>, and a source file is the file being compiled, or something included using "". Exactly where the contents of these come from, and what non-standard headers are available, is up to the implementation. All the Standard specifies is what is defined if you include the standard headers.
By convention, headers are generally system-wide things, and source files are generally local to a project (for some definition of project), but the standard wisely doesn't get bogged down in anything to do with project organisation; it just gives very general definitions that are compatible with such conventions, leaving the details to the implementation and/or the user.
Nearly all of the standard deals with the program after it's been preprocessed, at which time there are no such things as source files or headers, just the translations units that your last quote defines.
Hmmm...
My casual understanding has been that the distinction between <> includes and "" includes was inherited from c and (though not defined by the standards) the de facto meaning was that <> searched paths for system and compiler provided headers and "" also searched local and user specified paths.
The definition above seem to agree in some sense with that usage, but restricts the use of "header" to things provided by the compiler or system exclusive of code provided by the user, even if they have the traditional "interface goes in the header" form.
Anyway, very interesting.