When using __DATE__ or __TIME__ in a header file, the results of the preprocessor for that header inclusion can be somewhat different.
Under which circumstances does using __DATE__ or __TIME__ in a header file violate the one-definition-rule?
As a follow-up: Does the assert header violate the one-definition-rule?
If __TIME__ gives different results for different translation units, then it must not be used in a context where the same result is required across translation units. This means e.g. initialising an object (e.g. a class member) to __TIME__, where that initialiser is part of a header that gets included in multiple translation units, is going to be problematic.
__DATE__ is less likely to give different results for different translation units if you start a fresh build, but incremental builds, ones that only recompile the files that changed, do make it likely to become a problem as well.
assert is a macro that expands differently depending on how NDEBUG was defined when its header was included, so either the whole project must agree on whether NDEBUG should be defined, or functions defined in headers should avoid using assert.
The one definition rule is only applicable to variables, functions, class types, enumerations, or templates (e.g. Section 3.2, ISO/IEC 14882, 1998 C++ Standard). __DATE__ or __TIME__ are both predefined macros that expand to a string literal - which is not one of the things that the one-definition rule is applicable to.
assert() is also a preprocessor macro. If its expansion defines a variable, function, class type, enumeration, or template, then its use could potentially violate the one-definition rule, if that definition differed between translation units. Pragmatically, it is difficult to envisage a situation in which an implementation would have an assert() macro that expanded to such a definition.
Related
Consider a C++ header file compiled both in my_lib.a and in my_prog that links with my_lib.a. The library was compiled without NDEBUG, while my_prog - with NDEBUG. Would it result in ODR violation?
What if my_lib.so is a shared library? Of course, ODR is irrelevant here, because there are 2 separate executables, but could NDEBUG affect std (or other) classes in a way that would prevent passing their instances correctly via SO interface? E.g. if an std::vector instance was created in my_prog, can it be passed as an argument to the SO? May NDEBUG affect memory allocation etc?
Does the Standard specify this?
20.5.2.2 Headers [using.headers]
A translation unit may include library headers in any order (Clause 5). Each may be included more than once, with no effect different from being included exactly once, except that the effect of including either <cassert> or <assert.h> depends each time on the lexically current definition of NDEBUG.
It is guaranteed not to be an issue for standard headers, however the issue you have highlighted does apply to functions in source files you provide yourself.
6.2 One-definition rule [basic.def.odr]
There can be more than one definition of a class [function/enum/variable/etc] provided the definitions satisfy the following requirements:
[...]
each definition of D shall consist of the same sequence of tokens;
Note that tokenisation happens after preprocessing, so if the definition contains any assert, this must preprocess to the same token sequence, i.e. must have the same NEDBUG setting during compilation.
Based on the answer by sbi to this question,
An identifier can be declared as often as you want (statement 1)
But isn't it true that
an include guard in C++ just prevents the function declarations from
showing up more than once in a single source file (statement 2)
?
My question is: why this contradiction? Or have I misunderstood either of the two statements?
Yes, you can declare (but not define) a function multiple times in a single translation unit. And yes, include guards usually prevent this, but that is not their only purpose. Headers often define classes, templates, and inline functions; the header guard is needed to prevent multiple definitions of those entities from appearing in a single translation unit. Header guards also help prevent an exponential blowup in the number of times a header gets pasted into a translation unit.
Based on the answer here, It is only needed to define the method inside header file in order to make it inline. So my question is, why there is inline keyword?
For example, if you want to define a free function inside a header (not a member function of a class), you will need to declare it inline, otherwise including it in multiple translation units will cause an ODR violation.
The usual approach is to declare the function in the header and define it in a separately compiled .cpp file. However, defining it in the header as an inline function allows every translation unit that includes it to have its definition available, which makes it easier for that function to actually be inlined; this is important if the function is heavily used.
The historical reason for the inline keyword is that older C compilers weren't as capable of optimising code as good quality modern C and C++ compilers are. It was therefore, originally, introduced to allow the programmer to indicate a preference to inline a function (effectively insert the function body directly into the caller, which avoids overheads associated with function calls). Since there was no guarantee that a compiler could inline the function (e.g. some compilers could only inline certain types of functions) the inline keyword was made a hint rather than a directive.
The other typical use of inline is not as discretionary - if a function is defined in multiple compilation units (e.g. because the containing header is #included more than once) then leaving off inline causes a violation of the one definition rule (and therefore in undefined behaviour). inline is an instruction to the compiler (which in turn probably emits instructions to the linker) to resolve such incidental violations of the one-definition rule, and produce a working program (e.g. without causing the linker to complain about multiple definitions).
As to why it is still needed, functionally, the preprocessor only does text substitution ... such as replacing an #include directive with contents of the included header.
Let's consider two scenarios. The first has two source files (compilation units) that do;
#include "some_header"
//some other functions needed by the program, not pertinent here
and
#include "some_header"
int main()
{
foo();
}
where some_header contains the definition
void foo() { /* whatever */ }
The second scenario, however, omits the header file (yes, people do that) and has the first source file as
void foo() { /* whatever */ }
//some other functions needed by the program, not pertinent here
and the second as;
void foo() { /* whatever */ }
int main()
{
foo();
}
In both scenarios, assume the two compilation units are compiled and linked together. The result is a violation of the one definition rule (and, practically, typically results in a linker error due to multiply defined symbols).
Under current rules of the language (specifically, the preprocessor only doing text substitution), the two scenarios are required to be EXACTLY functionally equivalent. If one scenario was to print the value 42, the so should the other. If one has undefined behaviour (as is the case here) so does the other.
But, let's say for sake of discussion, that there was a requirement that a function be magically inlined if it is defined in a header. The code in the two scenarios would no longer be equivalent. The header file version would have defined behaviour (no violation of the one definition rule) and the second would have undefined behaviour.
Oops. We've just broken equivalence of the two scenarios. That may not seem much, but programmers would practically have trouble understanding why one version links and the other doesn't. And they would have have no way of fixing that ... other than moving code into a header file.
That means, we need some way to make the two scenarios equivalent. This means there needs to be something in the code which makes them equivalent. Enter the inline keyword, and prefix it to the definitions of foo().
Now, okay, one might argue that the preprocessor should do something a bit more intelligent i.e. do more than simple text substitution. But, now you are on a slippery slope. The C and C++ standards do (at length) specify that the preprocessor does that. And changing that would introduce a cascade of other changes. Whether that is a good idea or not (and, certainly, there is some advocacy for eliminating the preprocessor from C++ entirely), that is a much bigger change, with numerous ramifications on the language, and on programmers (who, whether it is good or bad, can rely on the preprocessor behaving as it does).
Short answer. It's not required. Compilers generally ignore the inline keyword.
More comprehensive answers are already given in other questions, not to mention the second answer in the question you linked.
The reason we have to define inline functions in the header is that each compilation unit where that function is called must have the entire definition in order to replace the call, or substitute it. My question is why are we forced to put a definition in a header file if the compiler can and does do its own optimisations of inlining, which would require it to dig into the cpp files where the functions are defined anyway.
In other words, the compiler seems to me to have the ability to see the function "declaration" in a header file, go to the corresponding cpp file and pull the definition from it and paste it in the appropriate spot in the other cpp. Given that this is the case, why the insistence of defining the function in the header, implying as if the compiler can't "see" into other cpp files.
The MSDN says about Ob2/ optimisation setting:
Ob2/ The default value. Allows expansion of functions marked as inline, __inline, or __forceinline, and any other function that the compiler chooses (My emphasis).
The reason we're forced to provide definitions of inline function in header files (or at least, in some form that is visible to the implementation when inlining a function in a given compilation unit) is requirements of the C++ standard.
However, the standard does not go out of its way to prevent implementations (e.g. the toolchain or parts of it, such as the preprocessor, compiler proper, linker, etc) from doing things a little smarter.
Some particular implementations do things a little smarter, so can actually inline functions even in circumstances where they are not visible to the compiler. For example, in a basic "compile all the source files then link" toolchain, a smart linker may realise that a function is small and only called a few times, and elect to (in effect) inline it, even if the points where inlining occurs were not visible to the compiler (e.g. because the statements that called the functions were in separate compilation units, the function itself is in another compilation unit) so the compiler would not do inlining.
The thing is, the standard does not prevent an implementation from doing that. It simply states the minimum set of requirements for behaviour of ALL implementations.
Essentially, the requirement that the compiler have visibility of a function to be inlined is the minimum requirement from the standard. If a program is written in that way (e.g. all functions to be inlined are defined in their header file) then the standard guarantees that it will work with every (standard compliant) implementation.
But what does this mean for our smarter tool-chain? The smarter tool-chain must produce correct results from a program that is well-formed - including one that defines inlined functions in every compilation unit which uses those functions. Our toolchain is permitted to do things smarter (e.g. peeking between compilation units) but, if code is written in a way that REQUIRES such smarter behaviour (e.g. that a compiler peek between compilation units) that code may be rejected by another toolchain.
In the end, every C++ implementation (the toolchain, standard library, etc) is required to comply with requirements of the C++ standard. The reverse is not true - one implementation may do things smarter than the standard requires, but that doesn't generate a requirement that some other implementation do things in a compatible way.
Technically, inlining is not limited to being a function of the compiler. It may happen in the compiler or the linker. It may also happen at run time - for example "Just In Time" technology can, in effect, restructure executable code after it has been run a few times in order to enhance subsequent performance [this typically occurs in a virtual machine environment, which permits the benefits of such techniques while avoiding problems associated with self-modifying executables].
The inline keyword isn't just about expanding the implementation at the point it was called, but in fact primarily about declaring that multiple definitions of a function may exist in a given translation unit.
This has been covered in other questions before, which can it explain much better than I :)
Why are class member functions inlined?
Is "inline" implicit in C++ member functions defined in class definition
No, compilers traditionally can't do this. In classic model, compiler 'sees' only one cpp file at a time, and can't go to any other cpp files. Out of this cpp file compiler so-called object file in platofirm native format, which is than linked using effectively linker from 1970s, which is as dumb as a hammer.
This model is slowly evolving. With more and more effective link-time optimizations (LTO) linkers become aware of what cpp code is, and can perform their own inlining. However, even with link-time optimization model compiler-done inlining and optimization are still way more efficient than link-time - a lot of important context is lost when cpp code is converted to intermediate format suitable for linking.
It's much easier for the compiler to expand a function inline if it has seen the definition of that function. The easiest way to let the compiler see the definition of a function in every translation unit that uses that function is to put the definition in a header and #include that header wherever the function will be used. When you do that you have to mark the definition as inline so that the compiler (actually the linker) won't complain about seeing the definition of that function in more than one translation unit.
In this article from Guru of the week, it is said: It is illegal to #define a reserved word. Is this true? I can’t find anything in the norm, and I have already seen programmers redefining new, for instance.
17.4.3.1.1 Macro names [lib.macro.names]
1 Each name defined as a macro in a header is reserved to the implementation for any use if the translation unit includes the header.164)
2 A translation unit that includes a header shall not contain any macros that define names declared or defined in that header. Nor shall such a translation unit define macros for names lexically identical to keywords.
By the way, new is an operator and it can be overloaded (replaced) by the user by providing its own version.
The corresponding section from C++11:
17.6.4.3.1 Macro names [macro.names]
1 A translation unit that includes a standard library header shall not #define or #undef names declared in any standard library header.
2 A translation unit shall not #define or #undef names lexically identical to keywords.
Paragraph 1 from C++03 has been removed. The second paragraph has been split in two. The first half has now been changed to specifically state that it only applies to standard headers. The second point has been broadened to include any translation unit, not just those that include headers.
However, the Overview for this section of the standard (17.6.4.1 [constraints.overview]) states:
This section describes restrictions on C++ programs that use the facilities of the C++ standard library.
Therefore, if you are not using the C++ standard library, then you're okay to do what you will.
So to answer your question in the context of C++11: you cannot define (or undefine) any names identical to keywords in any translation unit if you are using the C++ standard library.
They're actually wrong there, or at least doesn't tell the whole story about it. The real reason it's disallowed is that it violates the one-definition-rule (which by the way is also mentioned as the second reason why it's illegal).
To see that it's actually allowed (to redefine keywords), at least if you don't use the standard libraries, you have to look at an entirely different part of the standard, namely the translation phases. It says that the input is only decomposed into preprocessor tokens before preprocessing takes place and looking at those there's no distinction between private and fubar, they are both identifiers to the preprocessor. Later when the input is decomposed into token the replacement has already taken place.
It has been pointed out that there's a restriction on programs that are to use the standard libraries, but it's not evident that the example redefining private is doing that (as opposed to the "Person #4: The Language Lawyer" snippet which uses it for output to cout).
It's mentioned in the last example that the trick doesn't get trampled on by other translation units or tramples on other. With this in mind you should probably consider the possibility that the standard library is being used somewhere else which will put this restriction in effect.
Here's a little thing you can do if you don't want someone to use goto's. Just drop the following somewhere in his code where he won't notice it.
#define goto { int x = *(int *)0; } goto
Now every time he tries to use a goto statement, his program will crash.
It's not as far as I'm aware illegal - no compiler I've come across yet will generate an error if you do
#define true false
#defining certain keywords are likely to generate errors in compilation for other reasons. But a lot of them will just result in very strange program behaviour.