Rules for Argument Dependent name lookup in C++

Rules for Argument Dependent name lookup in C++ - c++

There have been some questions on SO recently on ADL that have got me thinking. Basically, I am confused which header files compiler can search when performing ADL ? Is it only the ones included by the user's code or can it include the other header files in which the same namespace which is being used in the user's code ? For example. std namespace spans across multiple header files. However, I may include only a subpart of it. Now if I define a function which is not in this subset of header files but is there in the std namespace (in the file I have not included), would it still be a ambiguous call ? I got this doubts mostly because of the discussion on this question

Here's how it works... There are 3 basic steps to compiling source into an executable:
preprocessing
compilation
linking
There is zero overlap in these steps in C++. Include directives are preprocessor directives and therefore happen before compilation. Template instantiation is a part of compilation, therefore it happens after preprocessing. Compilers do not search outside of the current translation unit for anything. Thus no, ADL, a compile time event, cannot search headers that where not included.
The problem with your code linked in the comment to Buzz is that you can't know what headers are included or not by the standard headers. (Well, you can know if you go looking in them to find out, but the standard doesn't say.) Any one of your headers could, and apparently did, include <algorithm>. Once that happens:
http://www2.roguewave.com/support/docs/leif/sourcepro/html/stdlibref/merge.html
Your version becomes ambiguous with one of the definitions in namespace std because of ADL.

No. As long as you avoid including the header that contains the definition (or including a header that includes another header that contains the definition), there would be no ambiguous call. You could define your own iostream, string, vector, etc. as long the C++ standard version has not been included (despite the fact you may have included other parts of the std namespace).

ADL is purely about lookup rules. As with all name lookups, only entities that have been previously declared can be found so if a header file is the only place where a certain declaration occurs and that header file hasn't been included directly or indirectly (yet) then the name introduced by that declaration won't be visible with or without ADL.
(This isn't quite true, as if the name being looked up is a dependent expression in a template definition the final lookup won't occur until a template specialization is instantiated, in which case subsequent declarations can influence the result of the lookup.)
All(!) ADL does is expand the namespaces searched when trying to match an unqualified-id in a function call expression to include namespaces 'related' to the parameters of the function call expression.

The C++ standard defines which names each include file will bring in, however it doesn't tell that only those will be the names made available.
This means that, in theory, just including <vector> may make available std::map.
This is unfortunate because
Code that is incorrect because of missing include files can compile anyway because of unportable dependencies between include files in a specific implementation.
You can get lookup problems because of ADL if any of your names are also names in std. This also can show up as a portability problem.
To explicitly answer your question: ADL will only refer to the include files that have been seen, however you cannot portably know which one they are, because on an implementation a standard file is allowed to include another standard file.
So ADL MAY look all possible standard headers, but you cannot count on that.

Compiler usually works on a single translation unit - that is all the source code in the input after the preprocessor made a pass over it. At that point all the headers have already been expanded recursively. The thing is that when you include a given library header file you cannot assume what other files it includes, etc. You can always check, but I'm pretty sure it's an "implementation detail".

Related

How to hide functions in C++ header files

I am writing a header-only template library in C++. I want to able to write some helper functions inside that header file that will not be visible from a cpp file that includes this header library.
Any tips on how to do this?
I know static keyword can be used in cpp files to limit visibility to that one translation unit. Is there something similar for header files?

There isn't really a way.
The convention is to use a namespace for definitions that are not meant to be public. Typical names for this namespace are detail, meaning implementation details, or internal meaning internal to your library.
And as mentioned in comments, C++20 modules changes this situation.

The easy answer is no.
Headers do not exist to the linker, so all functions in the headers are actually in the module that included them. Technically static (or anonymous namespace) functions in a header, are static to the module that included them. This might work, but you will end up with multiple functions, and bloated code-sizes.
Due to this you should always inline functions in header files, or use something that implies inline - like constexpr; If possible...
Function in headers usually rely on either being inline, or templated. A templated function is "weak", meaning the linker assumes that they are all the same, and just uses a random one, and discards the others.

How to throw a compile error if the include paths for a library were set up not as intended

Our C++ library contains a file with a namethat is (considered) equal to one of the standard libraries' headers. In our case this is "String.h", which Windows considers to be the same as "string.h", but for the sake of this question it could be any other ile name used in the standard library.
Normally, this file name ambiguity is not a problem since a user is supposed to set up the include paths to only include the parent of the library folder (therefore requiring to include "LibraryFolder/String.h") and not the folder containing the header.
However, sometimes users get this wrong and directly set the include path to the containing folder. This means that "String.h" will be included in place of "string.h" in both the user code and in the standard library headers, resulting in a lot of compile errors that may not be easy to resolve or understand for beginners.
Is it possible, during compile-time, to detect such wrongly set up include paths in our libraries' header and throw a compile #warning or #error right away via directive, based on some sort of check on how the inclusion path was?

There's no failsafe way. If the compiler finds another file, it won't complain.
However, you could make it so you can detect it. In your own LibraryName/string.h, you could define a unique symbol, like
#define MY_STRING_H412a55af_7643_4bd6_be5c_4315d3a1e6b7
Then later in dependent code you could check
#ifndef MY_STRING_H412a55af_7643_4bd6_be5c_4315d3a1e6b7
#error "Custom standard library path not configured correctly"
#endif
Likewise you could use this to detect when the wrong version of the library was included.

[edit - as per comments]
Header inclusion can be summarized as :
Parse #include line to determine header name to look up
Depending on <Foo.h> or "Foo.h" form, determine set of locations (usually directories) to search
Interpret the header name, in an implementation-dependent way. (usually as a relative path). Note that this is not necessarily as a string, e.g. MSVC doesn't treat \ as a string escape character.
If the header is found (usually, if a file is found), replace the #include line with the content of that file. If not, fail the compilation.
(The parenthesized "usually" apply to MSVC, GCC, clang, etc but theoretically a compiler could compile directly from a git repository instead of disk files)
The problem here is that the test imagined (spelling of header name) must be located in the included header file. This test would necessarily be part of the replaced #include line, which therefore no longer exists and cannot be tested.
C++17 introduces __has_include but this does not affect the analysis: It would still have to occur in the included header file, and would not have the character sequence from the #include "Foo.h" available.
[old]
Probably the easiest way, especially for beginners is to have a LibraryName/LibraryName.h. Hopefully that name is unique.
The benefit is that once that works, users can replace #include "LibraryName.h" with just #include "String.h" as you know the path is right.
That said, "String.h" is asking for problems. Windows isn't case sensitive.

Use namespaces. In your case this would translate into something like this:
MyString/String.h
namespace my_namespace {
class string {
...
}
}
Now to make sure your std::string or any other class named string is not accidentally used instead of my_namespace::string (by any means, including but not limited to setting up your include paths incorrectly) you need to refer to your type using its fully qualified name, namely my_namespace::string. By doing this you avoid any naming clashes and are guaranteed to get a compile error if you don't include the correct header file (unless there's actually exists another class called my_namespace::string that is not yours). There are other ways to avoid these clashes (such as using my_namespace::string) but I'd rather be explicit about the types I'm using. This solution is costly however because it probably needs change all over your code base (changing all strings to my_namespace::string).
A somewhat less cumbersome alternative would be to change the name of the header String.h to something like MyString.h. This would quickly introduce compile errors but requires changing all your includes from #include "String.h" into#include "MyString.h"` (Should be much less effort compared to the first option).
I cannot think of any other way that requires less effort as of now. Since you were looking for a solution that would work in all similar scenarios I'd go with the namespaces if I were you and solve the problem once and for all. This would prevent any other existing/future naming clashes that may be in you code.

Why headers when there are namespaces?

All the entities (variables, types, constants, and functions) of the standard C++ library are declared within the std namespace.
Namespaces
Now if everything is 'declared' in std namespace they why are these fancy headers out there?
Have already checked this but it was not helpful enough.

Namespaces are kind of a way to organize our types. Keep all your math functions in the MyMath namespace, etc. It's also a way to separate out your types so that they don't clash with other types. So you can have both a MyTypes::string and an stl::string in your code. The std namespace is the one the STL has chosen for its stuff.
Header files contain the public interface of code. It gives you what it has available, which types and functions it declares and hopefully documentation in comments about how to use it. If you try to use code without including its corresponding header file, your code won't compile because it can't find the types. Headers may or may not contain code in namespaces.

As you may know, When you include something like #include "a.h" in b.cpp, everything inside a.h will be placed instead of #include "a.h" in b.cpp. So if a.h is 200 lines of code and your actual code is 10 lines of code, the code that will be compiled would be about 210 lines code. And compile time will be increased if your a.h is big or you included it several times.( Pay attention that if something is included inside of a.h this story repeats. )
Let's suppose that std library is inside one .h file and inside std namespace. Now it's really big and In each file that you are going to use one of std classes, even the smallest one, you have to include whole of it. It makes your program really big and the compile really slow.

The snippet you have is a requirement. The header files are where the standard library entities are actually declared. There is more than one header due to the size and diversity of the entities contained in the standard library. Essentially, the declaration are grouped according to functionality.

Headers and namespaces are effectively unrelated; one does not have anything to do with the other.
As such, one also does not make the other obsolete in any way.
Consider this declaration of std::string that you might find in <string>:
namespace std {
typedef basic_string<char> string;
}
The namespace is a categorisation for the names created (and used) by the declaration;
The header file is a categorisation for where to store the declaration on your hard disk.
There's no "redeclaration" whatsoever because the two concepts are unrelated.

Is it possible to write header file without include guard, and without multiple definition errors?

Just out of curiosity I wanted to know if is there a way to achieve this.
In C++ we learn that we should avoid using macros. But when we use include guards, we do use at least one macro. So I was wondering if there is a way to write a macro-free program.

It's definitely possible, though it's unimaginably bad practice not to have include guards. It's important to understand what the #include statement actually does: the contents of another file are pasted directly into your source file before it's compiled. An include guard prevents the same code from being pasted again.
Including a file only causes an error if it would be incorrect to type the contents of that file at the position you included it. As an example, you can declare (note: declare, not define) the same function (or class) multiple times in a single compilation unit. If your header file consists only of declarations, you don't need to specify an include guard.
IncludedFile.h
class SomeClassSomewhere;
void SomeExternalFunction(int x, char y);
Main.cpp
#include "IncludedFile.h"
#include "IncludedFile.h"
#include "IncludedFile.h"
int main(int argc, char **argv)
{
return 0;
}
While declaring a function (or class) multiple times is fine, it isn't okay to define the same function (or class) more than once. If there are two or more definitions for a function, the linker doesn't know which one to choose and gives up with a "multiply defined symbols" error.
In C++, it's very common for header files to include class definitions. An include guard prevents the #included file from being pasted into your source file a second time, which means your definitions will only appear once in the compiled code, and the linker won't be confused.
Rather than trying to figure out when you need to use them and when you don't, just always use include guards. Avoiding macros most of the time is a good idea; this is one situation where they aren't evil, and using them here isn't dangerous.

It is definitely doable and I have used some early C++ libraries which followed an already misguided approach from C which essentially required the user of a header to include certain other headers before this. This is based on thoroughly understanding what creates a dependency on what else and to use declarations rather than definitions wherever possible:
Declarations can be repeated multiple times although they are obviously required to be consistent and some entities can't be declared (e.g. enum can only be defined; in C++ 2011 it is possible to also declare enums).
Definitions can't be repeated but are only needed when the definition if really used. For example, using a pointer or a reference to a class doesn't need its definition but only its declaration.
The approach to writing headers would, thus, essentially consist of trying to avoid definitions as much as possible and only use declaration as far as possible: these can be repeated in a header file or corresponding headers can even be included multiple times. The primary need for definitions comes in when you need to derive from a base class: this can't be avoided and essentially means that the user would have to include the header for the base class before using any of the derived classes. The same is true for members defined directly in the class but using the pimpl-idiom the need for member definitions can be pushed to the implementation file.
Although there are a few advantages to this approach it also has a few severe drawbacks. The primary advantage is that it kind of enforces a very thorough separation and dependency management. On the other hand, overly aggressive separation e.g. using the pimpl-idiom for everything also has a negative performance impact. The biggest drawback is that a lot the implementation details are implicitly visible to the user of a header because the respective headers this one depends on need to be included first explicitly. At least, the compiler enforces that you get the order of include files right.
From a usability and dependency point of view I think there is a general consensus that headers are best self-contained and that the use of include guards is the lesser evil.

It is possible to do so if you ensure the same header file is not being included in the same translation unit multiple times.
Also, you could use:
#pragma once
if portability is not your concern.
However, you should avoid using #pragma once over Include Guards because:
It is not standard & hence non portable.
It is less intuitive and not all users might know of it.
It provides no big advantage over the classic and very well known Include Guards.

In short, yes, even without pragmas. Only if you can guarantee that every header file is included only once. However, given how code tends to grow, it becomes increasingly difficult to honour that guarantee as the number of header files increase. This is why not using header guards is considered bad practice.
Pre-processor macros are frowned upon, yes. However, header include guards are a necessary evil because the alternative is so much worse (#pragma once will only work if your compiler supports it, so you lose portability)
With regard to pre-processor macros, use this rule:
If you can come up with an elegant solution that does not involve a macro, then avoid them.

Does the non-portable, non-standard
#pragma once
work sufficiently well for you? Personally, I'd rather use macros for preventing reinclusion, but that's your decision.

How to define (non-method) functions in header libraries

When writing a header library (like Boost), can one define free-floating (non-method) functions without (1) bloating the generated binary and (2) incurring "unused" warnings?
When I define a function in a header that's included by multiple source files which in turn is linked into the same binary, the linker complains about redefinitions. One way around this is to make the functions static, but this reproduces the code in each translation unit (BTW, can linkers safely dereplicate these?). Furthermore, this triggers compiler warnings about the function being unused.
I was trying to look for an example of a free-floating function in Boost, but I couldn't find one. Is the trick to contain everything in a class (or template)?

If you really want to define the function (as opposed to declaring it), you'll need to use inline to prevent linker errors.
Otherwise, you can declare the function in the header file and provide its implementation separately in your source file.

You can use the inline keyword:
inline void wont_give_linker_errors(void)
{
// ...
}

Er... The answer to your question is simply don't. You just don't define functions in header files, unless they are inline.
'static' function can also be defined in headers, but it is only useful for very specific rare purposes. Using 'static' just to work around a multiple-definition problem is utter nonsense.
Again, header files are for non-defining function declarations. Why on Earth would you want to define functions there?
You said you are writing "header library". What's a "header library"? Please note, that Boost defines its "functions" in header files because their "functions" are not really functions, they are function templates. Function templates have to be defined in header files (well, almost). If that's wasn't the case, Boost wouldn't be doing something as strange as defining anything in header files.

Besides the already mentioned inline, with most compilers templates have to be defined in headers (and with all compilers it's allowed). Since boost is mostly templates, that explains why it is almost all headers.

People have suggested inline but that violates the very first part of your question i.e. it bloats the code as the full definition is inserted into the code at each call of the function. The answer to your overall question is therefore "No".
If you mark them as static then they are still defined in each source file as you rightly pointed out but only once and so that's a better option than inline if code size is the only issue. I don't know if linkers can, or are allowed to, spot the duplicates and merge them. I suspect not.
Edit:
Just to clear up any confusion as to whether I support the notion of using static and/or defining functions within headers files generally then rest assured I don't. This was simply meant as a technical response as to the differences between functions marked inline and static defined in header files. Nothing more.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js