How should a C++ api be laid out?

How should a C++ api be laid out? - c++

Imagining I'm publishing a C++ library with its include files in a folder named api.
// file: api/mylib/fwd/foo.h
inline int mylib_foo();
// file: api/mylib/impl/foo.h
inline int mylib_foo() { return 42; }
In the context of this question, is it advisable for library builders to always use the 'full path' to their own (api) include files?
// file: api/mylib/all.h
#include "mylib/fwd/foo.h" // as opposed to "fwd/foo.h"
#include "mylib/impl/foo.h" // as opposed to "impl/foo.h"
Or could it be acceptable to rely on the fact that the preprocessor 'often' searches the including folder first?
If you don't want to add /home/xtofl/libs/mylib/api to the compiler's include path but rather ... #include "/home/xtofl/libs/mylib/api/mylib/all.h", or even just put mylib next to the client code.
// file: api/mylib/all.h
#include "fwd/foo.h"
#include "impl/foo.h"

N.B. this is nothing to do with how the project is laid out (as the title of your question says), because in all cases you are assuming that the headers are in a sub-directory called fwd. The question is about what kind of #include directives to use given a particular layout. Anyway ...
In the context of this question, is it advisable for library builders to always use the 'full path' to their own (api) include files?
// file: api/mylib/all.h
#include "mylib/fwd/foo.h" // as opposed to "fwd/foo.h"
#include "mylib/impl/foo.h" // as opposed to "impl/foo.h"
That assumes that the including code adds the api dir to its search paths, and you already mentioned two ways that assumption can fail:
If you don't want to add /home/xtofl/libs/mylib/api to the compiler's include path but rather ... #include "/home/xtofl/libs/mylib/api/mylib/all.h", or even just put mylib next to the client code.
So IMHO this is better:
Or could it be acceptable to rely on the fact that the preprocessor 'often' searches the including folder first?
Yes, I think it's better to rely on that, and do:
// file: api/mylib/all.h
#include "fwd/foo.h"
#include "impl/foo.h"
That handles the case where api is in the search path, and the case where it's not, and the case where mylib isn't in a directory called mylib at all.
It relies on the implementation-defined rule that searching for headers included with #include "..." starts in the including file's directory, but that is common to all compilers I know, and is a safer assumption than the other assumptions about where the files are installed.

Related

Protecting certain include locations

I'm building a little language that will compile to C or C++, I haven't decided yet, however I have come across a dilemma concerning the #include keyword.
My language will come with a standard library that will be incorporated into the language, and be accessible much like that of C or C++ with the standard includes such as #include <string>.
My compiler can automatically tell the difference between user includes and standard library includes, but my issue lies in how the GCC compiler uses the -I flag.
Let's take Java as an example. One of the default packages (folder) is called java.util. If I try to make my own folder called java.util inside my project, I get the error:
The package java.util conflicts with a package accessible from another module: java.base
Meaning it is included by default.
I would like this to do the same thing in C++, but am worried that a user could (hypothetically) do a relative path include and cause a conflict.
Take for example, I use the flag like so: -I ../some/folder.
However then the user could simply type #include "../some/folder" to access the same content. Is there any way I can restrict this, and like the title of the question suggests, "protect" the folder from being called like that?
Furthermore, if there is a file inside of that folder called test.h and the user decides to create their own file called test.h locally and include it. How will the conflicts occur? Will it pick the local folder over the included via. flags?
An example of a basic implementation is as follows: (General syntax, no specific language)
boolean userDefine = false;
string defineName = "foo";
// Do something to determine if <> define or "" define.
if (userDefine) {
// Returns #include "foo"
return "#include \"" + defineName + "\"";
} else {
// Returns #include "stdlib/foo"
return "#include \"stdlib/" + defineName + "\"";
}
But then again, the user could include the folder so that it satisfies the first condition and still gain access.

It's pretty much the standard practice to put any #include files at the very beginning of the C++ source file, as the first order of business.
Of course, a #include can appear anywhere in the C++ source file, and there are situations when that happens but, if you were to grab some random C++ source from github, chances are pretty good that all the #include files will be at the beginning of the file.
So, all you have to do, is to make arrangements that your library's #include is always at the beginning, and use the standard #ifndef/#define guards in your header files. Then, manual inclusion of them subsequently will have no effect whatsoever, no matter what path is used.
Of course, this won't stop anyone from manually #undefing your guard, to create some chaos. However, C++ never had a reputation for reliably preventing you from shooting yourself in the foot, and is unlikely to earn that reputation in the foreseeable future; so what? Actually, most compilers implement #pragma once, which might be a slightly better foot self-shooting prevention approach...

How to throw a compile error if the include paths for a library were set up not as intended

Our C++ library contains a file with a namethat is (considered) equal to one of the standard libraries' headers. In our case this is "String.h", which Windows considers to be the same as "string.h", but for the sake of this question it could be any other ile name used in the standard library.
Normally, this file name ambiguity is not a problem since a user is supposed to set up the include paths to only include the parent of the library folder (therefore requiring to include "LibraryFolder/String.h") and not the folder containing the header.
However, sometimes users get this wrong and directly set the include path to the containing folder. This means that "String.h" will be included in place of "string.h" in both the user code and in the standard library headers, resulting in a lot of compile errors that may not be easy to resolve or understand for beginners.
Is it possible, during compile-time, to detect such wrongly set up include paths in our libraries' header and throw a compile #warning or #error right away via directive, based on some sort of check on how the inclusion path was?

There's no failsafe way. If the compiler finds another file, it won't complain.
However, you could make it so you can detect it. In your own LibraryName/string.h, you could define a unique symbol, like
#define MY_STRING_H412a55af_7643_4bd6_be5c_4315d3a1e6b7
Then later in dependent code you could check
#ifndef MY_STRING_H412a55af_7643_4bd6_be5c_4315d3a1e6b7
#error "Custom standard library path not configured correctly"
#endif
Likewise you could use this to detect when the wrong version of the library was included.

[edit - as per comments]
Header inclusion can be summarized as :
Parse #include line to determine header name to look up
Depending on <Foo.h> or "Foo.h" form, determine set of locations (usually directories) to search
Interpret the header name, in an implementation-dependent way. (usually as a relative path). Note that this is not necessarily as a string, e.g. MSVC doesn't treat \ as a string escape character.
If the header is found (usually, if a file is found), replace the #include line with the content of that file. If not, fail the compilation.
(The parenthesized "usually" apply to MSVC, GCC, clang, etc but theoretically a compiler could compile directly from a git repository instead of disk files)
The problem here is that the test imagined (spelling of header name) must be located in the included header file. This test would necessarily be part of the replaced #include line, which therefore no longer exists and cannot be tested.
C++17 introduces __has_include but this does not affect the analysis: It would still have to occur in the included header file, and would not have the character sequence from the #include "Foo.h" available.
[old]
Probably the easiest way, especially for beginners is to have a LibraryName/LibraryName.h. Hopefully that name is unique.
The benefit is that once that works, users can replace #include "LibraryName.h" with just #include "String.h" as you know the path is right.
That said, "String.h" is asking for problems. Windows isn't case sensitive.

Use namespaces. In your case this would translate into something like this:
MyString/String.h
namespace my_namespace {
class string {
...
}
}
Now to make sure your std::string or any other class named string is not accidentally used instead of my_namespace::string (by any means, including but not limited to setting up your include paths incorrectly) you need to refer to your type using its fully qualified name, namely my_namespace::string. By doing this you avoid any naming clashes and are guaranteed to get a compile error if you don't include the correct header file (unless there's actually exists another class called my_namespace::string that is not yours). There are other ways to avoid these clashes (such as using my_namespace::string) but I'd rather be explicit about the types I'm using. This solution is costly however because it probably needs change all over your code base (changing all strings to my_namespace::string).
A somewhat less cumbersome alternative would be to change the name of the header String.h to something like MyString.h. This would quickly introduce compile errors but requires changing all your includes from #include "String.h" into#include "MyString.h"` (Should be much less effort compared to the first option).
I cannot think of any other way that requires less effort as of now. Since you were looking for a solution that would work in all similar scenarios I'd go with the namespaces if I were you and solve the problem once and for all. This would prevent any other existing/future naming clashes that may be in you code.

How Can I Save my Header File in the standard gallery of header files (or whatever it's called)?

So I do a lot of programming dealing in math, and I really hate having to write algorithms to find lists of prime numbers as well as check if numbers are prime, over and over again.
Now I want to make this file once, call it 'primes.h', and store it so that at any point I can just open up a program and just #include <primes> So what do I do with this header file? Also, what do I do with the 'primes.cpp' that goes along with it?
Thanks,
Live2Code

The names like <primes> are reserved for the standard library functionality, you should not use that sort of name - unless you convince the standards committee to include your code into the standard, that is.
You can certainly store a primes.h in a "central place where every project can get to it" - I wouldn't put them in with the header files that come with your compiler, because that means you have copy a new set when you switch to Visual Studio 2013 or some such. Instead, add a new include directory to your project, which points at your include directory. It's not very difficult to do, and it's a "once per project" thing.

Create a library, add the directory of the library and the header file to "path".
the header file do not have to have a ".h" extension. This is the vecotr file for stl:
#ifndef __SGI_STL_VECTOR
#define __SGI_STL_VECTOR
#include <stl_range_errors.h>
#include <stl_algobase.h>
#include <stl_alloc.h>
#include <stl_construct.h>
#include <stl_uninitialized.h>
#include <stl_vector.h>
#include <stl_bvector.h>
#endif /* __SGI_STL_VECTOR */
you can just create a header file named as "prime" and
#include <primes>

Best practice for including from include files

I was wondering if there is some pro and contra having include statements directly in the include files as opposed to have them in the source file.
Personally I like to have my includes "clean" so, when I include them in some c/cpp file I don't have to hunt down every possible header required because the include file doesn't take care of it itself. On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards, the files have to be parsed first. Is this just a matter of taste, or are there any pros/cons over the other?
What I mean is:
sample.h
#ifdef ...
#include "my_needed_file.h"
#include ...
class myclass
{
}
#endif
sample.c
#include "sample.h"
my code goes here
Versus:
sample.h
#ifdef ...
class myclass
{
}
#endif
sample.c
#include "my_needed_file.h"
#include ...
#include "sample.h"
my code goes here

There's not really any standard best-practice, but for most accounts, you should include what you really need in the header, and forward-declare what you can.
If an implementation file needs something not required by the header explicitly, then that implementation file should include it itself.

The language makes no requirements, but the almost universally
accepted coding rule is that all headers must be self
sufficient; a source file which consists of a single statement
including the include should compile without errors. The usual
way of verifying this is for the implementation file to include
its header before anything else.
And the compiler only has to read each include once. If it
can determine with certainty that it has already read the file,
and on reading it, it detects the include guard pattern, it has
no need to reread the file; it just checks if the controling
preprocessor token is (still) defined. (There are
configurations where it is impossible for the compiler to detect
whether the included file is the same as an earlier included
file. In which case, it does have to read the file again, and
reparse it. Such cases are fairly rare, however.)

A header file is supposed to be treated like an API. Let us say you are writing a library for a client, you will provide them a header file for including in their code, and a compiled binary library for linking.
In such scenario, adding a '#include' directive in your header file will create a lot of problems for your client as well as you, because now you will have to provide unnecessary header files just to get stuff compiling. Forward declaring as much as possible enables cleaner API. It also enables your client to implement their own functions over your header if they want.
If you are sure that your header is never going to be used outside your current project, then either way is not a problem. Compilation time is also not a problem if you are using include guards, which you should have been using anyway.

Having more (unwanted) includes in headers means having more number of (unwanted) symbols visible at the interface level. This may create a hell lot of havocs, might lead to symbol collisions and bloated interface

On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards
If your compiler doesn't remember which files have include guards and avoid re-opening and re-tokenising the file then get a better compiler. Most modern compilers have been doing this for many years, so there's no cost to including the same file multiple times (as long as it has include guards). See e.g. http://gcc.gnu.org/onlinedocs/cpp/Once_002dOnly-Headers.html
Headers should be self-sufficient and include/declare what they need. Expecting users of your header to include its dependencies is bad practice and a great way to make users hate you.
If my_needed_file.h is needed before sample.h (because sample.h requires declarations/definitions from it) then it should be included in sample.h, no question. If it's not needed in sample.h and only needed in sample.c then only include it there, and my preference is to include it after sample.h, that way if sample.h is missing any headers it needs then you'll know about it sooner:
// sample.c
#include "sample.h"
#include "my_needed_file.h"
#include ...
#include <std_header>
// ...
If you use this #include order then it forces you to make sample.h self-sufficient, which ensures you don't cause problems and annoyances for other users of the header.

I think second approach is a better one just because of following reason.
when you have a function template in your header file.
class myclass
{
template<class T>
void method(T& a)
{
...
}
}
And you don't want to use it in the source file for myclass.cxx. But you want to use it in xyz.cxx, if you go with your first approach then you will end up in including all files that are required for myclass.cxx, which is of no use for xyz.cxx.
That is all what I think of now the difference. So I would say one should go with second approach as it makes your code each to maintain in future.

Custom headers higher than standard?

Is it reasonable to put custom headers higher in include section than standard headers?
For example include section in someclass.hpp:
#include "someclass.h"
#include "global.h"
#include <iostream>
#include <string>
Is it best practice? What is the profit if it is?

The reason is that if you forget to include a dependent header in someclass.h, then whatever implementation file includes it as the first header, will get a warning/error of undefined or undeclared type, and whatnot. If you include other headers first, then you could be masking that fact - supposing the included headers define the required types, functions, etc. Example:
my_type.h:
// Supressed include guards, etc
typedef float my_type;
someclass.h:
// Supressed include guards, etc
class SomeClass {
public:
my_type value;
};
someclass.cpp:
#include "my_type.h" // Contains definition for my_type.
#include "someclass.h" // Will compile because my_type is defined.
...
This will compile fine. But imagine you want to use use SomeClass in your program. If you don't include my_type.h before including someclass.h, you'll get a compiler error saying my_type is undefined. Example:
#include "someclass.h"
int main() {
SomeClass obj;
obj.value = 1.0;
}

It is fairly common practice to #include "widget.h" as the first thing in widget.cpp. What this does is ensure that widget.h is self-contained, i.e. does not inadvertently depend on other header files.
Beyond that, I think it's essentially a matter of personal preference.

There are two important observations to be made before delving in the specifics:
When you develop a new header/source pair, it is important to check that the header is self-contained. To do so, the easiest way is to include first in a file.
It is best not to include extraneous things before including a header you do not own, as this could create strange issues in case of conflict of macros or overload of functions.
Therefore, the answer depend if you have unit test or not.
A general rule of thumb is to include headers starting with the Standard Library, then 3rd party headers (including Open Source projects), then your own middleware, utilities, etc... and finally the headers local to this library. It more or less follows the order of dependencies to comply with observation 2.
The only exception I have seen was the one header corresponding to the current source file, which would be included first to make sure it is self-contained (observation 1)... but this only holds if you don't have unit tests, for if you do then the unit test source file is a very good place to check this.

While it is just personal choice, I would prefer to include standard headers first. Few reasons:
Any set of #ifdef..#define would be correctly mapped, rather than standard headers misinterpreting them. This goes for conditional compilation as well as values of some macros, while standard headers are being compiled.
Any change/new function in standard header may conflict with your function, and compiler would emit error in header file, which would be be complicated to solve.
All required standard headers should be placed in one header (preferbly some pre-compiled-header), include that header, and then include your custom header. This would reduce compilation time.

Start with the system headers.
If there are no dependencies between the headers both ways work, but since programming is essentially communication, not with the computer but with other humans, it is important to make it logical and easy to understand. And my opinion is that it is better to start with the system headers.
I base this one of my very first programming courses (in 1984, I think), where we programmed in Lisp and were taught to think like this: you start with the normal Lisp language, and then you create a new language that is more useful for your application by adding some functions and data types. If you for example add dates and the ability to manipulate dates, this new language could be called Lisp-with-dates. Then you could use Lisp-with-dates to create a new language with calendar functionality, which could be called Lisp-with-calendars. Like layers in an onion.
Similarly, you can view C as having a "core" language, without any headers, and then you can for example expand this language into a new, bigger language with I/O functionality by #including stdio.h. You add more and more stuff to the core language by #including more headers. (I am aware that the term "C language" in other contexts refers to the entire standard, with all the standard headers, but bear with me here.) Each new #included header creates a new, bigger language, and an additional layer of the onion.
Now, to me it seems that the standard headers obviously should be the inner part of this onion, and therefore before the custom headers. You can create the language C-with-monsters by adding stuff to C-with-I/O, but the people who created C-with-I/O did not start with C-with-monsters.

any place you include c++ compiler treats it as the same

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js