Protecting certain include locations

Protecting certain include locations - c++

I'm building a little language that will compile to C or C++, I haven't decided yet, however I have come across a dilemma concerning the #include keyword.
My language will come with a standard library that will be incorporated into the language, and be accessible much like that of C or C++ with the standard includes such as #include <string>.
My compiler can automatically tell the difference between user includes and standard library includes, but my issue lies in how the GCC compiler uses the -I flag.
Let's take Java as an example. One of the default packages (folder) is called java.util. If I try to make my own folder called java.util inside my project, I get the error:
The package java.util conflicts with a package accessible from another module: java.base
Meaning it is included by default.
I would like this to do the same thing in C++, but am worried that a user could (hypothetically) do a relative path include and cause a conflict.
Take for example, I use the flag like so: -I ../some/folder.
However then the user could simply type #include "../some/folder" to access the same content. Is there any way I can restrict this, and like the title of the question suggests, "protect" the folder from being called like that?
Furthermore, if there is a file inside of that folder called test.h and the user decides to create their own file called test.h locally and include it. How will the conflicts occur? Will it pick the local folder over the included via. flags?
An example of a basic implementation is as follows: (General syntax, no specific language)
boolean userDefine = false;
string defineName = "foo";
// Do something to determine if <> define or "" define.
if (userDefine) {
// Returns #include "foo"
return "#include \"" + defineName + "\"";
} else {
// Returns #include "stdlib/foo"
return "#include \"stdlib/" + defineName + "\"";
}
But then again, the user could include the folder so that it satisfies the first condition and still gain access.

It's pretty much the standard practice to put any #include files at the very beginning of the C++ source file, as the first order of business.
Of course, a #include can appear anywhere in the C++ source file, and there are situations when that happens but, if you were to grab some random C++ source from github, chances are pretty good that all the #include files will be at the beginning of the file.
So, all you have to do, is to make arrangements that your library's #include is always at the beginning, and use the standard #ifndef/#define guards in your header files. Then, manual inclusion of them subsequently will have no effect whatsoever, no matter what path is used.
Of course, this won't stop anyone from manually #undefing your guard, to create some chaos. However, C++ never had a reputation for reliably preventing you from shooting yourself in the foot, and is unlikely to earn that reputation in the foreseeable future; so what? Actually, most compilers implement #pragma once, which might be a slightly better foot self-shooting prevention approach...

Related

ignore a main in a header file

I'm trying to use (Ligra) in a project. The framework works as long as the chief header "ligra.h" is included. Trouble is, that header has an implementation of parallel_main, which is a macro wrapper around main with OpenMP trickery. So if I wanted to write a simple program:
#include "ligra.h"
#include <iostream>
int main(){
std::cout<<"Hello World";
return 0;
}
It would not compile. Redefinition of symbol main.
Also, I need a parallel_main, with the exact macro trickery done in the "parallel.h" header.
So I think I have two options:
1) modify the file, add a pair of #ifdef LIGRA_MAIN's and not define the macro at compile time. Thus I can have my own main and not have redefinition. Trouble is I need my project to be using the upstream version of ligra, and Julian Shun, the original developer has probably forgottten about his project (and github, since he ignored more than one pull request).
2) Use/Write a #pragma that would strip that function out at the include stage.
I don't know how to do that last part, and would be very much in your debt if someone who did, reached out.

A solution that does not involve modifying library files (but is somewhat brittle) could be to do the following:
#include "ligra/parallel.h" (this does #define parallel_main main).
#undef parallel_main to prevent this rewriting of function names.
#include "ligra/ligra.h" as usual. Since parallel.h has an include guard, its repeated inclusion is prevented and parallel_main will not be redefined.
Proceed as normal.
You might also want to wrap this into a header so you only have to write it once.
Alternatively, you could do what #user463035818 suggests and redefine main only for the inclusion of ligra.h for very similar effect. The difference is in the names that the parallel_main function(s) from ligra will get.

You can simply not include ligra.h. If there is something useful in that file, then create a copy of the file - excluding the main function - and use that copy.
Sure, that means that if the upstream ligra.h is updated, your copy will not have the corresponding changes. However, given the premise "the original developer has probably forgottten about his project", this is probably not a problem. If the premise is wrong, then a better approach would be to create a pull request to make the framework usable as a library.

How to throw a compile error if the include paths for a library were set up not as intended

Our C++ library contains a file with a namethat is (considered) equal to one of the standard libraries' headers. In our case this is "String.h", which Windows considers to be the same as "string.h", but for the sake of this question it could be any other ile name used in the standard library.
Normally, this file name ambiguity is not a problem since a user is supposed to set up the include paths to only include the parent of the library folder (therefore requiring to include "LibraryFolder/String.h") and not the folder containing the header.
However, sometimes users get this wrong and directly set the include path to the containing folder. This means that "String.h" will be included in place of "string.h" in both the user code and in the standard library headers, resulting in a lot of compile errors that may not be easy to resolve or understand for beginners.
Is it possible, during compile-time, to detect such wrongly set up include paths in our libraries' header and throw a compile #warning or #error right away via directive, based on some sort of check on how the inclusion path was?

There's no failsafe way. If the compiler finds another file, it won't complain.
However, you could make it so you can detect it. In your own LibraryName/string.h, you could define a unique symbol, like
#define MY_STRING_H412a55af_7643_4bd6_be5c_4315d3a1e6b7
Then later in dependent code you could check
#ifndef MY_STRING_H412a55af_7643_4bd6_be5c_4315d3a1e6b7
#error "Custom standard library path not configured correctly"
#endif
Likewise you could use this to detect when the wrong version of the library was included.

[edit - as per comments]
Header inclusion can be summarized as :
Parse #include line to determine header name to look up
Depending on <Foo.h> or "Foo.h" form, determine set of locations (usually directories) to search
Interpret the header name, in an implementation-dependent way. (usually as a relative path). Note that this is not necessarily as a string, e.g. MSVC doesn't treat \ as a string escape character.
If the header is found (usually, if a file is found), replace the #include line with the content of that file. If not, fail the compilation.
(The parenthesized "usually" apply to MSVC, GCC, clang, etc but theoretically a compiler could compile directly from a git repository instead of disk files)
The problem here is that the test imagined (spelling of header name) must be located in the included header file. This test would necessarily be part of the replaced #include line, which therefore no longer exists and cannot be tested.
C++17 introduces __has_include but this does not affect the analysis: It would still have to occur in the included header file, and would not have the character sequence from the #include "Foo.h" available.
[old]
Probably the easiest way, especially for beginners is to have a LibraryName/LibraryName.h. Hopefully that name is unique.
The benefit is that once that works, users can replace #include "LibraryName.h" with just #include "String.h" as you know the path is right.
That said, "String.h" is asking for problems. Windows isn't case sensitive.

Use namespaces. In your case this would translate into something like this:
MyString/String.h
namespace my_namespace {
class string {
...
}
}
Now to make sure your std::string or any other class named string is not accidentally used instead of my_namespace::string (by any means, including but not limited to setting up your include paths incorrectly) you need to refer to your type using its fully qualified name, namely my_namespace::string. By doing this you avoid any naming clashes and are guaranteed to get a compile error if you don't include the correct header file (unless there's actually exists another class called my_namespace::string that is not yours). There are other ways to avoid these clashes (such as using my_namespace::string) but I'd rather be explicit about the types I'm using. This solution is costly however because it probably needs change all over your code base (changing all strings to my_namespace::string).
A somewhat less cumbersome alternative would be to change the name of the header String.h to something like MyString.h. This would quickly introduce compile errors but requires changing all your includes from #include "String.h" into#include "MyString.h"` (Should be much less effort compared to the first option).
I cannot think of any other way that requires less effort as of now. Since you were looking for a solution that would work in all similar scenarios I'd go with the namespaces if I were you and solve the problem once and for all. This would prevent any other existing/future naming clashes that may be in you code.

C/C++ #include formatting best practice

In my time with C/C++ I have encountered different ways to handle the file path for the #include directive when including your .h file in your .cpp/.c file. The Google style guide alludes to using part of the file path in your #include. That being said, I currently work on a project (albeit a small one) where a nicely laid out Makefile (for G++) and structure was laid out for me when I "inherited" the code. Namely, there is a directory named /project_name and inside is the Makefile and several sub-directories. For example, /project_name/inc holds the .h files and /project_name/src holds the .cpp files. The Makefile is set to look into each sub-directory to compile the source code.
My question is, given the directory structure and the Makefile, what is the "preferred" method for #include. The two alternatives I am successful with using are listed below.
include "mycode.h" // no knowledge of path, assumes structure that I described
include "../../project_name/inc/mycode.h" // seems a bit convoluted, but shows the file structure better
Are there any other options that I'm missing?

Use neither. Rather, put all your public headers in some hierarchy with a single root. For instance if your project is foo, put all your public headers in, say, include/foo, but don't hesitate to group your headers per component:
include/foo/io/printer.hh
include/foo/io/reader.hh
include/foo/job/job.hh
include/foo/job/scheduler.hh
Then if your code use only <foo/io/printer.hh> and so forth, which requires that you pass the proper -I $(top_srcdir)/include flags during construction of your project. This set-up simplifies things if you have to install your headers, as your code and users' code will use the headers exactly the same way.
If in addition you have private headers, use the same structure, but in another hierarchy, for instance:
src/io/parser.hh
You may, or may not, decide to use src/foo. The advantage of not using src/foo is that it is easier to see what are public and private headers.
But never use relative paths.

The first option appears as the less constraining.
If tomorrow the structure of the project directory changes, would you rather modify one makefile or change every single custom #include to take the change into account ?
Using the second option will make changes to the directory structure take more time, and the time needed to adapt everything will scale with the project size (whereas with the makefile change, it's constant).

This is a subjective answer; if both work then both are correct, however I prefer to have no knowledge of the source tree within the source code, only in the project settings/Makefile, so for me the first option is best:
#include "mycode.h"

As trojanfoe says, it is very subjective but still I would go with this style below.
#include "mycode.h"
If at all there is a need to restructure folders having.cpp/.h files, then the below style becomes fragile and bound to fail. You will be forced to change the .cpp files to provide the correct relative path.
#include "../../project_name/inc/mycode.h" //

it seems that include code structure in #include(i.e. option2) will make code less portable.
i recently met problem when bring a component(that build with jam) to cocoapod to be consumed by IOS. the main problem is that in the code there are following code
`#include <impl/xxxxx.h>`
when bring into cocoapods, it will not compiled because the path of "impl" is not recoginized and i did not find a setting for that.(i am be wrong, just share what i got now). change to
`#include "impl/xxxx.h" `
will make this pod compiled succ, but still not be used be client. sicne client code have no idea about the "impl" structure.(if the public interface has no this strucutre include, it will work. but it is not my case)
i end up with removing all this source structure include which becomes to use option 1..
so my point is that.
1) for public header, avoid include source structure.
2) for private header/code, use "private/source/structure/xxxx.h" to easy compile setting.
please share with me if anything.

Custom headers higher than standard?

Is it reasonable to put custom headers higher in include section than standard headers?
For example include section in someclass.hpp:
#include "someclass.h"
#include "global.h"
#include <iostream>
#include <string>
Is it best practice? What is the profit if it is?

The reason is that if you forget to include a dependent header in someclass.h, then whatever implementation file includes it as the first header, will get a warning/error of undefined or undeclared type, and whatnot. If you include other headers first, then you could be masking that fact - supposing the included headers define the required types, functions, etc. Example:
my_type.h:
// Supressed include guards, etc
typedef float my_type;
someclass.h:
// Supressed include guards, etc
class SomeClass {
public:
my_type value;
};
someclass.cpp:
#include "my_type.h" // Contains definition for my_type.
#include "someclass.h" // Will compile because my_type is defined.
...
This will compile fine. But imagine you want to use use SomeClass in your program. If you don't include my_type.h before including someclass.h, you'll get a compiler error saying my_type is undefined. Example:
#include "someclass.h"
int main() {
SomeClass obj;
obj.value = 1.0;
}

It is fairly common practice to #include "widget.h" as the first thing in widget.cpp. What this does is ensure that widget.h is self-contained, i.e. does not inadvertently depend on other header files.
Beyond that, I think it's essentially a matter of personal preference.

There are two important observations to be made before delving in the specifics:
When you develop a new header/source pair, it is important to check that the header is self-contained. To do so, the easiest way is to include first in a file.
It is best not to include extraneous things before including a header you do not own, as this could create strange issues in case of conflict of macros or overload of functions.
Therefore, the answer depend if you have unit test or not.
A general rule of thumb is to include headers starting with the Standard Library, then 3rd party headers (including Open Source projects), then your own middleware, utilities, etc... and finally the headers local to this library. It more or less follows the order of dependencies to comply with observation 2.
The only exception I have seen was the one header corresponding to the current source file, which would be included first to make sure it is self-contained (observation 1)... but this only holds if you don't have unit tests, for if you do then the unit test source file is a very good place to check this.

While it is just personal choice, I would prefer to include standard headers first. Few reasons:
Any set of #ifdef..#define would be correctly mapped, rather than standard headers misinterpreting them. This goes for conditional compilation as well as values of some macros, while standard headers are being compiled.
Any change/new function in standard header may conflict with your function, and compiler would emit error in header file, which would be be complicated to solve.
All required standard headers should be placed in one header (preferbly some pre-compiled-header), include that header, and then include your custom header. This would reduce compilation time.

Start with the system headers.
If there are no dependencies between the headers both ways work, but since programming is essentially communication, not with the computer but with other humans, it is important to make it logical and easy to understand. And my opinion is that it is better to start with the system headers.
I base this one of my very first programming courses (in 1984, I think), where we programmed in Lisp and were taught to think like this: you start with the normal Lisp language, and then you create a new language that is more useful for your application by adding some functions and data types. If you for example add dates and the ability to manipulate dates, this new language could be called Lisp-with-dates. Then you could use Lisp-with-dates to create a new language with calendar functionality, which could be called Lisp-with-calendars. Like layers in an onion.
Similarly, you can view C as having a "core" language, without any headers, and then you can for example expand this language into a new, bigger language with I/O functionality by #including stdio.h. You add more and more stuff to the core language by #including more headers. (I am aware that the term "C language" in other contexts refers to the entire standard, with all the standard headers, but bear with me here.) Each new #included header creates a new, bigger language, and an additional layer of the onion.
Now, to me it seems that the standard headers obviously should be the inner part of this onion, and therefore before the custom headers. You can create the language C-with-monsters by adding stuff to C-with-I/O, but the people who created C-with-I/O did not start with C-with-monsters.

any place you include c++ compiler treats it as the same

How to include header files

Does it matter how I import header files? I've seen double quotes as well as arrows used.
#include <stdlib.h>
#include "Some_Header.h"
Does it matter if they're capitalized a certain way as well? Experimenting around with this, it seems neither matters, but I figure there must be a reason for tutorials doing it the way they do.
Another question is, (coming from Java here), how do I access a class outside the file it was defined in? Say, I have one.cpp and two.cpp.
In one.cpp:
class Something {
...
In two.cpp:
class SomethingElse {
Something *example;
...
Like that? In Java you'd just preface a class name with "public." In C++, wrangling classes seems to be a bit more difficult..

Angle brackets in #include directives means the search path is limited to the "system" include directories. Double quotes mean the search path includes the current directory, followed by the system include directories.
The case of the filename matters when your OS is using a filesystem that is case sensitive. It sounds like you might be using Windows or Mac OS X, where filenames are case insensitive by default.

Angle brackets looks for the header in system header directories (e.g. /usr/include). Quotes is just an absolute or relative pathname, such as /path/to/header.h or ../headers/abc.h.
For accessing classes from other files, just #include the other file with the class. Be sure to structure your program so that no file is included more than once.

First the simple question:
Does it matter if they're capitalized a certain way as well?
In most cases, includes refer to files, and the compiler should be able to locate the file that you are including in the system. For that reason capitalization matters in all systems where the filesystem is case sensitive. If you want to keep a minimum of portability you should be consistent in the name of the files and the include. (All linux and mac os have case sensitive filesystems by default, in windows you can configure NTFS to be case sensitive also)
Now, does it actually matter how the file is named? No, it does not, as long as you are consistent in the inclusions. Also note that it is advisable to follow a pattern to ease the inclusion.
Does it matter how I import header files?
The standard is not really clear to this point, and different implementations followed different paths. The standard defines that they may be different, as the set of locations and order in which the compiler will search for the included file is implementation defined and can differ if the inclusion is with angle brackets or double quotes. If inclusion with quotes fails to locate the file, the compiler must fall back to process the include as if it had been written with angle brackets.
#include <x.h> // search in order in set1 of directories
#include "x.h" // search in order in set2 of directories
// if search fails, search also in set1
That implies that if a file is only present in set1, both types of includes will locate it. If a file is present in set2 but not set1 only the quote include will locate it. If different files with the same name are present in set1 and set2, then each inclusion type will find and include a different file. If two files with the same name are present in the common subset of set1 and set2, but the ordering of the sets is different each inclusion type can locate a different file.
Back to the real world, most compilers will include only the current directory in set2, with set1 being all the system include locations (which can usually be extended with compiler arguments) In these cases, if a file is only present in the current directory, #include "a.h" will locate it, but #include <a.h> will not.
Now, whether that is the common behavior, there are some implied semantics that are idiomatic in C/C++. In general square brackets are used to include system headers and external headers, while double quotes are used to include local files. There is a grey zone on whether a library in the same project should be considered as local or external. That is, even if always including with double quotes will work, most people will use angle quotes to refer to headers that are not part of the current module.
Finally, while no compiler that I know of does it, the standard allows an implementation (compiler) not to produce the standard headers as real files, but process the inclusion of standard headers internally. This is the only case where theoretically #include "vector" could fail to include the definitions of the std::vector class (or any other standard header). But this is not a practical issue, and I don't think it will ever be.

Question 1
Does it matter how I import header
files?
Does it matter if they're capitalized
a certain way as well?
It doesn't matter, but usual practice is,
Use angle brackets for system
headers.
User double quotes for User
defined headers(Your own headers)
Questions 2 & 3
Another question is, (coming from Java
here), how do I access a class outside
the file it was defined in?
You need to place class definition in a header file and include that header file wherever you want to use the class.
For your case, it would look like below.
//One.h
#ifndef ONE_H
#define ONE_H
class Something
{
public:
void doSomething(){}
};
#endif
//Two.cpp
#include "One.h"
class SomethingElse
{
SomeThing *example;
};

Firstly, the #include is a C pre-processor directive and not strictly part of the C++ language as such. You can find out more about it here although this is specifically for the GNU C preprocessor so may be different from what you're using. I think you should always assume case-sensitivity in include files. Not doing so might make it difficult to port you code to a case sensitive OS such as UNIX.
The use of "" or <> is rather subtle as explained above and most time you will notice no difference. Using "" generally searches the current directory first. I tend not to use this as:
I know where my headers are - I always specify them with -I on the compile line.
I've been caught out before when a locally bodged copy of a header overrode the central copy I was hoping to pick up.
I've also noticed some side effects such as when using make to create dependency trees (I can't quite recall the issue - it did treat the different includes differently, following some and not others, but this was about 7 years ago)
Secondly, your question about how to reference functions in other files is answered here/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js