In C++, why are cyclical directory dependencies bad? - c++

I'm asking this about a C++ project developed on Linux. Consider this:
I have two peer directories, dir1 and dir2. dir1 contains classA.h and classB.h. dir2 contains classC.h and classD.h. dir1/classA.h has an #include for dir2/classC.h. dir2/classD.h has an #include for dir1/classB.h. As a result, there is a cyclical dependency between directories dir1 and dir2. However, there are no cyclical dependencies between any classes.
I understand why cyclic dependencies are bad between classes. It seems intuitive to me that directories should also not have cyclical dependencies--however I can't figure out why this would be bad.
Anyone have an explanation?

They are not bad. At least not the way you stated the problem. Directories are meant to organize files, but programatically have no meaning.
However if your directories represent separate modules (i.e. there is a generated library file for each directory), you will have linking errors.
Because classA depends on classC, you need to build the second module in order to compile the first one. But the second module needs the first module to be built first, since classD depends on classB.

Like for classes cyclic dependencies for directories can be an issue for maintainability and reuse.
Maintainability: when a "module" (in this case a directory) has dependencies on another module, whenever the other modules changes, the change can affect this module.
Reuse: when reusing a module, you must also reuse the modules it depends on.
So with cyclic dependencies, all modules are affected. This isn't a real problem with a limited amount of modules, but it grows together with the growing amount.

Related

Why create an include/ directory in C and C++ projects?

When I work on my personal C and C++ projects I usually put file.h and file.cpp in the same directory and then file.cpp can reference file.h with a #include "file.h" directive.
However, it is common to find out libraries and other kinds of projects (like the linux kernel and freeRTOS) where all .h files are placed inside an include/ directory, while .cpp files remain in another directory. In those projects, .h files are also included with #include "file.h" instead of #include "include/file.h" as I was hoping.
I have some questions about all of this:
What are the advantages of this file structure organization?
Why are .h files inside include/ included with #include "file.h" instead of #include "include/file.h"? I know the real trick is inside some Makefile, but is it really better to do that way instead of making clear (in code) that the file we want to include is actually in the include/ directory?
The main reason to do this is that compiled libraries need headers in order to be consumed by the eventual user. By convention, the contents of the include directory are the headers exposed for public consumption. The source directory may have headers for internal use, but those are not meant to be distributed with the compiled library.
So when using the library, you link to the binary and add the library's include directory to your build system's header paths. Similarly, if you install your compiled library to a centralized location, you can tell which files need to be copied to the central location (the compiled binaries and the include directory) and which files don't (the source directory and so forth).
It used to be that <header> style includes were of the implicit path type, that is, to be found on the includes environment variable path or a build macro, and the "header" style includes were of the explicit form, as-in, exactly relative to where-ever the source file is that included it. While some build tool chains still allow for this distinction, they often default to a configuration that effectively nullifies it.
Your question is interesting because it brings up the question of which really is better, implicit or explicit? The implicit form is certainly easier because:
Convenient groupings of related headers in hierarchies of directories.
You only need include a few directories in the includes path and need not be aware of every detail with regard to exact locations of files. You can change versions of libraries and their related headers without changing code.
DRY.
Flexible! Your build environment doesn't have to match mine, but we can often get nearly exact same results.
Explicit on the other hand has:
Repeatable builds. A reordering of paths in an includes macro/environment variable, doesn't change resulting header files found during the build.
Portable builds. Just package everything from the root of the build and ship it off to another dev.
Proximity of information. You know exactly where the header is with #include "\X\Y\Z". In the implicit form, you may have to go searching along multiple paths and might even find multiple versions of the same file, how do you know which one is used in the build?
Builders have been arguing over these two approaches for many decades, but a hybrid form of the two, mostly wins out because of the effort required to maintain builds based purely of the explicit form, and the obvious difficulty one might have familiarizing one's self with code of a purely implicit nature. We all generally understand that our various tool chains put certain common libraries and headers in particular locations, such that they can be shared across users and projects, so we expect to find standard C/C++ headers in one place, but we don't initially know anything about the specific structure of any arbitrary project, lacking a locally well documented convention, so we expect the code in those projects to be explicit with regard to the non-standard bits that are unique to them and implicit regarding the standard bits.
It is a good practice to always use the <header> form of include for all the standard headers and other libraries that are not project specific and to use the "header" form for everything else. Should you have an include directory in your project for your local includes? That depends to some extent on whether those headers will be shipped as interfaces to your libraries or merely consumed by your code, and also on your preferences. How large and complex is your project? If you have a mix of internal and external interfaces or lots of different components, you might want to group things into separate directories.
Keep in mind that the directory structure your finished product unpacks to, need not look anything like the directory structure under which you develop and build that product in. If you have only a few .c/.cpp files and headers, it's ok to put them all in one directory, but eventually, you're going to work on something non-trivial and will have to think through the consequences of your build environment choices, and hopefully document it for others to understand it.
1 . .hpp and .cpp doesn't necessary have 1 to 1 relationship, there may have multiple .cpp using same .hpp according to different conditions (eg:different environments), for example: a multi-platform library, imagine there is a class to get the version of the app, and the header is like that:
Utilities.h
#include <string.h>
class Utilities{
static std::string getAppVersion();
}
main.cpp
#include Utilities.h
int main(){
std::cout << Utilities::getAppVersion() << std::ends;
return 0;
}
there may have one .cpp for each platform, and the .cpp may be placed at different locations so that they are easily be selected by the corresponding platform, eg:
.cpp for iOS (path:DemoProject/ios/Utilities.cpp):
#include "Utilities.h"
std::string Utilities::getAppVersion(){
//some objective C code
}
.cpp for Android (path:DemoProject/android/Utilities.cpp):
#include "Utilities.h"
std::string Utilities::getAppVersion(){
//some jni code
}
and of course 2 .cpp would not be used at the same time normally.
2.
#include "file.h"
instead of
#include "include/file.h"
allows you to keep the source code unchanged when your headers are not placed in the "include" folder anymore.

Do "namespaced" include paths in CMake and C++ projects have benefits when integrating projects together?

While orienting myself to one of the open source C++ project I found a line of code in the root CMakeLists.txt file:
include_directories(${PROJECT_SOURCE_DIR}/../include)
And then in one of the source files there is this line:
#include "someFolder/someFile.h"
someFolder is found in include folder.
I have seen a different approach in another project,
in which the CMakeLists.txt has something like this:
include_directories(${PROJECT_SOURCE_DIR}/../include/someFolder)
then in the source file:
#include "someFile.h"
The first approach typically "namespaces" the include path by the name of the project the header belongs to. Are there common benefits to this when integrating multiple projects together? If so, what are those common benefits?
I prefer subdirectories for include files.
The main reason for this is to avoid file name conflicts. If dependency A has a file called someFile.h, and dependency B also has a file called someFile.h, you got a problem, because the compiler doesn't know which one to include.
So for the same reason you should use namespaces, you should also use subdirectories for include files when possible.
Well, this is very opinion based, in my opinion...
I prefer the former approach, especially for larger libraries. It exhibits the logical structure of the library as intended for by the authors.

Structuring C++ Application (directory and folders)

I'm coming from web development and I need to ask C++ programmers how do they manage their directory for a model-based project?
I have structured my project in Visual Studio C++ Solution Manager like this:
-> Header Files
--> Models
DatabaseEngine.interface.h
-> Resources
-> Source Files
--> Models
DatabaseEngine.cpp
--> Application
Core.cpp
Bootstrap.cpp
--> FrontController
---
I have made an exact duplicate of Model's directory under the Headers directory, and appended them with ".interface" name, since they are interfaces and the real implementation of them lies in the mirror path under the Sources.
And I have custom types such as DBConnection which I don't know where to put them? should i put them in a file named CustomTypes.cpp or I should relate them to their associated parent model/class/object?
My concern is the convention and standards.
There is not any standard, C++ is a very open-minded world you will see ; )
It is all about making what works best for you, but usually taking advices from people that have already experimented cannot hurt.
Personally, I try to follow this convention
/ProjectName
/src
/libs <- Libraries go here
/Models <- Assuming you want to make a library out of your models
User.h
User.cpp
... <- Putting header and implementations together is not a problem,
they should be edited in parallel oftentimes
/Utilities <- Should your library grow, you can make it more modular
by creating subdirectories
(that could contain subdirectories, etc.)
DBConnection.h
DBConnection.cpp
/apps <- define your applications here.
They probably rely on classes and functions defined in one or several of your libaries define above.
/ApplicationA
Core.h
Core.cpp
Bootstrap.h
Bootstrap.cpp
/resources
/doc
# Below are 'environment specific' folders.
/vs <- Visual studio project files
/xcode <- Xcode project files
Remarks
Headers and implementations
Header files (.h, or .hpp, or no extension) are indeed defining the interface that will be implemented in the implementation file (.cpp). Nonetheless, it is very common to give the same basename to both, and only distinguish them by extension(or absence of). Adding an additional .interface part probably does not buy you much, and could confuse your IDE (or other tools), that is otherwise able to relate a header file to its implementation file.
For the same reason (not confusing some tools), it can be easier to put both files in the same folder: they are very closely related anyway.
Additionally, if later on you need to change your folders structures (eg. to modularize), having only one place to make subfolders (instead of two in your approach) will also make life a bit easier.
 Custom types
C++ offers classes for the programmer to define custom types. It is very common to define custom types in their own pair of header/implementation file. In your case, DBConnection.h would define a DBConnection class, whose (non-inline) methods would be implemented in DBConnection.cpp.
Personnaly, I would not be afraid to create one pair of files per type, which makes it easier for future-you and other programmers to find the file defining a type. You can manage the growing number of files by making subfolders, that will force you to modularize your design.
Of course, sometimes you will need to define a very short class, tightly coupled to another class. It is up to you to include both classes in a common pair of files if you feel the link between them is strong enough.
Extensibility
It may not be a concern to all projects, but this directory structure is extensible in terms of environments and build management.
Keeping project files in separate folders at the top level, and defining out-of-source builds, allows to create project files for other IDEs further down the line.
This hierarchy is also easily amenable to CMake build management, if you should go this way. A CMakeLists.txt file will be placed at the top level (under ProjectName/), this file invoking add_subdirectory(src), in turn caling a CMakeLists.txt in ProjectName/src/, etc.

Header files dependencies between C++ modules

In my place we have a big C++ code base and I think there's a problem how header files are used.
There're many Visual Studio project, but the problem is in concept and is not related to VS. Each project is a module, performing particular functionality. Each project/module is compiled to library or binary. Each project has a directory containing all source files - *.cpp and *.h. Some header files are API of the module (I mean the to the subset of header files declaring API of the created library), some are internal to it.
Now to the problem - when module A needs to work with module B, than A adds B's source directory to include search path. Therefore all B's module internal headers are seen by A at compilation time.
As a side effect, developer is not forced to concentrate what is the exact API of each module, which I consider a bad habit anyway.
I consider an options how it should be on the first place. I thought about creating in
each project a dedicated directory containing interface header files only. A client module wishing to use the module is permitted to include the interface directory only.
Is this approach ok? How the problem is solved in your place?
UPD On my previous place, the development was done on Linux with g++/gmake and we indeed used to install API header files to a common directory is some of answers propose. Now we have Windows (Visual Studio)/Linux (g++) project using cmake to generate project files. How I force the prebuild install of API header files in Visual Studio?
Thanks
Dmitry
It sounds like your on the right track. Many third party libraries do this same sort of thing. For example:
3rdParty/myLib/src/ -contains the headers and source files needed to compile the library
3rdParty/myLib/include/myLib/ - contains the headers needed for external applications to include
Some people/projects just put the headers to be included by external apps in /3rdParty/myLib/include, but adding the additional myLib directory can help to avoid name collisions.
Assuming your using the structure: 3rdParty/myLib/include/myLib/
In Makefile of external app:
---------------
INCLUDE =-I$(3RD_PARTY_PATH)/myLib/include
INCLUDE+=-I$(3RD_PARTY_PATH)/myLib2/include
...
...
In Source/Headers of the external app
#include "myLib/base.h"
#include "myLib/object.h"
#include "myLib2/base.h"
Wouldn't it be more intuitive to put the interface headers in the root of the project, and make a subfolder (call it 'internal' or 'helper' or something like that) for the non-API headers?
Where I work we have a delivery folder structure created at build time. Header files that define libraries are copied out to a include folder. We use custom build scripts that let the developer denote which header files should be exported.
Our build is then rooted at a substed drive this allows us to use absolute paths for include directories.
We also have a network based reference build that allows us to use a mapped drive for include and lib references.
UPDATE: Our reference build is a network share on our build server. We use a reference build script that sets up the build environment and maps(using net use) the named share on the build server(i.e. \BLD_SRV\REFERENCE_BUILD_SHARE). Then during a weekly build(or manually) we set the share(using net share) to point to the new build.
Our projects then a list of absolute paths for include and lib references.
For example:
subst'ed local build drive j:\
mapped drive to reference build: p:\
path to headers: root:\build\headers
path to libs: root:\build\release\lib
include path in project settings j:\build\headers; p:\build\headers
lib path in project settings j:\build\release\lib;p:\build\release\lib
This will take you local changes first, then if you have not made any local changes(or at least you haven't built them) it will use the headers and libs from you last build on the build server.
I've seen problems like this addressed by having a set of headers in module B that get copied over to the release directory along with the lib as part of the build process. Module A then only sees those headers and never has access to the internals of B. Usually I've only seen this in a large project that was released publicly.
For internal projects it just doesn't happen. What usually happens is that when they are small it doesn't matter. And when they grow up it's so messy to separate it out no one wants to do it.
Typically I just see an include directory that all the interface headers get piled into. It certainly makes it easy to include headers. People still have to think about which modules they're taking dependencies on when they specify the modules for the linker.
That said, I kinda like your approach better. You could even avoid adding these directories to the include path, so that people can tell what modules a source file depends on just by the relative paths in the #includes at the top.
Depending on how your project is laid out, this can be problematic when including them from headers, though, since the relative path to a header is from the .cpp file, not from the .h file, so the .h file doesn't necessarily know where its .cpp files are.
If your projects have a flat hierarchy, however, this will work. Say you have
base\foo\foo.cpp
base\bar\bar.cpp
base\baz\baz.cpp
base\baz\inc\baz.h
Now any header file can include
#include "..\baz\inc\baz.h
and it will work since all the cpp files are one level deeper than base.
In a group I had been working, everything public was kept in a module-specific folder, while private stuff (private header, cpp file etc.) were kept in an _imp folder within this:
base\foo\foo.h
base\foo\_imp\foo_private.h
base\foo\_imp\foo.cpp
This way you could just grab around within your projects folder structure and get the header you want. You could grep for #include directives containing _imp and have a good look at them. You could also grab the whole folder, copy it somewhere, and delete all _imp sub folders, knowing you'd have everything ready to release an API.
Within projects headers where usually included as
#include "foo/foo.h"
However, if the project had to use some API, then API headers would be copied/installed by the API's build wherever they were supposed to go on that platform by the build system and then be installed as system headers:
#include <foo/foo.h>

C++ project source code layout

One of the popular way to organize project directory is more or less like this:
MyLib
+--mylib_class_a.h
mylib_class_a.cpp
mylib_library_private_helpers.h
mylib_library_private_helpers.cpp
MyApp
+--other_class.h
other_class.cpp
app.cpp
app.cpp:
#include "other_class.h"
#include <mylib_class_a.h> // using library MyLib
All .h and .cpp files for the same library are in the same directory. To avoid name collision, file names are often prefix with company name and/or library name. MyLib will be in MyApp's header search path, etc. I'm not a fan of prefixing filenames, but I like the idea of looking at the #include and know exactly where that header file belongs. I don't hate this approach of organizing files, but I think there should be a better way.
Since I'm starting a new project, I want to solicit some directory organization ideas. Currently I like this directory structure:
ProjA
+--include
+--ProjA
+--mylib
+--class_a.h
+--app
+--other_class.h
+--src
+--mylib
+--class_a.cpp
library_private_helpers.h
library_private_helpers.cpp
+--app
+--other_class.cpp
app.cpp
util.h
app.cpp:
#include "util.h" // private util.h file
#include <ProjA/app/other_class.h> // public header file
#include <ProjA/mylib/class_a.h> // using class_a.h of mylib
#include <other3rdptylib/class_a.h> // class_a.h of other3rdptylib, no name collision
#include <class_a.h> // not ProjA/mylib/class_a.h
#include <ProjA/mylib/library_private_helpers.h> // error can't find .h
.cpp files and private (only visible to immediate library) .h files are stored under the src directory (src is sometimes called lib). Public header files are organized into a project/lib directory structure and included via <ProjectName/LibraryName/headerName.h>. File names are not prefixed with anything. If I ever needed to package up MyLib to be used by other teams, I could simply change my makefile to copy the appropriate binary files and the whole include/ProjA directory.
Once files are checked into source control and people start working on them it will be hard to change directory structure. It is better to get it right at the get-go.
Anyone with experience organizing source code like this? Anything you don't like about it? If you have a better way to do it, I would very much like to hear about it.
Well, it all depends on how big these projects are. If you've only got a few files, then whack them all in one folder.
Too many folders when you haven't got many files to manage is in my opinion overkill. It gets annoying digging in and out of folders when you've only got a few files in them.
Also, it depends on who's using this stuff. If you're writing a library and its going to be used by other programmers, then it's good to organize the headers they want to use into an include folder. If you're creating a number of libraries and publishing them all, then your structure might work. But, if they're independent libraries, and the development isn't all done together and they get versioned and released at different times, you'd be better off sticking with having all files for one project locatable within one folder.
In fact, I would say keep everything in one folder, until you get to a point where you find its unmanagable, then reorganize into a clever scheme of dividing the source up into folders like you've done. You'll probably know how it needs to be organized from the problems you run into.
KISS is usually always the solution in programming -> keep everything as simple as possible.
Why not do something like the first, only use the directory that MyLib resides in as a part of the include directive, which reduces the silly prefixing:
#include <MyLib/ClassA.h>
That tells you where they are from. As for the second choice, I personally get really annoyed when I have a header or source file open, and have to navigate around through the directory structure to find the other and open it. With your second example, if you had src/mylib/class_a.cpp open, and wanted to edit the header, in many editors you'd have to go back two levels, then into include/ProjA before finding the header. And how are we to know that the header is in the ProjA subdirectory without some other external clue? Plus, it's too easy for one file or the other to get moved into a different place that "better" represents how it is used, without the alternate file being moved. It just gives me headaches when I encounter it at my job (and we do have some parts of our codebase where people did every potential problem I've just mentioned).
I have tried both methods. Personally, I like the first better. I understand the urge to put everything in more specific directories, but it causes a lot of over-complication.
I usually use this rule: applications and in-house libraries use the first method. Public open source libraries use the second method. When you are releasing the code, it helps a lot if the include files are in a separate directory.