I'm trying to create simple C++ incremental-build tool with dependency resolver.
I've been confused about one problem with cpp build process.
Imagine we have a library consists several files:
// h1.h
void H1();
// s1.cpp
#include "h1.h"
#include "h2.h"
void H1(){ H2(); }
// h2.h
void H2();
// s2.cpp
#include "h2.h"
#include "h3.h"
void H2(){ /*some implementation*/ }
void H3(){ /*some implementation*/ }
// h3.h
void H3();
When in client code including h1.h
// app1.cpp
#include "h1.h"
int main()
{
H1();
return 0;
}
there is implicit dependency of s2.cpp implementation:
our_src -> h3 -> s1 -> h2 -> s2. So we need to link with two obj files:
g++ -o app1 app1.o s1.o s2.o
In contrast when h3.h included
// app2.cpp
#include "h3.h"
int main()
{
H3();
return 0;
}
there is only one source dependency:
our_src -> h3 -> s2
So when we include h3.h we need only s2.cpp compiled (in spite of s1.cpp -> h2.h inclusion):
g++ -o app2 app2.o s2.o
This is very simple example of the problem, in real projects surely we may have several hundreds files and chains of inefficient includes may contain much more files.
So my question is: Is there a way or instruments to find out which header inclusion could be omitted when we check dependencies (without CPP parsing)?
I would appreciate for any responce.
In the case you stated to see the implicit dependence on s2.cpp you need to parse the implementation module s1.cpp because only there you will find that the s1 module is using s2. So to the question "can I solve this problem without parsing .cpp files" the answer is clearly a no.
By the way as far as the language is concerned there is no difference between what you can put in an header file or in an implementation file. The #include directive doesn't work at the C++ level, it's just a textual macro function without any understanding of the language.
Moreover even parsing "just" C++ declarations is a true nightmare (the difficult part of C++ syntax are the declarations, not the statements/expressions).
May be you can use the result of gccxml that parses C++ files and returns an XML data structure that can be inspected.
This is not an easy problem. Just a couple of many things that make this difficult:
What if one header file is implemented in N>1 source files? For example, suppose class Foo is defined in foo.h but implemented in foo_cotr_dotr.cpp, foo_this_function.cpp, and foo_that_function.cpp.
What if the same capability is implemented in multiple source files? For example, suppose Foo::bar() has implementations in foo_bar_linux.cpp, foo_bar_osx.cpp, foo_bar_sunos.cpp. The implemention to be used depends on the target platform.
One easy solution is to build a shared or dynamic library and link against that library. Let the toolchain resolve those dependencies. Problem #1 disappears entirely, and problem #2 does too if you have a smart enough makefile.
If you insist on bucking this easy solution you are going to need to do something to resolve those dependencies yourself. You can eliminate the above problems (not an exhaustive list) by a project rule one header file == one source file. I have seen such a rule, but not nearly as often as I've seen a project rule that says one function == one source file.
You may have a look at how I implemented Wand. It uses a directive to add dependencies for individual source files. The documentation is not fully completed yet, but there are examples of Wand directives in the source code of Gabi.
Examples
Thread class include file
Thread.h needs thread.o at link time
#ifdef __WAND__
dependency[thread.o]
target[name[thread.h] type[include]]
#endif
Thread class implementation on windows (thread-win32.cpp)
This file should only be compiled when Windows is the target platform
#ifdef __WAND__
target[name[thread.o] type[object] platform[;Windows]]
#endif
Thread class implementation on GNU/Linux (thread-linux.cpp)
This file should only be compiled when GNU/Linux is the target platform. On GNU/Linux, the external library pthread is needed when linking.
#ifdef __WAND__
target
[
name[thread.o] type[object] platform[;GNU/Linux]
dependency[pthread;external]
]
#endif
Pros and cons
Pros
Wand can be extended to work for other programming languages
Wand will save all necessary data needed to successfully link a new program by just giving the command wand
The project file does not need to mention any dependencies since these are stored in the source files
Cons
Wand requires extra directives in each source file
The tool is not yet widely used by library writers
Related
I am facing the following problem:
I have a project which contains math classes (.h and .cpp files).
I want to use these classes in a different project (not int he same solution) but I can't.
In the new project I add a path to the include files of my math project. When I try to #include them it works and I can see my classes but upon trying to use them I get "Unresolved external". I haven't created a .dll or a .lib so I really don't know what's causing this.
If you have any suggestions, I`ll appreciate it.
Thank you very much in advance.
When I try to #include them it works and I can see my classes but
upon trying to use them I get "Unresolved external". I haven't created
a .dll or a .lib so I really don't know what's causing this.
That you have not created a library is precisely the reason why you get the error. The compilation units in your new project ("the *.cpp files") include the headers for your classes and make use of the class definitions, but the definitions of the members are missing.
For example, let's say you have a file called "c.h" in your old project:
#ifndef C_H
#define C_H
class C
{
public:
C();
void f();
};
#endif
Some *.cpp file in your new project includes the header and uses the class:
#include "somepath/oldproject/c.h"
void someFunction()
{
C c;
c.f();
}
This compiles fine, but it will cause linker errors, because the definitions of C::C and C::f will be missing.
Now, the clean solution to this is certainly not adding somepath/oldproject/c.cpp from your old project to your new project, although that would fix the linker error, but to employ a library-based approach. Turn your math classes into a library project, let's call it "Math Utils", which produces a *.lib file (as you seem to be on Windows), for example math-utils.lib. Add the library's include and release path to your global include and library paths. Then add the math-utils.lib file to the new project's linker dependencies.
And change the code in your new project to:
#include <math-utils/c.h>
void someFunction()
{
C c;
c.f();
}
Do the same thing in the old project! You will end up with three different projects. Two application projects and one library project, the former two depending on the latter.
Creating your own libraries for the first time can be a bit intimidating, but the benefits are worth the trouble.
See also The Linker Is not a Magical Program.
I'm developing a collection of C++ classes and am struggling with how to share the code in a way that maintains organization without compromising ease of compilation for a user of the collection.
Options that I have seen include:
Distribute compiled library file
Put the source in the header file (with implicit inline as discussed in this answer)
Use symbolic links to allow the compiler to find the files.
I'm currently using the third option where, for each class the I want to include I symbolic link each classess headers and source files (e.g. ln -s <path_to_class folder>/myclass.cpp) This works well except that I can't move the project folder location (it breaks all the symlinks) and I have to have all those symlinked files hanging around.
I like the second option (it has the appearance of Java), but I'm worried about code size bloat if everything is declared inline.
A user of the collection will create a project folder somewhere, and somehow include the collection into their compilation process.
I'd like a few things to be possible:
Easy compilation (something like gcc *.cpp from the project folder)
Easy distribution of library in uncompiled form.
Library organization by module.
Compiled code size is not bloated.
I'm not worried about documentation (Doxygen takes care of that) or compile time: the overall modules are small and even the largest projects on the slowest machines won't take more than a few seconds to compile.
I'm using the GCC compiler, if it makes any difference.
A library is the best option (in my opinion) of the three you raised. Then provide the header file(s) in the include path and the library in the linker path.
Since you also want to distribute the library in source code form, I would be inclined to provide a compressed archive (gzip, 7-zip, tarball, or other preferred format) in a central repository.
If I understand correctly, you do not want users to have to include the .cpp files in their build, but instead just want them to use either: (i) the headers directly, (ii) use a compiled form of the lib.
Your requirements are a bit unusual, but they can be achieved. It seems to me like you could organize your code in the following manner. First, have a global define that dictates whether or not you are compiling the library:
// global.h
// ...
#define LIB_SOURCE
// ...
Then in every header file, you check whether that define is set: if the library is distributed as a static/shared lib, the definitions are not included, otherwise, the '.cpp' file is included from the header file.
// A.h
#ifndef _A_H
#include "global.h"
#ifdef LIB_SOURCE
#include "A.cpp"
#endif
// ...
#endif
where 'A.cpp' would contain the actual implementation.
Again, this is a very strange way of doing things and I would actually advise against such practice. A better way (but one which requires more work) is to always distribute a shared library. But to keep things independent of the compiler, write a C layer around it. This way, you have a portable, maintainable library.
As for some of the other requirements:
Keep the build process simple by providing a Makefile
If you worry about the code size of the compiled library, look into gcc's optimization options (-Os). If you worry about the code size of the library when distributed in source-form in the headers, this is more tricky. Since the (inlined) code will actually be in the headers, the code will obviously grow with each inclusion in a .cpp file by the user.
I ended up using inline headers for all of the code. You can see the library here:
https://github.com/libpropeller/libpropeller/tree/master/libpropeller
The library is structured as:
library folder
class A
classA.h
classA.test.h
class B
classB.h
classB.test.h
class C
...
With this structure I can distribute the library as source, and all the user has to do is include -I/path/to/library in their makefile, and #include "library/classA/classA.h" in their source files.
And, as it turns out, having inline headers actually reduces the code size. I've done a full analysis of this, and it turns out that inline code in the headers allows the compiler to make the final binary roughly 5% smaller.
We are refactoring our code base, and trying to limit the direct dependencies between different components. Our source tree has several top level directories: src/a, src/b and src/c.
We want to enforce a set of restirctions:
Files in a cannot depend of files in b or c
Files in b can depend on files a but not c
Files in c can directly depend on files in b but not a
Enforcing the first one is simple. I have an implicit rule like this:
build/a/%.o : src/a/%.cpp
$(CXX) -I src/a $(OTHER_FLAGS) -o $# $<
If a file under a tries to include a header file from b or c, the build fails as the header is not found.
The second rule has a similar rule, which specifies src/a and src/b as include directories. The problem arises with building c. The following is allowed.
src/c/C.cpp
#include "b.h"
void C() { ... }
src/b/b.h
#include "a.h"
class B { ... };
src/a/a.h
class A { ... };
Here, a file from c includes a file from b (allowed), which in turn includes a file from a (also allowed). We want to prevent code like this:
src/c/C_bad.cpp
// Direct inclusion of a
#include "a.h"
src/c/c_bad.h
// Direct inclusion of a
#include "a.h"
For the allowed case to compile, the compile command for building files in src/c must include a -Isrc/a, but that allows the second cases to also compile.
I suspect that the answer to my problem is writing a script which looks at the dependencies generated from the compiler, finds potentially illegal dependencies and then looks at the source files to determine if this is a direct dependency. Is there a reasonable way to do this combining the compiler and/or makefile constructs?
If it matters, we are using GNU Make 3.81 and g++ 4.5.3, but would like to be portable if possible.
Update
We are looking for something where it takes effort to violate the rules, not one where it takes effort to follow the rules. (Past experience has shown that the latter is unlikely to work.) While there are some good ideas in the other answer, I'm accepting the one that says to write a script, since that is the one that takes the most effort to work around.
Thanks to everyone for your answers.
Considering the fact that you're applying this on an existing code base, I would opt for the "validation script" approach.
So instead of modifying the build process and severing dependencies one at a time as the build fails, you get presented with a list of files that are non-complaint. You can then refactor your codebase having the "big picture" in mind and any changes you make will be built using the same Makefiles as before thus simplifying testing and debugging.
Once refactored, the analysis script can continue to be used as a compliance checker to validate future updates.
A possible starting point for such an analysis would be to use makedepend or cpp -MM. For example, using the cpp/h files you've listed in the question:
[me#home]$ find .
.
./b
./b/b.h
./a
./a/a.h
./c
./c/C_bad.cpp
./c/C.cpp
./c/c_bad.h
[me#home]$ cpp -MM -Ia -Ib -Ic */*.cpp
C_bad.o: c/C_bad.cpp a/a.h
C.o: c/C.cpp b/b.h a/a.h
[me#home]$ # This also works for header files
[me#home]$ cpp -Ia -Ib -Ic -MM c/c_bad.h
c_bad.o: c/c_bad.h a/a.h
It should be reasonably straight-forward to parse those output to determine the dependencies of each cpp file and flag up those that are non-compliant.
The drawback to this approach is that it cannot differentiate between direct and indirect dependencies, so if that matters you may need to include an extra step to inspect the source and pick out direct dependencies.
You can make the -I options target-specific:
build/b/%.o: CPPFLAGS += -Isrc/a
build/c/%.o: CPPFLAGS += -Isrc/b
This is specific to gnu-make, though, so it's not portable.
Yes. But it takes some manual effort and discipline.
When building C you can depend on headers in src/b/*.h.
Inside project B any header files in the main directory should be self-contained and not have dependencies on other projects. You also need a subdirectory inside B src/b/detail/*.h. In here header files are allowed to include src/a/*.h and src/b/*.h but this is a private implementation detail and only available to source files for the b project.
The easiest way is to change your include path to -Isrc for everything. Include statements then have the complete relative path
#include <a/a.h>
for example. This makes it much easier to check the code automatically (perhaps in a commit hook rather than the makefile).
Alternatively, you could do something nasty with macros in the A and B headers:
// src/a/a.h
#ifndef SRC_A_H
#define SRC_A_H
#ifndef ALLOW_A
#error "you're not allowed to include A headers here"
#endif
//...
and
// src/b/b.h
#ifndef SRC_B_H
#define SRC_B_H
#ifdef ALLOW_A_INDIRECT
#define ALLOW_A
#endif
#include <a/a.h>
//...
#ifdef ALLOW_A_INDIRECT
#undef ALLOW_A
#endif
#endif // include guard
Now these make rules will allow A and B to build ok:
build/a/%.o: CPPFLAGS += -DALLOW_A
build/b/%.o: CPPFLAGS += -DALLOW_A
and this will allow C access only via B (and the macros in B's headers)
build/c/%.o: CPPFLAGS += -DALLOW_A_INDIRECT
Note this requires some discipline especially in B's headers, but I suppose if it sits alongside existing include guards, it ... ok, it's actually still pretty nasty.
I'm trying to access functions from another file for use inside my class definition:
// math.cpp
int Sum(int a, int b){
return (a + b);
}
// my_class.cpp
#include <math.cpp>
#include <my_class.h>
int ComputeSomething() {
...
return ::Sum(num1, num2);
}
Despite my best efforts, I can't get the compiler to spit out anything outside the likes of ::Sum has not been declared or Sum was not declared in this scope.
I'm trying to wrap my head around code organization in C++, any help appreciated.
It might be worth noting that I'm programming for Arduino.
To be able to access functions from a user-defined library, best divide that library into a .h (or .hpp) and a .cpp file. I understand you have actually done this, but tried various options – among them the inclusion of the .cpp file – for the sake of finding a solution.
Still, to ensure things work as expected, the declarations of functions and classes should go into the .h file, best protected by something like
#ifndef MY_H_FILE
#define MY_H_FILE
/* ..Declarations.. */
#endif
Then to include the .h file (I'll assume it's named my.h), either use
#include "my.h" // path relative to build directory
or
#include <my.h> // path relative to any of the include paths
The latter only works if my.h is found on an include path previously known to the compiler (e.g. what is specified using the -I command line option in GCC). The former works if the path to the .h file given is relative to the directory your are building from.
Finally, do not use a file name that can be confused with a system library (such as "math.h"), especially if you are using the <...> syntax, as the include path will definitely include the system library header files.
Have you followed the instructions given here?
User-created libraries as of version 0017 go in a subdirectory of your
default sketch directory. For example, on OSX, the new directory would
be ~/Documents/Arduino/libraries/. On Windows, it would be My
Documents\Arduino\libraries. To add your own library, create a new
directory in the libraries directory with the name of your library.
The folder should contain a C or C++ file with your code and a header
file with your function and variable declarations. It will then appear
in the Sketch | Import Library menu in the Arduino IDE.
I have a large C++ file (SS.cpp) which I decided to split in smaller files so that I can navigate it without the need of aspirins. So I created
SS_main.cpp
SS_screen.cpp
SS_disk.cpp
SS_web.cpp
SS_functions.cpp
and cut-pasted all the functions from the initial SS.cpp file to them.
And finally I included them in the original file :
#include "SS_main.cpp"
#include "SS_screen.cpp"
#include "SS_disk.cpp"
#include "SS_web.cpp"
#include "SS_functions.cpp"
This situation remains for some months now , and these are the problems I've had :
The Entire Solution search (Shift-Ctrl-F in VS) does not search in the included files, because they are not listed as source files.
I had to manually indicate them for Subversion inclusion.
Do you believe that including source files in other sources is an accepted workaround when files go really big ? I should say that splitting the implemented class in smaller classes is not an option here.
There are times when it's okay to include an implementation file, but this doesn't sound like one of them. Usually this is only useful when dealing with certain auto-generated files, such as the output of the MIDL compiler. As a workaround for large files, no.
You should just add all of those source files to your project instead of #including them. There's nothing wrong with splitting a large class into multiple implementation files, but just add them to your project, including them like that doesn't make much sense.
--
Also, as an FYI, you can add files to your projects, and then instruct the compiler to ignore them. This way they're still searchable. To do this, add the file to the project, then right-click it, and go to Properties, and under "General" set "Exclude from Build" to Yes.
Don't include cpp files in other files. You don't have to define every class function in one file, you can spread them across multiple files. Just add them individually to the project and have it compile all of them separately.
You don't include implementation (.cpp) files. Create header files for these implementation files containing the function/class declarations and include these as required.
There are actually times you will want to include CPP files. There are several questions here about Unity Builds which discuss this very topic.
You need to learn about Separate compilation, linking, and what header files are for.
You need to create a header file for each of those modules (except possibly main.cpp). The header file will contain the declarative parts of each .cpp source file, and the .cpp files themselves will contain the instantive parts. Each unit can then be separately compiled and linked. For example:
main.cpp
#include "function.h"
int main()
{
func1() ;
}
function.h
#if !defined FUNCTION_H
#define FUNCTION_H
extern void func1() ;
#endif
function.cpp
void func1()
{
// do stuff
}
Then function.cpp and main.cpp are separately compiled (by adding them to the sources for the project), and then linked. The header file is necessary so that the compiler is made aware of the interface to func1() without seeing the complete definition. The header should be added to the project headers, then you will find that the source browser and auto-completion etc. work correctly.
What bothers me with this question is the context of it.
A large cpp file has been created, large enough to warrant thinking about splitting it into smaller more manageable files. The proposed split is:
SS_main.cpp
SS_screen.cpp
SS_disk.cpp
SS_web.cpp
SS_functions.cpp
This seems to indicate that there are separate units of functionality from a specification and design perspective. We can only guess at the coupling between these units of code.
However, it would be a start to define these code units such that each new cpp file has its own header file thus defining the interfaces of these units and the (low) coupling between them to achieve (high) cohesion for each unit.
We are refactoring here.
It is not acceptable to use included cpp files in this context it as does not provide any advantages. The only time I've come across included cpp files is when a one is included to provide code for debug code, and example being to compile non-inline versions of functions. It helps in stepping through code in the debugger.