Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have a question about how large c++ projects with many components are supposed to be managed (I guess is the best term). For all intents and purposes I'm a beginning programmer. I understand the basics of compiling, header files, etc., but I've never really worked on anything bigger than homework assignments. So, let's take something like a game engine that has various components like a memory manager, renderer, physics simulation, and so on. How would one work on these components separately, but in a way that makes it easy to integrate back into the whole? For example, would you make a separate visual studio project for each piece with its own main? If you have one big project for everything, how would you work on one component without potentially another unfinished component making it fail every compile? I feel like I'm missing some major concept. Like, for projects with multiple programmers that have to check out portions to work on... do they grab all the code so they can compile, or do they set up their own temporary project to work on their bit? Both options sound wrong. You have to have a main function to compile right?
I would very much appreciate anyone educating me on this topic as I feel this is something i should have and just somehow missed completely.
When you are working with larger programs it is customary to have one source file with a main program and the rest (there can be many source files) are called from main. Then you need a build strategy. You can write a script file that compiles each of your source files and then links them all together. Unfortunately this can lead to long build times, so professional programmers use of make files which rebuild only the files that change.
As a further refinement, you can organize groups of sources into libraries and build the libraries separately and then link them with your remaining compiled source files.
Try looking up gmake (for linux) to see how to build larger projects. I guess you are using Microsoft VC++, in which case compiled files have .obj extensions and libraries .lib extensions. Microsoft have there own way of building libraries which is slighly more complicated than using gmake.
When you look further you'll come across shared libraries (dynamic link libraries on windows - DLLs).
This isn't really a great question for Stack overflows format. C++ does support language facilities for managing large code bases, like namespaces, classes, and header files. But your question seems to suggest a lack of perspective as to what they are for, or a limited understanding of the technical framework and process for contributing code to a software project. Which isn't a c++ specific issue.
When working on a living project, a primary concern is dealing with complexity. Or, in other words, reducing the number of things you have to think about at any one point in time. What that means is if another programmer is working on the user interface, ideally your code in the physics engine shouldn't have to change to reflect those changes. So interfaces, for forming abstractions and hiding information, are essential.
Granted I'm pretty green as well, so I can't give any real solid advice. I only mention this point to give some perspective as to how vague your question is. If I understand your question correctly, you might enjoy a book like Code Complete 2 by McConnell.
Large projects are separated into pieces. Normally, you should have the ability to compile each piece separately. The best practice that I know is to declare the interfaces among the various components, minimizing dependencies as close as possible to zero, and then building 'test' programs, which are small and serve two reasons: test a small piece of code, have main().
The directory structure is usually:
yourlib/
lib/
ext-inc/
test/
other dirs/
...
the lib contains the output library object (.a, .so)
the ext-lib contains the headers external code will use (sometimes called 'public' or just 'inc')
the test directory usually have a main.c (cpp) file and might have some more, as needed.
When you checkout(svn) / clone(git) / sync(p4) / etc. you would take everything, but work only on your area. once done, you merge/submit your changes into the main branch.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I have an application dependent on many libraries. I am building everything from sources on an ubuntu machine. I want to remove any function/class that is not required by an application. Is there any tool to help with that?
P.S. I want to remove source code from the library not just symbols from object files.
Standard strip utility was created exactly for this.
I have now researched this a bit in the context of my own project and decided this was worth a full answer rather than just a comment. This answer is based on Apple's toolchain on macOS (which uses clang, rather than gcc), but I think things work in much the same way for both.
The key to this is enabling 'link time optimization' when building your libraries and executable(s). The mechanics of this are actually very simple - just pass -flto to gcc and ld on the command line. This has two effects:
Code (functions / methods) in object files or archives that is never called is omitted from the final executable.
The linker performs the sort of optimisations that the compiler can perform (such as function inlining), but with knowledge that extends across compilation unit boundaries.
It won't help you if you are linking against a shared library, but it might help if that shared library links with other (static) libraries which contain code that the shared library never calls.
On the upside, this reduced the size of my final executable by about 5%, which I'm pleased about. YMMV.
On the downside, my object files roughly doubled in size and sometimes link times increased dramatically (by something like a factor of 100). Then, if I re-linked, it was much faster. This behaviour might be a peculiarity of Apple's toolchain however. Perhaps it is stashing away some build intermediates somewhere on the first link. In any case, if you only enable this option for release builds it should not be a major issue.
There are more details of the full set of gcc command line options that control optimisation here: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html. Search that page for flto to narrow down your search.
And for a glimpse behind the scenes, see: https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html
Edit:
A bit more information about link times. Apple's linker creates some huge files in a directory called LTOCache when you link. I've not seen these before today so these look to be the build intermediates that speed up linking second time around. As for my initial link being so slow, this may in part be due to the fact that, in my case, these are created on an SMB server. But then again, the CPU was maxed out so maybe not.
OK, now that I understand the OP's requirements better I have another answer for this that I think might better suit his needs. I think the way to tackle this is with a code coverage tool. After all, the problem is identifying what you can safely get rid of it. Actually stripping it out is easy.
My IDE (Visual Studio) has one of these built in but I think the OP is using gcc so the first port of call appears to be gcov. There are a number of commercial options, but they are expensive. There's also a potentially useful post here.
The other thing you need, of course, is a program that exercises all the parts of the library that you want to keep to give you a coverage report to work from, but it sounds like the OP already has that. A good IDE will also help as it makes navigating around the code so much easier. In Visual Studio, I find Jump to Definition and quick and easy 'bookmarking' to be key features.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm trying to find different ways to reuse my C++ functions in different applications. Say for example I have the following functions:
Function A(){} // this will do a complex math operation
Function B(){} // this will load a complex shape file
Function C(){} // Print the results.
I need to use the above 3 functions in 3 different C++ programs. They are completely independent and I'm trying to see what the best way is to use them in all of my applications rather than writing same code 3 times.
I am thinking about the following options:
Option A: Writing static library
Option B: Writing dynamic library
Option C: Windows Services
Option D: Same code and compile everywhere
Are there any other options? Or what would be the best option?
If the functions are only going to be called "in-house" by yourself and/or your co-workers (i.e. they aren't going to be exposed to people who don't have access to your source code repository) then option (D) is sufficient. Just keep the the .cpp and .h files in a single well-known sub-directory of your source code repository and have each application's project file reference them as necessary. This is simple to implement and gives you maximum flexibility (since each project can compile the shared .cpp files with different compiler-flags that best suit its own needs, if necessary -- with a library you'd have to figure out a single set of compiler flags that would work for all applications that want to link to the library, which isn't always convenient).
If you're writing an API for public consumption, OTOH, things get a little more complex, since after you release the code to the public you will no longer be in full control of which versions are getting used and where. In that case you will have to make a decision based on who your users are and what you think they would be most comfortable with.
Option C can probably be tossed out since it's overkill for this sort of thing, and carries the penalty of tying your code to a particular OS with no compensatory advantage.
It's option D (compile everywhere) all the way -- with the only exceptions being stand-alone libraries that are shared with many, many other people (or closed-source).
This makes it a lot easier to manage releases, because there really aren't any -- each copy of the library can be updated independently -- whenever is convenient.
This makes it easy for each project to debug into the library, with the particular version of the library that is in use.
This gives you the option of customizing the library for each project -- but use this capability judiciously to minimize merging complexity.
This choice is independent of whether or not you build the library it into a separate binary package as part of your build process.
I would recommend using something like git-submodules to manage the code -- except that the git-submodules feature is kind of half-baked.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I would focus on libraries though it can be a general application installation as well.
When we install a library (say C++), a novice user like me probably expects that when we "install" a library, all that source-code gets copied somewhere with few flags and path variables set so that we can directly use #include kind of statements in our own code and start using them.
But by inspection I can say that actually, the exact source-files are not copied but instead pre-compiled object-forms of the files are copied, except for the so called *.h header-files. (Simply because, I cannot find the sourcefiles all over the hard-disk except the headerfiles)
My Questions:
What is the behind scene method, when we "install" something.. what are all the typical locations that get affected by in a 'linux' environment. And the typical importance/use of each of those locations.
What is the difference between "installing" a library and installing a new application into the linux system via "sudo apt-get" or so.
Finally, If I have a custom set of source files which are useful as a library, and want to send them to another system, how would I "install" my own library there, in the same way as above.
Just to clarify, My primary interest is to know from your kind answers and literature-pointers, the bigger picture of a typical installation (an application/a library), to a level that I can crosscheck,learn and re-do if I want to.
(Question was removed, question addressed difference between header and object files) This is more a question of general programming. A header file is just the declaration of classes/functions/etc, it does nothing. All a header file does is say "hey, I exist, this is what I look like." That is to say it's just a declaration of signatures used later in the actual code. The object code is just the compiled and assembled, but not linked code. This diagram does a good job of explaining the steps of what we generally call the "compilation" process, but would better be called the "compilation, assembling, and linking process." Briefly, linking is pulling in all necessary object files, including those needed from the system, to create a running executable which you can use.
(Now question 1) When you think about it, what is installation except the creation and modification of necessary files with the appropriate content? That's what installing is, just placing the new files in the appropriate place, and then modifying configuration files if necessary. As to what "locations" are typically affected, you usually see binaries placed in /bin, /usr/bin and /usr/local/bin; libraries are typically placed in /lib or /usr/lib. Of course this varies, depending. I think you'd find this page on linux system directories to be an educational read. Remember though, anything can be placed pretty much anywhere and still work appropriately as long as you tell other things where to find it, these directories are just used because they keep things organized and allow for assumptions about where items, such as binaries, will be located.
(Now question 2) The only difference is that apt-get generally makes it easier by installing the item you need and keeping track of installed items, also it allows for easy removal of installed items. In terms of the actual installation, if you do it correctly manually then it should be the same. A package manager such as apt-get just makes life easier.
(Now question 3) If you want to do that you could create your own package or if it's less involved, you could just create a script that moves the files to the appropriate locations on the system. However you want to do it, as long as you get the items where they need to be. If you want to create a package yourself, it'd be a great learning experience and there are plenty of tutorials are online. Just find out what package system your flavor of linux uses then look for a tutorial on how to create packages of that type.
So the really big picture, in my opinion, of the installation process is just compilation (if necessary), then the moving of necessary files to their appropriate places on the system, and the modification of existing files on the system if necessary: Put your crap there, let the system know it's there if you need to.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'm building a very basic library, and it's the first project that I plan on releasing for others to use if they'd like. As such, I'd like to know what some "best practices" as far as organization goes. The only thing that may be unique about my project is that in order for it to be used in a project, users would be required to extend certain abstract classes, which leads me to my first question:
A lot of libraries I've seen consist of a .a file and a single .h file. Is this best practice? Wouldn't it be better to expose all the public .h files so that users can choose which ones to include? If this is the preferred way of doing things, how exactly is it accomplished? What goes into that single .h file?
My second question involves dependencies. For example my current project relies on OpenGL, GLFW, and GLEW. Should I package those in some way with my project, or just make it the user's responsibility to ensure that they are installed?
Edit: Someone asked about my target OS. All of my dependencies are cross platform so I'm (perhaps naively) hoping to make my library cross platform as well.
Thanks for any and all help!
It really depends on the circumstances. If you have some fairly complex functionality, that are in a number of closely related functions, then one header is the right solution.
E.g. you write a set of functions that draw something to the screen, and you need a few functions to confgiure/set up the environment, a few functions to define and place objects in the scene, a few functions to do the actual drawing/processing, and finally teardown, then using one header file is a good plan.
In the above case, it's also possible to have one "overall" header-file that includes several smaller ones. Particularly if you have fairly large classes, sticking them all in one file gets rather messy.
On the other hand, if you have one set of functions that deal with gasses dissolved in liquids, another set of functions to calculate the strength/load capacity of a steel beam, and another set of functions to calculate the friction of a rubber tyre against a roadsurface, then they probably should have different headers - even if it's all feasible functionality to go in a "Physics/mechanics library".
It is rarely a good idea to supply third party libraries with your library - yes, if you want to offer two downloads, one with the "all you nead, just add water", and one "bare library", that's fine. But I don't want to spend three times longer than necessary to download your library, simply because it also contains three other libraries that your code is using, which is already on my machine. However, do document what libraries are needed, and what you need to do to install them on your supported platforms (and what the supported platforms are). And what versions of libraries you have tested - there's nothing worse than "getting the latest", only to find that the version something needs is two steps back...
(And as Jason C points out, licensing gets very messy once you have a few different packages that your code depends on, because your license then has to be compatible with ALL the other licenses - sometimes that's not even possible...)
You have options and it really depends on how convenient you choose to make it for developers using your libraries.
As for the headers, the general method for libraries of average complexity is to have a single header that a developer can include to get everything they need. A good method is, if you have multiple headers, create a single header with the same name as your library (not required, just common) and have it #include all the individual headers. Then distribute the single header and individual headers. That way your users have the option of #including just one to get everything, or #including individual ones if necessary.
E.g. in mylibrary.h:
#ifndef MYLIBRARY_H
#define MYLIBRARY_H
#include <mylibrary/something.h>
#include <mylibrary/another.h>
#include <mylibrary/lastone.h>
#endif
Ensure that your individual headers can be included standalone (i.e. they #include everything they need) if you want to provide that option to developers.
As for dependencies, you will want to make it the user's responsibility to ensure they are installed. The user is compiling their code and linking to your library, and so it is also the user's responsibility to link to dependent libraries. If you package third-party dependencies with your library you run many risks:
Breaking user's systems who already have dependencies installed.
As mentioned in Mats Petersson's answer, forcing users to download dependencies they already have.
Violating licensing rights on third-party libraries.
The best thing for you to do is clearly document the required dependencies.
For this there are not really standard "best practices". Any sane practice would be a good practice.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
The source code of our application is hundreds of thousands of line, thousands of files, and in places very old - the app was first written in 1995 or 1996. Over the past few years my team has greatly improved the quality of the source, but one issue remains that particularly bugs me: a lot of classes have a lot of methods fully defined in their header file.
I have no problem with methods declared inline in a header in some cases - a struct's constructor, a simple method where inlining measurably makes it faster (we have some math functions like this), etc. But the liberal use of inlined methods for no apparent reason is:
Messy
Makes it hard to find the implementation of a method (especially searching through a tree of classes for a virtual function, only to find one class had its version declared in the header...)
Probably increases the compiled code size
Probably causes issues for our linker, which is notoriously flaky for large codebases. To be fair, it has got much better in the past few years, but it's not perfect.
That last reason may now be causing problems for us and it's a good reason to go through the codebase and move most definitions to the source file.
Our codebase is huge. Is there an automated tool that can do (most of) this for us?
Notes:
We use Embarcadero RAD Studio 2010. In other words, the dialect of C++ includes VCL and other extensions, etc.
A few headers are standalone, but most are paired with a corresponding .cpp file, as you normally would. Apart from the extension the filename is the same, i.e., if there are methods defined in X.h, they can be moved to X.cpp. This also means the tool doesn't have to handle parsing the whole project - it could probably just parse individual pairs of .cpp/.h files, ignore the includes, etc, so long as it could reliably recognise a method with a body defined in a class declaration and move it.
You might try Lazy C++. I have not used it, but I believe it is a command line tool to do just what you want.
If the code is working then I would vote against any major automated rewrite.
Lots of work could be involved fixing it up.
Small iterative improvements over time is a better technique as you will be able to test each change in isolation (and add unit tests). Anyway your major complaint about not being able to find the code is not a real problem and is already solved. There are already tools that will index your code base so your editor will jump to the correct function definition without you having to search for it. Take a look at ctags or the equivalent for your editor.
Messy
Subjective
Makes it hard to find the implementation of a method (especially searching through a tree of classes for a virtual function, only to find one class had its version declared in the header...)
There are already tools available for finding the function. ctags will make a file that allows you to jump directly to the function from any decent editor (vim/emacs). I am sure your editor if nto one of these has the equivalent tool.
Probably increases the compiled code size
Unlikely. The compiler will choose to inline or not based on internal metrics not weather it is marked inline in the source.
Probably causes issues for our linker, which is notoriously flaky for large codebases. To be fair, it has got much better in the past few years, but it's not perfect.
Unlikely. If your linker is flakey then it is flakey it is not going to make much difference where the functions are defined as this has no bearing on if they are inlined anyway.
XE2 includes a new static analyzer. It might be worthwhile to give the new version of C++Builer's trial a spin.
You have a number of problems to solve:
How to regroup the source and header files ideally
How to automate the code modifications to carry this out
In both cases, you need a robust C++ parser with full name resolution to determine the dependencies accurately.
Then you need machinery that can reliably modify the C++ source code.
Our DMS Software Reengineering Toolkit with its C++ Front End could be used for this. DMS has been used for large-scale C++ code restructuring; see http://www.semdesigns.com/Company/Publications/ and track down the first paper "Case Study: Re-engineering C++ Component Models Via Automatic Program Transformation". (There's an older version of this paper you can download from there, but the published one is better). AFAIK, DMS is the only tool to have ever been applied to transforming C++ on large scale.
This SO discussion on reorganizing code addresses the problem of grouping directly.