My C++ project is growing larger. We are also moving to using cmake for building now. I want to divide the application into libraries so that they can be linked for testing, preparing the application package, etc. Right now I would divide my code into libraries as follows:
core
GUI
utilities (these are used by core and other components)
io (xml parsing/outputing using print functions of classes in core)
tests (unit tests)
simulator (tests the core)
An alternative would be to divide based on the directory structure - one library for each directory. But from my past experience it leads to too many libraries and then library dependencies become tough to handle during linking.
Are there any best practices in this regard?
Sit down with a piece of paper and decide your library architecture.
The library should be designed as a set of levels.
A libary on level A (the base) should have dependencioes only on system libraries and only if it must on libraries on level A.
A library on level B can have dependencies on libraries at level A and system libararies and only if it must on libraries on level B.
etc
Each library should represent a complete job at its particular level. Things at lower level generally have smaller jobs but lots of them. A library at a higher level should reresent a complete task. ie don't have a lib for windows objects and a lib for events. At this level the job here is handline all interaction with a window (this includes how it interacts with events).
You seem to have identified some resonable functional groups. The only one that see a bit suspicious is io. If you truly have some generic IO routines that provide real functionality fine. But if it is just grouping the IO for a bunch of different objects then I would scrap that (it all depends on the usage).
So the next step is to identify the relationship between them.
As for using directory structures. Usually everything in one directory will be present within the same library, but that does not exclude the posability of other directories also being present. I would avoid putting half the classes in directory in libA and the other half of the classes in libB etc.
You should have a read of Large-Scale C++ Software Design by John Lakos.
You may not be able to read it before you start your work, but you should put this book on your list.
Otherwise Martin York's advise is sound.
One more thing though, I would recommend picking up a tool like doxygen that can give you dependency diagrams of your code base. If your bothering to do this type of restructuring you should rid yourself of circular dependencies between your libraries. Lakos describes a lot of ways to cut dependencies - some obvious, some less so.
I like starting from a package diagram that only has one way arrows.
http://www.agilemodeling.com/style/packageDiagram.htm
Your list looks like a good start.
Seems reasonable.
I'd query what your unit_tests library is supposed to do.
Given a collection of "projects" building libA, libB, libC... I'd expect to see some matching projects testA testB testC (whether they build libraries or executables depends whether the built tests run standalone or are loaded into some test runner).
I'm also slightly wary of "utilities" libraries. These seem to have a surprising ability to cause pain and suffering in the long run. For example, maybe your IO library has no other dependency than the utilities library. One day you want to reuse the IO library in another project on another platform. Only problem is, you now also have to port all of the utilities library (90% of which IO doesn't use), or disentangle the 10% of it which IO actually depends on. Sometimes it's better to have libraries be a bit more dependency free, at the cost of some code duplication.
Related
I searched for this particular question but could not come up with any results, neither here nor on-line in general (maybe because it is a little harder to phrase for me). If it has already been asked, please point me in the right direction.
I am at a point where I would like my libraries/software to be pluggable. I see all these various libraries and systems where plugins are used extensively and the authors boastfully point out (in a good way!) that their software has plugin support.
So my question is, where do I start? Are there any books/on-line-resources that break the ice and may guide one on the do's and dont's of making your library pluggable, define best practices etc.?
You have to understand some things before starting :
There is no support for modules (static or dynamic) in standard C++. Nope. Not yet. Maybe in 2015.
Dlls (or .so on unix-like systems) are dynamically loaded libraries that are compiler/os dependant. So it's a pragmatic solution that fill the need.
So, you'll have to use shared libraries (whatever the file extension, it's the keyword for searches about this subject) as plugin binaries. If your plugin should contain more than runtime code, like graphic resources, you can include your graphic resources in the binary, or have a file format or compressed archibe that contain the binary file.
Whatever the way you setup your plugin files, in C++ the problem is about the interface.
Depending on wich compiler you use, you'll have different ways to "tag" functions or classes as exported/imported (meaning your plugin source code export the code and the user of the plugin should import the code).
Setup clean and clear interface in C++ for the modules, with no templates (because they are compiler and compiler configuration dependant). Those interfaces should be function declarations and class declarations with no inline code and marked exported/imported.
Now, once you've got this, you can use OS-specific API to load/unload dynamic library binaries while the application is running. Once it's done, you can get pointers to functions, again using the OS-specific API. I let you search for it.
Now, there are libraries that provide ways to abstract this in a cross-platform way. I didn't use them yet and they are known to be unperfect because of lack of definitions in the C++ standard, but they could be useful if you're planning to have your application cross-platform:
boost::extension : it's not yet a boost library, nor even proposed yet, and it's developpement is in pause (until some new standard C++ implementations are done) so it might be a bad idea but a lot of people say they use it with success.
POCO libraries have a library for shared libraries that would be the equivalent of boost::extension. Again lot of people say it's useful so I guess it's good enough to be used.
The other alternative, that is easy to setup if you don't need to support tons of target platforms, is to just write some wrapper code around OS-Specific APIs. That's what I did before knowing about boost::extension for example.
I am about to attempt reorganizing the way my group builds a set of large applications that share about 90% of their source files. Right now, these applications are built without any libraries whatsoever involved except for externally linked ones that are not under our control. The applications use the same common source files (we are not maintaining 5 versions of the same .h/.cpp files), but these are not built into any common library. So, at the moment, we are paying the price of building the same code over-and-over per application, each time we intend to release a version. To me, this sounds like a prime candidate for using libraries to capture the shared code and reduce build times. I do not have the option of using DLL's, so the approach is to use static libraries.
I would like to know what tips you would have for how to approach this task. I have limited experience with creating/organizing static libraries, so even the basic suggestions towards organization/gotchas are welcome. Maybe even a good book recommendation?
I have done a brief exercise by finding the entire subset of files that each application share in common. As a proof of concept, I took these files and placed them in a single "Common Monster" static library. Building the full application using this single static library certainly improves the build time for all of the applications, but should I leave it at this? The purpose of the library in this form is not very focused and seems like a lazy attempt at modularity. There is ongoing development with these applications, and I'm afraid this setup will cause problems further down the line.
It's very hard to give general guidelines in this area - how you structure libraries depends very much on how you use them. Perhaps if I describe my own code libraries this may help:
One general purpose library containing code that I expect all applications will have at least a 50/50 chance of needing to use. This includes string utilities, regexes, expression evaluation, XML parsing and ODBC support. Conceivably this should be split up a bit, but it makes distributing my code in FOSS projects easier to keep it monolithic.
A library supporting multi-threading, providing wrappers around threads, mutexes, semaphores etc.
One supporting SQLite via its native interface, rather than via ODBC.
A C++ web server wrapper round the Mongoose C web server.
The general purpose library is used in all the stuff I write, the others in more specialised circumstances. Headers for each library are held in separate directories, as are the library binaries themselves (though they should probably be in a single lib directory).
Make sure that the dependencies of your libraries form acyclic directed graph (a tree). While this is not necessarily a problem for static libs (I'm not sure in fact), it will be a problem if you ever decide to switch to dlls. Depending on your situation, this may require some redesign of interfaces.
Another thing I noticed (for sure on MSVC), which you may consider if build speed is an important concern: DLLs link much faster than static libraries. I assume this is because they don't have to be copied into the new executable and there's no need to search an eliminate unused code. Even if it's no option for production, you may use this trick while developing.
I also have the habit to create my solution files with CMake, because it is easier to overview the entire build process than clicking through an endless list of options in a GUI. It's up to you to decide if you want to walk that path.
I am developing a portable C++ application and looking for some best practices in doing that. This application will have frequent updates and I need to build it in such a way that parts of program can be updated easily.
For a frequently updating program, creating the program parts into libraries is the best practice? If program parts are in separate libraries, users can just replace the library when something changes.
If answer for point 1 is "yes", what type of library I have to use? In LINUX, I know I can create a "shared library", but I am not sure how portable is that to windows. What type of library I have to use? I am aware about the DLL hell issues in windows as well.
Any help would be great!
Yes, using libraries is good, but the idea of "simply" replacing a library with a new one may be unrealistic, as library APIs tend to change and apps often need to be updated to take advantage of, or even be compatible with, different versions of a library. With a good amount of integration testing though, you'll be able to 'support' a range of different versions of the library. Or, if you control the library code yourself, you can make sure that changes to the library code never breaks the application.
In Windows DLLs are the direct equivalent to shared libraries (so) in Linux, and if you compile both in a common environment (either cross-compiling or using MingW in Windows) then the linker will just do it the same way. Presuming, of course, that all the rest of your code is cross-platform and configures itself correctly for the target platform.
IMO, DLL hell was really more of a problem in the old days when applications all installed their DLLs into a common directory like C:\WINDOWS\SYSTEM, which people don't really do anymore simply because it creates DLL hell. You can place your shared libraries in a more appropriate place where it won't interfere with other non-aware apps, or - the simplest possible - just have them in the same directory as the executable that needs them.
I'm not entirely convinced that separating out the executable portions of your program in any way simplifies upgrades. It might, maybe, in some rare cases, make the update installer smaller, but the effort will be substantial, and certainly not worth it the one time you get it wrong. Replace all executable code as one in most cases.
On the other hand, you want to be very careful about messing with anything your users might have changed. Draw a bright line between the part of the application that is just code and the part that is user data. Handle the user data with care.
If it is an application my first choice would be to ship a statically-linked single executable. I had the opportunity to work on a product that was shipped to 5 platforms (Win2K,WinXp, Linux, Solaris, Tru64-Unix), and believe me maintaining shared libraries or DLLs with large codebase is a hell of a task.
Suppose this is a non-trivial application which involves use of 3rd Party GUI, Threads etc. Using C++, there is no real one way of doing it on all platforms. This means you will have to maintain different codebases for different platforms anyway. Then there are some wierd behaviours (bugs) of 3rd Party libraries on different platforms. All this will create a burden if application is shipped using different library versions i.e. different versions are to be attached to different platforms. I have seen people shipping libraries to all platforms when the fix is only for a particular platform just to avoid the versioning confusion. But it is not that simple, customer often has a different angle to how he/she wants to upgrade/patch which is also to be considered.
Ofcourse if the binary you are building is huge, then one can consider DLLs/shared-libraries. Even if that is the case, what i would suggest is to build your application in the form of layers like:-
Application-->GUI-->Platform-->Base-->Fundamental
So here some libraries can have common-code for all platforms. Only specific libraries like 'Platform' can be updated for specific behaviours. This will make you life a lot easier.
IMHO a DLL/shared-library option is viable when you are building a product that acts as a complete solution rather than just an application. In such a case different subsystems use common logic simultaneously within your product framework whose logic can then be shared in memory using DLLs/shared-libraries.
HTH,
As soon as you're trying to deal with both Windows and a UNIX system like Linux, life gets more complicated.
What are the service requirements you have to satisfy? Can you control when client systems get upgraded? How many systems will you need to support? How much of a backward-compatibility requirement do you have.
To answer your question with a question, why are you making the application native if being portable is one of the key goals?
You could consider moving to a a virtual platform like Java or .Net/Mono. You can still write C++ libraries (shared libraries on linux, DLL's on windows) for anything that would be better as native code, but the bulk of your application will be genuinely portable.
We have a core library in the form of a DLL that is used by more than one client application. It has gotten somewhat bloated and some applications need only a tiny fraction of the functionality that this DLL provides. So now we're looking at dividing this behemoth into smaller components.
My question is this: Can anyone recommend a path to take to divide this bloated DLL into a set of modules that have some interdepencies but do not necessarily require all other modules?
Here are the options as I see them but I'm hoping someone can offer other possibilities:
Create a "core" dll and several "satellite" dlls which use the core and possibly other satellite DLLs.
Subdivide the contents of the bloated DLL into static libraries that the main DLL uses (to maintain the same functionality) but apps that don't want to use the bloated version can assemble the static libraries they need into their own dll or into the app itself.
I was hesitant to mention this but I think it may be important to note that the app uses MFC.
Thanks for your thoughts.
Somewhat related to your question is this question, about splitting up a very large C module into smaller ones.
How do you introduce unit testing into a large, legacy (C/C++) codebase?
It seems your question has to do with the larger question of breaking some large blob of code into a more modular system. The link above is definitely recommended reading.
Without having all the details it is a little hard to help but here is what I would do in your situation
provide both static and dll versions of whate3ver you release - for MT and single threaded.
try to glean from the disparate clients which items should be grouped together to provide reasonable segmentation - without having layers of dependencies.
having a "core" module sounds like a good idea - and make sure you don't have too many levels of dependencies - you might want to keep it simple.
You may find after the exercise that one big dll is actually reasonable.
Another consideration is that maintaining multiple DLLs and both static libs and DLLs will hugely increase the complexity of maintenance.
Are you going to be releasing them all at once every time, or are they going to be mix and match? Be careful here - and know that you could create testing issues
If no one is complaining about the size of the DLL then you might want to consider leaving it as is.
Assume you have five products, and all of them use one or more of the company's internal libraries, written by individual developers.
It sounds simple but in practice, I found it to be very difficult to maintain.
How do you deal with the following scenarios:
A developer unintentionally introduces a bug and breaks everything in production.
Every library has to mature, That means the API needs to evolve, so how do you deploy the updated version to production if every developer needs to update/test their code while they are extremely busy on other projects? Is this a resource and time issue?
Version control, deployment,and usage. Would you store this in one global location or force each project to use, say, svn:externals to "tie" a library?
I've found that it is extremely hard to come up with a good strategy. My own pet theory is this:
Each common library has to have a super-thorough set of tests or else it should never be common, even if it means someone else duplicates the effort. Duplicate untested code is better than common untested code (you break only one project).
Each common library has to have a dedicated maintainer (can be offset by a really good test suite in a smaller team).
Each project should check out the version of the library that is known to work with it. This means a developer does not have to get pulled away to update API usage, as the common code gets updated. Which it will be. Every non-trivial piece of code evolves over months and years.
Thank you for your thoughts on this!
You have a competing set of goals here. First, a library of reusable components must be open enough that people from the other projects can easily add to it (or submit components to it). If it's too difficult for them to do that, they'll build their own libraries, and ignore the common one, leading to a lot of duplicate code and wasted effort. On the other hand, you want to control the development of the library enough that you can ensure its quality.
I've been in this position. There's no easy answer. However, there are some heuristics that can help.
Treat the library as an internal project. Release it on regular intervals. Ensure that it has a well-defined release procedure, complete with unit tests and quality assurance. And, most important, release often, so that new submissions to the library show up in the product frequently.
Provide incentives for people to contribute to the library, rather than just making their own internal libraries.
Make it easy for people to contribute to the library, and make the criteria clear-cut and well-defined (e.g., new classes must come with unit tests and documentation).
Put one or two developers in charge of the library, and (IMPORTANT!) allocate time for them to work on it. A library that is treated as an afterthought will quickly become an afterthought.
In short, model the development and maintenance of your internal library after a successful open source library project.
I don't agree with this:
Duplicate untested code is better than
common untested code (you break only
one project).
If you are all equally likely to create bugs by implementing the same thing, then you'll all have to fix potentially different bugs in each instance of the "duplicate" library.
It also seems that it'd be much faster/cheaper to write the library once and, instead of having multiple other teams write the same thing, have some resources allocated to testing.
Now to solve your actual problem: I'd mimic what we do with real third-party libraries. We use a particular version until we're ready, or compelled to upgrade. I don't upgrade everything just because I can--there has to be a reason.
Once I see that reason (bug fix, new feature, etc.), then I upgrade with the risk that the new library may have new bugs or breaking changes.
So, you're library project would continue development as necessary, without impacting individual teams until they were ready to "upgrade".
You could publish releases or peg/branches/tag svn to help with all this.
If all teams have access to the bug tracker, they could easily see what known issues exist in the upgrade-candidate before they upgrade, too. Or, you could maintain that list yourself.
#Brian Clapper provides some excellent guidelines for how to run your library as a project in his answer.
I used to work in a similar situation to what you're describing, only my company had dozens of software products. I worked on the team that was responsible for maintaining and upgrading the core set of libraries that everyone else used.
We dealt with those scenarios as follows:
Test the heck out of the core libraries. Maintaining duplicate code is a nightmare. You're not just maintaining the core and one copy. Somewhere in your company's source control there are several copies of the same code. We had dozens of products, so that would have meant dozens of copies. Hunt them down and kill them.
We had a small team of 10-12 developers dedicated to maintaining the core library and its test suites. We were also responsible for fielding calls from the other 1100 developers in the company about how to use the core library, so as you can imagine, we were very busy.
Each other project needs to work with the version of the core library that it is known to work with. You can use version control branches to test new releases of the core library with old products to make sure you don't break code that works. If the core team does a thorough job of testing, this should go very smoothly. The only time this ever got really complicated for us was when the core API changed, or when we flat out screwed something up. Even if you're very confident in your core testing, use branches to test individual products.
I agree - this is difficult. In our small team (consulting .. not a product company - which made it harder), we had one common component that stood out from the others. In this case the recipe for success was:
Make a good developer responsible for developing the component
Make a good developer the gatekeeper for maintaining the component
Make sure all upgrades (there were several) are backward compatible
Make sure there is some basic documentation (or a simple reference application) explaining how the component is to be used
Make sure all developers know that the component exists (!) and where they can find it (along with the code, if they wish to review it)
Give developers the ability to review the code and suggest better implementations or refectoring, but have the final mods go through an experienced gatekeeper. When the component were upgraded, older apps did not have to upgrade. If we did a new release, we evaluated if we wanted to upgrade, and if we did, all we needed to do was swap the libraries - no code needed to change, unless we wanted to use some new features available through the upgrade. Resistance is inevitable, but sometimes it is a good sort of resistance when it comes from good developers who have better ideas for a new generation or refactored component.
Treat the development of the libraries like any other product. Each library has its own repository, its own releases and version numbers. The compiled and officially tested versions of the library are also kept in the repository. Document features and changes from version to version.
Then use the libraries like you would using third party libraries. Your product uses only fixed versions of the compiled libraries. Switch to a new version when you really need to and be aware that this involves more testing. Add the versions you use to your version control.
When you find a bug or require a new feature in a library, a new version or sub-version is created. Using a version control system like svn makes this easy. When you need the source code for debugging purposes, export it and include it in your projects, but do not change it there, but fix problems in the libraries' repositories.
This way, every team can contribute to the libraries without endangering the work of the other teams. Switching versions is done deliberately and not by accident.
Create an Anti-corruption (DDD) layer for the existing library... this is nothing but a facade.. and then write unit-test for this anti-corruption layer... Now even if someone upgrade/update the library you would know if something is broken by running the unit tests...
These tests could also serve as documentation of contract... and not every project that need to use the library has to write this anti -corruption layer, if they are using the same exact functionality..
"Duplication is the root of all evil"
Sounds to me like you need:
An artifact repository like Ivy so you can have the libraries shared and versioned with a distinction between versions that are API stable and ones that are "maturing"
Tests for the libraries
Tests for the projects using them
A continuous integration system so that when an incompatibility or bug is introduced both the project and the original library developer are notified
I think that one shared library is better than 3 duplicate ones (and 1 tested is definitely better than 3 untested). That's because when you find and fix problem, this makes the whole application area more solid (and development and maintenance are more efficient).
BTW, that't one of the reasons (apart from contributing back to the community) why our company exposes our .NET shared libraries to the public as open-source.
Plus, there's less code to write. And you can designate one dev to enforce good development practices on the library and its usages (i.e. through code contracts enforced on the unit tests within library consumers). This improves quality and and reduces maintenance costs.
We store shared libraries as binaries in the solution. That comes from the logical requirement that any solution has to be atomic and independent (this rules out svn:externals links).
API compatibility is not an issue at all. Just let your integration server rebuild and retest the whole product stack (while updating all the inner references and propagating changes) and you'll always be sure that all internal API's are solid. And whomever breaks the API has to either fix it or update the usages.
Duplication is the root of all evil
I would argue that unchecked government is the root of all evil :)
I do get a lot of flack for even suggesting that duplication should be an option. I understand why, but let me complicate this a bit.
Say you have a fairly large library that doesn't actually do anything in particular - it's just a collection of utilities. There are NO tests for this library - at all. You need only one function from it. Say, something that parses out a file's extension.
Pop quiz: do you just write something as small as this in your own project, or you bite the bullet and use the free-for-all untested set of utilities, which WILL break your application if someone breaks the function?
Also, imagine you are in environment where writing tests is not part of the culture, since most projects are very intense and have a very short development span.
Duplicating large systems - such as client registration - would be dumb beyond belief, of course. However, aren't there any cases where it is safer to duplicate something fairly small in your project if the alternative is not safe enough (no system for maintaining common code).
Think of it this way - and this happens all the time - multiple contractors working on different projects, for the same company. They don't even know about each other.
My argument is this:
If a team cannot dedicate to maintaining a solid common codebase, or if the environment does not give them enough time to, it's best to let them work as separate "contractors".
You will STILL need to use large existing systems that simply cannot be duplicated.
Duplicating large systems - such as
client registration - would be dumb
beyond belief,
That's why those systems publish external interfaces.
If you define a library as shared code between projects: in my experience that's almost always bad. A project should be stand alone, and updates for one project should not affect other projects.
Even if you start with libraries, you'll end up duplicating code anyway. Want to hotfix project 1? It was released with library 1.34, so to keep the hotfix as small as possible, you'll go back to library 1.34 and fix that. But hey-- now you did exactly waht the library was supposed to avoid-- you've duplicated the code.
Every developer uses Google to find code and copy it into his application. That's probably how they found Stack Overflow in the first place. Imagine what would happen if Stackoverflow published libraries instead of code snippets, and you'll get an idea of the problems that afflicts many well meaning library creators.
Libraries tend to be generic solutions to specific problems. Typically, the generic solution is more complex than the sum of the two specific solutions. This means you need one good programmer to solve a problem that could have been solved by two morons. Sounds like a bad tradeoff to me :D
I would like to point a problem in the solutions suggested above: treating the library as an internal project with its own versioning scheme.
The problem
If your company has more than one product (lets say two teams - two product: A, B), than each product has its own release schedule. Let's give an example: Team A is working on product A v1.6. Their release schedule is two weeks from now (suppose Oct 30th). Team B is working on product B v2.4. Their release schedule is 1.5 months from now - Nov 30th. Lets assume both are working on acme-commons-1.2-SNAPSHOT. Both are adding changes to acme-commons, as they need it. Couple of days before Oct 30th, team B introduce a change which is buggy, to acme-commons-1.2-SNAPSHOT. Team A is getting into stress mode since they discover the bug 1 day prior to code freeze.
This scenario shows that treating a common library as a third party library is almost impossible. The trivial, but problematic, solution is for each team to have their own copy of the version they are about to change. For example, product A v1.2 will create a branch (and version) for acme-commons named "1.2-A-1.6". Team B will also create a branch in acme-commons called "1.2-B-2.4". Their development will never collide and they will be stress free once they tested their product.
Of course, someone will have to merge their changes back to the original branch (lets say master or 1.2).
The problems I found with this solution is:
Branch inflation - the tree structure will be very puffy and it will be harder to understand the flow of changes/merges.
Merges back to 1.2 will probably never happen - Unless a team/developer is dedicated to this library, the chances that Team A or Team B merges their code back to 1.2 branch is slim. They will always stay focused on their tasks, thus creating and using their own branch space. Allocation of a developer/team is expensive, thus not always a viable solution.
I'm still trying to figure this one out, so any thoughts of this matter are welcome