Organizing solutions, projects and SVN - c++

I would like some help in setting up a project in SVN with regards to directory structure. I have read several answers regarding this on SO, but as I am new to this, most of them are difficult to understand.
I am building a single library, on which several other distinct project depends on:
I need the ability to export MyLibrary (headers and .lib only) easily for use by third parties
MyLibrary1
Depends on external libraries, should be able to manage different versions of these libraries!
MyLibrary2
Depends on External Libraries fmod, glew, ...
Project 1, 2, 4, 5, 6 ...
Depends on MyLibrary1, 2, or both
Each project could need versions for multiple platforms (osx, windows ...)
I would like to know of a good way to organize this, do keep in mind that I am rather new to this - a more pedantic answer would be helpful. For example if you write something like /src, do explain what is supposed to go into it! I would be able to guess, but I wont be sure =)
////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Edit
I cant put this into a comment, so here goes:
#J.N, thanks for the extensive reply, I would like to clarify some stuff, I hope I understood what you meant properly:
root
library foo
/branches // old versions of foo
/tags // releases of foo
/trunk // current version
/build // stuff required by makefiles
/tools // scripts to launch tests ect
/data // test data needed when running
/output // binaries, .exe files
/dependencies // libraries that foo needs
/lib name
include
lib
/docs // documentation
/releases // generated archives
/sample // sample project that shows how to use foo
/source // *.h, *.cpp
program bar
/branches // old versions of bar
/tags // releases of bar
/trunk // current version
/build // stuff required by makefiles
/tools // scripts to launch tests ect
/data // test data needed when running
/output // binaries, .exe files
/dependencies // libraries that bar needs
/lib name
include
lib
/docs // documentation
/releases // generated archives
/sample // sample project that shows how to use bar
/source // *.h, *.cpp
1) Where do the *.sln files go? In /build?
2) do I need to copy foo/source into bar/dependencies/foo/include? After all, bar depends on foo
3) Where do *.dll files go? If foo has dependencies on dll files, then all programs using foo need access to the same dll files. Should this go into root/dlls?

There are several levels to your questions: how to organize a single project source tree, how to maintain the different projects together, how to maintain the dependencies of those project, how to maintain different variants of each projects and how to package them.
Please keep in mind that whatever you do, your project will eventually grow large enough to make it unadapted. It's normal to change the structure several times in the lifetime of a project. You'll get the feeling that it isn't right anymore when that will happen: it's usually when the setup is bothering you more than it helps.
1 - Maintaining the different variants of each project
Don't have variants for each project, you won't solve several variants by maintaining parralel versions or branches. Have a single source tree for every project/library that can be used for all variants. Don't manage different "OSes", manage different features. That is, have variants on things like "support posix sockets" or "support UI". That means that if a new OS come along, then you just need to choose the set of features it supports rather than starting a new version.
When specific code is needed, create an interface (abstract class in C++), and implement the behaviour with respect to it. That will isolate the problematic code and will help adding new variants in the future. Use a macro to choose the proper one at compile time.
2 - Maintaining the dependencies of each project
Have a specific "dependencies" folder in which each subfolder contains everything needed for one dependency (that is includes and sub dependencies). At the beggining when the codebase is not too large, you don't care too much about ensuring automatically that all the dependencies are compatible with each other, save it for later.
Don't try to merge the dependencies from their root location higher in the svn hierarchy. Formally deliver each new version to the teams needing it, up to them to update their own part of the SVN with it.
Don't try to use several versions of the same dependency at once. That will end badly. If you really need to (but try avoiding it as much as you can), branch your project for each version.
3 - Maintain the different projects
I'd advise to maintain each projects repository independently (with SVN they still could be the same repo, but in separated folders). Branches and tags should be specific to one project, not all of them. Try to limit to a maximum the number of branches, they don't solve problems (even with git). Use branches when you have to maintain different chronoligical versions in parallel (not variants) and fight back as much as you can before you actually do it, everybody will benefit from the use of the newer code.
That will allow to impose security restrictions (not sure if feasible with vanilla SVN, but there are some freely available servers that support it).
I'd recommend sending emails notifications whenever someone commits on a project to everybody potentially interested.
4 - Project source tree organization
Each project should have the following SVN structures:
trunk (current version)
branches (older versions, still in use)
tags (releases, used to create branches without thinking too much when patches are required)
When the project gets bigger, organize branches and tags in sub folders (for instance branches/V1.0/V1.1 and branches/V2.0/V2.1).
Have a root folder with the following subfolders: (some of this may be created by VC itself)
Build system (stuff required by your makefiles or others)
Tools (if any, like an XSLT tool or SOAP compiler, scripts to launch the tests)
Data (test data you need while running)
Output (where the build system put the binaries)
Temp Output (temporary files created by the compilation, optional)
Dependencies
Docs (if any ;) or generated docs)
Releases (the generated archives see later)
Sample (a small project that demonstrate how to use the project / library)
Source ( I don't like to split headers and .cpp, but that's my way )
Avoid too many levels of subfolders, it's hard to search trees, lists are easier
Define properly the build order of each folder (less necessary for VC but still)
I make my namespaces match my folders names (old Java habits, but works)
Clearly define the "public" part that you need to export
If the project is large enough to hold several binaries / dlls each should have its own folder
Don't commit any binaries you generate, only the releases. Binaries like to conflict with each other and cause pain to the other people in the team.
5 - Packaging the projects
First, make sure to include a text file with the SVN revision and the date, there's an automated way to do that with auto props.
You should have a script to generate releases (if time allows). It will check that everything is commited, generate a new version number .... Create a zip/tar.gz archive you must commit/archive, whose name contains the SVN revision, branch and the current date (the format should be normalized accross projects). The archive should have everything that is needed to run the app / use the library in a file structure. Create a tag so that you can start from it for emergency bug fixing.

Related

How to obtain statically linked dll dependencies automatically

When creating new projects I've always stumbled upon the issue that my final executable or dll does not properly run because it is missing dependencies I was using.
In my live as a developer I've seen several approaches to handle that (of which I don't like any)
Set environment path to all those dependencies so that the OS can find them (bad, since very unportable, requires environment, won't work out of the box)
Copy every dependency of a potentially large package into the binary output directory (e.g. dozens of dlls regardless of need). For dependencies like OSG, Qt, etc this is quite odd as you typically won't link all dlls provided by larger packages and you may end up copying much more data than necessary.
Hand-Pick single dependencies (and optionally their pdbs) with a fine-grained xcopy/robocopy or whatever task. (Don't like that as it needs attention, I add a dependency in Visual Studio and then I need to adjust some script). There are tools like dependency walker which help this, but still, this might still be unportable because your include path in VS may have a version in it, if you change that, you need to change it also in the script, which is too much Department of Redundancy Department
In larger projects or companies I would expect this issue to be quite common. Is their nothing in Visual Studio or in Windows that aids that? I need to pick my include paths and import lib paths properly in order to compile a binary but then I need some other mechanism to actually make things work.
I am thinking about solving this problem in a more general way for more projects but really wonder if I am missing something.

Visual Studio Solution Dependencies

I'm working at an organization with a product suite based on several hundred Visual Studio solutions (mostly C++). Some of these solutions generate libraries that are used by other solutions and there's also a common "include" folder containing headers that shared by multiple modules.
The issue is that the dependencies are not explicitly stated anywhere, and the build system resolves dependencies by specifying a linear build order that makes sure the dependent modules get built at the right time. This works well for the build system but leaves developers at a disadvantage when trying to work on components with many direct and indirect external dependencies. For example, I might want to edit one of the library projects or shared headers and then build all the affected modules without necessarily knowing ahead of time which ones are affected. Another use case involves building a module after doing a fresh pull from TFS and having the modules it depends on built first without having to build the entire system.
I am wondering if there is/are any tool(s) available that can automate dependency generation for building large projects. I have considered creating a few really big solutions that encapsulate the other solutions but that seems really awkward and clumsy. Also, I don't like the idea of having developers manually specify dependencies as it can error prone, especially with such a large code base. I worked with scons a few years ago and really liked the way it could parse source files and automatically discover all the dependencies dependencies. Is there anything available today that can do the same thing with Visual Studio solutions?
This is not a duplicate of Visual Studio: how to handle project dependencies right?
I need to emphasize the magnitude of the problem I am trying to solve. This is a very large existing code base. In the main directory there are several hundred sub-folders, each one containing one of more VS solutions (not projects). Each solution, in turn, contains one or more projects. As I said before, I'm not trying to establish dependencies among a few projects in a solution. The problem is much bigger than that. I'm trying to find a way to establish dependencies among the solutions themselves (several hundred of them). For example, one solution may contain some projects that generate libraries for security, others for communications, etc. There may be, for example, dozens of solutions that use the communications libraries. So essentially I'm trying to create a directed a cyclic graph with hundreds of nodes and potentially tens of thousands of edges.
You could use cmake (https://cmake.org/). With it, you can specify several libraries and apps to be built. Once configured, you can modify a project and the build will just update the dependent projects. Cmake also provides a visual studio generator, so that you can continue using that IDE.
A possible disavantage to you is that, to configure, you must explictly specify, for each project (library or executable), with what projects it must be linked and what folders it must include. There are ways to define some global includes and links, but the use will depends on your problem.
VS does track dependencies (by parsing source files). It doesn't make sense that something could automatically set dependencies of your VS projects, in any other build tools you'd still have to specify in some way that for linking project A.exe you need to use B.lib.
If you use newer VS versions you should simply add references to lib to your exe/dll projects. If you manually added project dependencies, most likely you should remove them all, especially make sure you don't make static lib projects dependent on each other. VS allows you to do that (for example, if build of one library generates some source files that another static lib uses), but in general these shouldn't have any dependencies and this allows VS to optimize builds by building them in parallel.
For example, commonly you could have some kind of Base.lib, then System.lib and Graphics.lib. All of these are user by your App.exe. System.lib uses code from Base.lib, Graphics.lib uses code from System.lib and Base.lib. So, naturally the dependency chain is clear and you go and set them in VS, and that's a mistake! In cases like this in VS you should make these 4 libs independent and only App.exe should be dependent on all these libs (e.g. it should have references to all of these). VS will figure out what is the the correct dependency of these projects.
Regarding Cmake case: it simply generates VS projects and solutions, if you use VS then cmake cannot do more than VS itself can.

g++: Use ZIP files as input

We have the Boost library in our side. It consists of a huge number of files which never change and only a tiny portion of it is used. We swap the whole boost directory if we are changing versions. Currently we have the Boost sources in our SVN, file by file which makes the checkout operations very slow, especially on Windows.
It would be nice if there were a notation / plugin to address C++ files inside ZIP files, something like:
// #ZIPFS ASSIGN 'boost' 'boost.zip/boost'
#include <boost/smart_ptr/shared_ptr.hpp>
Are there any support for compiler hooks in g++? Are there any effort regarding ZIP support? Other ideas?
I assume that make or a similar buildsystem is involved in the process of building your software. I'd put the zip file in the repository, and add a rule to the Makefile to extract it before the actual build starts.
For example, suppose your zip file is in the source tree at "external/boost.zip", and it shall be extracted to "external/boost", and it contains at its toplevel a file "boost_version.h".
# external/Makefile
unpack_boost: boost/boost_version.h
boost/boost_version.h: boost.zip
unzip $<
I don't know the exact syntax of the unzip call, ask your manpage about this.
Then in other Makefiles, you can let your source files depend on the unpack_boost target in order to have make unpack Boost before a source file is compiled.
# src/Makefile (excerpt)
unpack_boost:
make -C ../external unpack_boost
source_file.cpp: unpack_boost
If you're using a Makefile generator (or an entirely different buildsystem), please check the documentation for these programs for how to create something like the custom target unpack_boost. For example, in CMake, you can use the add_custom_command directive.
The fine print: The boost/boost_version.h file is not strictly necessary for the Makefile to work. You could just put the unzip command into the unpack_boost target, but then the target would effectively be phony, that is: it would be executed during each build. The file inbetween (which of course you need to replace by a file which is actually present in the zip archive) ensures that unzip only runs if necessary.
A year ago I was in the same position as you. We kept our source in SVN and, even worse, included boost in the same repository (same branch) as our own code. Trying to work on multiple branches was impossible, as it would take most of a day to check-out a fresh working copy. Moving boost into a separate vendor repository helped, but it would still take hours to check-out.
I switched the team over to git. To give you an idea of how much better it is than SVN, I have just created a repository containing the boost 1.45.0 release, then cloned it over the network. (Cloning copies all of the repository history, which in this case is a single commit, and creates a working copy.)
That clone took six minutes.
In the first six seconds a compressed copy of the repository was copied to my machine. The rest of the time was spent writing all of those tiny files.
I heartily recommend that you try git. The learning curve is steep, but I doubt you'll get much pre-compiler hacking done in the time it would take to clone a copy of boost.
We've been facing similar issues in our company. Managing boost versions in build environments is never going to be easy. With 10+ developers, all coding on their own system(s), you will need some kind of automation.
First, I don't think it's good idea to store copies of big libraries like boost in SVN or any SCM system for that matter, that's not what those systems are designed for, except if you plan to modify code in boost yourself. But let's assume you're not doing that.
Here's how we manage it now, after trying lots of different methods, this works best for us.
For every version of boost that we use, we put the whole tree (unzipped) on a file server and we add extra subdirectories, one for each architecture/compiler-combination, where we put the compiled libraries.
We keep copies of these trees on every build system and in the global system environment we add variables like:
BOOST_1_48=C:\boost\1.48 # Windows environment var
or
BOOST_1_48=/usr/local/boost/1.48 # Linux environment var, e.g. in /etc/profile.d/boost.sh
This directory contains the boost tree (boost/*.hpp) and the added precompiled libs (e.g. lib/win/x64/msvc2010/libboost_system*.lib, ...)
All build configurations (vs solutions, vs property files, gnu makefiles, ...) define an internal variable, importing the environment vars, like:
BOOSTROOT=$(BOOST_1_48) # e.g. in a Makefile, or an included Makefile
and further build rules all use the BOOSTROOT setting for defining include paths and library search paths, e.g.
CXXFLAGS += -I$(BOOSTROOT)
LFLAGS += -L$(BOOSTROOT)/lib/linux/x64/ubuntu/precise
LFLAGS += -lboost_date_time
The reason for keeping local copies of boost is compilation speed. It takes up quite a bit of disk space, especially the compiled libs, but storage is cheap and a developer losing lots of time compiling code is not. Plus, this only needs to be copied once.
The reason for using global environment vars is that build configurations are transferrable from one system to another, and can thus be safely checked in to your SCM system.
To smoothen things a bit, we've developed a little tool that takes care of the copying and setting the global environment. With a CLI, this can even be included in the build process.
Different working environments mean different rules and cultures, but believe me, we've tried lots of things and finally, we decided to define some kind of convention. Maybe ours can inspire you...
This is something you would not do in g++, because any other application that wants to do it would also have to be modified.
Store the files on a compressed filesystem. Then every application gets the benefit automatically.
It should be possible in an OS to allow transparent access to files inside a ZIP file. I know that I put it in the design of my own OS a long time ago (2004 or so) but never got it to a point where it was usable. The downside is that seeking backwards in a file inside a ZIP is slower as it's compressed (and you can't rewind the compressor state, so you have to seek from the start instead). This also makes using a zip-inside-a-zip slow for rewinding and reading. Fortunately, most cases just read a file sequentially.
It should also be retrofittable to current OSes, at least in client space. You can hook the filesystem access functions used (fopen, open, ...) and add a set of virtual file descriptors that your own software would return for a given filename. If it's a real file just pass it on, if it's not open the underlying file (possibly again via this very function) and pass a virtual handle. When accessing the file contents, read directly from the zip file without caching.
On Linux you would use an LD_PRELOAD to inject it into existing software (at usage time), on Windows you can hook the system calls or inject a DLL into the space of software to hook the same functions.
Does anybody know if this already exists? I can't see any clear reason it wouldn't...

Managing library dependencies using git

I have a project which is built for multiple OSes(Linux and Windows for now, maybe OS X) and processors. To this project I have a handful of library dependencies, which are manly external but I have a couple of internal ones, in source form which I compile(cross-compile) for each OS-processor combination possible in my context.
Most of the external libraries are not changed very often, just maybe in case of a local bugfix or some feature\bugfix implemented in a newer version I think it may benefit the project. The internal libraries change quite often(1 month cycles) and are provided by another team in my company in binary form, although I also have access to the source code and if I need a bug to be fixed I can do that and generate new binaries for my usage until the next release cycle. The setup I have right now is the following(filesystem only):
-- dependencies
|
-- library_A_v1.0
|
--include
|
--lib
|
-- library_A_v1.2
|
--include
|
--lib
|
-- library_B
|
--include
|
--lib
| ...
The libraries are kept on a server and every time I make an update I have to copy any new binaries and header files on the server. The synchronization on the client side is done using a file synchronization utility. Of course any updates to the libraries need to be announced to the other developers and everyone has to remember to synchronize their "dependencies" folder.
Needless to say that I don't like very much this scheme. So I was thinking of putting my libraries under version control(GIT). Build them, pack them in a tgz\zip and push them on the repo. Each library would have its own git repository so that I could easily tag\branch already used versions and test drive new versions. A "stream" of data for each library that I could easily get, combine, update. I would like to have the following:
get rid of this normal filesystem way of keeping the libraries; right now complete separate folders are kept and managed for each OS and each version and sometimes they get out of sync resulting in a mess
more control over it, to be able to have a clear history of which versions of the libs we used for which version of our project; much like what we can obtain from git(VCS) with our source code
be able to tag\branch the versions of the dependencies I'm using(for each and every one of them); I have my v2.0.0 tag/branch for library_A from which I normally take it for my project but I would like to test drive the 2.1.0 version, so I just build it, push it on the server on a different branch and call my build script with this particular dependency pointing to the new branch
have simpler build scripts - just pull the sources from the server, pull the dependencies and build; that would allow also to use different versions of the same library for different processor-OS combinations(more than often we need that)
I tried to find some alternatives to the direct git based solution but without much success - like git-annex which kind of seems overly complicated for what I'm trying to do.
What I'm facing right now is the fact that there seems to be very strong opinion against putting binary files under git or any VCS(although technically I would have also header files; I could also push the folder structure that I described directly to git to not have the tgz\zip, but I would still have the libraries binaries) and that some of my colleagues, driven by that shared strong opinion, are against this scheme of things. I perfectly understand that git tracks content and not files, but to some extent I will be tracking also content and I believe it will definitely be an improvement over the current scheme of things we have right now.
What would be a better solution to this situation? Do you know of any alternatives to the git(VCS) based scheme of things? Would it be such a monstrous thing to have my scheme under git :)? Please share your opinions and especially your experience in handling these types of situations.
Thanks
An alternative, which would still follwo your project, would be to use git-annex, which would allow you track header files, while keeping binaries stored elsewhere.
Then each git repo can be added as a submodule to your main project.

Organising .libs in a codebase of several C++ projects

Let's say you have several bespoke C++ projects in separate repositories or top-level directories in the same repository. Maybe 10 are library projects for stuff like graphics, database, maths, etc and 2 are actual applications using those libraries.
What's the best way to organise those 2 application projects to have the .libs they need?
Each lib project builds the .lib in its own directory, developers have to copy these across to the application area manually and make sure to get the right version
Application projects expect lib projects to be in particular paths and look for .libs inside those locations
A common /libs directory is used by all projects
Something else
This is focused on C++, but I think it's pretty similar with other languages, for instance organising JARs in a Java project.
I'd suggest this approach:
Organise your code in a root folder. Let's call it code.
Now put your projects and libraries as subfolders (e.g. Projects and Libraries).
Build your libraries as normal and add a post-build step that copies the resulting headers and .lib files into a set of shared folders. For example, Libraries\include and Libraries\lib. It's a good idea to use subfolders or a naming convention (myLib.lib, myLib_d.lib) to differentiate different builds (e.g. debug and release) so that any lib reference explicitly targets a single file that can never be mixed up. It sucks when you accidentally link against the wrong variant of a lib!
You can also copy third-party libraries that you use into these folders as well.
Note: To keep them organised, include your files with #include "Math\Utils.h" rather than just "Utils.h". And put the headers for the whole Math library into include\Math, rather than dropping them all in the root of the include folder. This way you can have many libraries without name clashes. It also lets you have different versions of libraries (e.g. Photoshop 7, Photoshop 8) which allows you to multi-target your code at different runtime environments.
Then set up your projects to reference the libraries in one of two ways:
1) Tell your IDE/compiler where the libs are using its global lib/include paths. This means you set up the IDE once on each PC and never have to specify where the libs are for any projects.
2) Or, set each project to reference the libs with its own lib/include paths. This gives you more flexibility and avoids the need to set up every PC, but means you have to set the same paths in every new project.
(Which is best depends on the number of projects versus the number of developer PCs)
And the most important part: When you reference the includes/libs, use relative paths. e.g. from Projects\WebApp\WebApp.proj, use "..\..\Libraries\include" rather than "C:\Code\Libraries\Include". This will allow other developers and your buildserver to have the source code elsewhere (D:\MyWork instead of C:\Code) for convenience. If you don't do this, it'll bite you one day when you find a developer without enough disk space on C:\ or if you want to branch your source control.