Our codebase has thousands of lines and legacy code. Across time different developers have coded as per their suitability and standards. One of the wrongly implemented code is that a common header is included is declared and defined in different directories to be lined to different binaries with little difference. Ex:
dir1/xxx.h
class ABC{
public:
int init();
};
dir1/xxx.cpp
ABC::init()
Similarly
dir2 has its own copy.
The issue was that developers wanted to keep different versions - primary because they should know when the need to call source code under dir1 or dir2 which is independent of modifications to each.
Now its the hierarchy of how we are linking code in our binary is our issue. The header file in concern is conditionally compiled using same inclusive directive #ifndef .. #define .. #endif. The header files gets archived into lib1.a lib2.a and so on. Hence when we link our library and if incase we required it from lib3.a during linking we need to make sure that it linked the first:
ldd .. lib1.a lib2.a lib3.a -- so the exact header does not gets linked properly. Note that all .a have some additional interfaces compiled and linked.
Its unfortunate is that the required header contains common declaration (defines same methods but are little bit different)
How can we resolve the issue? Including Namespace would mean a lot of revamp in our codebase? Is there a better way to do that?
What would be best design for such a code base - so that later onwards no developer can accidentally include these fatal signatures?
Please help
There are several approaches to sharing code between developers:
Let everyone share the same code. Make a team responsible for the shared code, and if they make changes to the shared code, make sure that these changes (e.g. an extra argument to a method) are 'propagated' to all the applications using the shared code.
Alternatively, you can make 'everybody' responsible for the shraed code, but even then, if the shared code is changed, the developer that did the the change should propagate this to all other applications.
In this approach you can still choose to distribute the shared code as a LIB, or as a DLL.
Give everyone their own copy of the shared code. At the same time, make a 'central version' of the shared code, and make this 'central version' the 'trunk'. This means, whenever the shared code needs to be changed, it is this 'central version' that is changed. All the local copies of the shared code in the applications are not changed.
Additionally, assign the task of 'Integration Manager" to a member of every application team. He/She will be responsible for bringing in new versions of the shared code, from the central version to the local copy. He/She will have to make changes in the application to make sure that the application still works with the new copy, and make sure the application is re-tested with the new shared code version.
If at all possible, you really need to rethink the basic practice, and undo it if you can. If this same basic header and class (or set of classes) is used in all the different library projects, even with minor variations, there ought to be some way to harmonize those variations into a proper class hierarchy such that a single library, with subclasses implementing slightly different variations as necessary, replaces multiple copies.
Related
How can I add boost.asio to a windows universal project to it's shared components?
Do I need to create separate project and include the header files there or is there more simple way ?
Thanks!
While I can't get into the specifics of Universal Apps too much (I'm not an authority on that subject), I can tell you that this: boost::asio is a header only library. That means that by simply including the headers into your C++ project, that code is merged directly into your main assembly. I highly recommend using it in this way.
If you're going to include this header only library into another DLL that you then include in your main app, things are going to messy. First, you have the headache of building binaries for each target (x86, x64 and ARM) and maintaining those dependencies but beyond that, the real headache is what you need to go through to make boost::asio function when being loaded from a shared assembly.
In order to do this, you need to define a special static member inside ::asio called winsock_init in your code. ::asio uses an internal, static customized reference counter using interlocked exchanges to track its own usage. When the counter is incremented beyond zero, calls to things such as WSAStartup() are made to ensure that the library plays nice with Winsock. When the counter reaches zero again, WSACleanup() is called again for the same reasons.
The structure winsock_init circumvents this functionality, so it's up to you to correctly, manually call these functions from within your shared assembly, otherwise you're going to completely break ASIO AND your application will fail compliance testing for app store deployment.
Also, whenever you try to wrap ::asio into a shared assembly you need to include special source files one time only, within the dll and then you need to define a bunch of special boost config variables both in this DLL project and any project that uses this ::asio dll.
My advice again is to simply include the headers alone in your primary assembly and then you're not introducing all of these headaches. Another alternative is to simply use C++/CLI or Managed C++, whatever it's called these days, and directly access the .NET socket classes from within your mixed C++ code.
See here for more details about compiling ASIO into a separate assembly if you really want to suffer all the pain I've described.
What is the best practice to add or modify a single class method in a well-established C++ library like OpenCV, while still reusing the remaining of the library code, preferably in the lib format.
At this point the only way I know is to copy all the source and header files that belong to the specific library (let's say OpenCV's core library) to the current source folder, modify that one function and recompile the module with the rest of the code. Ideally, I want to be able to link all the current .lib files the way they are, but simply define a new method (or modify a current method) for a class defined inside those libs in a way that my implementation of the method supersedes the implementation of the default library files.
Inheritance doesn't always seem to be an option, since sometimes the base class has private members that are required for the correct inherited class implementation.
I'm not aware of a clean way in C++ to accomplish what you're asking. What you're really asking to do (given that you need to use or modify private methods) is violate encapsulation, and the C++ language is designed to not let you do that.
A few options exist:
A .lib file is simply a collection of .obj files. Your compiler toolchain should have a command-line program for adding, deleting, and replacing .obj files in a .lib, so you could build one or two .obj files and merge them into the .lib. I suspect that this solution would be ugly and fragile.
If there's something that the library doesn't do and should do, then there's always a chance that you can submit a patch or feature request to the library authors to get that change made. Of course, this can take a while, if it works at all.
As #fatih_k suggests, adding your changes as friend classes would work. If the only change you make to OpenCV is to add a friend line to the header file, then the library's ABI will be unchanged, and you won't have to touch the .lib.
The cleanest option is to simply accept that you need to modify the OpenCV library and track its source code, along with your modifications, along with the source code you develop yourself, and build it along with the source code you build yourself. This is a very common approach, and various patterns and techniques exist to help you do it; for example, Subversion has the concept of vendor branches. This approach is more work to set up but is definitely the cleanest in the long run.
If the library is already compiled, there is not much you can do portably and cleanly.
If you know the specific target architecture the program will be runnning on, you could get the pointer to the member function and monkey patch the instructions with a jmp instruction to your own version of the method. If the method is virtual, you can modify the vtable. Those requires a lot of compiler-specific knowledge and would not be portable.
If the library ships in dynamic link archive, you could extract the archive and replace the method with your own version, and repack the archive.
Another method is you can copy the class' declaration from the header and add a friend declaration. Alternatively, you can do #define private public or #define private protected before including a header file. These will give you access to their private members.
With any of the above, you need to be careful that your changes does not modify the ABI of the library.
Well, OpenCV is licensed under BSD so you can make your changes without worries about republishing them.
You could always follow the Proxy design pattern and add the new method external to the library, and call into the library from there. That means you don't need to worry about maintaining your own version of OpenCV and distributing it as well. There's more information about Proxy patterns on Wiki to get you started.
We have a common functionality we need to share among several applications. We already have a few internal libraries, into which we put common code with a well-defined interface. Sometimes, though, there are problems with some code (typically a single or a few .cpp files) as it doesn't fit into an existing library and it is too small to make a new one.
Our current version control system supports file sharing, so usually such files are just shared between the applications that use them. I tend to consider it a bad thing, but actually, it makes it quite clear, as you can see exactly in which applications they are used.
Now, we are moving to svn, which does not have "real" file sharing, there is this svn:externals stuff, but will it still be simple to track the places where the files are shared when using it?
We could create a "garbage" library (or folder) and put such files there temporarily, but it's always the same problem that it complicates dependency tracking (which project use this file?).
Otherwise, are there other good solutions? How does it work in your company?
Why don't you just create a folder in SVN called "Shared" and put your shared files into that? You can include the shared files into your projects from there.
Update:
Seems like you are looking for a 3rd party tool that tracks dependencies.
Subversion and dependencies
You can only find out where a file is used by looking at all repositories.
We have a (very large) existing codebase for a custom ActiveX control, and I'd like to integrate libkml into it for the sake of interacting with KML mapping data, rather than reinventing the wheel. The problem is, I'm a relatively new Windows developer, and coming from the Linux world, I'm really not sure what the right way of integrating a third party library is. Thankfully, libkml does provide MSVCC projects for compiling it, so porting isn't a problem. I guess I have a couple choices that I can think of:
Build and link the library directly. We already have a solution with project files in it for the "main" project; I could add the libkml projects to that solution, but I'd rather not. It's very unlikely that the libkml code will change in relation to our app's code.
Statically link to the .lib files produced by the libkml build. This is unattractive, since there are six .lib files that come out of the libkml solution and it seems inelegant to manually specify them in the linker options, etc.
Package the code as-is in a DLL. Maybe with COM? It seems like if I did this without any translation, I'd end up with a lot of overhead, and since I'm fairly unfamiliar with COM, I don't know how much work would be involved in exposing all the functionality I'd like to use via COM. The library is fairly big, has a lot of classes it uses, and if I had to manually write code to expose it all, I'd be hesitant to go this route.
Write wrapper code to to abstract the functionality I need, package that in a COM DLL, and interact with that. This seems sensible, I suppose, but it's difficult to determine how much abstraction I need since I haven't written the code that would use libkml yet.
Let me reiterate: I haven't yet written the code that will interact with libkml yet, so this is mostly experimental. Options 1 and 2 are also complicated by the fact that libkml relies additionally on three more external libraries that are also in .lib files (that I had to recompile anyways to get the code generation flags to line up). The goal obviously is to get the code to work, but maintainability and source tree organization are also goals, so I'm leaning towards options 3 and 4, but I don't know the best way to approach those on Windows.
Typing six file names, or using the declarative style with #pragma comment(lib, "foo.lib") is small potatoes compared to the work you'll have to do to turn this into a DLL or COM server.
The distribution is heavily biased towards using this as a static link library. There are only spotty declarations available to turn this into a DLL with __declspec(dllexport). They exist only in the 3rd party dependencies. All using different #defines of course, you'll by typing a bunch of names in the preprocessor definitions for the projects.
Furthermore, you'll have a hard time actually getting this DLL loaded at runtime since you are using it in a COM server. The search path for DLLs will be the client app's when COM creates your control instance, not likely to be anywhere near close to the place you deployed the DLL.
Making it a COM server is a lot of work, you'll have to write all the interface glue yourself. Again, nothing already in the source code that helps with this at all.
You can also wrap all the functionality you need in a non-COM-dll. Visual studio supports creating a static wrapper library which, when linked, will make your program use the dll. This way you only have one dependency to specify instead of six.
Other than that, what is wrong with specifying six dependencies. I would assume that there is a good reason that these are six separate libraries instead of one, so it is prudent to specify exactly which parts you actually use.
Maybe I'm missing something here, but I really don't see what is wrong with (1). I think that even if you had multiple projects that were using libkml, just insert the project file for libkml into your solution file, specify the dependencies, and you should be done. It's dead simple. Even solution (2) is dead simple. If the libraries ever change, you rebuild - you're going to need to do that anyway.
I'm failing to see how (3) or (4) are necessary or even desired. To me, it sounds like a lot of work for goals (source tree organization and maintainability) that I'm not even sure that those options really meet. In fact, you said yourself that "It's very unlikely that the libkml code will change in relation to our app's code."
What I've found over the years is to just keep things simple. If rebuilding KML is potentially time consuming, grab the libs and just statically link to the libraries. Yes, there are other dependencies, but you'll set this up once and be done, hopefully never to worry about it again. Otherwise, stick it in the project and move on. I think that it's worthwhile to ask whether spending a lot of time on this issue is worth the trouble.
General question:
For unmanaged C++, what's better for internal code sharing?
Reuse code by sharing the actual source code? OR
Reuse code by sharing the library / dynamic library (+ all the header files)
Whichever it is: what's your strategy for reducing duplicate code (copy-paste syndrome), code bloat?
Specific example:
Here's how we share the code in my organization:
We reuse code by sharing the actual source code.
We develop on Windows using VS2008, though our project actually needs to be cross-platform. We have many projects (.vcproj) committed to the repository; some might have its own repository, some might be part of a repository. For each deliverable solution (.sln) (e.g. something that we deliver to the customer), it will svn:externals all the necessary projects (.vcproj) from the repository to assemble the "final" product.
This works fine, but I'm quite worried about eventually the code size for each solution could get quite huge (right now our total code size is about 75K SLOC).
Also one thing to note is that we prevent all transitive dependency. That is, each project (.vcproj) that is not an actual solution (.sln) is not allowed to svn:externals any other project even if it depends on it. This is because you could have 2 projects (.vcproj) that might depend on the same library (i.e. Boost) or project (.vcproj), thus when you svn:externals both projects into a single solution, svn:externals will do it twice. So we carefully document all dependencies for each project, and it's up to guy that creates the solution (.sln) to ensure all dependencies (including transitive) are svn:externals as part of the solution.
If we reuse code by using .lib , .dll instead, this would obviously reduce the code size for each solution, as well as eliminiate the transitive dependency mentioned above where applicable (exceptions are, for example, third-party library/framework that use dll like Intel TBB and the default Qt)
Addendum: (read if you wish)
Another motivation to share source code might be summed up best by Dr. GUI:
On top of that, what C++ makes easy is
not creation of reusable binary
components; rather, C++ makes it
relatively easy to reuse source code.
Note that most major C++ libraries are
shipped in source form, not compiled
form. It's all too often necessary to
look at that source in order to
inherit correctly from an object—and
it's all too easy (and often
necessary) to rely on implementation
details of the original library when
you reuse it. As if that isn't bad
enough, it's often tempting (or
necessary) to modify the original
source and do a private build of the
library. (How many private builds of
MFC are there? The world will never
know . . .)
Maybe this is why when you look at libraries like Intel Math Kernel library, in their "lib" folder, they have "vc7", "vc8", "vc9" for each of the Visual Studio version. Scary stuff.
Or how about this assertion:
C++ is notoriously non-accommodating
when it comes to plugins. C++ is
extremely platform-specific and
compiler-specific. The C++ standard
doesn't specify an Application Binary
Interface (ABI), which means that C++
libraries from different compilers or
even different versions of the same
compiler are incompatible. Add to that
the fact that C++ has no concept of
dynamic loading and each platform
provide its own solution (incompatible
with others) and you get the picture.
What's your thoughts on the above assertion? Does something like Java or .NET face these kinds of problems? e.g. if I produce a JAR file from Netbeans, will it work if I import it into IntelliJ as long as I ensure that both have compatible JRE/JDK?
People seem to think that C specifies an ABI. It doesn't, and I'm not aware of any standardised compiled language that does. To answer your main question, use of libraries is of course the way to go - I can't imagine doing anything else.
One good reason to share the source code: Templates are one of C++'s best features because they are an elegant way around the rigidity of static typing, but by their nature are a source-level construct. If you focus on binary-level interfaces instead of source-level interfaces, your use of templates will be limited.
We do the same. Trying to use binaries can be a real problem if you need to use shared code on different platforms, build environments, or even if you need different build options such as static vs. dynamic linking to the C runtime, different structure packing settings, etc..
I typically set projects up to build as much from source on-demand as possible, even with third-party code such as zlib and libpng. For those things that must be built separately, e.g. Boost, I typically have to build 4 or 8 different sets of binaries for the various combinations of settings needed (debug/release, VS7.1/VS9, static/dynamic), and manage the binaries along with the debugging information files in source control.
Of course, if everyone sharing your code is using the same tools on the same platform with the same options, then it's a different story.
I never saw shared libraries as a way to reuse code from an old project into a new one. I always thought it was more about sharing a library between different applications that you're developing at about the same time, to minimize bloat.
As far as copy-paste syndrome goes, if I copy and paste it in more than a couple places, it needs to be its own function. That's independent of whether the library is shared or not.
When we reuse code from an old project, we always bring it in as source. There's always something that needs tweaking, and its usually safer to tweak a project-specific version than to tweak a shared version that can wind up breaking the previous project. Going back and fixing the previous project is out of the question because 1) it worked (and shipped) already, 2) it's no longer funded, and 3) the test hardware needed may no longer be available.
For example, we had a communication library that had an API for sending a "message", a block of data with a message ID, over a socket, pipe, whatever:
void Foo:Send(unsigned messageID, const void* buffer, size_t bufSize);
But in a later project, we needed an optimization: the message needed to consist of several blocks of data in different parts of memory concatenated together, and we couldn't (and didn't want to, anyway) do the pointer math to create the data in its "assembled" form in the first place, and the process of copying the parts together into a unified buffer was taking too long. So we added a new API:
void Foo:SendMultiple(unsigned messageID, const void** buffer, size_t* bufSize);
Which would assemble the buffers into a message and send it. (The base class's method allocated a temporary buffer, copied the parts together, and called Foo::Send(); subclasses could use this as a default or override it with their own, e.g. the class that sent the message on a socket would just call send() for each buffer, eliminating a lot of copies.)
Now, by doing this, we have the option of backporting (copying, really) the changes to the older version, but we're not required to backport. This gives the managers flexibility, based on the time and funding constraints they have.
EDIT: After reading Neil's comment, I thought of something that we do that I need to clarify.
In our code, we do lots of "libraries". LOTS of them. One big program I wrote had something like 50 of them. Because, for us and with our build setup, they're easy.
We use a tool that auto-generates makefiles on the fly, taking care of dependencies and almost everything. If there's anything strange that needs to be done, we write a file with the exceptions, usually just a few lines.
It works like this: The tool finds everything in the directory that looks like a source file, generates dependencies if the file changed, and spits out the needed rules. Then it makes a rule to take eveything and ar/ranlib it into a libxxx.a file, named after the directory. All the objects and library are put in a subdirectory that is named after the target platform (this makes cross-compilation easy to support). This process is then repeated for every subdirectory (except the object file subdirs). Then the top-level directory gets linked with all the subdirs' libraries into the executable, and a symlink is created, again, naked after the top-level directory.
So directories are libraries. To use a library in a program, make a symbolic link to it. Painless. Ergo, everything's partitioned into libraries from the outset. If you want a shared lib, you put a ".so" suffix on the directory name.
To pull in a library from another project, I just use a Subversion external to fetch the needed directories. The symlinks are relative, so as long as I don't leave something behind it still works. When we ship, we lock the external reference to a specific revision of the parent.
If we need to add functionality to a library, we can do one of several things. We can revise the parent (if it's still an active project and thus testable), tell Subversion to use the newer revision and fix any bugs that pop up. Or we can just clone the code, replacing the external link, if messing with the parent is too risky. Either way, it still looks like a "library" to us, but I'm not sure that it matches the spirit of a library.
We're in the process of moving to Mercurial, which has no "externals" mechanism so we have to either clone the libraries in the first place, use rsync to keep the code synced between the different repositories, or force a common directory structure so you can have hg pull from multiple parents. The last option seems to be working pretty well.