Is there a way to programmatically enumerate a namespace and its members in C++?
I have a large C++ program which utilizes several namespaces. I am unfamiliar with the codebase, and would like to determine which functions/classes/variables are associated with which namespaces.
My current approach involves simply removing the 'using namespace' directives one by one and checking what breaks during compilation, but I assume there is a much better way to achieve the same goal.
This is not possible in C++.
However, you can use external tools, such as Doxygen, that will create documentation (HTML, and other formats) that will list all the members of your namespaces.
Unfortunately, introspection is NOT one of C++'s big features. There's no way (within the language) to do what you want. You'll need an external code analysis tool (something that can parse the code and build a reference) to do the job. I use cscope for a lot of analysis, but to my knowledge it doesn't really know about namespaces, so probably not the right tool for you.
You can use a C++ front-end (e.g. Elsa) to do the job for you.
Also consider using a good IDE that has a 'Go To Defiinition' functionality (e.g. Microsoft Visual Studio).
You can start by running Doxygen to generate an index of all the functions/classes/namespaces defined in your project. Make sure to edit the settings to generate the index for undocumented symbols.
If you know which namespaces you're looking for, you can just generate a map file (g++ -Wl,-Map,MyMapFile.map). Then search for e.g. MyNamespace:: in the map file.
Related
I am using Emacs + Tuareg mode to do my OCaml project.
It is working fine and I get used to it.
However, along with my project source base getting bigger and bigger, I find managing the project is getting harder and harder.
Especially for refactoring. If I change a module name or function name, I have to search everywhere for the part that need to changed accordingly or I just constantly compile again and again to let compiler tell me where I should go.
It is not convenient.
Anyone can suggest a good way for source base management?
thanks
A good option is TypeRex. This is an alternative Emacs mode created by OCamlPro that has a bunch of OCaml-aware features including proper support for refactoring (like renaming identifiers).
It also has a bunch of other nice features like good auto-complete, semantic grep and so on.
Unfortunately, this involves changing your build process to use some wrapper programs. These generate the additional information the mode needs to function. However, once you get the build set up, it's a really awesome editing environment.
Issue
I have recently found myself working with a large, unfamiliar, multi-department, C++ codebase in need of better organization. I would like to discover a way to map which symbols are used by which source files for any given header. This is in the hope that if only one department uses a given function, then it can be moved out of the shared area and into that department's area.
Attempts
My first thoughts were to use the symbol table: ie. compile the project and dump the symbols for each object file. From there I figured I could simply write a script to check if the symbols from my header file were used. While this approach seems viable, it would require me to create a list of symbols I am looking for from the headers. With my limited knowledge, I am unsure of how to automate such a process, and with hundreds of headers files to test, doing it manually is out of the question.
Questions
Is my approach valid? If so..
What can I use to generate the symbol names from my header file?
If not..
What else can I do?
Additionally, while I am using Linux, most of the development teams work in Windows only environments. What utilities could I use on both platforms?
Any and all help is greatly appreciated.
When I need to clean up APIs I sometimes use information from callcatcher. It basically builds a database of all symbols while compiling and allows you to determine what symbols are used in some build product.
I sometimes also use DXR (code on github, an example installation) to browse what code defined where is used how. In contrast to callcatcher with DXR you can drill down to much finer detail. Setting up DXR is pretty heavy duty, but might be worth it if you have enough code to work with.
On the other side of the spectrum there are tools like cscope. Even though it doesn't work super nicely with C++ code it is still very useful. If you deal with more than a couple 100kloc you will quickly feel limited though.
If I had to pick only one of these tools and would be working on a large code base (>1Mloc) I would definitely pick DXR.
You can get a reasonable start on the information that you've described by using doxygen.
Even for source that doesn't contain the doxygen formatted comments the documentation created can contain a list of places (ie. source files) where a particular symbol is used.
And, as doxygen can be used to generate html documentation, navigating through your source tree becomes trivial. It's can be even better if you enable the dot functionality to generate relationship diagrams for the classes in your source tree.
very old-school, simple, and possibly unix only, but are you aware of etags? there's also gnu global which i think is similar.
the gnu global link refers to the "comparison with similar tools" discussion here which might also be useful.
I need to parse function headers from a .i file used by SWIG which contains all sorts of garbage beside the function headers. (final output would be a list of function declarations)
The best option for me would be using the GNU toolchain (GCC, Binutils, etc..) to do so, but i might be missing an easy way of doing it with SWIG. If I am please tell me!
Thanks :]
edit: I also don't know how to do that with GCC toolchain, if you have an idea it will be great.
I would try getting an XML dump of the abstract syntax tree either from clang or from gccxml. From there you can easily extract the function declarations you are interested in.
Our DMS Software Reengineering Toolkit provides general purpose program parsing, analysis, and transformation capability. It has front ends for a wide variety of languages, including C++.
It has been used to analyze and transforms very complex C++ programs and their header files.
You aren't clear as to what you will do after you "parse the function headers"; normally people want to extract some information or produce another artifact. DMS with its C++ front end can do the parsing; you can configure DMS to do the custom stuff.
As a practical matter, this isn't usually an afternoon's exercise; DMS is a complex beast, because it has to deal with complex beasts such as C++. And I'd expect you to face the same kind of complexity for any tool that can handle C++. The GCC toolchain can clearly handle C++, so you might be able to do it with that (at that same level of complexity) but GCC is designed to be a compiler, and IMHO you will find it a fight to get it do what you want.
Your "output function declarations" goal isn't clear. You want just the function names? You want a function signature? You want all the type declarations on which the function depends? You want all the type declarations on which the function depends, if they are not already present in an existing include file you intend to use?
The best way to extract function decls from the garbage which is C header files is to substitute out what constitutes the most smelly garbage: macros. You can do that with:
cpp - The C Preprocessor
What packages do you use to handle command line options, settings and config files?
I'm looking for something that reads user-defined options from the command line and/or from config files.
The options (settings) should be dividable into different groups, so that I can pass different (subsets of) options to different objects in my code.
I know of boost::program_options, but I can't quite get used to the API. Are there light-weight alternatives?
(BTW, do you ever use a global options object in your code that can be read from anywhere? Or would you consider that evil?)
At Google, we use gflags. It doesn't do configuration files, but for flags, it's a lot less painful than using getopt.
#include <gflags/gflags.h>
DEFINE_string(server, "foo", "What server to connect to");
int main(int argc, char* argv[]) {
google::ParseCommandLineFlags(&argc, &argv, true);
if (!server.empty()) {
Connect(server);
}
}
You put the DEFINE_foo at the top of the file that needs to know the value of the flag. If other files also need to know the value, you use DECLARE_foo in them. There's also pretty good support for testing, so unit tests can set different flags independently.
For command lines and C++, I've been a fan of TCLAP: Templatized Command Line Argument Parser.
http://sourceforge.net/projects/tclap/
Well, you're not going to like my answer. I use boost::program_options. The interface takes some getting used to, but once you have it down, it's amazing. Just make sure to do boatloads of unit testing, because if you get the syntax wrong you will get runtime errors.
And, yes, I store them in a singleton object (read-only). I don't think it's evil in that case. It's one of the few cases I can think of where a singleton is acceptable.
If Boost is overkill for you, GNU Gengetopt is probably, too, but IMHO, it's a fun tool to mess around with.
And, I try to stay away from global options objects, I prefer to have each class read its own config. Besides the whole "Globals are evil" philosophy, it tends to end up becoming an ever-growing mess to have all of your configuration in one place, and also it's harder to tell what configuration variables are being used where. If you keep the configuration closer to where it's being used, it's more obvious what each one is for, and easier to keep clean.
(As to what I use, personally, for everything recently it's been a proprietary command line parsing library that somebody else at my company wrote, but that doesn't help you much, unfortunately)
I've been using TCLAP for a year or two now, but randomly I stumbled across ezOptionParser. ezOptionParser doesn't suffer from "it shouldn't have to be this complex"-syndrome the same way that other option parsers do.
I'm pretty impressed so far and I'll likely be using it going forward, specifically because it supports config files. TCLAP is a more sophisticated library, but the simplicity and extra features from ezOptionParser is very compelling.
Other perks from its website include (as of 0.2.0):
Pretty printing of parsed inputs for debugging.
Auto usage message creation in three layouts (aligned, interleaved or staggered).
Single header file implementation.
Dependent only on STL.
Arbitrary short and long option names (dash '-' or plus '+' prefixes not required).
Arbitrary argument list delimiters.
Multiple flag instances allowed.
Validation of required options, number of expected arguments per flag, datatype ranges, user defined ranges, membership in lists and case for string lists.
Validation criteria definable by strings or constants.
Multiple file import with comments.
Exports to file, either set options or all options including defaults when available.
Option parse index for order dependent contexts.
GNU getopt is pretty nice. If you want a C++ feel, consider getoptpp which is a wrapper around the native getopt.
As far as configuration file is concerned, you should try to make it as stupid as possible so that parsing is easy. If you are bit considerate, you might want to use yaac&lex but that would be really a big bucks for small apps.
I also would like to suggest that you should support both config files and command line options in your application. Config files are better for those options which are to be changed less frequently. Command-line options are good when you want to pass the immediate changing arguments (typically when you are creating a app, which would be called by some other program.)
If you are working with Visual Studio 2005 on x86 and x64 Windows there is some good Command Line Parsing utilities in the SimpleLibPlus library. I have used it and found it very useful.
Not sure about command line argument parsing. I have not needed very rich capabilities in that area and have generally rolled my own to save adding more dependencies to my software. Depending upon what your needs are you may or may not want to try this route. The C++ programs I have written are generally not invoked from the command line.
On the other hand, for a config file you really can't beat an XML based format. It's readable, extensible, structured, etc... :) Plus there are lots of XML parsers out there. Despite the fact it is a C library, I tend to use libxml2 from xmlsoft.org.
Try Apache Ant. Its primary usage is Java projects, but there isn't anything Java about it, and its usable for almost anything.
Usage is fairly simple and you've got a lot of community support too. It's really good at doing things the way you're asking.
As for global options in code, I think they're quite necessary and useful. Don't misuse them, though.
When you get a third-party library (c, c++), open-source (LGPL say), that does not have good documentation, what is the best way to go about understanding it to be able to integrate into your application?
The library usually has some example programs and I end up walking through the code using gdb. Any other suggestions/best-practicies?
For an example, I just picked one from sourceforge.net, but it's just a broad engineering/programming question:
http://sourceforge.net/projects/aftp/
I frequently use a couple of tools to help me with this:
GNU Global. It generates cross-referencing databases and can produce hyperlinked HTML from source code. Clicking function calls will take you to their definitions, and you can see lists of all references to a function. Only works for C and perhaps C++.
Doxygen. It generates documentation from Javadoc-style comments. If you tell it to generate documentation for undocumented methods, it will give you nice summaries. It can also produce hyperlinked source code listings (and can link into the listings provided by htags).
These two tools, along with just reading code in Emacs and doing some searches with recursive grep, are how I do most of my source reverse-engineering.
One of the better ways to understand it is to attempt to document it yourself. By going and trying to document it yourself, it forces you to really dive in and test and test and test and make sure you know what each statement is doing at what times. Then you can really start to understand what the previous developer may have been thinking (or not thinking for that matter).
Great question. I think that this should be addressed thoroughly, so I'm going to try to make my answer as thorough as possible.
One thing that I do when approaching large projects that I've either inherited or contributing to is automatically generate their sources, UML diagrams, and anything that can ease the various amounts of A.D.D. encountered when learning a new project:)
I believe someone here already mentioned Doxygen, that's a great tool! You should look into it and write a small bash script that will automatically generate sources for the application you're developing in some tree structure you've setup.
One thing that I've haven't seen people mention is BOUML! It's fantastic and free! It automatically generates reverse UML diagrams from existing sources and it supports a variety of languages. I use this as a way to really capture the big picture of what's going on in terms of architecture and design before I start reading code.
If you've got the money to spare, look into Understand for %language-here%. It's absolutely great and has helped me in many ways when inheriting legacy code.
EDIT:
Try out ack (betterthangrep.com), it is a pretty convenient script for searching source trees:)
Familiarize yourself with the information available in the headers. The functions you call will be declared there. Then try to identify the valid arguments and pre-/post-conditions of the functions, as those are your primary guidance (even if they are not documented!). The example programs are your next bet.
If you have code completion/intellisense I like opening up the library and going '.' or 'namespace::' and seeing what comes up. I always find it helpful, you can navigate through the objects/namespaces and see what functionality they have. This is of course assuming its an OOP library with relatively good naming of functions/objects.
There really isn't a silver bullet other than just rolling up your sleeves and digging into the code.
This is where we earn our money.
Three things;
(1) try to run the test or example apps available, set low debug levels, and walk through logs.
(2) use source navigator tool / cscope ( available both on windows and linux) and browse the code to understand the flow.
(3) also in parallel use gdb to walk into code while running test/example apps.