How do you handle command line options and config files? - c++

What packages do you use to handle command line options, settings and config files?
I'm looking for something that reads user-defined options from the command line and/or from config files.
The options (settings) should be dividable into different groups, so that I can pass different (subsets of) options to different objects in my code.
I know of boost::program_options, but I can't quite get used to the API. Are there light-weight alternatives?
(BTW, do you ever use a global options object in your code that can be read from anywhere? Or would you consider that evil?)

At Google, we use gflags. It doesn't do configuration files, but for flags, it's a lot less painful than using getopt.
#include <gflags/gflags.h>
DEFINE_string(server, "foo", "What server to connect to");
int main(int argc, char* argv[]) {
google::ParseCommandLineFlags(&argc, &argv, true);
if (!server.empty()) {
Connect(server);
}
}
You put the DEFINE_foo at the top of the file that needs to know the value of the flag. If other files also need to know the value, you use DECLARE_foo in them. There's also pretty good support for testing, so unit tests can set different flags independently.

For command lines and C++, I've been a fan of TCLAP: Templatized Command Line Argument Parser.
http://sourceforge.net/projects/tclap/

Well, you're not going to like my answer. I use boost::program_options. The interface takes some getting used to, but once you have it down, it's amazing. Just make sure to do boatloads of unit testing, because if you get the syntax wrong you will get runtime errors.
And, yes, I store them in a singleton object (read-only). I don't think it's evil in that case. It's one of the few cases I can think of where a singleton is acceptable.

If Boost is overkill for you, GNU Gengetopt is probably, too, but IMHO, it's a fun tool to mess around with.
And, I try to stay away from global options objects, I prefer to have each class read its own config. Besides the whole "Globals are evil" philosophy, it tends to end up becoming an ever-growing mess to have all of your configuration in one place, and also it's harder to tell what configuration variables are being used where. If you keep the configuration closer to where it's being used, it's more obvious what each one is for, and easier to keep clean.
(As to what I use, personally, for everything recently it's been a proprietary command line parsing library that somebody else at my company wrote, but that doesn't help you much, unfortunately)

I've been using TCLAP for a year or two now, but randomly I stumbled across ezOptionParser. ezOptionParser doesn't suffer from "it shouldn't have to be this complex"-syndrome the same way that other option parsers do.
I'm pretty impressed so far and I'll likely be using it going forward, specifically because it supports config files. TCLAP is a more sophisticated library, but the simplicity and extra features from ezOptionParser is very compelling.
Other perks from its website include (as of 0.2.0):
Pretty printing of parsed inputs for debugging.
Auto usage message creation in three layouts (aligned, interleaved or staggered).
Single header file implementation.
Dependent only on STL.
Arbitrary short and long option names (dash '-' or plus '+' prefixes not required).
Arbitrary argument list delimiters.
Multiple flag instances allowed.
Validation of required options, number of expected arguments per flag, datatype ranges, user defined ranges, membership in lists and case for string lists.
Validation criteria definable by strings or constants.
Multiple file import with comments.
Exports to file, either set options or all options including defaults when available.
Option parse index for order dependent contexts.

GNU getopt is pretty nice. If you want a C++ feel, consider getoptpp which is a wrapper around the native getopt.
As far as configuration file is concerned, you should try to make it as stupid as possible so that parsing is easy. If you are bit considerate, you might want to use yaac&lex but that would be really a big bucks for small apps.
I also would like to suggest that you should support both config files and command line options in your application. Config files are better for those options which are to be changed less frequently. Command-line options are good when you want to pass the immediate changing arguments (typically when you are creating a app, which would be called by some other program.)

If you are working with Visual Studio 2005 on x86 and x64 Windows there is some good Command Line Parsing utilities in the SimpleLibPlus library. I have used it and found it very useful.

Not sure about command line argument parsing. I have not needed very rich capabilities in that area and have generally rolled my own to save adding more dependencies to my software. Depending upon what your needs are you may or may not want to try this route. The C++ programs I have written are generally not invoked from the command line.
On the other hand, for a config file you really can't beat an XML based format. It's readable, extensible, structured, etc... :) Plus there are lots of XML parsers out there. Despite the fact it is a C library, I tend to use libxml2 from xmlsoft.org.

Try Apache Ant. Its primary usage is Java projects, but there isn't anything Java about it, and its usable for almost anything.
Usage is fairly simple and you've got a lot of community support too. It's really good at doing things the way you're asking.
As for global options in code, I think they're quite necessary and useful. Don't misuse them, though.

Related

How to conveniently refactor OCaml project?

I am using Emacs + Tuareg mode to do my OCaml project.
It is working fine and I get used to it.
However, along with my project source base getting bigger and bigger, I find managing the project is getting harder and harder.
Especially for refactoring. If I change a module name or function name, I have to search everywhere for the part that need to changed accordingly or I just constantly compile again and again to let compiler tell me where I should go.
It is not convenient.
Anyone can suggest a good way for source base management?
thanks
A good option is TypeRex. This is an alternative Emacs mode created by OCamlPro that has a bunch of OCaml-aware features including proper support for refactoring (like renaming identifiers).
It also has a bunch of other nice features like good auto-complete, semantic grep and so on.
Unfortunately, this involves changing your build process to use some wrapper programs. These generate the additional information the mode needs to function. However, once you get the build set up, it's a really awesome editing environment.

Determine arguments accepted by an exe without having source code?

I have an application with which I work daily.
The developers have provided a variety of convenience arguments which when passed to the exe perform certain tasks. While debugging an issue the tech support guy told me to run the exe with some special args which reduced a lot of manual steps of my job. However, the developers are not kind enough to share a list of all such args. So I wanted to know whether there is any way to just determine the args which an exe accepts? The application is developed in C++.
The first thing I would do is run something like strings (under UNIX-like operating systems) on the executable to extract anything that looks like an option.
That won't tell you how to use a particular option but, if your strings command returns:
--option1
--option2
--run-faster
--use-less-cpu
--format-hard-disk
it's a pretty safe bet that those are valid options. Shorter options may not show up so easily since strings tends to be for obviously textual data.
Even if you don't have something like strings, there's a good chance all the options will be lumped together in the executable just because of the way many compilers and linkers work.
And, as Eugeny Loy kindly points out in a comment, the sysinternals suite from Microsoft has a strings utility as well.
By the way, I'd give serious pause before trying to test if --format-hard-disk is a valid option :-)

The best way to handle config in large C++ project

In order to start my C++ program, I need to read some configs, e.g. ip address, port number, file paths... These settings may change quite frequently (every week or everyday!), so hardcoding them into source files is not a good idea.
After some research, I'm confused about whether there is a best practice to load config settings from a file and made those configs available to other class/module/*.cpp in the same project.
static is bad; singleton is bad (an anti-pattern?) So, what other options do we have? Or, maybe the idea of "config file" is wrong?
EDIT: I have no problem of loading the config file. I'm worried about, after loading all those settings into a std::map< string, string > in memory, how to let other classes, functions access those settings.
EDIT 2: Thanks for everybody's input. I know these patterns that I listed here are FINE, and they are used by lots of programs. I'm curious about whether there is a (sort of) BEST pattern to handle configurations of a program.
Arguably, a configuration file is a legitimate use for a Singleton. The Singleton pattern is usually frowned upon because Singletons cause problems with race conditions in a multi-threaded environment, and since they're globally accessible, you run into the same problems you have with globals. But if your Singleton object is initialized once when you read in the config file, and never altered after that, I can't think of a legitimate reason to call it an "anti-pattern" other than some sort of cargo-cult mentality.
That being said, when I need to make a configuration file available as an object to my application, I don't use a Singleton. Usually I pass the configuration object around to those objects/functions which need it.
The best pattern I know of solving this is through an options class, that gets injected into your code on creation/configuration.
Steps:
create an options parser class
configure the parser on what parameters and options it should accept, and their default values (default values can be your "most probable" defaults)
write client code to accept options as parameters (instead of singleton and/or static stuff).
inject options when creating objects.
Have a look at boost.program_options for an already mature module for program options.
If you're familiar with python, have a look at the examples in the doc of argparse (same concept, implemented in python library). They are very easy to get the concept and interactions from.

Are there any lint tools for C and C++ that check formatting?

I have a codebase that is touched by many people. While most people make an effort to keep the code nicely formatted (e.g. consistent indentation and use of braces), some don't, and even those that do can't always do it because we all use different editors, so settings like spaces vs. tabs are different.
Is there any standard lint tool that checks that code is properly formatted, but doesn't actually change it (like indent but that returns only errors and warnings)?
While this question could be answered generally, my focus is on C and C++, because that's what this project is written in.
Google uses cpplint. This is their style guide.
The Linux kernel uses a tool that does exactly this - it's called checkpatch. You'd have to modify it to check your coding standards rather than theirs, but it could be a good basis to work from. (It is also designed for C code rather than C++).
Take a look at Vera++, it has a number of rules already available but the nice part is that you can modify them or write your own.
There are several programs that can do formatting for you automatically on save (such as Eclipse). You can have format settings that everyone can use ensuring the same formatting.
It is also possible to automatically apply such formatting when code is committed. When you use SVN, the system to do this is called svn hooks. This basically starts a program to process (or check and deny) the formatting when a commit happens.
This site explains how you can make your own. But also ones already exist to do this.

Reading/Understanding third-party code

When you get a third-party library (c, c++), open-source (LGPL say), that does not have good documentation, what is the best way to go about understanding it to be able to integrate into your application?
The library usually has some example programs and I end up walking through the code using gdb. Any other suggestions/best-practicies?
For an example, I just picked one from sourceforge.net, but it's just a broad engineering/programming question:
http://sourceforge.net/projects/aftp/
I frequently use a couple of tools to help me with this:
GNU Global. It generates cross-referencing databases and can produce hyperlinked HTML from source code. Clicking function calls will take you to their definitions, and you can see lists of all references to a function. Only works for C and perhaps C++.
Doxygen. It generates documentation from Javadoc-style comments. If you tell it to generate documentation for undocumented methods, it will give you nice summaries. It can also produce hyperlinked source code listings (and can link into the listings provided by htags).
These two tools, along with just reading code in Emacs and doing some searches with recursive grep, are how I do most of my source reverse-engineering.
One of the better ways to understand it is to attempt to document it yourself. By going and trying to document it yourself, it forces you to really dive in and test and test and test and make sure you know what each statement is doing at what times. Then you can really start to understand what the previous developer may have been thinking (or not thinking for that matter).
Great question. I think that this should be addressed thoroughly, so I'm going to try to make my answer as thorough as possible.
One thing that I do when approaching large projects that I've either inherited or contributing to is automatically generate their sources, UML diagrams, and anything that can ease the various amounts of A.D.D. encountered when learning a new project:)
I believe someone here already mentioned Doxygen, that's a great tool! You should look into it and write a small bash script that will automatically generate sources for the application you're developing in some tree structure you've setup.
One thing that I've haven't seen people mention is BOUML! It's fantastic and free! It automatically generates reverse UML diagrams from existing sources and it supports a variety of languages. I use this as a way to really capture the big picture of what's going on in terms of architecture and design before I start reading code.
If you've got the money to spare, look into Understand for %language-here%. It's absolutely great and has helped me in many ways when inheriting legacy code.
EDIT:
Try out ack (betterthangrep.com), it is a pretty convenient script for searching source trees:)
Familiarize yourself with the information available in the headers. The functions you call will be declared there. Then try to identify the valid arguments and pre-/post-conditions of the functions, as those are your primary guidance (even if they are not documented!). The example programs are your next bet.
If you have code completion/intellisense I like opening up the library and going '.' or 'namespace::' and seeing what comes up. I always find it helpful, you can navigate through the objects/namespaces and see what functionality they have. This is of course assuming its an OOP library with relatively good naming of functions/objects.
There really isn't a silver bullet other than just rolling up your sleeves and digging into the code.
This is where we earn our money.
Three things;
(1) try to run the test or example apps available, set low debug levels, and walk through logs.
(2) use source navigator tool / cscope ( available both on windows and linux) and browse the code to understand the flow.
(3) also in parallel use gdb to walk into code while running test/example apps.