How to generate empty definitions given a header file - c++

I have a 3rd-party library which for various reasons I don't wish to link against yet. I don't want to butcher my code though to remove all reference to its API, so I'd like to generate a dummy implementation of it.
Is there any tool I can use which spits out empty definitions of classes given their header files? It's fine to return nulls, false and 0 by default. I don't want to do anything on-the-fly or anything clever - the mock object libraries I've looked at appear quite heavy-weight? Ideally I want something to use like
$ generate-definition my_header.h > dummy_implemtation.cpp
I'm using Linux, GCC4.1

This is a harder problem than you might like, as parsing C++ can quickly become a difficult task. Your best bet would be to pick an existing parser with a nice interface.
A quick search found this thread which has many recommendations for parsers to do something similar.
At the very worst you might be able to use SWIG --> Python, and then use reflection on that to print a dummy implementation.
Sorry this is only a half-answer, but I don't think there is an existing tool to do this (other than a mocking framework, which is probably the same amount of work as using a parser).

Create one test application which reads the header file and creates the source file. Test application should parse the header file to know the function names.

Related

How to programmatically alter source code in C++ file?

I need a way to take a C/C++ source code file, inspect and perform some modifications to it, and then write the modified variant back to disk. Possible use cases that I have for it are:
Mutation testing, such as intentionally corrupting calculation in order to check if tests can catch it.
Altering visibility scope or annotating functions and methods. In order to split a large file into several smaller files but still being able to link them together, I want to turn some static functions into external functions so that the linker can find them later.
Generation of mock implementations of existing functions methods. For all externally visible functions, create a function with identical prototype but with empty/dummy body so that other code can link against it.
Are there existing solutions for such workflow?
I am mostly interested in dealing with functions/methods. The rest of information contained in a file, such as includes, type definitions etc. are less important for me, but they must be preserved in the output so that the end result remains syntactically correct.
A straightforward approach of applying a bunch of regular expressions to extract/modify the text is possible. But it is obviously not reliable in a any way. I would like to avoid writing a full-blown C++ parser. Even having such a parser does not solve the follow-up task of saving the modified parse tree back to a file.
LibTooling and libclang are commonly used to develop such refactoring tools (clang-format, clang-tidy, etc.).

Discovering Symbol Usage

Issue
I have recently found myself working with a large, unfamiliar, multi-department, C++ codebase in need of better organization. I would like to discover a way to map which symbols are used by which source files for any given header. This is in the hope that if only one department uses a given function, then it can be moved out of the shared area and into that department's area.
Attempts
My first thoughts were to use the symbol table: ie. compile the project and dump the symbols for each object file. From there I figured I could simply write a script to check if the symbols from my header file were used. While this approach seems viable, it would require me to create a list of symbols I am looking for from the headers. With my limited knowledge, I am unsure of how to automate such a process, and with hundreds of headers files to test, doing it manually is out of the question.
Questions
Is my approach valid? If so..
What can I use to generate the symbol names from my header file?
If not..
What else can I do?
Additionally, while I am using Linux, most of the development teams work in Windows only environments. What utilities could I use on both platforms?
Any and all help is greatly appreciated.
When I need to clean up APIs I sometimes use information from callcatcher. It basically builds a database of all symbols while compiling and allows you to determine what symbols are used in some build product.
I sometimes also use DXR (code on github, an example installation) to browse what code defined where is used how. In contrast to callcatcher with DXR you can drill down to much finer detail. Setting up DXR is pretty heavy duty, but might be worth it if you have enough code to work with.
On the other side of the spectrum there are tools like cscope. Even though it doesn't work super nicely with C++ code it is still very useful. If you deal with more than a couple 100kloc you will quickly feel limited though.
If I had to pick only one of these tools and would be working on a large code base (>1Mloc) I would definitely pick DXR.
You can get a reasonable start on the information that you've described by using doxygen.
Even for source that doesn't contain the doxygen formatted comments the documentation created can contain a list of places (ie. source files) where a particular symbol is used.
And, as doxygen can be used to generate html documentation, navigating through your source tree becomes trivial. It's can be even better if you enable the dot functionality to generate relationship diagrams for the classes in your source tree.
very old-school, simple, and possibly unix only, but are you aware of etags? there's also gnu global which i think is similar.
the gnu global link refers to the "comparison with similar tools" discussion here which might also be useful.

What generic template processor should I use?

This is a potentially dangerous question because interdisciplinary questions and answers will be biased, but I'll have a stab at it anyway. All in good spirit!
So, here we go. I'm writing a major editing mode for Emacs for the language that it has almost no support for yet. And I'm at the point, where I have to decide on a way to generate project files. Below is the syllabus of the task ahead:
The templates have to represent project directory tree, not only single files.
The resulting files are of various formats, potentially including SGML-like languages, but not limited to this variety. They also have to generate C-like source code and, eLisp source code and plain text files, like README, for example.
The templates must be processed in a batch upon user-initiated action (as in user wants to create a project - several files must be created in the user-appointed directory). It may be beneficial to have an ability to supervise the creation, but this is less important then the ability to run the process entirely automatically.
Bonus features:
The template language has already a user base (with a potential of reuse of existing templates).
The templates can be used for code snippets (contain blanks which are filled interactively once the user invokes code-generating routine while editing the file).
Obvious things like cross-platform-ness, ease of use both through graphical interface and command line.
I made a research, but I won't share my results (yet) so I won't bias the answers. The problem with answering this question is not that the answer is hard to find, but that it is hard to chose one from many.
I'm developing a system based on Mustache for exactly the use case that you've described. The template language itself is a very simple extension of Mustache called Groome.
I also released a command-line tool called Molt that renders Groome templates. I'd be curious to know if it does everything that you need. I'm still adding features to the tool and haven't yet announced it. Thanks.
I went to solve a similar problem several years aback, where I wanted to use Emacs to generate code out of a UML diagram (cogre), and also generate Makefiles from project specifications. I first tried to use Tempo, but when I tried to get the templates to nest, I ran into problems. I also looked into skeleton, but that didn't quite fit the plan either.
I ended up using Google Templates for a little bit, and liked the syntax, and developed SRecode instead, and just borrowed the good bits from Google templates. SRecode was written specifically for machine-generated code. The interaction for template insertion (aka - what tempo was written for) isn't first class in SRecode. For generating code from a data structure, however, it is very robust, and has a lot of features, and automatically filled variables. It works closely with your major mode, and allows many nested templates, with control over the nested dictionary values. There is a subsystem that will use Semantic tags and generate code from them for a couple languages. That means you can parse code in one language with Semantic, and generate code in another language with SReocde using those tags. Nifty! Many parts of CEDET Reference manuals were built that way.
The templates themselves allow looping, if statements, and include statements. There are a couple examples in SRecode for making an 'application', such as the comment writer, and EDE uses it to create Makefiles, which is almost exactly what you are trying to do.
Another option is Generator, which offers “language-agnostic project bootstrapping with an emphasis on simplicity”. Installation requires Node.js and npm.
Generator’s emphasis on simplicity means it is very easy to learn how to make a template. Generator also saves you from having to reference templates by file paths – it looks for templates in ~/.generator.
However, there is no way to write README or LICENSE files for the template itself without those files being copied to the generated project. Also, post-generation commands written in the Makefile will be copied to the generated Makefile, even after they are no longer of use. Finally, the ad-hoc templating language doesn’t provide a way to escape its __lowercasevariables__ – though I can’t think of a language where that limitation would be a problem.

parser generator that generates stand-alone C++ code

Is there a LALR parser generator that produces stand-alone C++ code? I am hoping that it would generate two files named something like "Parser.cpp" and "Parser.hpp," and the generated parser is implemented in a single class (that I can wrap in whatever namespace) that I can use for my parsing needs.
I want to use it for fun (i.e. small personal projects), and I'd like the output to be stand-alone (without any headers) so that I know I can compile it wherever I have a C++ compiler.
The search so far:
I've looked at flex/bison, but AFAIK they both require special headers and libraries. I've also looked at ANTLR a little bit, but it is not obvious to me that it can generate stand-alone C++ code. If someone can confirm that it can, then I might look more into it.
GOLD Parser (Bart Kiers mentioned the list on Wikipedia) has support for C and C++ languages. It does not generate a completely self-contained C/C++ source code file. All it does is the generation of Lexer/Parser tables which can be consumed by the "parsing engine".
To accomplish your task (or something similar) I did the following:
Prepare your LALR grammar in Gold's format
Generate parsing tables (one binary file)
Use an old trick to convert the binary file into a header file like
unsigned char ParseTable[] = { ... };
Modify the loader from the "parsing engine" sources (or use the C version which supports in-memory loading, as I remember)
Combine the sources for the GPEngine (if it is a C++ version) into the .h/.cpp pair.
Append the ParseTable to .cpp
Sure, it's not that straightforward, but all the steps can in principle be done within a single "combine" script which can be used with a number of grammars.
I guess the major drawback is the fact that GOLD is closed-source and windows-only (it means that to produce the parsing tables you have to use Windows machine).
ANTLR can generate C++ code although IMHO I find the support for C++ is a bit weak, it is more like C code. Still it is a good environment to work with ANTLRWorks giving you a graphical representation of your syntax tree.
The output from flex+bison consists of two .c files and one .h file. These are completely stand-alone, in that they are all you need to compile into your application to make use of the parser. There are no additional libraries or headers needed (beside the standard C ones).
Unless I've misunderstood your requirements, you definitely can do what you want with flex+bison.

What library to use to *write* XML file in a C++ program?

What library to use to write XML file in a C++ program?
I've found two classes posted in CodeProject
http://www.codeproject.com/KB/stl/simple_xmlwriter.aspx
http://www.codeproject.com/KB/XML/XML_writer.aspx
but want to check if there is more standard option than these. I'm only concerned with writing, and not parsing XML.
I tried different libraries and finally decided for TinyXml. It's compact, fast, free (zlib license) and very easy to use.
Question: Are you ever going to update an XML file? Because while that sounds like it's just more writing, with XML it still requires a parser.
While xerces is large and bloated, it is fully standards compliant and it is DOM based. Should you ever have to cross platform or change language, there will always be a DOM based library for whatever language/platform you might move to so knowing how DOM based parsing/writing works is a benefit. If you are going to use XML, you may as well use it correctly.
Avoiding XML altogether is of course the best option. But short of that, I'd go with xerces.
You can use Xerces-C++, a library written by Apache foundation. This library permits read, write and manipulate XML files.
Link: http://xerces.apache.org/xerces-c/
For my purposes, PugiXML worked out really nicely
http://pugixml.org/
The reason why I thought it was so nice was because it was simply 3 files, a configuration header, a header, and the actual source.
But as you stated, you aren't interested in parsing, so why even bother using a special class to write out XML? While maybe your classes are too complex for this, I found the easy thing to do is use the std::ostream and just write out standard compliant XML this way. For example, say I have a class that represents a Company which is a collection of Employee objects, just make a method in each the Company and Employee classes that looks something like the following psuedocode
Company::writeXML(std::ostream& out){
out << "<company>" << std::endl;
BOOST_FOREACH(Employee e, employees){
e.writeXML(out);
}
out << "</company>" << std::endl;
}
and to make it even easier, your Employee's writeXML function can be declared virtual so that you can have a specific output for say a CEO, President, Janitor or whatever the subclasses should be.
I have been using the open-source libxml2 library for years, works great for me.
I ran into the same problem and wound up rolling my own solution. It's implemented as a single header file that you can drop into your project: xml_writer.h
And it comes with a set of unit tests which also serve as documentation.
Roll your own
I've been in a similar situation. I had a program that needed to generate JSON. We did it two ways. First we tried jsoncpp, but in the end I just generated the JSON directly via a std::ofstream.
Afterward we ran the generated JSON through a validator to catch any syntax errors. There were a few but they were really easy to find and correct.
If I were to do it again I would definitely roll my own again. Somewhat unexpectedly, there was less code when using std::ofstream. Plus we didn't have to use/learn a new API. It was easier to write and is easier to maintain.