boost unit test templates produces bloated code. How to avoid that? - c++

I have about a hundred of simple tests done with boost unit test library. Not only I get very long compile times (in order of half a minute), but the size of the resulting executable gets really big - 4MB for just a hundred simple tests. If the tests are done without using boost test, the executable size is a mere 120kB.
How can I lessen the bloat? This question is just because of interest, not that I need test code to have shiny performance :)
The debugging info is already stripped. I've tried all optimization options with no success.
EDIT:
Each test is basically as follows:
PlainOldDataObject a, b;
a = { ... initial_data ... };
a = some_simple_operation(a);
b = { ... expected_result ... };
BOOST_CHECK(std::memcmp(&a, &b, sizeof(PlainOldDataObject)) == 0);

I. Which usage variant do you employ? If you employ single header variant of unit test framework, you should switch to offline variant (either static or dynamic)
II. If you suspect that BOOST_AUTO_TEST_CASE macro is at fault, you have several options:
Give up single assertion per test case policy and use number of "themed" test cases. I personally find this acceptable.
Use manual test case registration. You can probably automate it with your own macro to avoid tedious repetition.
Split into multiple test files. You might see at least some compilation time improvement (or might not).
III. If you suspect BOOST_CHECK statements, there is not much you can do, but I'd be rather surprised to see this much overhead from them. Maybe you should investigate further.

Try using Loki library instead: it also has many common-used generic components (including a static assertion macro, similar to BOOST_CHECK).
Loki is known to be lightweight, but even more powerful than boost is, because it uses a policy-based approach to class design. However, it doesn't have all that variety of tools, only most common ones: smart pointers, small object allocator, meta-programming helpers, factories and a few others. But if you don't need any of those monstrous boost libs like serialization for ex, you may find it satisfying for your needs.

Related

Getting llvm::LoopInfo from (non-LLVM) code?

For the development of my own Pass I want to write unit tests - i have lots of 'pure' helper methods, so they seem ideal candidates for unit test. But some of them require an instance of llvm::LoopInfo as an argument.
In my (Function-)Pass I just use
void getAnalysisUsage(llvm::AnalysisUsage &AU) const override {
AU.setPreservesCFG();
AU.addRequired<llvm::LoopInfoWrapperPass>();
}
...
llvm::LoopInfo &loopInfo = getAnalysis<LoopInfoWrapperPass>(F).getLoopInfo();
to get this information object.
In my unit test I currently parse my llvm::Function void foo() (that I want to run my analysis on) from disk like this:
llvm::SMDiagnostic Err;
llvm::LLVMContext Context;
std::unique_ptr<llvm::Module> module(parseIRFile(my_bc_filename, Err, Context));
llvm::Function* foo = module.operator*().getFunction("foo");
to finalize my test I would have to fill in following stub:
llvm::LoopInfo& = /*run LoopInfoWrapperPass on foo and return LoopInfo element */;
My first attempts were based on using the PassManager<Function> (in Header "llvm/IR/PassManager.h"), AnalysisManager<Function>, and the class LoopInfoWrapperPass, but I couldn't find any example usage online for LLVM 4.0 - and older examples seemed to be using a previous version of PassManager, and I did not see how to make use of the LegacyPassManager. I tried to look into the sources for PassManager but could not make enough sense of the typedefs and template arguments (and they are increasing my irrational dislike for C++ as a language).
How do I fill in that stub? How do I call this Analysis Pass (and get LoopInfo) in my plain C++ code?
PS: There are more passes other than LoopInfoWrapperPass I need to use, but I'm assuming the way should be transferable to any Analysis Pass.
PPS: I'm using googletest as a unit test framework, with a CMake build configuration that makes the unit tests their own target, and I'm building my Pass out-of-tree against binary libs of LLVM 4.0.1, if any of that is somehow relevant.
I am not sure how you have your unit tests structured, but looking around in the LLVM source tree is a good idea.
One example can be found in CFGTest.cpp here.
You need to create the PassManager and the pipeline yourself. From my short experience on this, it works well for small tests, but once you need anything bigger or pass data in/out it's really restricting, since the LoopInfo data have only meaning within the pipeline (aka runOn() methods and friends).
Otherwise, you might want to opt (no pun intended) for the simpler, IMHO, method of creating the set of the required analysis yourself (only dominators in the case of LoopInfo) without using the pass manager infrastructure. An example of this can be seen here.
Hope this helps.

Xcode C++ Unit testing with global variable

I've got a problem when unit testing my program.
The problem is simple but i'm not sure why this is not working.
1 -> i build all my program
2 -> i build my unitTest
3 -> the test is running.
All is ok when it is not about getting global data from the data segment. It seems as if the variable are not initialized / or simply found. So of course all my tests become wrong.
My question is:
Is it totally wrong to build an executable, then running the test on it? Or should i must compile all my code + the unit test in the same time, and then running it? Or is it just a lack of SenTesting framework?
I forgot to mention that this is a C++ const string. Dunno if that change something.
*EDIT***
My assumption was wrong, but i still don't understand the magic beyond! Seems a C++ magic hoydi hoo?
char cstring[] = "***";
std::string cppString = "***";
NSString* nstring = #"***";
- (void)testSync{
STAssertNotNil(nstring, nil); // fine
STAssertNotNil((id)strlen(bbb), nil); // fine
STAssertNotNil((id)cppString.size(), nil); // failed
}
EDIT 2**
Actually this is normal that the C++ is not initialized at this part of the code. If i do a nm on my executable, it appears that my C and Obj-C global are put into the dataSegment. I thought my C++ string was in the same case, but it is actually put into the bss segment. That's means it's uninitialized. The fact is the C++ compiler do some magic beyond and the C++ string is initialized after the main() call and act like if it were into the dataSegment.
I didn't know that testSuit doesn't have main() call, so the C++ object are never initialized. There is some technique in order to call the .ctor before the testSuit. But i am too lazy too explain and it's some kind of topic. I have just replaced my C++ string with a simple char array, and it work perfectly since my value are now POD.
By the way there is no devil in global variable if they are just read-only. ;)
OK, I can see a few faults here.
First of all, this code gives errors on my environment (Xcode 5) and for good reasons (with ARC enabled). I don't know how you got the thing to compile. The reason is that you are casting an integer (or long) to an object, and this will result in many errors, as it is normally an invalid operation. So, the real question is not why the third "assert" failed, but why the second one succeeded.
As far as the second part of your question is concerned, I have to admit that I do not completely understand your question, and you may have to explain it more thoroughly.
In general, unit testing is testing specific parts of your code. Therefore, you typically don't perform the tests on an actual final executable (this is not called unit testing, I believe), nor do you have to compile "all your c++ code + your unit tests at the same time".
Since you are using Xcode, I will give you some indications.
Write your application (or at least a part of it), and find the aspects / functions / objects you want to perform unit tests on.
In separate files, write unit tests that instantiate these objects and test their methods, call them and compare the inputs and outputs.
You should have a second target in your application, that will compile only the unit test source code and the relevant main program code.
Build this target, or press command-U and it will report successes and failures.
So, you need to separate your source code and isolate your classes / methods to make them testable like this. This needs a good architecture and design of the application on your part, and you may need to make some compromises in flexibility (that is up to you to decide). Oh, and I believe that in a testable code you should avoid global variables in general, for various reasons. Global variables are helpful sometimes, but they generally make unit testing really difficult, (and if misused may lead to spaghetti code, but this is a whole different story)
I hope I helped, even without fully understanding the second part of your post.

Insert text into C++ code between functions

I have following requirement:
Adding text at the entry and exit point of any function.
Not altering the source code, beside inserting from above (so no pre-processor or anything)
For example:
void fn(param-list)
{
ENTRY_TEXT (param-list)
//some code
EXIT_TEXT
}
But not only in such a simple case, it'd also run with pre-processor directives!
Example:
void fn(param-list)
#ifdef __WIN__
{
ENTRY_TEXT (param-list)
//some windows code
EXIT_TEXT
}
#else
{
ENTRY_TEXT (param-list)
//some any-os code
if (condition)
{
return; //should become EXIT_TEXT
}
EXIT_TEXT
}
So my question is: Is there a proper way doing this?
I already tried some work with parsers used by compilers but since they all rely on running a pre-processor before parsing, they are useless to me.
Also some of the token generating parser, which do not need a pre-processor are somewhat useless because they generate a memory-mapping of tokens, which then leads to a complete new source code, instead of just inserting the text.
One thing I am working on is to try it with FLEX (or JFlex), if this is a valid option, I would appreciate some input on it. ;-)
EDIT:
To clarify a little bit: The purpose is to allow something like a stack trace.
I want to trace every function call, and in order to follow the call-hierachy, I need to place a macro at the entry-point of a function and at the exit point of a function.
This builds a function-call trace. :-)
EDIT2: Compiler-specific options are not quite suitable since we have many different compilers to use, and many that are propably not well supported by any tools out there.
Unfortunately, your idea is not only impractical (C++ is complex to parse), it's also doomed to fail.
The main issue you have is that exceptions will bypass your EXIT_TEXT macro entirely.
You have several solutions.
As has been noted, the first solution would be to use a platform dependent way of computing the stack trace. It can be somewhat imprecise, especially because of inlining: ie, small functions being inlined in their callers, they do not appear in the stack trace as no function call was generated at assembly level. On the other hand, it's widely available, does not require any surgery of the code and does not affect performance.
A second solution would be to only introduce something on entry and use RAII to do the exit work. Much better than your scheme as it automatically deals with multiple returns and exceptions, it suffers from the same issue: how to perform the insertion automatically. For this you will probably want to operate at the AST level, and modify the AST to introduce your little gem. You could do it with Clang (look up the c++11 migration tool for examples of rewrites at large) or with gcc (using plugins).
Finally, you also have manual annotations. While it may seem underpowered (and a lot of work), I would highlight that you do not leave logging to a tool... I see 3 advantages to doing it manually: you can avoid introducing this overhead in performance sensitive parts, you can retain only a "summary" of big arguments and you can customize the summary based on what's interesting for the current function.
I would suggest using LLVM libraries & Clang to get started.
You could also leverage the C++ language to simplify your process. If you just insert a small object into the code that is constructed on function scope entrance & rely on the fact that it will be destroyed on exit. That should massively simplify recording the 'exit' of the function.
This does not really answer you question, however, for your initial need, you may use the backtrace() function from execinfo.h (if you are using GCC).
How to generate a stacktrace when my gcc C++ app crashes

Performance penalty for large C++ dll's with autogenerated C code

I am working on a piece of software that needs to call a family of optimisation solvers. Each solver is an auto-generated piece of C code, with thousands of lines of code. I am using 200 of these solvers, differing only in the size of optimisation problem to be solved.
All-in-all, these auto-generated solvers come to about 180MB of C code, which I compile to C++ using the extern "C"{ /*200 solvers' headers*/ } syntax, in Visual Studio 2008. Compiling all of this is very slow (with the "maximum speed /O2" optimisation flag, it takes about 8hours). For this reason I thought it would be a good idea to compile the solvers into a single DLL, which I can then call from a separate piece of software (which would have a reasonable compile time, and allow me to abstract away all this extern "C" stuff from higher-level code). The compiled DLL is then about 37MB.
The problem is that when executing one of these solvers using the DLL, execution requires about 30ms. If I were to compile only that single one solvers into a DLL, and call that from the same program, execution is about 100x faster (<1ms). Why is this? Can I get around it?
The DLL looks as below. Each solver uses the same structures (i.e. they have the same member variables), but they have different names, hence all the type casting.
extern "C"{
#include "../Generated/include/optim_001.h"
#include "../Generated/include/optim_002.h"
/*etc.*/
#include "../Generated/include/optim_200.h"
}
namespace InterceptionTrajectorySolver
{
__declspec(dllexport) InterceptionTrajectoryExitFlag SolveIntercept(unsigned numSteps, InputParams params, double* optimSoln, OutputInfo* infoOut)
{
int exitFlag;
switch(numSteps)
{
case 1:
exitFlag = optim_001_solve((optim_001_params*) &params, (optim_001_output*) optimSoln, (optim_001_info*) &infoOut);
break;
case 2:
exitFlag = optim_002_solve((optim_002_params*) &params, (optim_002_output*) optimSoln, (optim_002_info*) &infoOut);
break;
/*
...
etc.
...
*/
case 200:
exitFlag = optim_200_solve((optim_200_params*) &params, (optim_200_output*) optimSoln, (optim_200_info*) &infoOut);
break;
}
return exitFlag;
};
};
I do not know if your code is inlined into each case part in the example. If your functions are inline functions and you are putting it all inside one function then it will be much slower because the code is laid out in virtual memory, which will require much jumping around for the CPU as the code is executed. If it is not all inlined then perhaps these suggestions might help.
Your solution might be improved by...
A)
1) Divide the project into 200 separate dlls. Then build with a .bat file or similar.
2) Make the export function in each dll called "MyEntryPoint", and then use dynamic linking to load in the libraries as they are needed. This will then be the equivalent of a busy music program with a lot of small dll plugins loaded. Take a function pointer to the EntryPoint with GetProcAddress.
Or...
B) Build each solution as a separate .lib file. This will then compile very quickly per solution and you can then link them all together. Build an array of function pointers to all the functions and call it via lookup instead.
result = SolveInterceptWhichStep;
Combine all the libs into one big lib should not take eight hours. If it takes that long then you are doing something very wrong.
AND...
Try putting the code into different actual .cpp files. Perhaps that specific compiler will do a better job if they are all in different units etc... Then once each unit has been compiled it will stay compiled if you do not change anything.
Make sure that you measure and average the timing multiple calls to the optimizer, because it could be that there's a large overhead to the setup before the first call.
Then also check what that 200-branch conditional statement (your switch) is doing to your performance! Try eliminating that switch for testing, calling just one solver in your test project but linking all of them in the DLL. Do you still see slow performance?
I assume the reason you are generating the code is for better run-time performance, and also for better correctness.
I do the same thing.
I suggest you try this technique to find out what the run-time performance problem is.
If you're seeing a 100:1 performance difference, that means each time you interrupt it and look at the program's state, there is a 99% chance you will see what the problem is.
As far as build time goes, sure it makes sense to modularize it.
None of that should have much effect on run time, unless it means you're doing crazy I/O.

Make a variable unavailable in portion of codes

From time to time, I want, as a safety check, to check that a variable v is not used in some portion of code, or in the remainder of some function, even though it is still visible in the scope of this function/portion of code. For instance:
int x;
// do something with x
DEACTIVATE(x);
// a portion of code which should not use x
ACTIVATE(x);
// do something else with x
Is there a good way to perform that type of verification at compile time?
NOTE: I know that one should always use a scope that is as small as possible for each variable, but there are cases where pushing this practice to an extreme can become cumbersome, and such a tool would be useful.
Thanks!
The best way to achieve this is to actually have small scopes in your code, i.e. use short, focused methods which do one thing only. This way you tend to have few local variables per each individual method, and they go out of scope automatically once you don't need them.
If you have long legacy methods which make you worry about this problem, the best long-term solution is to refactor them by extracting smaller chunks of functionality into separate methods. Most modern IDEs have automated refactoring support which lowers the risk of introducing bugs with such changes - although the best is of course to have a proper set of unit tests to ensure you aren't breaking anything.
Recommended reading is Clean Code.
Use
#define v #
..
#undef v
This should do it as # is with very low probability conflicting with any other variable name or keyword or operator.
As i know, no such compile verification. Maybe you can verify it by yourself using grep. I think the best way is to separate your function into two functions. One use the variable, and the other cannot see the variable. That's one of the reasons why we need functions.