difference between code coverage and profiling

difference between code coverage and profiling - profiling

What is difference between code code coverage and profiling.
Which is the best open source tool for code coverage.

Code coverage is an assessment of how much of your code has been run. This is used to see how well your tests have exercised your code.
Profiling is used to see how various parts of your code perform.
The tools depend on the language and platform you are using. I'm guessing that you are using Java, so recommend CodeCover. Though you may find NoUnit easier to use.

Coverage is important to see which parts of the code have not been run.
In my experience it has to be accumulated over multiple use cases, because any single run of the software will only use some of the code.
Profiling means different things at different times. Sometimes it means measuring performance. Sometimes it means diagnosing memory leaks. Sometimes it means getting visibility into multi-threading or other low-level activities.
When the goal is to improve software performance, by finding so-called "bottlenecks" and fixing them, don't just settle for any profiler, not even necessarily a highly-recommended or venerable one.
It is essential to use the kind that gets the right kind of information and presents it to you the right way, because there is a lot of confusion about this.
More on that subject.
Added:
For a coverage tool, I've always done it myself. In nearly every routine and basic block, I insert a call like this: Utils.CovTest("file name, routine name, comment that tells what's being done here").
The routine records the fact that it was called, and when the program finishes, all those comments are appended to a text file.
Then there's a post-processing step where that file is "subtracted" from a complete list (gotten by a grep-like program).
The result is a list of what hasn't been tested, requiring additional test cases.
When not doing coverage testing, Utils.CovTest does nothing. I leave it out of the innermost loops anyway, so it doesn't affect performance much.
In C and C++, I do it with a macro that, during normal use, expands to nothing.

Related

How can I verify that refactoring preserves code flow, not just behavior?

Sometimes, I see if-statements that could be written in a better way. Usually these are cases where we have several layers of nested if-statements and I've identified a simpler way of rewriting the block of if-statements.
Of course the biggest concern is that the resulting code will have a different code flow in certain cases.
How can I compare the two code-blocks and determine if the code flow is the same or different?
Is there a way to support this analysis with static analysis tools? Are there any other techniques that might help?

Find some way to exercise all possible paths through the code that you want to refactor. You could
write unit tests by hand
use Daikon http://plse.cs.washington.edu/daikon/, which exercises code automatically and systematically to infer invariants (I haven't used it myself, but I have tried a commercial descendant targeted at Java)
Either way, use a code coverage tool to verify that you have complete statement and decision coverage. Use a coverage tool that reports the number of times each statement is executed during the coverage run. You might even be able to get trucov, which actually generates diagrams of code paths, to work.
Do your refactoring.
Run the coverage tool again and compare statement execution counts before and after the refactoring. If any statement execution count changed, the flow must have changed. The opposite isn't guaranteed to be true, but it's probably close enough to true for practical applications. Alternatively, if you got trucov to work, compare execution graphs before and after; that would be definitive.

How to extract the active code path from a complex algorithm

I have been puzzled lately by an intruiging idea.
I wonder if there is a (known) method to extract the executed source code from a large complex algorithm. I will try to elaborate this question:
Scenario:
There is this complex algorithm where a large amount of people have worked on for many years. The algorithm creates measurement descriptions for a complex measurement device.
The input for the algorithm is a large set of input parameters, lets call this the recipe.
Based on this recipe, the algorithm is executed, and the recipe determines which functions, loops and if-then-else constructions are followed within the algorithm. When the algorithm is finished, a set of calculated measurement parameters will form the output. And with these output measurement parameters the device can perform it's measurement.
Now, there is a problem. Since the algorithm has become so complex and large over time, it is very very difficult to find your way in the algorithm when you want to add new functionality for the recipes. Basically a person wants to modify only the functions and code blocks that are affected by its recipe, but he/she has to dig in the whole algorithm and analyze the code to see which code is relevant for his or her recipe, and only after that process new functionality can be added in the right place. Even for simple additions, people tend to get lost in the huge amount of complex code.
Solution: Extract the active code path?
I have been brainstorming on this problem, and I think it would be great if there was a way to process the algorithm with the input parameters (the recipe), and to only extract the active functions and codeblocks into a new set of source files or code structure. I'm actually talking about extracting real source code here.
When the active code is extracted and isolated, this will result in a subset of source code that is only a fraction of the original source code structure, and it will be much easier for the person to analyze the code, understand the code, and make his or her modifications. Eventually the changes could be merged back to the original source code of the algorithm, or maybe the modified extracted source code can also be executed on it's own, as if it is a 'lite' version of the original algorithm.
Extra information:
We are talking about an algorithm with C and C++ code, about 200 files, and maybe 100K lines of code. The code is compiled and build with a custom Visual Studio based build environment.
So...:
I really don't know if this idea is just naive and stupid, or if it is feasible with the right amount of software engineering. I can imagine that there have been more similar situations in the world of software engineering, but I just don't know.
I have quite some experience with software engineering, but definitely not on the level of designing large and complex systems.
I would appreciate any kind of answer, suggestion or comment.
Thanks in advance!

Other naysayers say you can't do this. I disagree.
A standard static analysis is to determine control and data flow paths through code. Sometimes such a tool must make assumptions about what might happen, so such analyses tend to be "conservative" and can include more code than the true minimum. But any elimination of irrelevant code sounds like it will help you.
Furthermore, you could extract the control and data flow paths for a particular program input. Then where the extraction algorithm is unsure about what might happen, it can check what the particular input would have caused to happen. This gives a more precise result at the price of having to provide valid inputs to the tool.
Finally, using a test coverage tool, you can relatively easily determine the code exercised for a particular input of interest, and the code exercised by another input for case that is not so interesting, and compute the set difference. This gives code exercised by the interesting case, that is not in common with the uninteresting case.
My company builds build program analysis tools (see my bio). We do static analysis to extract control and data flow paths on C++ source code, and could fairly easily light up the code involved. We also make C++ test coverage tools, that can collect the interesting- and uninteresting- sets, and show you the difference superimposed over the source code.

I'm afraid what you try is mathematically impossible. The problem is that this
When the algorithm is finished, a set of calculated measurement parameters will form the output.
is impossible to determine by static code analysis.
What you're running into is essentially a variant of the Halting Problem for which has been proven that there can not be an algorithm/program that can determine, if an algorithm passed into it will yield a result in finite time.

Profile optimised C++/C code

I have some heavily templated c++ code that I am working with. I can compile and profile with AMD tools and sleepy in debug mode. However without optimisation most of time spent concentrated in the templated code and STL. With optimised compilation, all the profile tools that I know produce garbage information. Does anybody know a good way to profile optimised native code
PS1:
The code that I am writing is also heavily templated. Most of the time spent in the unoptimised code will be optimized away. I am talking about 96-97% of the run time are spent in templated code without optimisation. This is going to corrupt the accuracy of the profiling. And yes I can change many templated code or at least what part of the templated code is introducing the most trouble and I can do better in those places.

You should focus on the code you wrote because that is what you can change, time spent in STL is irrelevant, just ignore it and focus on the callers of that code. If too much time is spent in STL you probably can call some other STL primitive instead of the current one.
Profiling unoptimized code is less interesting, but you can still get some informations. If used algorithms from some parts of code are totally flawed it will show up even there. But you should be able to get useful informations from any good profiling tool in optimized code. What tools do you use exactly and why do you call their output garbage ?
Also it's usually easy enough to instrument your code by hand and find out exactly which parts are efficient and which are not. It's just a matter of calling timer functions (or reading cycle count of processor if possible) at well chosen points. I usually do that from unit tests to have reproducible results, but all depends of the specifics of your program.
Tools or instrumenting code are the easy part of optimization. The hard part is finding ways to get faster code where it's needed.

What do you mean by "garbage information"?
Profiling is only really meaningful on optimized builds, so tools are designed to work with them -- thus if you're getting meaningless results, it's probably due to the profiler not finding the right symbols, or needing to instrument the build.
In the case of Intel VTune, for example, I found I got impossible results from the sampler unless I explicitly told it where to find the PDBs for the executable I was tuning. In the instrumented version, I had to fiddle with the settings until it was reliably putting probes into the function calls.

When #kriss says
You should focus on the code you wrote
because that is what you can change
that's exactly what I was going to say.
I would add that in my opinion it is easier to do performance tuning first on code compiled without optimization, and then later turn on the optimizer, for the same reason. If something you can fix is costing excess time, it will cost proportionally excess time regardless of what the compiler does, and it's easier to find it in code that hasn't been scrambled.
I don't look for such code by measuring time. If the excess time is, say, 20%, then what I do is randomly pause it several times. As soon as I see something that can obviously be improved on 2 or more samples, then I've found it. It's an oddball method, but it doesn't really miss anything. I do measure the overall time before and after to see how much I saved. This can be done multiple times until you can't find anything to fix. (BTW, if you're on Linux, Zoom is a more automated way to do this.)
Then you can turn on the optimizer and see how much it gives you, but when you see what changes you made, you can see there's really no way the compiler could have done it for you.

How to approach debugging a huge not so familiar code base?

Seldom during working on large scale projects, suddenly you are moved on to a project which is already in maintainance phase.You end up with having a huge code C/C++ code base on your hands, with not much doccumentation about the design.The last person who could give you some knowledge transfer about the code has left the company already and to add to your horrors there is not enough time to get acquainted with the code and develop an understanding of the overall module/s.In this scenario when you are expected to fix bugs(core dumps,functionality,performance problems etc) on the module/s what is the approach that you will take?
So the question is:
What are your usual steps for debugging a not so familiar C/C++ code base when trying to fix a bug?
EDIT: Enviornment is Linux, but code is ported on Windows too so suggestions for both will be helpful.

If possible, step through it from main() to the problematic area, and follow the execution path. Along the way you'll get a good idea of how the different parts play together.
It could also be helpful to use a static code analysis tool, like CppDepends or even Doxygen, to figure out the relations between modules and be able to view them graphically.

Use a pen and paper, or images/graphs/charts in general, to figure out which parts belong where and draw some arrows and so on.
This helps you build and see the image that will then be refined in your mind as you become more comfortable with it.
I used a similar approach attacking a hellish system that had 10 singletons all #including each other. I had to redraw it a few times in order to fit everything, but seeing it in front of you helps.
It might also be useful to use Graphviz when constructing dependency graphs. That way you only have to list everything (in a text file) and then the tool will draw the (often unsightly) picture. (This is what I did for the #include dependencies in above syste,)

As others have already suggested, writing unit-tests is a great way to get into the codebase. There are a number of advantages to this approach:
It allows you to test your
assumptions about how the code
works. Adding a passing test proves
that your assumptions about that
small piece of code that you are
testing are correct. The more
passing tests you write, the better
you understand the code.
A failing unit test that reproduces
the bug you want to fix will pass
when you fix the bug and you know
that you have succeeded.
The unit tests that you write act as
documentation for the future.
The unit tests you write act as
regression tests as more bugs are
fixed.
Of course adding unit tests to legacy code is not always an easy task. Happily, a gentleman by the name of Michael Feathers has written an excellent book on the subject, which includes some great 'recipes' on adding tests to code bases without unit tests.

Some pointers:
Debug from the part which seems more
relevant to the workflow.
Use debug
strings
Get appropriate .pdb and attach the
core dump in debuggers like Windbg
or debugdiag to analyze it.
Get a person's help in your
organization who is good at
debugging. Even if he is new to your
codebase, he could be very helpful.
I had prior experience. They would
give you valuable pointers.
Per Assaf Lavie's advice, you could use static code analyzers.
The most important thing: as you
explore and debug, document
everything as you progress. At least
the person succeeding you would
suffer less.

Three things i don't see yet:
write some unit tests which use the libraries/interfaces. demonstrate/verify your understanding of them and promote their maintainability.
sometimes it is nice to create an special assertion macro to check that the other engineer's assumptions are in line with yours. you could:
not commit their uses
commit their uses, converting them to 'real' assertions after a given period
commit their uses, allowing another engineer (more familiar with the project) to dispose or promote them to real assertions
refactoring can also help. code that is difficult to read is an indication.

The first step should be try to read the code. Try to see the code where the bug is. Follow the code from main to that point ans try to see what could be wrong. Read the comments from the code(if any). Normally the function names are useful. Understand what each function does.
Once you get some idea of the code then you can start debugging the code. Put breakpoints where you don't understand the code or where you think the error can be. Start following the code line by line. Debugging is like sex. Initially painful, but slowly you start to enjoy it.

cscope + ctags are available on both Linux and Windows (via Cygwin). If you give them a chance, these tools will become indispensable to you. Although, IDEs like Visual Studio also do an excellent job with code browsing facilities as well.
In a situation like yours, because of time constraints, you are driven by symptoms. I mean that you don't have time to reconstruct the big picture / design / architecture. So you focus on the symptoms and work outwards, and each time reconstruct as much of the big picture as you need for that particular problem. But do not make "local" decisions in a hurry. Have the patience to see as much of the big picture as needed to make a good quality decision. And don't get caught in the band-aid syndrome i.e. put any old fix in that will work. It is your job to preserve the underlying architecture / design (if there is one, and to whatever extent that you can discover it).
It will be a struggle at first, as your mind "hunts" excessively. But soon the main themes in the design / architecture will emerge, and all of it will start to make sense. Think, by not thinking, grasshoppa :)

You have to have a fully reliable IDE which has a lot of debbugging tools (breakpoints, watches, and the like). The best way to familiarize yourself with a huge code is to play around with it and see how data is passed from one method to another. Also, you can reverse engineer the code so could see the relationship of the classes. :D Good Luck!

For me, there is only one way to get to know a process - Interaction. Identify the interfaces of the process/system. Then identify the input/output relationship (these steps maybe not linear). Once you do that, you can start tinkering at the code with a fair amount of confidence because you know what it is "supposed to do" then it's just a matter of finding out "how it is actually being done". For me though, getting to know the interface (Not necessarily the user interface) of the system is the key. To put it bluntly - Never touch the code first!!!

Not sure about C/C++, but coming from Java and C#, unit testing will help. In Java there's JUnit and TestNG libraries for unit testing, in C# there's NUnit and mstest. Not sure about C/C++.
Read the book 'Refactoring: Improving the Design of Existing Code' by Martin Fowler, Kent Beck, et al. Will be quite a few tips in there I'm sure that will help, and give you some guidance to improving the code.
One tip: if it aint broke, don't fix it. Don't bother trying to fix some library or really complicated function if it works. Focus on parts where there's bugs.
Write a unit test to reproduce the scenario where the code should work. The test will fail at first. Fix the code until the unit test passes successfully. Repeat :)
Once a majority of your code, the important bits that are too complex to manually debug and fix, is under automated unit tests, you'll have a safety harness of regression tests that'll make you feel more confident at changing the existing code base.

while (!codeUnderstood)
{
Breakpoints();
Run();
StepInto();
if(needed)
{
StepOver();
}
}

I don't try to get an overview of the whole system as suggested by many here. If there is something which needs fixing I learn the smallest part of the code I can to fix the bug. The next time there is an issue I'm a little more familiar and a little less daunted and I learn a little more. Eventually I'm able to support the whole shebang.
If management suggests I do a major change to something I'm not familiar with I make sure they understand the time scales and if things a really messy suggest a rewrite.

Usually the program in question will produce some kind of output ( log, console printout, dialog box ).
Find the closest place to your
problem in the program output
Search through the code base and look for the text in that output
Start putting your own printouts, nothing fancy, just printf( "Calling xxx\n" );, so you can pinpoint exactly to the point where the problem starts.
Once you pinpointed the problem spot, put a breakpoint
When you hit the breakpoint, print a stacktrace
Now you can see what players you have and start the analysis of how you've got to the wrong place.
Hopefully the names of the methods on the call stack are more meaningful than a, b and c ( seen this ), and there is some sort of comments, method documentation more meaningful than calling a ( seen this many times ).
If the source is poorly documented, don't be afraid to leave your comments once you have figured out what's going on. If program design permits it create a unit test for the problem you've fixed.

Thanks for the nice answers, quite a number of points to take up. I have worked on such situation a number of times and here is the usual procedure i follow:
Check the crash log or trace log. Check relevant trace if just a simple developer mistake if cannot evaluate in one go, then move on to 2.
Reproduce the bug! This is the most important thing to do. Some bugs are rare to occur and if you get to reproduce the bug nothing like it. It means you have a better % of cracking it.
If you cant reproduce a bug, find a alternative use case, situation where in you can actually reproduce the bug. Being able to actually debug a scenario is much more useful than just the crash log.
Head to version control! Check if the same buggy behavior exists on previous few SW versions. If NOT..Voila! You can find between what two versions the bug got introduced and You can easily get the code difference of the two versions and target the relevant area.(Sometimes it is not the newly added code which has the bug but it exposes some old leftovers.Well, We atleast have a start I would say!)
Enable the debug traces. Run the use case of the bug, check if you can find some additional information useful for investigation.
Get hold of the relevant code area through the trace log. Check out there for some code introducing the bug.
Put some breakpoints in the relevant code. Study the flow. Check the data flows.Lookout for pointers(usual culprits). Repeat till you get a hold of the flow.
If you have a SW version which does not reproduce the bug, compare what is different in the flows. Ask yourself, Whats the difference?
Still no Luck!- Arghh...My tricks have exhausted..Need to head the old way. Understand the code..and understand the code and understand it till you know what is happening in the code when that particular use case is being executed.
With newly developed understanding try debugging the code and sure the solution is around the corner.
Most important - Document the understanding you have developed about the module/s. Even small knitty gritty things. It is sure going to help you or someone just like you, someday..sometime!

You can try GNU cFlow tool (http://www.gnu.org/software/cflow/).
It will give you graph, charting control flow within program.

Do very long methods always need refactoring?

I face a situation where we have many very long methods, 1000 lines or more.
To give you some more detail, we have a list of incoming high level commands, and each generates results in a longer (sometime huge) list of lower level commands. There's a factory creating an instance of a class for each incoming command. Each class has a process method, where all the lower level commands are generated added in sequence. As I said, these sequences of commands and their parameters cause quite often the process methods to reach thousands of lines.
There are a lot of repetitions. Many command patterns are shared between different commands, but the code is repeated over and over. That leads me to think refactoring would be a very good idea.
On the contrary, the specs we have come exactly in the same form as the current code. Very long list of commands for each incoming one. When I've tried some refactoring, I've started to feel uncomfortable with the specs. I miss the obvious analogy between the specs and code, and lose time digging into newly created common classes.
Then here the question: in general, do you think such very long methods would always need refactoring, or in a similar case it would be acceptable?
(unfortunately refactoring the specs is not an option)
edit:
I have removed every reference to "generate" cause it was actually confusing. It's not auto generated code.
class InCmd001 {
OutMsg process ( InMsg& inMsg ) {
OutMsg outMsg = OutMsg::Create();
OutCmd001 outCmd001 = OutCmd001::Create();
outCmd001.SetA( param.getA() );
outCmd001.SetB( inMsg.getB() );
outMsg.addCmd( outCmd001 );
OutCmd016 outCmd016 = OutCmd016::Create();
outCmd016.SetF( param.getF() );
outMsg.addCmd( outCmd016 );
OutCmd007 outCmd007 = OutCmd007::Create();
outCmd007.SetR( inMsg.getR() );
outMsg.addCmd( outCmd007 );
// ......
return outMsg;
}
}
here the example of one incoming command class (manually written in pseudo c++)

Code never needs refactoring. The code either works, or it doesn't. And if it works, the code doesn't need anything.
The need for refactoring comes from you, the programmer. The person reading, writing, maintaining and extending the code.
If you have trouble understanding the code, it needs to be refactored. If you would be more productive by cleaning up and refactoring the code, it needs to be refactored.
In general, I'd say it's a good idea for your own sake to refactor 1000+ line functions. But you're not doing it because the code needs it. You're doing it because that makes it easier for you to understand the code, test its correctness, and add new functionality.
On the other hand, if the code is automatically generated by another tool, you'll never need to read it or edit it. So what'd be the point in refactoring it?

I understand exactly where you're coming from, and can see exactly why you've structured your code the way it is, but it needs to change.
The uncertainty you feel when you attempt to refactor can be ameliorated by writing unit tests. If you've tests specific to each spec, then the code for each spec can be refactored until you're blue in the face, and you can have confidence in it.
A second option, is it possible to automatically generate your code from a data structure?
If you've a core suite of classes that do the donkey work and edge cases, you can auto-generate the repetitive 1000 line methods as often as you wish.
However, there are exceptions to every rule.
If the methods are a literal interpretation of the spec (very little additional logic), and the specs change infrequently, and the "common" portions (i.e. bits that happen to be the same right now) of the specs change at different times, and you're not going to be asked to get a 10x performance gain out of the code anytime soon, then (and only then) . . . you may be better off with what you have.
. . . but on the whole, refactor.

Yes, always. 1000 lines is at least 10x longer than any function should ever be, and I'm tempted to say 100x, except that when dealing with input parsing and validation it can become natural to write functions with 20 or so lines.
Edit: Just re-read your question and I'm not clear on one point - are you talking about machine generated code that no-one has to touch? In which case I would leave things as they are.

Refectoring is not the same as writing from scratch. While you should never write code like this, before you refactor it, you need to consider the costs of refactoring in terms of time spent, the associated risks in terms of breaking code that already works, and the net benefits in terms of future time saved. Refactor only if the net benefits outweigh the associated costs and risks.
Sometimes wrapping and rewriting can be a safer and more cost effective solution, even if it appears expensive at first glance.

Long methods need refactoring if they are maintained (and thus need to be understood) by humans.

As a rule of thumb, code for humans first. I don't agree with the common idea that functions need to be short. I think what you need to aim at is when a human reads your code they grok it quickly.
To this effect it's a good idea to simplify things as much as possible--but not more than that. It's a good idea to delegate roughly one task for each function. There is no rule as for what "roughly one task" means: you'll have to use your own judgement for that. But do recognize that a function split into too many other functions itself reduces readability. Think about the human being who reads your function for the first time: they would have to follow one function call after another, constantly context-switching and maintaining a stack in their mind. This is a task for machines, not for humans.
Find the balance.
Here, you see how important naming things is. You will see it is not that easy to choose names for variables and functions, it takes time, but on the other hand it can save a lot of confusion on the human reader's side. Again, find the balance between saving your time and the time of the friendly humans who will follow you.
As for repetition, it's a bad idea. It's something that needs to be fixed, just like a memory leak. It's a ticking bomb.
As others have said before me, changing code can be expensive. You need to do the thinking as for whether it will pay off to spend all this time and effort, facing the risks of change, for a better code. You will possibly lose lots of time and make yourself one headache after another now, in order to possibly save lots of time and headache later.

Take a look at the related question How many lines of code is too many?. There are quite a few tidbits of wisdom throughout the answers there.
To repost a quote (although I'll attempt to comment on it a little more here)... A while back, I read this passage from Ovid's journal:
I recently wrote some code for
Class::Sniff which would detect "long
methods" and report them as a code
smell. I even wrote a blog post about
how I did this (quelle surprise, eh?).
That's when Ben Tilly asked an
embarrassingly obvious question: how
do I know that long methods are a code
smell?
I threw out the usual justifications,
but he wouldn't let up. He wanted
information and he cited the excellent
book Code Complete as a
counter-argument. I got down my copy
of this book and started reading "How
Long Should A Routine Be" (page 175,
second edition). The author, Steve
McConnell, argues that routines should
not be longer than 200 lines. Holy
crud! That's waaaaaay to long. If a
routine is longer than about 20 or 30
lines, I reckon it's time to break it
up.
Regrettably, McConnell has the cheek
to cite six separate studies, all of
which found that longer routines were
not only not correlated with a greater
defect rate, but were also often
cheaper to develop and easier to
comprehend. As a result, the latest
version of Class::Sniff on github now
documents that longer routines may not
be a code smell after all. Ben was
right. I was wrong.
(The rest of the post, on TDD, is worth reading as well.)
Coming from the "shorter methods are better" camp, this gave me a lot to think about.
Previously my large methods were generally limited to "I need inlining here, and the compiler is being uncooperative", or "for one reason or another the giant switch block really does run faster than the dispatch table", or "this stuff is only called exactly in sequence and I really really don't want function call overhead here". All relatively rare cases.
In your situation, though, I'd have a large bias toward not touching things: refactoring carries some inherent risk, and it may currently outweigh the reward. (Disclaimer: I'm slightly paranoid; I'm usually the guy who ends up fixing the crashes.)
Consider spending your efforts on tests, asserts, or documentation that can strengthen the existing code and tilt the risk/reward scale before any attempt to refactor: invariant checks, bound function analysis, and pre/postcondition tests; any other useful concepts from DBC; maybe even a parallel implementation in another language (maybe something message oriented like Erlang would give you a better perspective, given your code sample) or even some sort of formal logical representation of the spec you're trying to follow if you have some time to burn.
Any of these kinds of efforts generally have a few results, even if you don't get to refactor the code: you learn something, you increase your (and your organization's) understanding of and ability to use the code and specifications, you might find a few holes that really do need to be filled now, and you become more confident in your ability to make a change with less chance of disastrous consequences.
As you gain a better understanding of the problem domain, you may find that there are different ways to refactor you hadn't thought of previously.
This isn't to say "thou shalt have a full-coverage test suite, and DBC asserts, and a formal logical spec". It's just that you are in a typically imperfect situation, and diversifying a bit -- looking for novel ways to approach the problems you find (maintainability? fuzzy spec? ease of learning the system?) -- may give you a small bit of forward progress and some increased confidence, after which you can take larger steps.
So think less from the "too many lines is a problem" perspective and more from the "this might be a code smell, what problems is it going to cause for us, and is there anything easy and/or rewarding we can do about it?"
Leaving it cooking on the backburner for a bit -- coming back and revisiting it as time and coincidence allows (e.g. "I'm working near the code today, maybe I'll wander over and see if I can't document the assumptions a bit better...") may produce good results. Then again, getting royally ticked off and deciding something must be done about the situation is also effective.
Have I managed to be wishy-washy enough here? My point, I think, is that the code smells, the patterns/antipatterns, the best practices, etc -- they're there to serve you. Experiment to get used to them, and then take what makes sense for your current situation, and leave the rest.

I think you first need to "refactor" the specs. If there are repetitions in the spec it also will become easier to read, if it makes use of some "basic building blocks".
Edit: As long as you cannot refactor the specs, I wouldn't change the code.
Coding style guides are all made for easier code maintenance, but in your special case the ease of maintenance is achieved by following the spec.
Some people here asked if the code is generated. In my opinion it does not matter: If the code follows the spec "line by line" it makes no difference if the code is generated or hand-written.

1000 thousand lines of code is nothing. We have functions that are 6 to 12 thousand lines long. Of course those functions are so big, that literally things get lost in there, and no tool can help us even look at high level abstractions of them. the code is now unfortunately incomprehensible.
My opinion of functions that are that big, is that they were not written by brilliant programmers but by incompetent hacks who shouldn't be left anywhere near a computer - but should be fired and left flipping burgers at McDonald's. Such code wreaks havok by leaving behind features that cannot be added to or improved upon. (too bad for the customer). The code is so brittle that it cannot be modified by anyone - even the original authors.
And yes, those methods should be refactored, or thrown away.

Do you ever have to read or maintain the generated code?
If yes, then I'd think some refactoring might be in order.
If no, then the higher-level language is really the language you're working with -- the C++ is just an intermediate representation on the way to the compiler -- and refactoring might not be necessary.

Looks to me that you've implemented a separate language within your application - have you considered going that way?

It has been my understanding that it's recommended that any method over 100 lines of code be refactored.

I think some rules may be a little different in his era when code is most commonly viewed in an IDE. If the code does not contain exploitable repetition, such that there are 1,000 lines which are going to be referenced once each, and which share a significant number of variables in a clear fashion, dividing the code into 100-line routines each of which is called once may not be that much of an improvement over having a well-formatted 1,000-line module which includes #region tags or the equivalent to allow outline-style viewing.
My philosophy is that certain layouts of code generally imply certain things. To my mind, when a piece of code is placed into its own routine, that suggests that the code will be usable in more than one context (exception: callback handlers and the like in languages which don't support anonymous methods). If code segment #1 leaves an object in an obscure state which is only usable by code segment #2, and code segment #2 is only usable on a data object which is left in the state created by #1, then absent some compelling reason to put the segments in different routines, they should appear in the same routine. If a program puts objects through a chain of obscure states extending for many hundreds of lines of code, it might be good to rework the design of the code to subdivide the operation into smaller pieces which have more "natural" pre- and post- conditions, but absent some compelling reason to do so, I would not favor splitting up the code without changing the design.

For further reading, I highly recommend the long, insightful, entertaining, and sometimes bitter discussion of this topic over on the Portland Pattern Repository.

I've seen cases where it is not the case (for example, creating an Excel spreadsheet in .Net often requires a lot of line of code for the formating of the sheet), but most of the time, the best thing would be to indeed refactor it.
I personally try to make a function small enough so it all appears on my screen (without affecting the readability of course).

1000 lines? Definitely they need to be refactored. Also not that, for example, default maximum number of executable statements is 30 in Checkstyle, well-known coding standard checker.

If you refactor, when you refactor, add some comments to explain what the heck it's doing.
If it had comments, it would be much less likely a candidate for refactoring, because it would already be easier to read and follow for someone starting from scratch.

Then here the question: in general, do
you think such very long methods would
always need refactoring,
if you ask in general, we will say Yes .
or in a
similar case it would be acceptable?
(unfortunately refactoring the specs
is not an option)
Sometimes are acceptable, but is very unusual, I will give you a pair of examples:
There are some 8 bit microcontrollers called Microchip PIC, that have only a fixed 8 level stack, so you can't nest more than 8 calls, then care must be taken to avoid "stack overflow", so in this special case having many small function (nested) is not the best way to go.
Other example is when doing optimization of code (at very low level) so you have to take account the jump and context saving cost. Use it with care.
EDIT:
Even in generated code, you could need to refactorize the way its generated, for example for memory saving, energy saving, generate human readable, beauty, who knows, etc..

There has been very good general advise, here a practical recommendation for your sample:
common patterns can be isolated in plain feeder methods:
void AddSimpleTransform(OutMsg & msg, InMsg const & inMsg,
int rotateBy, int foldBy, int gonkBy = 0)
{
// create & add up to three messages
}
You might even improve that by making this a member of OutMsg, and using a fluent interface, such that you can write
OutMsg msg;
msg.AddSimpleTransform(inMsg, 12, 17)
.Staple("print")
.AddArtificialRust(0.02);
which can be an additional improvement under circumstances.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js