Finding "dead code" in a large C++ legacy application [closed] - c++

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I'm currently working on a large and old C++ application that has had many developers before me. There is a lot of "dead code" in the project, classes and functions that aren't used by anyone anymore.
What tools are available for C++ to make a analysis of large code base to detect and refactor dead code? Note: I'm not talking about test coverage tool like gcov.
How do you find dead code in your project?

You'll want to use a static analysis tool
StackOverflow: What open source C++ static analysis tools are available?
Wikipedia: List of tools for static code analysis
The main gotcha I've run into is that you have to be careful that any libraries aren't used from somewhere that you don't control/have. If you delete a function from a class that gets used by referencing a library in your project you can break something that you didn't know used the code.

You can use Cppcheck for this purpose:
$ cppcheck --enable=unusedFunction .
Checking 2380153.c...
1/2 files checked 0% done
Checking main.c...
2/2 files checked 0% done
[2380153.c:1]: (style) The function '2380153' is never used.

Caolán McNamara's callcatcher is very effectively used within the LibreOffice project (~6 MLOC) to find dead code.

I think your best bet would probably be a coverage tool. There're plenty for both *nix and windows. If you have unit tests, it's easy - if you have a low test coverage, then the uncovered code is either dead or not tested yet (you want both pieces of this info anyway). If you don't have unit tests, build your app with instrumentation provided by one of those tools, run it through some (should be all ideally) execution paths, and see the report. You get the same info as with unit tests, it will only require a lot more work.
Since you're using VisualStudio, I could provide you couple of links which you could consider using:
Coverage meter
BullseyeCoverage
Neither of them is free, not even cheap, but the outcome is usually worth it.
On *nix-like platforms gcov coupled with tools like zcov or lcov is a really great choice.

Nothing beats familiarity with the code. Except perhaps rigourous pruning as one goes along.
Sometimes what looks like deadwood is used as scaffolding for unit tests etc, or it appears to be alive simply because legacy unit tests exercise it, but it is never exercised outside of the tests. A short while ago I removed over 1000 LOC which was supporting external CAD model translators, we had tests invoking those external translators but those translators had been unsupported for 8+ years and there was no way that a user of the application even if they wanted to could ever invoke them.
Unless one is rigourous in getting rid of the dead wood, one will find your team maintaining the stuff for years.

One approach is to use "Find All References" context menu item on class and function names. If a class/function is only referenced in itself, it is almost certainly dead code.
Another approach, based on the same idea, is to remove(comment out) files/functions from project and see what error messages you will get.

See our SD C++ Test Coverage.
You need to do a lot of dynamic testing to exercise the code, to make sure you hit the maximum amount of coverage. Code "not covered" may or may not be dead; perhaps you simply didn't have a test case to exercise it.

Although not specifically for dead code, I found the Source Navigator
http://sourcenav.berlios.de/
quite useful, although cumbersome to set up and a bit buggy. That was a year ago on Linux (Fedora).

Related

Testing workflow for small (i.e. one person) design in SystemVerilog [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I started implementing design in SystemVerilog but I'm a bit lost as far as testing is concerned. I tried to use straightforward SystemVerilog for verification but it seems limited:
The errors are spotted by going through the log (even $error and assert don't stop simulation) so they can be easily missed.
I cannot (?) run all the tests as Vivado allows to use only one as active
I could put everything in single test simulation but waveform for debugging seems too long as it mixes various tests.
I can try to create my own framework but it sounds like reinventing the wheel which is bad idea.
I know of SVUnit but it seems to work with expensive simulators, not xsim I have license for. I'm trying to look at UVM but I'm not sure if the investment of time is worth it.
What would be a good test workflow for SV for person coming from software (drivers) for personal, one-person, FPGA project?
The free and open source VUnit provides a single click (= single command) solution that will find your test suites and test cases, (re)compile what's needed (no recompiles between tests), run the simulations and then present the pass/fail result.
VUnit started as a VHDL unit testing framework but since much of the top-level automation is language agnostic it was updated to also support SystemVerilog. The difference between the VHDL and SV support is that VUnit provides a number of testbench support packages for VHDL which you don't find for SV. On the other hand, some of that functionality is already part of SV.
Find out the very basics here. The UART example above can be found in the examples directory.
VUnit supports simulators from Mentor, Aldec and Cadence and also the open source GHDL. It doesn't support Vivado today but it's being discussed. However, you can use the free ModelSim Altera Edition.
Disclaimer: I'm one of the authors for VUnit.
Running all tests isn't usually done in one simulator invocation. This is handled as multiple invocations by a different tool, which usually does more (distribute jobs across a compute farm, centralizes status, etc.).
Determining whether a test passed or failed is usually done by inspecting the log file. If an error was detected, it should show up in the log and you can grep for it. The simulator's exit code isn't used for this, since non-zero exit codes mean that something was wrong with the tool invocation, not with the simulation itself.
In your case, since you only have the simulator available you have to build a lot of the infrastructure. You'll need a script that can run a single test and can determine if it was a PASS or a FAIL (via grep, Perl, etc.). You can then define another script that loops over all of your tests, calls the previous script and computes a summary.
Have you tried VUnit? If you are interested to run UVM, we do have a port of UVM base class that runs on free Modelsim (With some limitations such as no randomisation, coverage, SVA etc.) as part of Go2UVM (www.go2uvm.org).
Regards
Srini

Choosing a scripting/build tool [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
We are currently working on a project with both actionscript and Java. Up to now, we were using Ant as our main build tool, but the dumb amount of duplication it implies and the lack of flexibility (we are building a pretty large amount of small sub-projects, and copying all of the build files every time is a pain) are pushing us towards a change of tools.
EDIT3: I'm done rewriting all of our builds in Gant, and even though it's not perfect, it downsized our build files massively and made adding new project much more straightforward, so I'd definetely recommand Gant to people not wanting to change their build philosophy and project structure, but just looking for a more convenient tool than ant. I might have a look into graddle and/or Ivy one of those days.
EDIT2 : After trying out Buildr, we ruled it out because it does way more things than what we actually need. I'm now trying Gant which looks like right what we need but the documentation is pretty small. Is it worth it moving all the way to Gradle, or is th project not mature enough yet ?
EDIT : I'll try to clarify our problems with Ant. We have several sub-projects with similar layouts which we have to compile and run tests for. Once that's done, some of them need to be packaged together to produce executables (namely a client, a server, and some stand-alone demos). The work to descripe our standard layout in ant is pretty long, and it's awfully difficult to introduce small variations without rewriting the whole macro. (Say, one of the projects need to grab its visual assets from a different repository).
Gant which would allow us to reuse the ant tasks that are already out there both for Flash and Java
Gradle for the same reasons even though it looks slightly more complicated
Rake which seems to be highly recommended. The downside being the experimental support of action script integration and our lack of knowledge of Ruby
Buildr which looks pretty cool, but here again, no knowldege of ruby
Scons seems to have less momentum, but Python is a pretty cool scripting language
Maven was considered, but has been eliminated because of the inherent complexity and the apparent error-proneness. We are currently leaning towards Gant. Does any of you have experience using several of these tools ? How do they compare ?
Our needs are pretty basic : Compile and package projects, deploy them to several targets and some scripting capability (to run project-specific performance tests for instance). Of note could also be that we use Hudson to handle continuous integration.
I'm not sure switching to gant will solve your problems. Gant is just writing build files in groovy instead of xml. I think your issue lays more in the way you're using ant. Hard to say without more details, but phrases like "dumb amount of duplication" and "copying build files around" make me think you could be using ant more efficiently.
If you haven't already, look at your ant tasks, and see if you can refactor them so eliminate that duplication. Also, checkout the -find option to ant if you haven't seen it already. You shouldn't need to be copying build files around.
BTW, Ivy is for dependency management, not building.
I know that people in our company who do Java for a living swear by Ivy, but not having any experience with it, I don't have enough facts to back this suggestion up with technical arguments. They did mention lack of duplication as a plus though compared to Ant they used before. Caveat emptor.

What is the most common way of understanding a very large C++ application? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
When having a new C++ project passed along to you, what is the standard way of stepping through it and becoming acquainted with the entire codebase? Do you just start at the top file and start reading through all x-hundred files? Do you use a tool to generate information for you? If so, which tool?
I use change requests/bug reports to guide my learning of some new project. It never makes a lot of sense to me to try and consume the entirety of something all at once. A change order or bug report gives me guidance to focus on this one tendril of the system, tracing it's activity through the code.
After a reasonable amount of these, I can get a good understanding of the fundamentals of the project.
Here's my general process:
Start by understanding what the application does, and how its used. (I see way too many developers completely skip this critical step.)
Search for any developer documentation related to the project. (However, realize this will nearly always be wrong and out of date - it just will have helpful clues.)
Try to figure out the logic in the organization. How is the main architecture defined? What large scale patterns are used? (ie: MVC, MVP, IoC, etc)
Try to figure out the main classes related to the "large" objects in the project. This helps for the point above.
Slowly start refactoring and cleaning up as you try to maintain the project.
Usually, that will get me at least somewhat up to speed. However, usually I end up given a project like this because something has to be fixed or enhanced, and timing isn't always realistic, in which case I often just have to jump in and pray.
Start working on it, perhaps by
adding a small feature.
Step through application startup in the debugger.
You could try running it through doxygen to at last give a browsable set of documentation - but basically the only way is a debugger, some trace/std::cerr messages and a lot of coffee.
The suggestion to write test cases is the basis of Working-Effectively-Legacy-code and the point of the cppunit test library. If you can take this approach depends on your team and your setup - if you are the new junior you can't really rewrite the app to support testing.
Try writing unit tests for the various classes.
There is one tool I know about that may help you, it's currently in beta called CppDepend that will help you understand the relation between the classes and the projects in the solution.
Other than that you can try to understand the code by reading it:
Start with the header (.h/.hpp) files, reading them would help understand the "interfaces" between the classes
If the solution has several project try to understand the responsibility of each project.
Find someone who is familiar with the project that could give you and overview, 5 min with the right person can save you an hour with the debugger
Understanding how the code is used is usually very helpful.
If this is a library, look at client code and unit tests. If there aren't any unit tests, write some.
If this is an application, understand how it works - in detail. Again read & write unit tests.
Essentially, it's all about the interfaces. Understand the the interfaces and you'll go a long way towards understanding how the code works. By interface, I mean, the API if it's a library, the UI if it's a graphical application, the content of the inbound & outbound messages if it's a server.
Firstly how large is large?
I don't think you can answer this without knowing the other half of the scenario. What is the requirement for changing the code?
Are you just supporting/fixing it when it goes wrong? Developing new functionality? Porting the code to a new platform? Upgrading the code for a new C++ compiler?
Depending on what your requirement is I would start in different ways.
Here's how I approach the problem
Start by fixing easy bugs. Do extreme dilligance on these bugs and use the debugger heavily to find the problem
Code review every change that goes into the system. On an unbelievably large system, pick a smaller subset and review all of these changes
And most importantly: Ask a lot of questions!
Things to do:
Look at what the sales brochure tells you it does, set the scope of your expectations
Install it, what options do you have in the installer, read the quick start/install guide
Find out what it does, does it even execute, do you have multiple executables
Is there a developer setup guide/wiki, pointers to VCS
Get the code and make your build environment work, document SDKs, build tools you need if it isn't already
Look at the build process, project dependancies, is there a build machine/CI service
Look at generated doc output (if there is any!)
Find an interesting piece of the solution and see how it works, what are the entry points/ how does it work/look for main classes and interfaces
Replicate bugs, stop at interesting features in the program to get an overview and work down to tracing code.
Start to fix things, but ensure you are fixing things by having appropriate unit tests to show that it is broken now and when it will be fixed.
I have been incorporating source codes from some mid-sized projects. The most important lesson I learn from this process is before going into the source codes, you must be sure what part of the source codes interest you most. You should then go into that piece by grepping logging/warning messages or looking at class/function names. In understanding the source codes, you should run it in a debugger or insert your own warning messages. In all, you should focus on things you are interested in. The last thing you want is to read all the source codes.
Try generating a documentation using Doxygen or something similar if it wasn't done already.
Walk through the API and see if there is something that is unclear to you and look at the code, if you still don't get it ask a developer who already worked on it before.
Always examine whatever you have to work on first.
Take a look at whatever UML documents you've got, if you don't have any:
Smack the developer/s who worked on it. It's a shame they didn't do something as basic as UML class diagrams.
Try to generate them from the code. They will not be accurate but the they will give you a head start.
If there is something specific that you don't understand or think is wrong, ask the team who developed it. They will probably know better.
Fixing bugs works just fine for any project, not just c++ one.
Browse around in the file hierarchy with Total Commander, try getting an overview of the structure. Try identify where the main header files are located. Also find the file where the main() function is located.
Ask a person who is already familiar with the codebase to outline the basic concepts that were used during development.
He doesn't need to explain every detail, but should give you a rough idea of how the software works and how the individual modules are connected with each other.
Additionally, what I've found useful in the past was to first setup a working development environment before starting to think about the code.
Read the documentation. If possible, speak with the former maintainer. Then, check out the code bases from the first commit and the first release from the VCS and spend some time looking at them. Don't go for full understanding yet, just skim and understand which are the major components and what they do. Then read the change logs and the release notes for each of the major releases. Then start breaking everything and see what breaks what. Do some bug fixes. Review the test suite and understand which component each test is focused on. Add some tests. Step through the code in a debugger. Repeat.
As already said, grab doxygen and build HTML documentation for source code.
If code is well-designed, you'll easily see a nice class hierarchy, clear call graphs and many other things that otherwise would take ages to uncover. When certain parts behavior appears unclear, look at the unit tests or write your own.
However, if the structure appears to be flat, or messy, or both together, you may find yourself in some sort of trouble.
I'm not sure there is a standard way. There are some for-pay tools that will do C++ class diagrams/call graphs and provide some kind of code-level view. doxygen is a good free one. My low-tech approach is to find the top-level file and start to sort through what it provides and how...taking notes if needed.
In C++, the most common problem is that a lot of energy and time is wasted on low level tasks, such as "memory management".
Things that are no - brainers in managed languages are a pain to do in C++.

How do you implement unit-testing in large scale C++ projects? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I believe strongly in using unit-tests as part of building large multi-platform applications. We currently are planning on having our unit-tests within a separate project. This has the benefit of keeping our code base clean. I think, however, that this would separate the test code from the implementation of the unit. What do you think of this approach and are there any tools like JUnit for c++ applications?
There are many Test Unit frameforks for C++.
CppUnit is certainly not the one I would choose (at least in its stable version 1.x, as it lacks many tests, and requires a lot of redundant lines of codes).
So far, my preferred framework is CxxTest, and I plan on evaluating Fructose some day.
Any way, there are a few "papers" that evaluate C++ TU frameworks :
Exploring the C++ Unit Testing Framework Jungle, By Noel Llopis
an article in Overload Journal #78
That's a reasonable approach.
I've had very good results both with UnitTest++ and Boost.Test
I've looked at CppUnit, but to me, it felt more like a translation of the JUnit stuff than something aimed at C++.
Update: These days I prefer using Catch. I found it to be effective and simple to use.
You should separate your base code to a shared (dynamic) library and then write the major part of your unit tests for this library.
Two years ago (2008) I have been involved in large LSB Infrastructure project deployed by The Linux Foundation. One of the aims of this project was to write unit tests for 40.000 functions from the Linux core libraries. In the context of this project we have created the AZOV technology and the basic tool named API Sanity Autotest in order to automatically generate all the tests. You may try to use this tool to generate unit tests for your base library (ies).
I think your on the right path with unit testing and its a great plan to improve reliability of your product.
Though unit testing is not going to solve all your problems when converting your application to different platforms or even different operating systems. The reason for this, is the process unit testings goes through to uncover bugs in your application. It just simply throws as many inputs imaginable into your system and waits for a result on the other end. Its like getting a monkey to constantly pound at the keyboard and observing the results(Beta testers).
To take it to the next step, with good unit testing you need to focus on your internal design of your application. The best approach i found was to use a design pattern or design process called "contract programming" or "Design by contract". The other book that is very helpful for building reliability into your core design was.
Debugging the Development Process: Practical Strategies for Staying Focused, Hitting Ship Dates, and Building Solid Teams.
In our development team, we looked very closely at what we consider to be a programmer error, developer error, design error and how we could use both unit testing and also building reliability into our software package through DBC and following the advice of debugging the development proccess.
I use UnitTest++. The tests are in a separate project but the actual tests are intertwined with the actual code. They exist in a folder under the section under test.
ie:
MyProject\src\ <- source of the actual app
MyProject\src\tests <- the source of the tests
If you have nested folders (and who doesn't) then they too will have their own \tests subdirectory.
Cppunit is a direct equivalent of Junit for C++ applications
http://cppunit.sourceforge.net/cppunit-wiki
Personally, I created the unit tests in a different project, and created a separate build configuration which built all the unit tests and dependent source code. In some cases I wanted to test private member functionss of a class so I made the Test class a friend class to the object to be tested, but hid the friend declarations when building in "non-test" configurations through preprocessor declarations.
I ended up doing these coding gymnastics as I was integrating tests into legacy code however. If you are starting out with the purpose of unit testing a better design may be simple.
You can create a unit test project for each library in your source tree in a subdirectory of that library. You end up with a test driver application for each library, which makes it easier to run a single suite of tests. By putting them in a subdirectory, it keeps your code base clean, but also keeps the tests close to the code.
Scripts can easily be written to run all of the test suites in your source tree and collect the results.
I've been using a customized version of the original CppUnit for years with great success, but there are other alternatives now. GoogleTest looks interesting.
Using tut http://tut-framework.sourceforge.net/
very simple, just header file only no macros. Can generate XML results
CxxTest is also worth a look for lightweight, easy to use cross platform JUnit/CppUnit/xUnit-like framework for C++. We find it very straightforward to add and develop tests
Aeryn is another C++ Testing Framework worth looking at

how to handle code that is deemed dangerous to change, but stable? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
What is the best way to handle a big team that has access to a stable but no so pretty code, that is easy to introduce bugs into?
I'm looking for something along the lines of SVN locking the file(s).
Write unit tests if you don't have them already. Then start refactoring, and keep doing regression tests upon every commit.
Tell them to leave it alone.
It works, what is the benefit of changing it other than prettying it up (and the potential cost is high) so you just need to explain the cost/benefit analysis.
I would hope your developers would be smart enough to understand this and, if not, you can use your source code control system logs, rolled up tightly, to beat them to death:-) .
Svn does have a setting for locking the file to prevent concurrent access (similar to Source Safe) but I would recommend building some automated unit tests and integration tests around the fearful code. Hopefully you have a solid QA group as a safety net as well.
Write automated unit tests. If you have tests that test the code you are maintaining you can be assured that any modifications haven't broken it. Test frameworks such as JUnit can help.
Get a copy of Martin Fowler's classic book Refactoring and read it. Pay particular attention to the concept of code smells. This will point you to particular refactorings that will help with your situation.
Get a good IDE that has refactoring support built in. IDEs won't support all of the refactorings in the book but many of them will have a number of them. Eclipse and NetBeans in the Java world are free and support refactoring well.
Consider a continuous integration server like Hudson to track whether your tests are failing.
Yeah, lock it until you can write a more maintainable replacement for it.
Michael Feathers' book on legacy code will be a good read for that team. Of course, easier said than done, but that particular code can become a design debt for your software in the long run.
Black-box it in a library so it can't be messed with. Document the interface well.
Produce complete unit tests so that if it has to change, you know it still works.
If you have Subversion, locking files isn't terribly unless the code is just a few files. Subversion doesn't let you lock sub-directories, just individual files. Plus the lock can be broken.
What you probably want is a pre-commit hook script. You can do pretty much anything, but I've used it to restrict access to a certain subdirectory to specific people (branches, SQL scripts). Also, unless you have access to the server you can't break a pre-commit hook.
See the Version Control with Subversion book on Implementing Repository Hooks. The Subversion distribution should include some good examples of how to do exactly this.
I'm thinking more like refacter, if the code is hard to work with then it needs to be redone, it may take some time but it will likely be better in the long run as you won't cause as many problems.
Set up automated builds and unit tests. Any kind of repository that tracks changes is good, but won't prevent bugs.
Also, only make changes that you can run right away. The Agile methodology that says release early and often helps here. That way, you can get a better understanding of the code as you get deeper into it.
Basically, if you can, start with refactoring that doesn't change the functionality. Then introduce new functionality on top of the refactored code. Do it slowly with small, deliberate changes.
Locking the source while changes are made probably won't help as much as communicating what is changing and where. Your best approach is to make open communication channels. Set up a forum using something like Slashcode where they can discuss things openly and ask questions, and leave a record.