What features are standard for a testing framework? - unit-testing

So, I've been developing a few programs for AutoCAD 2005, and I've been constantly running into problems--specifically, I've been working on a program that needs to draw lines based on absolute angles ("azimuths") and distances, converting from a special input format to degrees to radians and back, and, like many other programmers, my code has gotten especially bulky and buggy; I've been stuck for almost a week and a half for a program/script that should've taken about three or four days.
I've been thinking about implementing a testing framework for making development much smoother, but unlike other languages, I'm working in a language that supposedly has absolutely no libraries for it and, even better, it's a embedding scripting environment.
I have a few ideas about how the design might work out, but I need to explain a few things:
Executing commands/AutoLISP background
Most of the programs I write are in the form of console-like commands, much like a shell. For example, say I write a function x. In AutoLISP, it's expressed as (the slash is also literally there): (defun x (arguments / local variables) body of function). To make it exposed to the console, I would need to change the name from x to C:x.
So, most of my testing must be done directly from the console; I tend to avoid the inbuilt Visual Lisp editor in AutoCAD because it seems to be disjointed from the actual working environment that the program will most likely be used in, and it doesn't seem to have an actual debugger. So, I frequently have to use (print "string") or other methods to debug my code.
Ideas/Thoughts/Questions
1. What types of functions might I want to expose for a testing framework? I've heard about multiple paradigms myself, such as compile-time testing, asserts, test classes in Java, etc. Should I try to code an assert? Perhaps I should create reflection-using tests?
2. How and where might I want to inject tests? I've thought about writing a different function and turning all local variables into globals to expose it to a possible testing context, but I'm still unsure of how I might want to do this. AutoLISP is lacking in regular Lisp macros, I believe, but I think it still has very nice reflection capabilities, so it is possible for me to actually feed in commands to the console in order to do things. I feel that an external, non-intrusive framework would make the most sense, but I'd like to get a more experienced answer on this.

Base functionality: with the same inputs, the system gives the same outputs.
Useful functionality: asserts. Given some setup in testing code, you then run part of the program to be tested, and make assertations about the output. If all of the assertations are as-expected, print something minimal. If an assertation fails, print something more verbose, to help track back what went wrong.
Incremental functionality. if something sneaks by your tests, and you have to manually find a bug, write a test that will cover that bug next time.
Continual functionality. Have the tests run at least once-per-submission to your source control system. They can run as a presubmit if failures are common but testing itself is quick, or can run as a postsubmit if failures are rare but testing is slow.

Related

What are some techniques I can use to debug my Clojure code?

I'm using CounterClockWise to develop my first Clojure project on the Windows 7 OS. Besides javascript (which I'm not too familiar with), this is my first dynamically typed language.
The hardest part of building my project was debugging the issues I had. The technique I've been using is to sprinkle printlns in places to confirm my inputs and outputs are what I want them to be.
Compared to Java, it seems that a lot of Clojure functions accept what I'd consider garbage input and happily return nil. As a result, the runtime exception you see can come from many functions away from the cause of the problem. My point being, it can be hard to even know where to put the printlns.
And, these runtime exceptions were outputing compiled code line numbers so they weren't very informative. Most of my functions were short and side-effect free but the problem is my inputs were webpages. Sometimes the input to a function was the raw html, sometimes it was the parsed html (by enlive), sometimes it was a list of links (by using a css-like selector on the parsed html). These inputs could be deeply nested, complex structures (ie: a list of maps of maps of lists of maps) so it wasn't easy to build them up by hand. When you've got a stack trace that's not pointing to the issue, I'd pretty much have to debug half my program and figure out how to generate inputs to each part. It was pretty time consuming.
On the IRC channel someone informed me of the stacktrace library which made debugging that much easier. It still pointed many functions away from the source of the bad input, but it was still helpful. I'm looking for more techniques like this. What are some techniques I can use to debug my code better?
Since most functions in Clojure should be rather short (or decomposed to be short) and usually work without side affects, you can always try them separately from Clojure REPL or write tests for them.
Also you can use Java debugger/breakpoints with La Clojure plugin for IntelliJ IDEA - see my answer to: How to run/debug compojure web app via counterclockwise (or la clojure) for more details on using IDEA to run & debug Clojure projects.
If garbage input/unexpected output is an issue for you when calling other functions, maybe you could restructure your code a bit so that these touch points are encapsulated in their own functions with pre/post conditions defined. e.g. http://blog.fogus.me/2009/12/21/clojures-pre-and-post/
I'm sure you could make it much more of a pleasing thing to code with a little bit of macro support. Although that might make the stack traces harder to read.

Test driven development for signal processing libraries

I work with audio manipulation, generally using Matlab for prototyping, and C++ for implementation. Recently, I have been reading up on TDD. I have looked over a few basic examples and am quite enthusiastic about the paradigm.
At the moment, I use what I would consider a global 'test-assisted' approach. For this, I write signal processing blocks in C++, and then I make a simple Matlab mex file that can interface with my classes. I subsequently add functionality, checking that the results match up with an equivalent Matlab script as I go. This works ok, but the tests become obsolete quickly as the system evolves. Furtermore, I am testing the whole system, not just units.
It would be nice to use an established TDD framework where I can have a test suite, but I don't see how I can validate the functionality of the processing blocks without tests that are equally as complex as the code under test. How would I generate the reference signals in a C++ test to validate a processing block without the test being a form of self-fulfilling prophecy?
If anyone has experience in this area, or can suggest some methodologies that I could read into, then that would be great.
I think it's great to apply the TDD approach to signal processing (it would have saved me months of time if I knew about it years ago when I was doing signal processing myself). I think the key is to break down your system into the lowest level components that can be independently tested, eg:
FFTs: test signals at known frequencies: DC, Fs/Nfft, Fs/2 and different phases etc. Check the peaks and phase are as you expect, check the normalisation constant is as you expect
peak picking: test that you correctly find maxima/minima
Filters: generate input at known frequencies and check the output amplitude and phase is as expected.
You are unlikely to get exactly the same results out between C++ and Matlab, so you'll have to supply error bounds on some of the tests. TDD is a great way of not only verifying the correctness of the code you have but is really useful when trying out different implementations. For example if you want to replace one FFT implementation with another, there are often slight differences with the way the data is packed, or the normalisation constant that is used. TDD will give you a high degree of confidence the new library is correctly integrated.
I do something similar for heuristics detection, and we have loads and loads of capture files and a framework to be able to load and inject them for testing. Do you have the possibility to capture the reference signals in a file and do the same?
As for my 2 cents regarding TDD, its a great way to develop, but as with most paradigms, you dont always have to follow it to the letter, there are times when you should know how to bend the rules a bit, so as not to write too much throw-away code/tests. I read about one approach that said absolutely no code should be written until a test is developed, which at times can be way too strict.
On the other hand, I always like to say: "If its not tested, its broken" :)
It's OK for the test to be as complex or more complex than the code under development. If you change (update, refactor, bug fix) the code and not the test, the unit test will warn you that something changed and needs to be reviewed (was a bug fix for mode A supposed to change mode B?, etc.)
Furthermore, you can maintain the APIs for the individual compute components, and not just for the entire end-to-end system.
I've only just starting thinking about TDD in the context of signal processing, so I can only add a bit to the previous answers. What I've done is exploit a bit of superposition to test primitives. For example, testing an IIR filter, I independently verified b0, b1, and b2 elements with unit and scaled gains, and then verified a1 and a2 elements that followed easily modeled decays. My test signal was a combination of ramp functions for the numerator and impulse functions for the denominator. I know it's a trivial example, but the process should work for plenty of linear operations. Tests should also exercise unstable regions and show that outputs explode appropriately.
In general, I expect that impulse responses are going to do a lot of the work for me, since many situations will see them reduce to trigonometric functions, which can be independently calculated. Similarly, if your operation has a series expansion, your test function could perform the expansion to a relevant order and compare against against your processing block. It'll be slow, but it should work.

Embedding Python into C++ application

Context:
An ongoing problem we have been facing is unit testing our market data applications. These applications sit and observe data being retrieved from feeds and does something. Some critical events which are hard to trigger rarely occur and it is are difficult for the Testers to verify our applications perform correctly under all situations, hence we have to rely on unit tests.
These systems generally work by issuing callbacks (into our application) when an event has occurred, then our task to deal with this.
Solution I envision:
Is it possible to embed Python, or extend (not 100% clear on this), so that a tester could fire up a Python REPL and issue function calls that are akin to callbacks which are then handled by our C++ classes. Some form of dynamic manipulation of our objects at runtime.
I do something similar to this in one of my projects by using SWIG to generate python bindings for the relevant parts of the C++ code. Then I embed the interpreter as others have suggested. Having done that I can execute python code at will (e.g. PyRun_SimpleString), which can access C++ code. Normally I end up using something like a Singleton to make accessing specific C++ objects from python easier.
Also worth a mention is directors in swig python modules, which allow virtual functions to be handled much more intuitively. Depending on quite what you're doing you might find these very helpful.
What you want to do is possible, though not trivial to get right. It sounds like you want to embed (rather than extend) Python. Both topics are covered in the tutorial here.
There's quite a lot of work in mapping from C++ classes to Python classes, and there are a number of things that can go wrong in subtle ways, particularly with memory leaks and multithreading (if your existing code is multi-threaded). However, if it's only for use in a testing situation and stability is not mission-critical then it might be less of a problem.
Yes, it is possible. See this for the how.

Based on your development stack, which is easier for you and why? Debugging or logging?

Please state if you are developing on the front end, back end, or if you are developing a mobile/desktop application.
List your development stack
Language, IDE, etc..
Unit Testing or no Unit Testing
Be sure to include any AOP frameworks if used.
Tell me if it is easier for you to use a debugger or to using logging during development, and why you feel it is easier.
I'm just trying to get a feel for why people choose to use a debugger or logging based on their development stack.
[Front end and Back end. Desktop]
As usual: it depends....
Debugging is better if you are investigating behaviour at a distinct place in the code and/or you don't know what objects you will need to inspect and you don't mind interfering with the natural speed/order of code flow
Logging is better if there is a known variable or variables you need to monitor often over a wide swath of the flow AND when you want the code to run naturally without interruptions. Logging is also a useful addition to unit testing.
It entirely depends on the type of problem. A lot of the work that I do currently is done on the back-end (C#, WCF-services). I typically find it easiest to use logging to get a rough idea on where and when a problem occurs, then I try to tailor a test that provokes the behaviour, and then use debugging in order to fix it.
I mainly use logging and unit testing, though I think my greatest weakness as a programmer is that I am not proficient in using gdp. I can do the basic stuff (breakpoints, watches) but don't really know enough to really tap into the power it really has.
I feel some discord in the question. Debugging—according to Wikipedia—is:
Debugging is a methodical process of
finding and reducing the number of
bugs, or defects, in a computer
program
Logging is an automatic writing of trace text records while program is running.
So I use logging as a part of debugging. And I think many people are. Otherwise, what are logs were made for? Well, maybe for further numeric analysis, but that's another story.

How does unit testing work when the program doesn't lend itself to a functional style?

I'm thinking of the case where the program doesn't really compute anything, it just DOES a lot. Unit testing makes sense to me when you're writing functions which calculate something and you need to check the result, but what if you aren't calculating anything? For example, a program I maintain at work relies on having the user fill out a form, then opening an external program, and automating the external program to do something based on the user input. The process is fairly involved. There's like 3000 lines of code (spread out across multiple functions*), but I can't think of a single thing which it makes sense to unit test.
That's just an example though. Should you even try to unit test "procedural" programs?
*EDIT
Based on your description these are the places I would look to unit test:
Does the form validation work of user input work correctly
Given valid input from the form is the external program called correctly
Feed in user input to the external program and see if you get the right output
From the sounds of your description the real problem is that the code you're working with is not modular. One of the benefits I find with unit testing is that it code that is difficult to test is either not modular enough or has an awkward interface. Try to break the code down into smaller pieces and you'll find places where it makes sense to write unit tests.
I'm not an expert on this but have been confused for a while for the same reason. Somehow the applications I'm doing just don't fit to the examples given for UNIT testing (very asynchronous and random depending on heavy user interaction)
I realized recently (and please let me know if I'm wrong) that it doesn't make sense to make a sort of global test but rather a myriad of small tests for each component. The easiest is to build the test in the same time or even before creating the actual procedures.
Do you have 3000 lines of code in a single procedure/method? If so, then you probably need to refactor your code into smaller, more understandable pieces to make it maintainable. When you do this, you'll have those parts that you can and should unit test. If not, then you already have those pieces -- the individual procedures/methods that are called by your main program.
Even without unit tests, though, you should still write tests for the code to make sure that you are providing the correct inputs to the external program and testing that you handle the outputs from the program correctly under both normal and exceptional conditions. Techniques used in unit testing -- like mocking -- can be used in these integration tests to ensure that your program is operating correctly without involving the external resource.
An interesting "cut point" for your application is you say "the user fills out a form." If you want to test, you should refactor your code to construct an explicit representation of that form as a data structure. Then you can start collecting forms and testing that the system responds appropriately to each form.
It may be that the actions taken by your system are not observable until something hits the file system. Here are a couple of ideas:
Set up something like a git repository for the initial state of the file system, run a form, and look at the output of git diff. It's likely this is going to feel more like regression testing than unit testing.
Create a new module whose only purpose is to make your program's actions observable. This can be as simple as writing relevant text to a log file or as complex as you like. If necessary, you can use conditional compilation or linking to ensure this module does something only when the system is under test. This is closer to traditional unit testing as you can now write tests that say upon receiving form A, the system should take sequence of actions B. Obviously you have to decide what actions should be observed to form a reasonable test.
I suspect you'll find yourself migrating toward something that looks more like regression testing than unit testing per se. That's not necessarily bad. Don't overlook code coverage!
(A final parenthetical remark: in the bad old days of interactive console applications, Don Libes created a tool called Expect, which was enormously helpful in allowing you to script a program that interacted like a user. In my opinion we desperately need something similar for interacting with web pages. I think I'll post a question about this :-)
You don't necessarily have to implement automated tests that test individual methods or components. You could implement an automated unit test that simulates a user interacting with your application, and test that your application responds in the correct way.
I assume you are manually testing your application currently, if so then think about how you could automate that and work from there. Over time you should be able to break your tests into progressively smaller chunks that test smaller sections of code. Any sort of automated testing is usually a lot better than nothing.
Most programs (regardless of the language paradigm) can be broken into atomic units which take input and provide output. As the other responders have mentioned, look into refactoring the program and breaking it down into smaller pieces. When testing, focus less on the end-to-end functionality and more on the individual steps in which data is processed.
Also, a unit doesn't necessarily need to be an individual function (though this is often the case). A unit is a segment of functionality which can be tested using inputs and measuring outputs. I've seen this when using JUnit to test Java APIs. Individual methods might not necessarily provide the granularity I need for testing, though a series of method calls will. Therefore, the functionality I regard as a "unit" is a little greater than a single method.
You should at least refactor out the stuff that looks like it might be a problem and unit test that. But as a rule, a function shouldn't be that long. You might find something that is unit test worthy once you start refactoring
Good object mentor article on TDD
As a few have answered before, there are a few ways you can test what you have outlined.
First the form input, can be tested in a few ways.
What happens if invalid data is inputted, valid data, etc.
Then each of the function can be tested to see if the functions when supplied with various forms of correct and incorrect data react in the proper manner.
Next you can mock the application that are being called so that you can make sure that your application send and process data to the external programs correctly. Don't for get to make sure your program deals with unexpected data from the external program as well.
Usually, the way I figure out how I want to write tests for a program I have been assigned to maintain, is to see what I am do manually to test the program. Then try and figure how to automate as much of it as possible. Also, don't restrict your testing tools just to the programming language you are writing the code in.
I think a wave of testing paranoia is spreading :) Its good to examine things to see if tests would make sense, sometimes the answer is going to be no.
The only thing that I would test is making sure that bogus form input is handled correctly.. I really don't see where else an automated test would help. I think you'd want the test to be non invasive (i.e. no record is actually saved during testing), so that might rule out the other few possibilities.
If you can't test something how do you know that it works? A key to software design is that the code should be testable. That may make the actual writing of the software more difficult, but it pays off in easier maintenance later.