How to set up hermitic tests in Julia, with per-test and/or per-fixture setup/teardown methods? - unit-testing

Most test frameworks in languages like Java, C++, Javascript, C#, Python include, as a core tenant, that tests should be hermetic. To that end, they provide various ways to set up the state to a known configuration before each and every test, and then do cleanup after each and every test. Although this is not foolproof, programmers generally don't have to worry about one test affecting another test.
How do I do that in Julia?
The built-in Test module just runs the entire test script with utter disregard for hermetic concerns on the part of the framework. Other modules I've found like ReTest.jl and XUnit.jl add various bells and whistles, but I haven't found anything that adds setup/teardown machinery.

Related

Can unit tests be environment-dependent?

I have this very generic question about unit test. Should we write environment dependent unit tests? The unit tests may depend on file system, a specific connection or presence of a file.
I am not sure if there is a answer to it. But I found it takes time to decouple the unit tests from the environment. Does that mean it is ok to create environment dependent unit tests? Or it always points out that there is a design problem when we need to write environment dependent unit tests?
TL;DR
Whether environment dependent tests are indeed a design flaw or not is strongly dependent on your application, programming language and specific problem. Keep in mind, that tests are never "complete" or "perfect" and try not to achieve perfection at the expense of more and more time. Using modern frameworks, making your tests re-usable and mocking your environment can take a lot of the pain out of testing.
The longer answer
IMHO, there's no perfect answer to this question. Unit Tests should be as generic and re-usable as possible without trying to cover environmental edge cases at any price.
Example: Most environments should have an AMD64 processor nowadays (not regarding phones), so in most cases it is not needed to create unit tests for X32 or ARM environments, ...
Your question is technology agnostic, so it's hard to tell, if the need for environment dependent unit tests is a design flaw. If you're using Java (see below), I'd rather think it is, using C++, it might not be...
Some rules of thumb I can give from my experience (and therefore a little opinion-based):
Follow the pareto principle
Don't try to achieve 100% of what is possible in theory, but concretate on the about 80% that are also likely to be encountered. Let's say, you expect your application to run on Windows Clients. So possibly your tests should cover the environmental characteristics of Windows 10 and 11. Older Windows versions, Linux, Mac OS, ... should not be needed.
Or, if your application mainly has as Spanish audience, you should not care so much about how it can deal with chinese characters or canadian data protection rules.
Exclude environmental specialties as much as sane and possible when coding your application
If you can manage to make your application code mostly independent from your environment, the same will "automatically" apply to your tests. (Also, see the point about not re-inventing the wheel below)
As you didn't ask for a specific programming language, I'll refer to Java here, but similar concepts exist in most modern languages.
E. g., Java abstracts very much from your environment, so you can use File.separator, if you don't know if your application will be running on Windows (\ as separator) or Linux (/).
Not using OS-specific APIs (like Java NI) avoids environment-related problems as well. If not dealing closely with hardware access, there shouldn't be any need to use them.
Adapt your tests
Like the "real" application, usually your tests are never "done". A new problem with your software arises? Fix it and add a specific test case for this kind of bug (trying to be quite generic). If it was a problem related to the 20% of cases or environments you did not consider before, be happy - now you've got a little hold of them as well.
Package your test data with your application
Referring to Java again, you could place your test files in the resources folder and load them dynamically while testing - not need to bother the real file system of the testing machine.
Also, if you're using a database, you could spin up an in-memory instance (like H2) during testing. This will not exactly mirror your production database system, but makes testing very easy without the danger of breaking something. Anyway, database abstraction is also very sophisticated nowadays, so despite of some edge cases, you should not see any difference in behaviour if you're using a good database abstraction layer (which itself in turn make you less environment-dependent).
Make your tests re-usable
Using a tool like Maven or Gradle, you can make your tests run on a bigger variety of environments, if they are packaged as just pointed out. The need for a specific test environment will decrease.
Mocking is your friend
As with the "fake" database mentioned before, you can mock a lot of things. There are frameworks like Mockito allowing white box and black box testing with mocked objects. "Helpers" like WireMock allow you to even mock the answer you get from an external service. And there are a lot more possibilities out there!
Don't re-invent the wheel
If you use one of todays advanced application frameworks (like Spring, Symfony, Boost, ...) they'll do a lot of the work for you if you make proper use of them. You'll have to read the docs and burrow into them, but then you will realize how much easier they can make your life. This does not apply to frameworks only, but also to libraries, services, components, ... And the good thing is, all of that will also come into place when it's about testing! Possibly most of your environmental dependencies might vanish if you use a good software foundation.

Should hand-coded stubs be unit-tested?

In an environment where the use of a mocking framework is not an option, should hand-coded stubs and fakes (etc.) themselves be unit-tested?
For example, imagine I have a fake logger class that logs to memory rather than to file. The fake is extremely simple – it just keeps a record of log messages – but as it is hand-coded should it still be unit-tested?
No, there is no need to separately test them. Your unit tests and tested code act as tests for each other.
If you are using Test Driven Development (TDD), the red phase of the red-green-refactor cycle also tests your stubs.
Globally for me, any not tested code is a future-dead. So I advise you to test usage of any part of all written code. And in test driven development philosophy, you'll be sure after any changement that this part of code will be right.
In Python, I test unittest utilities with doctests instead of real unittest.
It allows me to split application tests and test utilities tests. And give developers concret help in docstring.
Doctest official doc
I think this is one of those scenarios where, annoyingly, it depends. Typically I do not write tests for hand rolled stubs, and would 100% agree with the point #Raedwald makes that unit tests and stubs should essentially act as tests for each other. However there have been certain situations where it felt correct to do so, for example:
When there is any kind of conditional logic involved that tests rely upon (e.g. the stub calculates output based on the input provided), then it can be useful to verify this separately. I'm not saying that putting conditional logic in stubs is a good testing practice mind you, but can be useful in some scenarios.
When the stubs are being used for larger scale integration tests, because in this environment determining the root cause of a test failure can be much more difficult, and it is particularly frustrating if the root cause turns out to be a poorly coded stub.
In summary, I don't think it's a bad practice to unit test them if your gut tells you it's a good idea, however it's certainly not the end of the world if you don't bother either. I wouldn't sweat the decision too much.

Learning About Unit Testing Using When and Should and TDD

The tests at my new job are nothing like the tests I have encountered before.
When they're writing their unit tests (presumably before the code), they create a class starting with "When". The name describes the scenario under which the tests will run (the fixture). They'll created subclasses for each branch through the code. All of the tests within the class start with "should" and they test different aspects of the code after running. So, they will have a method for verifying that each mock (DOC) is called correctly and for checking the return value, if applicable. I am a little confused by this method because it means the exact same execution code is being run for each test and this seems wasteful. I was wondering if there is a technique similar to this that they may have adapted. A link explaining the style and how it is supposed to be implemented would be great. I sounds similar to some approaches of BDD I've seen.
I also noticed that they've moved the repeated calls to "execute" the SUT into the setup methods. This causes issues when they are expecting exceptions, because they can't use built-in tools for performing the check (Python unittest's assertRaises). This also means storing the return value as a backing field of the test class. They also have to store many of the mocks as backing fields. Across class hierarchies it becomes difficult to tell the configuration of each mock.
They also test code a little differently. It really comes down to what they consider an integration test. They mock out anything that steals the context away from the function being tested. This can mean private methods within the same class. I have always limited mocking to resources that can affect the results of the test, such as databases, the file system or dates. I can see some value in this approach. However, the way it is being used now, I can see it leading to fragile tests (tests that break with every code change). I get concerned because without an integration test, in this case, you could be using a 3rd party API incorrectly but your unit tests would still pass. I'd like to learn more about this approach as well.
So, any resources about where to learn more about some of these approaches would be nice. I'd hate to pass up a great learning opportunity just because I don't understand they way they are doing things. I would also like to stop focusing on the negatives of these approaches and see where the benefits come in.
If I understood you explanation in the first paragraph correctly, that's quite similar to what I often do. (Depending on whether the testing framework makes it easy or not. Also many mocking frameworks don't support it, but spy frameworks like Mockito do better.)
For example see the stack example here which has a common setup (adding things to the stack) and then a bunch of independent tests which each check one thing. Here's still another example, this time one where none of the tests (#Test) modify the common fixture (#Before), but each of them focuses on checking just one independent thing that should happen. If the tests are very well focused, then it should be possible to change the production code to make any single test fail while all other tests pass (I wrote about that recently in Unit Test Focus Isolation).
The main idea is to have each test check a single feature/behavior, so that when tests fail it's easier to find out why it failed. See this TDD tutorial for more examples and to learn that style.
I'm not worried about the same code paths executed multiple times, when it takes a millisecond to run one test (if it takes more than a couple of seconds to run all unit tests, the tests are probably too big). From your explanation I'm more worried that the tests might be too tightly coupled to the implementation, instead of the feature, if it's systematic that there is one test for each mock. The name of the test would be a good indicator of how well structured or how fragile the tests are - does it describe a feature or how that feature is implemented.
About mocking, a good book to read is Growing Object-Oriented Software Guided by Tests. One should not mock 3rd party APIs (APIs which you don't own and can't modify), for the reason you already mentioned, but one should create an abstraction over it which better fits the needs of the system using it and works the way you want it. That abstraction needs to be integration tested with the 3rd party API, but in all tests using the abstraction you can mock it.
First, the pattern that you are using is based on Cucumber - here's a link. The style is from the BDD (Behavior-driven development) approach. It has two advantages over traditional TDD:
Language - one of the tenants of BDD is that the language you use influences the thoughts you have by forcing you to speak in the language of the end user, you will end up writing different tests than when you write tests from the focus of a programmer
Tests lock code - BDD locks the code at the appropriate level. One problem common in testing is that you write a large number of tests, which makes your codebase more brittle as when you change the code you must also change a large number of tests too. BDD forces you to lock the behavior of your code, rather than the implementation of your code. This way, when a test breaks, it is more likely to be meaningful.
It is worth noting that you do not have to use the Cucumber style of testing to achieve these affects and using it does add an extra layer of overhead. But very few programmers have been successful in keeping the BDD mindset while using traditional xUnit tools (TDD).
It also sounds like you have some scenarios where you would like to say 'When I do , then verify '. Because the current BDD xUnit frameworks only allow you to verify primitives (strings, ints, doubles, booleans....), this usually results in a large number of individual tests (one for each Assert). It is possible to do more complicated verifications using a Golden Master paradigm test tool, such as ApprovalTests. Here's a video example of this.
Finally, here's a link to Dan North's blog - he started it all.

How do I write useful unit tests for a mostly service-oriented app?

I've used unit tests successfully for a while, but I'm beginning to think they're only useful for classes/methods that actually perform a fair amount of logic - parsers, doing math, complex business logic - all good candidates for testing, no question. I'm really struggling to figure out how to use testing for another class of objects: those which operate mostly via delegation.
Case in point: my current project coordinates a lot of databases and services. Most classes are just collections of service methods, and most methods perform some basic conditional logic, maybe a for-each loop, and then invoke other services.
With objects like this, mocks are really the only viable strategy for testing, so I've dutifully designed mocks for several of them. And I really, really don't like it, for the following reasons:
Using mocks to specify expectations for behavior makes things break whenever I change the class implementation, even if it's not the sort of change that ought to make a difference to a unit test. To my mind, unit tests ought to test functionality, not specify "the methods needs to do A, then B, then C, and nothing else, in that order." I like tests because I am free to change things with the confidence that I'll know if something breaks - but mocks just make it a pain in the ass to change anything.
Writing the mocks is often more work than writing the classes themselves, if the intended behavior is simple.
Because I'm using a completely different implementation of all the services and component objects in my test, in the end, all my tests really verify is the most basic skeleton of the behavior: that "if" and "for" statements still work. Boring. I'm not worried about those.
The core of my application is really how all the pieces work together, so I'm considering
ditching unit tests altogether (except for places where they're clearly appropriate) and moving to external integration tests instead - harder to set up, coverage of less possible cases, but actually exercise the system as it is mean to be run.
I'm not seeing any cases where using mocks is actually useful.
Thoughts?
If you can write integration tests that are fast and reliable, then I would say go for it.
Use mocks and/or stubs only where necessary to keep your tests that way.
Notice, though, that using mocks is not necessarily as painful as you described:
Mocking APIs let you use loose/non-strict mocks, which will allow all invocations from the unit under test to its collaborators. Therefore, you don't need to record all invocations, but only those which need to produce some required result for the test, such as a specific return value from a method call.
With a good mocking API, you will have to write little test code to specify mocking. In some cases you may get away with a single field declaration, or a single annotation applied to the test class.
You can use partial mocking so that only the necessary methods of a service/component class are actually mocked for a given test. And this can be done without specifying said methods in strings.
To my mind, unit tests ought to test
functionality, not specify "the
methods needs to do A, then B, then C,
and nothing else, in that order."
I agree. Behavior testing with mocks can lead to brittle tests, as you've found. State-based testing with stubs reduces that issue. Fowler weighs in on this in Mocks Aren't Stubs.
Writing the mocks is often more work
than writing the classes themselves
For mocks or stubs, consider using an isolation (mocking) framework.
in the end, all my tests really verify
is the most basic skeleton of the
behavior: that "if" and "for"
statements still work
Branches and loops are logic; I would recommend testing them. There's no need to test getters and setters, one-line pure delegation methods, and so forth, in my opinion.
Integration tests can be extremely valuable for a composite system such as yours. I would recommend them in addition to unit tests, rather than instead of them.
You'll definitely want to test the classes underlying your low-level or composing services; that's where you'll see the biggest bang for the buck.
EDIT: Fowler doesn't use the "classical" term the way I think of it (which likely means I'm wrong). When I talk about state-based testing, I mean injecting stubs into the class under test for any dependencies, acting on the class under test, then asserting against the class under test. In the pure case I would not verify anything on the stubs.
Writing Integration Tests is a viable option here, but should not replace Unit Tests. But since you stated your writing mocks yourself, I suggest using an Isolation Framework (aka Mocking Framework), which I am pretty sure of will be available for your environment too.
Being that you've posted several questions in one I'll answer them one by one.
How do I write useful unit tests for a mostly service-oriented app?
Do not rely on unit tests for a "mostly service-oriented app"! Yes I said that in a sentence. These types of apps are meant to do one thing: integrate services. It's therefore more pressing that you write integration tests instead of unit tests to very that the integration is working correctly.
I'm not seeing any cases where using mocks is actually useful.
Mocks can be extremely useful, but I wouldn't use them on controllers. Controllers should be covered by integration tests. Services can be covered by unit tests but it may be wise to have them as separate modules if the amount of testing slows down your project.
Thoughts?
For me, I tend to think about a few things:
What is my application doing?
How expensive would it be to perform system level / integration tests?
Can I split my application up into modules that can be tested separately?
In the scenario you've provided, I'd say your application is an integration of many services. Therefore, I'd lean heavily on integration tests over unit tests. I'd bet most of the Mocks you've written have been for http related classes etc.
I'm a bigger fan of integration / system level tests wherever possible for the following reasons:
In this day and age of "moving fast", re-factoring the designs of yesterday happens at an ever increasing rate. Integration tests aren't concerned about implementation details at all so this facilitates rapid change. Dynamic languages are in full swing making mocks even more dangerous / brittle. With a static lang, mocks are much safer because your tests won't compile if they're trying to stub out a non existent or misspelled method name.
The amount of code written in an integration test is usually 60% less than the amount of code written in a unit test to achieve the same level of coverage so development time is less. "Yes but it takes longer to run integration tests..." that's where you need to be pragmatic until it actually slows you down to run integration tests.
Integration tests catch more bugs. Mocking is often contrived and removes the developer from the realities of what their changes will do to the application as a whole. I've allowed way more bugs into production under the "safety net" of 100% unit test coverage than I would have with integration tests.
If integration testing is slow for my application then I haven't split it up into separate modules. This is often an indicator early on that I need to do some extracting into separation.
Integration tests do way more for you than reach code coverage, they're also an indicator of performance issues or network problems etc.

When Testing your MVC-based UI, how much of the test setup do you make common?

I'm trying to test a simple WebForms (asp.net) based UI, and follow the MVP pattern to allow my UI to be more testable.
As I follow the TDD methodology for backend algorithms, I find that there are some unit test refactorings that happen in the spirit of the DRY principle (Don't Repeat Yourself). As I try to apply this to the UI using Rhino Mocks to verify my interactions, I see many commonalities in the Controller tests when setting up the view or model expectations.
My question is: how far do you typically take this refactoring, if at all? I'm curious to see how other TDDer's test their MVC/MVP based UIs.
I would not refactor tests like standard code. Tests start to become more obscure as you refactor things into common base classes, helper methods, etc. Tests should be sufficiently clear on their own.
DRY is not a test concern.
That said, there are many plumbing things that are commonly done, and those should be abstracted away.
I use MVP, and on my tests I try to apply most of the refactoring I would in standard code. It normally doesn't work quite as well on the tests, due to the slight variations needed to test different scenarios, but within parts there can be commonality, and when possible I do consolidate. This does ease the needed changes later as the project evolves; just like in your standard code it is easier to change one place instead of 20.
I'd prefer to treat unit test as pure functional programs, to avoid to have to test them. If an operation is enough common in between tests, then I would evaluate it for the standard codebase, but even then I'd avoid refactoring tests, because I tend to have lots of them, specially for gui driven BL.
I use selenium for functional testing
and I'm using JUnit to test my controllers.
I'll mock out services or resources used by the controller and test to see what URI the controller is redirecting to, etc...
The only thing I'm not really testing at this point are the views. But I have employed functional testing to compensate.