How to Unit Test Against Remote Resources

How to Unit Test Against Remote Resources - unit-testing

I've been trying to learn how to properly unittest and set up unit tests for all of my code on a new project. The project I'm currently doing this for requires me running a lot of actions against Google BigQuery (i.e. create tables, insert, query, delete). I'm feeling like I can't truly test all of this functionality by mocking BigQuery because the actions I do against it are complicated and interdependent, and if there's a break in the middle somewhere, I want to catch it. Is it generally frowned upon to have something like an environment variable that specifies a test account built into my unit tests so they actually run against the remote service? This feels like the best way to truly test everything and hit tests that I couldn't hit with a mock. So, is this something people do? Are there some major downsides to doing things this way?

I tend to have a mix of unit and integration tests in my project. I believe both are equally valuable, but one thing to keep in mind when doing integration testing is to ensure that the tests are stable and repeatable.
There are several approaches, but I favor the approach of making the tests self sufficient by ensuring that all data dependencies are built in the test itself. This is important since you avoid failing tests due to failed assumptions about existing data in your data source.
A variation on this is to have a scaffolding script populate your data source with fixed test data. I find this to be less manageable since it can introduce dependencies between tests and changing the test data for one test may cause failure in another.

What you're looking to do is technically called integration tests but I do see your point. I myself am doing both as well currently. My interaction in my integration tests is with a database. I find that these integration tests often catch way more errors than true unit tests and are generally more beneficial. I will say however that unit tests are important as well.
I have found that integration tests can tend to take a much longer time since it's doing all this interaction and if this is a part of your nightly build process for example this can greatly increase the amount of time it takes for a build to complete. Some of our builds take close to an hour at this point to complete which is sometimes a problem for us.
I will say when you introduce things like environment variables into the mix you have to start making sure that every developer on the team has this environment variable if they want to run the tests. As a general rule of thumb I try to make it as simple as possible for everyone to build and run tests directly out of source control. There is nothing more frustrating than not being able to build source code or execute unit tests directly out of source control.

It's helpful to think of things like BigQuery as just implementation details; means to an end.
Something in your application currently says "I need x - I'll use BigQuery to get it." Instead of having explicit knowledge of BigQuery, this thing could instead have knowledge of "some entity capable of getting x". This is the location of a seam, and is where mocking would take place.
You mentioned that you don't want to mock all of the objects involved in creating a BigQuery request. You are absolutely right in avoiding this. That doesn't mean that you can't mock out BigQuery, though; you just need to move up a rung.

Related

is it ok to write test cases for save/update/persist methods - whether it be mock or by calling real methods [duplicate]

I know what the advantages are and I use fake data when I am working with more complex systems.
What if I am developing something simple and I can easily set up my environment in a real database and the data being accessed is so small that the access time is not a factor, and I am only running a few tests.
Is it still important to create fake data or can I forget the extra coding and skip right to the real thing?
When I said real database I do not mean a production database, I mean a test database, but using a real live DBMS and the same schema as the real database.

The reasons to use fake data instead of a real DB are:
Speed. If your tests are slow you aren't going to run them. Mocking the DB can make your tests run much faster than they otherwise might.
Control. Your tests need to be the sole source of your test data. When you use fake data, your tests choose which fakes you will be using. So there is no chance that your tests are spoiled because someone left the DB in an unfamiliar state.
Order Independence. We want our tests to be runnable in any order at all. The input of one test should not depend on the output of another. When your tests control the test data, the tests can be independent of each other.
Environment Independence. Your tests should be runnable in any environment. You should be able to run them while on the train, or in a plane, or at home, or at work. They should not depend on external services. When you use fake data, you don't need an external DB.
Now, if you are building a small little application, and by using a real DB (like MySQL) you can achieve the above goals, then by all means use the DB. I do. But make no mistake, as your application grows you will eventually be faced with the need to mock out the DB. That's OK, do it when you need to. YAGNI. Just make sure you DO do it WHEN you need to. If you let it go, you'll pay.

It sort of depends what you want to test. Often you want to test the actual logic in your code not the data in the database, so setting up a complete database just to run your tests is a waste of time.
Also consider the amount of work that goes into maintaining your tests and testdatabase. Testing your code with a database often means your are testing your application as a whole instead of the different parts in isolation. This often result in a lot of work keeping both the database and tests in sync.
And the last problem is that the test should run in isolation so each test should either run on its own version of the database or leave it in exactly the same state as it was before the test ran. This includes the state after a failed test.
Having said that, if you really want to test on your database you can. There are tools that help setting up and tearing down a database, like dbunit.
I've seen people trying to create unit test like this, but almost always it turns out to be much more work then it is actually worth. Most abandoned it halfway during the project, most abandoning ttd completely during the project, thinking the experience transfer to unit testing in general.
So I would recommend keeping tests simple and isolated and encapsulate your code good enough it becomes possible to test your code in isolation.

As far as the Real DB does not get in your way, and you can go faster that way, I would be pragmatic and go for it.
In unit-test, the "test" is more important than the "unit".

I think it depends on whether your queries are fixed inside the repository (the better option, IMO), or whether the repository exposes composable queries; for example - if you have a repository method:
IQueryable<Customer> GetCustomers() {...}
Then your UI could request:
var foo = GetCustomers().Where(x=>SomeUnmappedFunction(x));
bool SomeUnmappedFunction(Customer customer) {
return customer.RegionId == 12345 && customer.Name.StartsWith("foo");
}
This will pass for an object-based fake repo, but will fail for actual db implementations. Of course, you can nullify this by having the repository handle all queries internally (no external composition); for example:
Customer[] GetCustomers(int? regionId, string nameStartsWith, ...) {...}
Because this can't be composed, you can check the DB and the UI independently. With composable queries, you are forced to use integration tests throughout if you want it to be useful.

It rather depends on whether the DB is automatically set up by the test, also whether the database is isolated from other developers.
At the moment it may not be a problem (e.g. only one developer). However (for manual database setup) setting up the database is an extra impediment for running tests, and this is a very bad thing.

If you're just writing a simple one-off application that you absolutely know will not grow, I think a lot of "best practices" just go right out the window.
You don't need to use DI/IOC or have unit tests or mock out your db access if all you're writing is a simple "Contact Us" form. However, where to draw the line between a "simple" app and a "complex" one is difficult.
In other words, use your best judgment as there is no hard-and-set answer to this.

It is ok to do that for the scenario, as long as you don't see them as "unit" tests. Those would be integration tests. You also want to consider if you will be manually testing through the UI again and again, as you might just automated your smoke tests instead. Given that, you might even consider not doing the integration tests at all, and just work at the functional/ui tests level (as they will already be covering the integration).
As others as pointed out, it is hard to draw the line on complex/non complex, and you would usually now when it is too late :(. If you are already used to doing them, I am sure you won't get much overhead. If that were not the case, you could learn from it :)

Assuming that you want to automate this, the most important thing is that you can programmatically generate your initial condition. It sounds like that's the case, and even better you're testing real world data.
However, there are a few drawbacks:
Your real database might not cover certain conditions in your code. If you have fake data, you cause that behavior to happen.
And as you point out, you have a simple application; when it becomes less simple, you'll want to have tests that you can categorize as unit tests and system tests. The unit tests should target a simple piece of functionality, which will be much easier to do with fake data.

One advantage of fake repositories is that your regression / unit testing is consistent since you can expect the same results for the same queries. This makes it easier to build certain unit tests.
There are several disadvantages if your code (if not read-query only) modifies data:
- If you have an error in your code (which is probably why you're testing), you could end up breaking the production database. Even if you didn't break it.
- if the production database changes over time and especially while your code is executing, you may lose track of the test materials that you added and have a hard time later cleaning it out of the database.
- Production queries from other systems accessing the database may treat your test data as real data and this can corrupt results of important business processes somewhere down the road. For example, even if you marked your data with a certain flag or prefix, can you assure that anyone accessing the database will adhere to this schema?
Also, some databases are regulated by privacy laws, so depending on your contract and who owns the main DB, you may or may not be legally allowed to access real data.
If you need to run on a production database, I would recommend running on a copy which you can easily create during of-peak hours.

It's a really simple application, and you can't see it growing, I see no problem running your tests on a real DB. If, however, you think this application will grow, it's important that you account for that in your tests.
Keep everything as simple as you can, and if you require more flexible testing later on, make it so. Plan ahead though, because you don't want to have a huge application in 3 years that relies on old and hacky (for a large application) tests.

The downsides to running tests against your database is lack of speed and the complexity for setting up your database state before running tests.
If you have control over this there is no problem in running the tests directly against the database; it's actually a good approach because it simulates your final product better than running against fake data. The key is to have a pragmatic approach and see best practice as guidelines and not rules.

Best practice approach for automated testing

This is a very strange request for advice for which I truly feel there is no real answer. In my project I have archiving routines on various objects that have been consumed for logical calculations, I archive these items for the sake of audit trail and to check up on calculation errors or prove correctiveness at a later stage. I am working with Entity Framework and things are slightly different to perhaps your own project.
I consume the original object, modify it directly, create a clone of the modified item, revert the original item from store and save changes accordingly. An object is not reverted to original if never consumed by a calculation, in these instances, I save directly over that object along with the various relationships that exist with further objects.
This may sound long winded, but I assure you - it seems the easiest so far in terms of my workings with EF in my situation.
My trouble with these archiving routines is, that over time as I introduce further functionality - I sometimes, without knowing, break critical code to a point where I have to regression test the entire solution over, from beginning to end, to ensure that the archiving requirements remain intact.
Is there any unit test approach or automated methodology for testing these sorts of requirements. It would speed up deployment of packages cutting down on my own manual testing.
Any advice or links to simlar situations appreciated.

I think there are two pieces to this problem you are describing:
First you need some unit tests that you can build which will represent technical requirements of the system. Think of the unit tests as the rules which you have set up to technically accomplish the goal that the end user desires. In this way, I would craft unit tests that you can feel confident will break if a technical assumption you had made about the system fails because of a code change. Remember to keep the unit tests at the unit level so that you don't have a large amount of dependencies interacting to fail a test. A unit test should test exactly one thing. If you do this, when you make code changes you can run all your unit tests and immediately know what assumptions you had made about the system which are now not being met.
I would also set up some sort of integration functional tests which are automated. I think in your problem domain it would make sense to set up integrated tests which are similar to unit tests (you can use the same tool.) Here you will want to take bigger pieces of functionality, perhaps pipes which data flows through the system and test that the correct series of transformations occur on the data.

One best practice is to make sure the tests can be run in any order. You could separate the produce routines from the archive routines, perhaps by using "gold" data on the archive routing.

The number one best practice for unit tests is just do it! Beyond that, I'd like to recommend xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros.

What are the pros and cons of automated Unit Tests vs automated Integration tests?

Recently we have been adding automated tests to our existing java applications.
What we have
The majority of these tests are integration tests, which may cover a stack of calls like:-
HTTP post into a servlet
The servlet validates the request and calls the business layer
The business layer does a bunch of stuff via hibernate etc and updates some database tables
The servlet generates some XML, runs this through XSLT to produce response HTML.
We then verify that the servlet responded with the correct XML and that the correct rows exist in the database (our development Oracle instance). These rows are then deleted.
We also have a few smaller unit tests which check single method calls.
These tests are all run as part of our nightly (or adhoc) builds.
The Question
This seems good because we are checking the boundaries of our system: servlet request/response on one end and database on the other. If these work, then we are free to refactor or mess with anything inbetween and have some confidence that the servlet under test continues to work.
What problems are we likely to run into with this approach?
I can't see how adding a bunch more unit tests on individual classes would help. Wouldn't that make it harder to refactor as it's much more likely we will need to throw away and re-write tests?

Unit tests localize failures more tightly. Integration-level tests more closely correspond to user requirements and so are better predictor of delivery success. Neither of them is much good unless built and maintained, but both of them are very valuable if properly used.
(more...)
The thing with units tests is that no integration level test can exercise all the code as much as a good set of unit tests can. Yes, that can mean that you have to refactor the tests somewhat, but in general your tests shouldn't depend on the internals so much. So, lets say for example that you have a single function to get a power of two. You describe it (as a formal methods guy, I'd claim you specify it)
long pow2(int p); // returns 2^p for 0 <= p <= 30
Your test and your spec look essentially the same (this is sort of pseudo-xUnit for illustration):
assertEqual(1073741824,pow2(30);
assertEqual(1, pow2(0));
assertException(domainError, pow2(-1));
assertException(domainError, pow2(31));
Now your implementation can be a for loop with a multiple, and you can come along later and change that to a shift.
If you change the implementation so that, say, it's returning 16 bits (remember that sizeof(long) is only guaranteed to be no less than sizeof(short)) then this tests will fail quickly. An integration-level test should probably fail, but not certainly, and it's just as likely as not to fail somewhere far downstream of the computation of pow2(28).
The point is that they really test for diferent situations. If you could build sufficiently details and extensive integration tests, you might be able to get the same level of coverage and degree of fine-grained testing, but it's probably hard to do at best, and the exponential state-space explosion will defeat you. By partitioning the state space using unit tests, the number of tests you need grows much less than exponentially.

You are asking pros and cons of two different things (what are the pros and cons of riding a horse vs riding a motorcycle?)
Of course both are "automated tests" (~riding) but that doesn't mean that they are alternative (you don't ride a horse for hundreds of miles, and you don't ride a motorcycle in closed-to-vehicle muddy places)
Unit Tests test the smallest unit of the code, usually a method. Each unit test is closely tied to the method it is testing, and if it's well written it's tied (almost) only with that.
They are great to guide the design of new code and the refactoring of existing code. They are great to spot problems long before the system is ready for integration tests. Note that I wrote guide and all the Test Driven Development is about this word.
It does not make any sense to have manual Unit Tests.
What about refactoring, which seems to be your main concern? If you are refactoring just the implementation (content) of a method, but not its existence or "external behavior", the Unit Test is still valid and incredibly useful (you cannot imagine how much useful until you try).
If you are refactoring more aggressively, changing methods existence or behavior, then yes, you need to write a new Unit Test for each new method, and possibly throw away the old one. But writing the Unit Test, especially if you write it before the code itself, will help to clarify the design (i.e. what the method should do, and what it shouldn't) without being confused by the implementation details (i.e. how the method should do the thing that it needs to do).
Automated Integration Tests test the biggest unit of the code, usually the entire application.
They are great to test use cases which you don't want to test by hand. But you can also have manual Integration Tests, and they are as effective (only less convenient).
Starting a new project today, it does not make any sense not to have Unit Tests, but I'd say that for an existing project like yours it does not make too much sense to write them for everything you already have and it's working.
In your case, I'd rather use a "middle ground" approach writing:
smaller Integration Tests which only test the sections you are going to refactor. If you are refactoring the whole thing, then you can use your current Integration Tests, but if you are refactoring only -say- the XML generation, it does not make any sense to require the presence of the database, so I'd write a simple and small XML Integration Test.
a bunch of Unit Tests for the new code you are going to write. As I already wrote above, Unit Tests will be ready as soon as you "mess with anything in between", making sure that your "mess" is going somewhere.
In fact your Integration Test will only make sure that your "mess" is not working (because at the beginning it will not work, right?) but it will not give you any clue on
why it is not working
if your debugging of the "mess" is really fixing something
if your debugging of the "mess" is breaking something else
Integration Tests will only give the confirmation at the end if the whole change was successful (and the answer will be "no" for a long time). The Integration Tests will not give you any help during the refactoring itself, which will make it harder and possibly frustrating. You need Unit Tests for that.

I agree with Charlie about Integration-level tests corresponding more to user actions and the correctness of the system as a whole. I do think there is alot more value to Unit Tests than just localizing failures more tightly though. Unit tests provide two main values over integration tests:
1) Writing unit tests is as much an act of design as testing. If you practice Test Driven Development/Behavior Driven Development the act of writing the unit tests helps you design exactly what you code should do. It helps you write higher quality code (since being loosely coupled helps with testing) and it helps you write just enough code to make your tests pass (since your tests are in effect your specification).
2) The second value of unit tests is that if they are properly written they are very very fast. If I make a change to a class in your project can I run all the corresponding tests to see if I broke anything? How do I know which tests to run? And how long will they take? I can guarantee it will be longer than well written unit tests. You should be able to run all of you unit tests in a couple of minutes at the most.

Just a few examples from personal experience:
Unit Tests:
(+) Keeps testing close to the relevant code
(+) Relatively easy to test all code paths
(+) Easy to see if someone inadvertently changes the behavior of a method
(-) Much harder to write for UI components than for non-GUI
Integration Tests:
(+) It's nice to have nuts and bolts in a project, but integration testing makes sure they fit each other
(-) Harder to localize source of errors
(-) Harder to tests all (or even all critical) code paths
Ideally both are necessary.
Examples:
Unit test: Make sure that input index >= 0 and < length of array. What happens when outside bounds? Should method throw exception or return null?
Integration test: What does the user see when a negative inventory value is input?
The second affects both the UI and the back end. Both sides could work perfectly, and you could still get the wrong answer, because the error condition between the two isn't well-defined.
The best part about Unit testing we've found is that it makes devs go from code->test->think to think->test->code. If a dev has to write the test first, [s]he tends to think more about what could go wrong up front.
To answer your last question, since unit tests live so close to the code and force the dev to think more up front, in practice we've found that we don't tend to refactor the code as much, so less code gets moved around - so tossing and writing new tests constantly doesn't appear to be an issue.

The question has a philisophical part for sure, but also points to pragmatic considerations.
Test driven design used as the means to become a better developer has its merits, but it is not required for that. Many a good programmer exists who never wrote a unit test. The best reason for unit tests is the power they give you when refactoring, especially when many people are changing the source at the same time. Spotting bugs on checkin is also a huge time-saver for a project (consider moving to a CI model and build on checkin instead of nightly). So if you write a unit test, either before or after you written the code it tests, you are sure at that moment about the new code you've written. It is what can happen to that code later that the unit test ensures against - and that can be significant. Unit tests can stop bugs before tehy get to QA, thereby speeding up your projects.
Integration tests stress the interfaces between elements in your stack, if done correctly. In my experience, integration is the most unpredictable part of a project. Getting individual pieces to work tends not to be that hard, but putting everything together can be very difficult because of the types of bugs that can emerge at this step. In many cases, projects are late because of what happens in integration. Some of the errors encountered in this step are found in interfaces that have been broken by some change made on one side that was not communicated to the other side. Another source of integration errors are in configurations discovered in dev but forgotten by the time the app goes to QA. Integration tests can help reduce both types dramatically.
The importance of each test type can be debated, but what will be of most importance to you is the application of either type to your particular situation. Is the app in question being developed by a small group of people or many different groups? Do you have one repository for everything, or many repos each for a particular component of the app? If you have the latter, then you will have challenges with inter compatability of different versions of different components.
Each test type is designed to expose the problems of different levels of integration in the development phase to save time. Unit tests drive the integration of the output many developers operating on one repository. Integration tests (poorly named) drive the integration of components in the stack - components often written by separate teams. The class of problems exposed by integration tests are typically more time-consuming to fix.
So pragmatically, it really boils down to where you most need speed in your own org/process.

The thing that distinguishes Unit tests and Integration tests is the number of parts required for the test to run.
Unit tests (theoretically) require very (or no) other parts to run.
Integration tests (theoretically) require lots (or all) other parts to run.
Integration tests test behaviour AND the infrastructure. Unit tests generally only test behaviour.
So, unit tests are good for testing some stuff, integration tests for other stuff.
So, why unit test?
For instance, it is very hard to test boundary conditions when integration testing. Example: a back end function expects a positive integer or 0, the front end does not allow entry of a negative integer, how do you ensure that the back end function behaves correctly when you pass a negative integer to it? Maybe the correct behaviour is to throw an exception. This is very hard to do with an integration test.
So, for this, you need a unit test (of the function).
Also, unit tests help eliminate problems found during integration tests. In your example above, there are a lot of points of failure for a single HTTP call:
the call from the HTTP client
the servlet validation
the call from the servlet to the business layer
the business layer validation
the database read (hibernate)
the data transformation by the business layer
the database write (hibernate)
the data transformation -> XML
the XSLT transformation -> HTML
the transmission of the HTML -> client
For your integration tests to work, you need ALL of these processes to work correctly. For a Unit test of the servlet validation, you need only one. The servlet validation (which can be independent of everything else). A problem in one layer becomes easier to track down.
You need both Unit tests AND integration tests.

Unit tests execute methods in a class to verify proper input/output without testing the class in the larger context of your application. You might use mocks to simulate dependent classes -- you're doing black box testing of the class as a stand alone entity. Unit tests should be runnable from a developer workstation without any external service or software requirements.
Integration tests will include other components of your application and third party software (your Oracle dev database, for example, or Selenium tests for a webapp). These tests might still be very fast and run as part of a continuous build, but because they inject additional dependencies they also risk injecting new bugs that cause problems for your code but are not caused by your code. Preferably, integration tests are also where you inject real/recorded data and assert that the application stack as a whole is behaving as expected given those inputs.
The question comes down to what kind of bugs you're looking to find and how quickly you hope to find them. Unit tests help to reduce the number of "simple" mistakes while integration tests help you ferret out architectural and integration issues, hopefully simulating the effects of Murphy's Law on your application as a whole.

Joel Spolsky has written very interesting article about unit-testing (it was dialog between Joel and some other guy).
The main idea was that unit tests is very good thing but only if you use them in "limited" quantity. Joel doesn't recommend to achive state when 100% of your code is under testcases.
The problem with unit tests is that when you want to change architecture of your application you'll have to change all corresponding unit tests. And it'll take very much time (maybe even more time than the refactoring itself). And after all that work only few tests will fail.
So, write tests only for code that really can make some troubles.
How I use unit tests: I don't like TDD so I first write code then I test it (using console or browser) just to be sure that this code do nessecary work. And only after that I add "tricky" tests - 50% of them fail after first testing.
It works and it doesn't take much time.

We have 4 different types of tests in our project:
Unit tests with mocking where necessary
DB tests that act similar to unit tests but touch db & clean up afterwards
Our logic is exposed through REST, so we have tests that do HTTP
Webapp tests using WatiN that actually use IE instance and go over major functionality
I like unit tests. They run really fast (100-1000x faster than #4 tests). They are type safe, so refactoring is quite easy (with good IDE).
Main problem is how much work is required to do them properly. You have to mock everything: Db access, network access, other components. You have to decorate unmockable classes, getting a zillion mostly useless classes. You have to use DI so that your components are not tightly coupled and therefore not testable (note that using DI is not actually a downside :)
I like tests #2. They do use the database and will report database errors, constraint violations and invalid columns. I think we get valuable testing using this.
#3 and especially #4 are more problematic. They require some subset of production environment on build server. You have to build, deploy and have the app running. You have to have a clean DB every time. But in the end, it pays off. Watin tests require constant work, but you also get constant testing. We run tests on every commit and it is very easy to see when we break something.
So, back to your question. Unit tests are fast (which is very important, build time should be less than, say, 10 minutes) and the are easy to refactor. Much easier than rewriting whole watin thing if your design changes. If you use a nice editor with good find usages command (e.g. IDEA or VS.NET + Resharper), you can always find where your code is being tested.
With REST/HTTP tests, you get a good a good validation that your system actually works. But tests are slow to run, so it is hard to have a complete validation at this level. I assume your methods accept multiple parametres or possibly XML input. To check each node in XML or each parameter, it would take tens or hundreds of calls. You can do that with unit tests, but you cannot do that with REST calls, when each can take a big fraction of a second.
Our unit tests check special boundary conditions far more often than #3 tests. They (#3) check that main functionality is working and that's it. This seems to work pretty well for us.

As many have mentioned, integration tests will tell you whether your system works, and unit tests will tell you where it doesn't. Strictly from a testing perspective, these two kinds of tests complement each other.
I can't see how adding a bunch more
unit tests on individual classes would
help. Wouldn't that make it harder to
refactor as it's much more likely we
will need to throw away and re-write
tests?
No. It will make refactoring easier and better, and make it clearer to see what refactorings are appropriate and relevant. This is why we say that TDD is about design, not about testing. It's quite common for me to write a test for one method and in figuring out how to express what that method's result should be to come up with a very simple implementation in terms of some other method of the class under test. That implementation frequently finds its way into the class under test. Simpler, more solid implementations, cleaner boundaries, smaller methods: TDD - unit tests, specifically - lead you in this direction, and integration tests do not. They're both important, both useful, but they serve different purposes.
Yes, you may find yourself modifying and deleting unit tests on occasion to accommodate refactorings; that's fine, but it's not hard. And having those unit tests - and going through the experience of writing them - gives you better insight into your code, and better design.

Although the setup you described sounds good, unit testing also offers something important. Unit testing offers fine levels of granularity. With loose coupling and dependency injection, you can pretty much test every important case. You can be sure that the units are robust; you can scrutinise individual methods with scores of inputs or interesting things that don't necessarily occur during your integration tests.
E.g. if you want to deterministically see how a class will handle some sort of failure that would require a tricky setup (e.g. network exception when retrieving something from a server) you can easily write your own test double network connection class, inject it and tell it to throw an exception whenever you feel like it. You can then make sure that the class under test gracefully handles the exception and carries on in a valid state.

You might be interested in this question and the related answers too. There you can find my addition to the answers that were already given here.

Should I mix my UnitTests and my Integration tests in the same project?

I am using NUnit to test my C# code and have so far been keeping unit tests (fast running ones) and integration tests (longer running) separate, and in separate project files. I use NUnit for doing both the unit tests and the integration tests. I just noticed the category attribute that NUnit provides, so that tests can be categorized. This begs the question, should I mix them together and simply use the category attribute to distinguish between them?

if it is not too difficult to separate them, do so now
unit tests should be run early and often (e.g. every time you change something, before check-in, after check-in), and should complete in a short time-span.
integration tests should be run periodically (daily, for example) but may take significant time and resources to complete
therefore it is best to keep them separate

seperate them if possible, because integration tests normally take much longer than UnitTests.
Maybe your project grows and you end up with very much tests, all which take a short amount of time - except the integration tests - and you want to run your UnitTests as often as possible...

I find that using separate projects for unit test and integration tests tends to create a little too many top level artifacts in the projects. Even though we're TDD and all, I still think the code being developed should be deserving at least half of the top-level of my project structure.

I don't think it really matters that much but separating them sounds like a better idea, since isolation, automation will be so easier. And category feature is nice but not that good from usability point of view.

The original motivation behind [Category] was to solve the problem you mention. It was also intended to create broader test suites but that is kind of what you are doing.
Do be careful with [Category]. Not all test runners support it the same way the NUnit gui does (or did, I haven't upgraded in a while). In the past some runners would ignore the attribute if it was on the class itself or just ignore it all together. Most seem to work now.

I would keep with whatever method you're currently using. It's more of an opinion thing, and you wouldn't want to have to re-tool your whole testing method.

What not to test when it comes to Unit Testing?

In which parts of a project writing unit tests is nearly or really impossible? Data access? ftp?
If there is an answer to this question then %100 coverage is a myth, isn't it?

Here I found (via haacked something Michael Feathers says that can be an answer:
He says,
A test is not a unit test if:
It talks to the database
It communicates across the network
It touches the file system
It can't run at the same time as any of your other unit tests
You have to do special things to your environment (such as editing config files) to run it.
Again in same article he adds:
Generally, unit tests are supposed to be small, they test a method or the interaction of a couple of methods. When you pull the database, sockets, or file system access into your unit tests, they are not really about those methods any more; they are about the integration of your code with that other software.

That 100% coverage is a myth, which it is, does not mean that 80% coverage is useless. The goal, of course, is 100%, and between unit tests and then integration tests, you can approach it.What is impossible in unit testing is predicting all the totally strange things your customers will do to the product. Once you begin to discover these mind-boggling perversions of your code, make sure to roll tests for them back into the test suite.

achieving 100% code coverage is almost always wasteful. There are many resources on this.
Nothing is impossible to unit test but there are always diminishing returns. It may not be worth it to unit test things that are painful to unit test.

The goal is not 100% code coverage nor is it 80% code coverage. A unit test being easy to write doesn't mean you should write it, and a unit tests being hard to write doesn't mean you should avoid the effort.
The goal of any test is to detect user visible problems in the most afforable manner.
Is the total cost of authoring, maintaining, and diagnosing problems flagged by the test (including false positives) worth the problems that specific test catches?
If the problem the test catches is 'expensive' then you can afford to put effort into figuring out how to test it, and maintaining that test. If the problem the test catches is trivial then writing (and maintaining!) the test (even in the presence of code changes) better be trivial.
The core goal of a unit test is to protect devs from implementation errors. That alone should indicate that too much effort will be a waste. After a certain point there are better strategies for getting correct implementation. Also after a certain point the user visible problems are due to correctly implementing the wrong thing which can only be caught by user level or integration testing.

What would you not test? Anything that could not possibly break.
When it comes to code coverage you want to aim for 100% of the code you actually write - that is you need not test third-party library code, or operating system code since that code will have been delivered to you tested. Unless its not. In which case you might want to test it. Or if there are known bugs in which case you might want to test for the presence of the bugs, so that you get a notification of when they are fixed.

Unit testing of a GUI is also difficult, albeit not impossible, I guess.

Data access is possible because you can set up a test database.
Generally the 'untestable' stuff is FTP, email and so forth. However, they are generally framework classes which you can rely on and therefore do not need to test if you hide them behind an abstraction.
Also, 100% code coverage is not enough on its own.

#GarryShutler
I actually unittest email by using a fake smtp server (Wiser). Makes sure you application code is correct:
http://maas-frensch.com/peter/2007/08/29/unittesting-e-mail-sending-using-spring/
Something like that could probably be done for other servers. Otherwise you should be able to mock the API...
BTW: 100% coverage is only the beginning... just means that all code has actually bean executed once.... nothing about edge cases etc.

Most tests, that need huge and expensive (in cost of resource or computationtime) setups are integration tests. Unit tests should (in theory) only test small units of the code. Individual functions.
For example, if you are testing email-functionality, it makes sense, to create a mock-mailer. The purpose of that mock is to make sure, your code calls the mailer correctly. To see if your application actually sends mail is an integration test.
It is very useful to make a distinction between unit-tests and integration tests. Unit-tests should run very fast. It should be easily possible to run all your unit-tests before you check in your code.
However, if your test-suite consists of many integration tests (that set up and tear down databases and the like), your test-run can easily exceed half an hour. In that case it is very likely that a developer will not run all the unit-tests before she checks in.
So to answer your question: Do net unit-test things, that are better implemented as an integration test (and also don't test getter/setter - it is a waste of time ;-) ).

In unit testing, you should not test anything that does not belong to your unit; testing units in their context is a different matter. That's the simple answer.
The basic rule I use is that you should unit test anything that touches the boundaries of your unit (usually class, or whatever else your unit might be), and mock the rest. There is no need to test the results that some database query returns, it suffices to test that your unit spits out the correct query.
This does not mean that you should not omit stuff that is just hard to test; even exception handling and concurrency issues can be tested pretty well using the right tools.

"What not to test when it comes to Unit Testing?"
* Beans with just getters and setters. Reasoning: Usually a waste of time that could be better spent testing something else.

Anything that is not completely deterministic is a no-no for unit testing. You want your unit tests to ALWAYS pass or fail with the same initial conditions - if weirdness like threading, or random data generation, or time/dates, or external services can affect this, then you shouldn't be covering it in your unit tests. Time/dates are a particularly nasty case. You can usually architect code to have a date to work with be injected (by code and tests) rather than relying on functionality at the current date and time.
That said though, unit tests shouldn't be the only level of testing in your application. Achieving 100% unit test coverage is often a waste of time, and quickly meets diminishing returns.
Far better is to have a set of higher level functional tests, and even integration tests to ensure that the system works correctly "once it's all joined up" - which the unit tests by definition do not test.

Anything that needs a very large and complicated setup. Ofcourse you can test ftp (client), but then you need to setup a ftp server. For unit test you need a reproducible test setup. If you can not provide it, you can not test it.

You can test them, but they won't be unit tests. Unit test is something that doesn't cross the boundaries, such as crossing over the wire, hitting database, running/interacting with a third party, Touching an untested/legacy codebase etc.
Anything beyond this is integration testing.
The obvious answer of the question in the title is You shouldn't unit test the internals of your API, you shouldn't rely on someone else's behavior, you shouldn't test anything that you are not responsible for.
The rest should be enough for only to make you able to write your code inside it, not more, not less.

Sure 100% coverage is a good goal when working on a large project, but for most projects fixing one or two bugs before deployment isn't necessarily worth the time to create exhaustive unit tests.
Exhaustively testing things like forms submission, database access, FTP access, etc at a very detailed level is often just a waste of time; unless the software being written needs a very high level of reliability (99.999% stuff) unit testing too much can be overkill and a real time sink.

I disagree with quamrana's response regarding not testing third-party code. This is an ideal use of a unit test. What if bug(s) are introduced in a new release of a library? Ideally, when a new version third-party library is released, you run the unit tests that represent the expected behaviour of this library to verify that it still works as expected.

Configuration is another item that is very difficult to test well in unit tests. Integration tests and other testing should be done against configuration. This reduces redundancy of testing and frees up a lot of time. Trying to unit test configuration is often frivolous.

FTP, SMTP, I/O in general should be tested using an interface. The interface should be implemented by an adapter (for the real code) and a mock for the unit test.
No unit test should exercise the real external resource (FTP server etc)

If the code to set up the state required for a unit test becomes significantly more complex than the code to be tested I tend to draw the line, and find another way to test the functionality. At that point you have to ask how do you know the unit test is right!

FTP, email and so forth can you test with a server emulation. It is difficult but possible.
Not testable are some error handling. In every code there are error handling that can never occur. For example in Java there must be catch many exception because it is part of a interface. But the used instance will never throw it. Or the default case of a switch if for all possible cases a case block exist.
Of course some of the not needed error handling can be removed. But is there a coding error in the future then this is bad.

The main reason to unit test code in the first place is to validate the design of your code. It's possible to gain 100% code coverage, but not without using mock objects or some form of isolation or dependency injection.
Remember, unit tests aren't for users, they are for developers and build systems to use to validate a system prior to release. To that end, the unit tests should run very fast and have as little configuration and dependency friction as possible. Try to do as much as you can in memory, and avoid using network connections from the tests.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js