How can I guarantee all unit tests pass before committing?

How can I guarantee all unit tests pass before committing? - unit-testing

We've had problems recently where developers commit code to SVN that doesn't pass unit tests, fails to compile on all platforms, or even fails to compile on their own platform. While this is all picked up by our CI server (Cruise Control), and we've instituted processes to try to stop it from happening, we'd really like to be able to stop the rogue commits from happening in the first place.
Based on a few other questions around here, it seems to be a Bad Idea™ to force this as a pre-commit hook on the server side mostly due to the length of time required to build + run the tests. I did some Googling and found this (all devs use TortoiseSVN):
http://cf-bill.blogspot.com/2010/03/pre-commit-force-unit-tests-without.html
Which would solve at least two of the problems (it wouldn't build on Unix), but it doesn't reject the commit if it fails. So my questions:
Is there a way to make a pre-commit hook in TortoiseSVN cause the commit to fail?
Is there a better way to do what I'm trying to do in general?

There is absolutely no reason why your pre-commit hook can't run the Unit tests! All your pre-commit hook has to do is:
Checkout the code to a working directory
Compile everything
Run all the unit tests
Then fail the hook if the unit tests fail.
It's completely possible to do. And, afterwords, everyone in your development shop will hate your guts.
Remember that in a pre-commit hook, the entire hook has to complete before it can allow the commit to take place and control can be returned to the user.
How long does it take to do a build and run through the unit tests? 10 minutes? Imagine doing a commit and sitting there for 10 minutes waiting for your commit to take place. That's the reason why you're told not to do it.
Your continuous integration server is a great place to do your unit testing. I prefer Hudson or Jenkins over CruiseControl. They're easier to setup, and their webpage are more user friendly. Even better they have a variety of plugins that can help.
Developers don't like it to be known that they broke the build. Imagine if everyone in your group got an email stating you committed bad code. Wouldn't you make sure your code was good before you committed it?
Hudson/Jenkins have some nice graphs that show you the results of the unit testing, so you can see from the webpage what tests passed and failed, so it's very clear exactly what happened. (CruiseControl's webpage is harder for the average eye to parse, so these things aren't as obvious).
One of my favorite Hudson/Jenkins plugin is the Continuous Integration Game. In this plugin, users are given points for good builds, fixing unit tests, and creating more passed unit tests. They lose points for bad builds and breaking unit tests. There's a scoreboard that shows all the developer's points.
I was surprised how seriously developers took to it. Once they realized that their CI game scores were public, they became very competitive. They would complain when the build server itself failed for some odd reason, and they lost 10 points for a bad build. However, the number of failed unit tests dropped way, way down, and the number of unit tests that were written soared.

There are two approaches:
Discipline
Tools
In my experience, #1 can only get you so far.
So the solution is probably tools. In your case, the obstacle is Subversion. Replace it with a DVCS like Mercurial or Git. That will allow every developer to work on their own branch without the merge nightmares of Subversion.
Every once in a while, a developer will mark a feature or branch as "complete". That is the time to merge the feature branch into the main branch. Push that into a "staging" repository which your CI server watches. The CI server can then pull the last commit(s), compile and test them and only if this passes, push them to the main repository.
So the loop is: main repo -> developer -> staging -> main.
There are many answers here which give you the details. Start here: Mercurial workflow for ~15 developers - Should we use named branches?
[EDIT] So you say you don't have the time to solve the major problems in your development process ... I'll let you guess how that sounds to anyone... ;-)
Anyway ... Use hg convert to get a Mercurial repo out of your Subversion tree. If you have a standard setup, that shouldn't take much of your time (it will just need a lot of time on your computer but it's automatic).
Clone that repo to get a work repo. The process works like this:
Develop in your second clone. Create feature branches for that.
If you need changes from someone, convert into the first clone. Pull from that into your second clone (that way, you always have a "clean" copy from subversion just in case you mess up).
Now merge the Subversion branch (default) and your feature branch. That should work much better than with Subversion.
When the merge is OK (all the tests run for you), create a patch from a diff between the two branches.
Apply the patch to a local checkout from Subversion. It should apply without problems. If it doesn't, you can clean your local checkout and repeat. No chance to lose work here.
Commit the changes in subversion, convert them back into repo #1 and pull into repo #2.
This sounds like a lot of work but within a week, you'll come up with a script or two to do most of the work.
When you notice someone broke the build (tests aren't running for you anymore), undo the merge (hg clean -C) and continue to work on your working feature branch.
When your colleagues complain that someone broke the build, tell them that you don't have a problem. When people start to notice that your productivity is much better despite all the hoops that you've got to jump, mention "it would be much more simple if we would scratch SVN".

The best thing to do is to work to improve the culture of your team, so that each developer feels enough of a commitment to the process that they'd be ashamed to check in without making sure it works properly, in whatever ways you've all agreed.

Related

How Do I Know That I'm Not Breaking Anything During Refactoring?

I've started my first experience in refactoring on huge system and writing unit tests for it, but I am just scared that I'm breaking the code without knowing it.
I studied the "the art of unit testing" and "working efficiently with legacy code" to find a solution, and my next plan is just stop refactoring for a while and write some integration testing(I have selected Fitnesse tool for integration testing purpose) to run them every time after I change some thing.
I am just wondering is there any other one with same experience? Do you think inetegration testing can be a good solution for this issue? Do you have any better idea?
I also checked this question (How can I check that I didn't break anything when refactoring?) but my situation is different with that, because there is no unit test available and I am here to write unit tests.

Integration testing is part of a good solution for refactoring. However some problems introduced by the refactoring will only show up when you have deployed the project.
A better idea would be to incorporate the integration testing into a continuous delivery strategy. This means you should have a clean and practical approach to build and deploy the project as often as possible to a near identical environment while refactoring it. The book Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment is a good resource. Here is one of the antipatterns it describes (Pages 7-9):
Antipattern: Deploying to a Production-like Environment Only after Development Is Complete
In this pattern, the first time the software is deployed to a
production-like environment (for example, staging) is once most of
the development work is done...
Once the application is deployed into staging, it is common for new
bugs to be found...
The remedy is to integrate the testing, deployment, and release
activities into the development process. Make them a normal and
ongoing part of development so that by the time you are ready to
release your system into production there is little to no risk,
because you have rehearsed it on many different occasions in a
progressively more production-like sequence of test environments. Make
sure everybody involved in the software delivery process, from the
build and release team to testers to developers, work together from
the start of the project.

At the end of the day, this is the problem of working with Legacy Code.
Integration Tests are your best bet, but to write those to correctly meet your needs, you would need to know the original intent of the original code, which often isn't as clear, because there are often hidden requirements.
There are no ideal solutions.

Although previous answers are very good, I'd like to add that unit tests are exactly for this. In our test project when we refactor each other components, its mandatory to run already existing unit tests prepared from initial developer + new ones before commit to the Version control. Besides - its a good approach to have smoke tests running on every check-in. An ofcourse - Integration, Regression etc. afterwards.
UPDATE
I'm in the exact same situation - chained to maintenance. Tools can vary greatly - depending of the needs. Starting from Web-, -Unit-Testing up to SOA- and Server-testing. If you provide more detailed info about your SUT I'll gladly try to help.

buildbot vs hudson/jenkins for C++ continuous integration

I'm currently using jenkins/hudson for continuous integration a large mostly C++ project. We have separate projects for trunk and every branch. Also, there are some related projects for the Java code, but the setup for those are fairly basic right now (we may do more later though). The C++ projects do the following:
Builds everything with options for whether to reconfigure, do a clean build, or use a fresh checkout
Optionally builds and runs all tests
Optionally runs all tests using Valgrind's memcheck
Runs cppcheck
Generates doxygen documentation
Publishes reports: unit tests, valgrind, cppcheck, compiler warnings, SLOC, open tasks, and code coverage (using gcov, gcovr, and the cobertura plugin)
Deploys code nightly or on demand to a test environment and a package repository
Everything is configurable for automatic builds and optional for on demand builds. Underneath, there's a bash script that controls much of this, which farther depends on our build system, which uses automake and autoconf along with custom bash scripts.
We started using Hudson (at the time) because that's what the Java guys were using and we just wanted nightly builds. Since then, we've added a lot more and continue to add more. In some ways Hudson is great, but certainly isn't ideal.
I've looked at other solutions and the only one that looks like it could be a replacement is buildbot. Would buildbot be better for this situation? Is the investment worth it since we're already using Hudson? Why?
EDIT: Someone asked why I haven't found Hudson/Jenkins to be ideal. The short answer is that everything can be improved. I'm simply wondering if Jenkins is the best current solution for my use case or whether there is something better (buildbot?) that would be easier to maintain in the long run even as new requirements come up.

Both are open source projects, but you do not need to change buildbot code to "extend" it, it is actually quite easy to import your own packages in its configuration in which you can sub-class most of the features with your own additions. Examples: your own compilation or test code, some parsing of outputs/errors to be given to the next steps, your own formating of alert emails etc. there are lots of possibilities.
Generally I would say that buildbot is the most "general purpose" automatic builds tools. Jenkins however might be the best related to running tests, especially for parsing and presenting results in nice ways (results, details, charts.. some clicks away), things that buildbot does not do "out-of-the-box". I'm actually thinking of using both to have sexier test result pages.. :-)
Also as a rule of thumb it should not be difficult to create a new tool's config: if the specification of what to do (configs, builds, tests) is too hard to switch from one tool to another, it is a (bad) sign that not enough configuration scripts are moved to the sources. Buildbot (or Jenkins) should only call simple commands. If it is simple to run tests, then developers will do it as well and this will improve the success rate, whereas if only the continuous integration system runs the tests, you will be running after it to fix the new code failures, and will loose its non-regression value, just my 0.02€ :-)
Hope it'll help.

The 'result integration' is also in jenkins/hudson, and you can relatively easily capture build products without having to 'copy them elsewhere'.
For our instance, the coverage reports and unit test metrics and javadoc for the java code is all integrated. For our C++ code, the plugins are a little lacking, but you can still get most of it.
we ran buildbot since pre 0.7, and are now running 0.8 and are only now seeing any real reason to switch, as buildbot 0.8 forgot about windows slaves for an extended period of time and the support was pretty poor.

There are many other solutions out there, besides Jenkins/Hudson/BuildBot:
TeamCity by Jetbrains
Bamboo by Atlassian
Go by Thoughtworks
Cruise Control
OpenMake Meister
The specifics about what you are doing are not so important, in fact, as long as the agents (aka nodes) that you are doing them on support those tasks.
The beauty of a CI server is noticing when the build changes to trigger a new build (and test), publish the artifacts, and publish test results.
When you compare CI tools like those we mentioned, consider features like the usability of its interface, how easy is branching (and features it might offer like automatic merging), notifications (like XMPP/jabber), or an information-radiator (like hooking up a monitor to always show status). Product support is another thing to consider - Jenkins' support is only as good as who is responding to community questions at the time you have questions.
My personal favorite is Bamboo, but it comes with a license fee.

I'm a long-time Jenkins user in the middle of evaluating Buildbot and would like to offer a few items for folks considering using Buildbot for multi-module solutions:
*) Buildbot doesn't have any out-of-the-box concept of file artifacts related to each build. It's not in the UI and it's not in any of the builtin "steps" modules as far as I can see:
http://docs.buildbot.net/current/manual/configuration/buildsteps.html
...and I see no third party plugin:
https://github.com/buildbot/buildbot/wiki/PluginList#steps
Buildbot does collect all the console output from a given build, but critically, you can't collect files related to it.
*) Given that artifacts are not supported, it's not easy to create "collector" projects that bring multiple modules into say, a single installer. Jenkins has a great feature that lets you parameterize a build with builds from other modules (the parameter type is a run).
*) Establishing dependencies between modules is trickier in Buildbot. Say you have a library that three binaries depend on, and you want those binaries to rebuild each time the library changes. Jenkins has triggers built into the UI. If you want to do triggers in Buildbot you have to script them using schedulers.Dependent, and it causes a lot of item congestion in the Schedulers UI.
*) When you're working in Buildbot, it seems that pretty much all of the configuration is done in master.cfg in code. This is awesome and frustrating.
*) Buildbot forces you to create a worker in addition to a master server. This is annoying for beginners and systems for which a single build server is sufficient.
My impression after two days of Buildbot evaluation is that we'll stick with Jenkins, primarily due to it having artifacts. Buildbot is a tool we'd only use if we had more extensive customization needs, and the time to do it.

On the subject of buildbot and artifacts -- I don't have enough user score to make a comment -- you can get artifacts from buildbot 2.x series pretty easy with built-in file/directory upload actions. However you rarely want to just move files. Typically you make a triggered buildstep that does deployment directly off the worker for best results. eg push to cloud storage, containers, thirdparty (steam uploads), etc.
This way you can get metrics on the uploads and conditionally control them better (or even mix and match artifacts across worker machines).

Preventing build breaks - using a pre-commit build

What methods are available for preventing sloppy developers from breaking builds.
Are there any version control systems which have a system of preventing check-in of code which breaks the build.
Thanks

Microsoft TFS Build has something called "gated check-ins" which provides this, by performing a private check-in (called Shelving) which is promoted to a normal check-in if the build succeeds.
http://blogs.msdn.com/b/patcarna/archive/2009/06/29/an-introduction-to-gated-check-in.aspx
TeamCity has the concept of "delayed commit"
http://www.jetbrains.com/teamcity/features/delayed_commit.html
I can wholeheartedly recommend TeamCity!

Get a LART and beat developers who break the build.
Have the build status on a great big screen where everyone can see it with the a red/green background and the name of the last person to commit.
Have the build server send an e-mail to the whole dev team fingering the developer who broke the build.
Honestly... why do people get so hung up on "making developers do X". Tell them it's the process, and fire them if they don't follow it.
EDIT: because the following was too long for a comment.
I work on a team with 12 or so developers. Some people consider that large, some consider it small
We have a large screen (32" flat screen TV on a 6 foot stand) that everyone can see that tells us all sorts of information - including (in the biggest box on the screen) the state of the "commit build".
Our process is to update from SVN and run the commit build locally (about 2-3 mins) before committing. If it passes, ship it. If not, fix it locally and repeat. Since we do TDD, this generally only happens if something you pulled out of SVN broke something you were working on.
If it fails in CI you either ignored the process or your commit collided with someone else's in a bad way. The screen goes red, someone shouts at you, you fix it and move on. This generally only happens to us about once a week or so; mostly it goes red because people try and cut corners ;-)
Nobody needs to "force developers" to do anything. We're creative, artistic individuals who are generally adult and professional enough to follow a process if that process makes sense. In this case, build locally and only commit if that passes.
Who cares if the build breaks in CI, as long as it's fixed quickly so it doesn't prevent the team from working?

What to do with non-regressive tests?

Not really a Ruby on Rails question, but that is the framework which we are working in.
We are migrating data from a legacy system into our own system, and have been testing the code that will do the data migrations. These tests live alongside the rest of the applications tests, and so ran against our build server on commits, etc.
Once we've migrated this data, these tests will seemingly be useless to us, since the code they are testing will never be run again. What's more, is the tests will most likely get stale, and might require maintenance, lest they break our build.
Should we just be throwing these tests out afterward? Tagging them in some way so that they don't get ran after we do things for real? Something else?

Get rid of them.*
*Which is to say, let them sit in source control if you ever need to refer to them.

If it were me, I would separate out the project that does the data migration along with its tests. That way the tests don't generate noise in your current build process, and you only have to modify them if you (for some reason) touch the migration project again.
If this isn't possible, then just rip all of it out once you are done. If you ever need to get it back it should be in source control... right!?!

What is your (simple) continuous integration solution for Django projects?

In one of my Django projects I have a suite of unit tests that are based on TransactionalTestCase class (it takes much longer than TestCase). It is impossible to run tests after each change in code because it takes more than 0.5 hour to run all tests. We looked some time ago for some easy contiuous integration tool that could allow us to (at least) run tests on tests server and send emails with errors to the team members (we have of course code repository and we don't need auto deployment at the momment). Do you have some working solutions or ideas how to accomplish this?
We wrote some 'super extra simple CI server' which does nothing more than running tests and sending email reports (it is integrated with our code repository). But since we had some problems with our not-ideal simple tool recently I'm wondering now if you have sucessfully completed similar scenarios in your working enviroment?
I'm looking for something ligthweight, easy to install and use.

Disclaimer: I don't know Django. But I do know that I use Hudson as my continuous integration tool for a number of languages and platforms. I found it easy to install and confgure on both Windows and Linux (set & forget) and was impressed with the number of plugins available.
Basically, if what you want to do can be automated by a sctript file, then you can use Hudson. It really is worth checking out.
It took me only a few minutes to set it so that I get an email if, and only if, something goes wrong, although you might want to do somethinng else (for which there probably exists a plugin). Hudson also plays well with other tools like BigZilla, all major version control tools, etc

Have you considered having two kinds of tests - basic and advanced and adding additional django command, that would run only basic tests, that are fast? This way you can do basic testing on small changes and run the full test suite only when you are about to commit/push changes?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js