I teach myself to code. I've read about TDD/BDD which encourages programmers to write a failing test first, and subsequently write a code to make the test pass.
But then I encounter the continuous integrating (CI) workflow - after the codes are committed, and test passes, it will be deployed to the production immediately. How does the test in the CI workflow differs than the one developers write?
But only it shouldn't necessarily get deployed to Production immediately after a new commit that doesn't break the CI (test pass).
The test that the developers have written is the same one that gets run in the CI, but in a different environment (to avoid the "it works on my machine" issue).
Related
When testing my Apache Spark application, I want to do some integration tests. For that reason I create a local spark appliciation (with hive support enabled), in which the tests are executed.
How can I achieve that after each test, the derby metastore is cleared, so that the next test has a clean environment again.
What I don't want to do is restarting the spark application after each test.
Are there any best practices to achieve what I want?
I think that introduction of some application level logic for integration testing kind of breaks concept of integration testing.
From my point of view correct approach is to restart application for each test.
Anyway I believe another option is to start/stop SparkContext for each test. It should clean any relevant stuff.
UPDATE - answer to comments
Maybe it's possible to do a cleanup by deleting tables/files?
I would ask more general question - what do you want to test with your test?
In a software development is defined unit testing and integration testing. And nothing in between. If you desire to do something that is not integration and not unit test - then you're doing something wrong. Specifically, with your test you try to test something that is already tested.
For the difference and general idea of unit and integration tests you can read here.
I suggest you to rethink your testing and depending on what you want to test do either integration or unit test. For example:
To test application logic - unit test
To test that your application works in environment - integration test. But here you shouldn't test WHAT is stored in Hive. Only that the fact of storage is happened, because WHAT is stored shall be tested by unit test.
So. The conclusion:
I believe you need integration tests to achieve your goals. And the best way to do it - restart your application for each integration test. Because:
In real life your application will be started and stopped
In addition to your Spark stuff - you need to make sure that all your objects in code are correctly deleted/reused. Singletones, Persistent objects, Configurations.. - it all may interfere with your tests
Finally, the code that will perform integration tests - where is a guarantee, that it will not break production logic at some point?
I'm just starting to mess about with continous integration. Therefore I wanted to set up Jenkins as also Sonarqube. While reading manuals/docs and tutorials I got a little bit confused.
For both systems, there are descriptions about how to set up unit test runners. So where should unit tests ideally be run? In Jenkins or in Sonarqube or in both systems? Where does it belong in theory/best practice?
We have configured Jenkins to launch the unit tests and the results are “forwarded” to Sonar to be interpreted as a post build action
The Best practice would be running the Unit test in Jenkins. This would ensure the Unit test cases are executed before we Build/Deploy.
SonarQube is normally used to ensure the quality of the code which will point out the bad codes, based on the guidelines/rules.It also gives the report on the Unit test coverage, Lines of code etc.
Usually it's done in Jenkins as you want to actually test your code before building the module.
Currently, our unit tests are committed along with the application code and executed by a build bot job on each commit. Likewise, code coverage is calculated. However, both UTs and coverage are - or can be - conducted by the developers before they commit new features to the repository, so the CI process does not seem to add any value.
What kind of tests do you suggest should be executed on the CI server which are not executed by the developer's prior to commit?
so the CI process does not seem to add any value
No?
What happens if a developer doesn't run the tests before committing?
What happens if a developer commits a change which conflicts with a change committed by another developer?
What happens if the tests pass on the developer's machine but not on other machines?
What happens if another developer has changed a test?
...
CI isn't "continuous testing", it's "continuous integration". The need to run the tests as part of the integration build is to validate that the changes committed can successfully integrate with what's already there. Whether or not the tests passed on the developer's local potentially-non-integrated workstation is immaterial.
The unit tests (any reasonably fast automated tests, really) should be executed by the CI server to validate the current state of the build. That state may be different than what is on one individual developer's workstation.
They may be the same tests which were just run by the developer moments prior. But the context in which the tests run is physically different and the need to run them is semantically different. Unless you're being charged by the clock cycle, there's no reason to omit running tests.
David raises very good points. One thing that he didn't mention though is that automated testing in a continuously integrated environment can be even more powerful when it goes beyond unit tests. The CI process allows you to run integration and system level tests that would be too expensive to run on a dev box. For example, you may have unit tests for your persistence layer that run against an in-memory database. Your CI server however can run these same automated tests against a snapshot of your production database.
Completely agree with previous posts, as Unit testing is mostly done by devs. So tests that should be executed as part of the CI process is pretty much opinion based. Depending of the goals of the team/project.
The thing that is also important is that CI (server) gives you a separate testing environment. So your test effort and execution can run independent. Your test are executed in a clone of the production environment.
In my expirience I've used CI server mostly for System,Integration, Functional, Regression testing and UAT.
When developing, my team obviously uses development as our environment.
When we run automated tests, we use testing.
We also have staging and production environments, respectively used for our testers to check out features and the final "live" product.
We're trying to setup an internal CI server to run our automated tests against and to eventually assist with automated deployments.
Since the CI server is really running automated tests, some think it should be run in testing environment. However, in order for the CI server to actually be useful, my thoughts are that it needs to be run in production mode with as close-as-possible a mirror of the actual production environment (without touching the production DB, obviously).
Is there an accepted environment that a CI server should be executed under? production environment (with different DB) seems the only logical answer to me, but I may be missing something...
Running any tests on PROD environment as you said
seems the only logical answer
but is not quite true. There are risks that your tests can seriously damage the actual environment/application to a point where you'll face a recovery option. After all the dark side of testing is to show/find that your software has not only minor bugs and it is working not as expected.
I can think of at least these 'why not test production' considerations:
when the product is launched, the customer rely on it. Expecting that your software is working ()being already tested). Your live environment should do its job and not be loaded with tests. If the product misbehaved (or did not perform), the technical team have to be sent to to cover the damage, fix the gaps and make it run hassle free. Now this not only affected the product cost, but delayed the project deadlines in a major way. This will make a recursive effect at the vendors profits and next few projects.
the production or development team when completes a product development at their end, have to produce this test environment for testing team prior to loading their newly developed product on that environment for testing.
To me, no matter that you
also have staging and production environments
it is essential to use the Test one accordingly. Further more Testing environment should be (configured) as close as it gets to the Production. Also one person could be trying to test while another person breaks the thing that he has been testing. With out the two being separate their is no way to do proper testing.
Just to be full answer, your STAGE environment can have different roles depending on the company.
One is that it can be the QA/STAGE environment that has an exact copy of production which is used for both QA and system testing (testing of the system when a lot of updates/changes or upgrade is going to go into production).
UPDATE:
That was my point too. The QA environment should be a mirror of the PROD. Possible solution about your issue with caching/pre-loading files onto staging/production is creation of pre-/post-steps .bat (let's assume) files.
In our current Test project we use this approach. In pre-steps we set-up files needed for test execution (like removing files from previous runs and downloading latest copies/artifacts). In post-steps we set up reporting files needed.The advantage is that your files will be collected and sync before every execution.
About the
not on the same physical hardware
in my case we support dedicated remote Test server. Advantages are clear, only thing that you need to be considered is that it'll require maintenance (administration).
I'm familiar with TDD and use it in both my workplace and my home-brewed web applications. However, every time I have used TDD in a web application, I have had the luxury of having full access to the web server. That means that I can update the server then run my unit tests directly from the server. My question is, if you are using a third party web host, how do you run your unit tests on them?
You could argue that if your app is designed well and your build process is sound and automated, that running unit tests on your production server isn't necessary, but personally I like the peace of mind in knowing that everything is still "green" after a major update.
For everyone who has responded with "just test before you deploy" and "don't you have a staging server?", I understand where you're coming from. I do have a staging server and a CI process set up. My unit tests do run and I make sure they all pass before an an update to production.
I realize that in a perfect world I wouldn't be concerned with this. But I've seen it happen before. If a file is left out of the update or a SQL script isn't run, the effects are immediately apparent when running your unit tests but can go unnoticed for quite some time without them.
What I'm asking here is if there is any way, if only to satisfy my own compulsive desires, to run a unit test on a server that I cannot install applications on or remote into (e.g. one which I will only have FTP access to in order to update files)?
I think I probably would have to argue that running unit tests on your production server isn't really part of TDD because by the time you deploy to your production environment technically speaking, you're past "development".
I'm quite a stickler for TDD, and when I'm preaching the benefits to clients I often find myself saying "you can't half adopt TDD, it's all or nothing"
What you probably should have is some form of automated testing that you perform "after" deployment but these are not part of TDD.
Maybe you should look at your process again.
You could write functional tests in something like WATIR, WATIN or Selenium that test what is returned in the reponse page after posting certain form data or requesting specific URLs.
For clarification: what sort of access do you have to your web server? FTP or WebDAV only? From your question, I'm guessing ssh access isn't available - you're dropping files in a directory to deploy. Is that correct?
If so, the answer for unit testing is likely 'do it before you deploy'. You can set up functional testing driven by an automated tool like Selenium to test your app remotely via the web interface, but that's not really unit testing the sense that you're restricted to testing the system as a whole.
Have you considered setting up a staging server, perhaps as a VMWare instance, that mirrors or at least mimics your deployment environment?
What's preventing you from running unit tests on the server? If you can upload your production code and let it run there, why can't you upload this other code and run it as well?
I've written test tools for sites using python and httplib/urllib2 generally it would have been overkill but it was suitable in these cases. Not sure it's going to be of general use though.