Development teams are often plagued by builds in version control being transiently broken. The entire team's productivity can come to a halt while trying recover from a build broken by one person.
Is there software that would allow hosting Git in a way that prevents breaking builds in version control by not accepting commits that fail to pass tests in the first place? The usage scenario could for example look as follows:
The software runs on a server that continuously pulls revisions from Git repositories that developers have published.
For each pulled revision, the software builds the revision and tests if it passes the unit tests.
If it passes the tests, the revision is merged into the "stable" branch.
If it doesn't pass the tests, it is rejected and the revision is not merged into the "stable" branch. The developer is forced to correct the revision and resubmit it.
Developers by default pull from the "stable" branch that should never be broken—in the sense that tests do not fail—and are more productive as they spend less time being blocked by broken builds. And the usefulness of such a system grows with the team size.
A few notes:
Git's pre-commit hooks and similar are not satisfactory in this case. The solution should be automatic and enforced on the server side for each commit.
Looking for a solution that has been implemented and thought out as far as possible, instead of writing a system like this myself from scratch.
I think this is more a build server feature that ties into a VCS such as Git. TeamCity does have support for this, but I've not tried it so I can't comment on how good it actually is.
http://www.jetbrains.com/teamcity/features/delayed_commit.html
The Hudson guys have been discussing it for a while, but I've yet to see it in a release.
http://wiki.hudson-ci.org/display/HUDSON/Designing+pre-tested+commit
We use gerrit and hudson. It is what android and Cyanogenmod use as well (along with many others).
Gerrit allows for code review and automatic building of every commit with automatic rejection of those that fail tests.
Hudson runs the tests.
Hudson: http://wiki.hudson-ci.org/display/HUDSON/Designing+pre-tested+commit
Gerrit: http://gerrit.googlecode.com
This system works well with the repo tool to have a large number of small repositories, this will reduce merge conflicts which have to be handled manually via a rebase.
Note: it is quite a bit of work to get up and running if you have a large existing code base, but totally worth it.
It's a really good idea. Bamboo supports this quite naturally. Several Bamboo customers as well as teams at Atlassian have this exact methodology setup and working to great effect. Bamboo has event listeners which can tell (without polling) when a commit is pushed to a 'test verfication' repo and then verify it by running the tests before pushing to the stable branch. www.atlassian.com/bamboo
Related
I'm new to git. I've read the well-written intro book. But gee, it's still not a trivial topic. I've been bumbling around, experiencing various problems. I realized it might be because I'm unaware of workflow, and specifically, "what are the best practices for doing what I'm trying to do?"
I started out developing a django project on my win7 with Pycharm. Great way to get the initial 95% written.
But then I need to deploy it to my production machine at PythonAnywhere.
So I created a private Github repository, pushed my win7 codebase to github.
Then in pythonAnywhere, I cloned the github repository.
For now, no others work on this project. It will not be released to the public.
Now that the server is running on PythonAnywhere, I still need to tweak settings, which is best done on the PythonAnywhere codebase side. But there are other improvements (new pages, or views) that I'd rather do inside Pycharm IDE on my win7 than in vim on python anywhere.
So I've been kind of clumsily pushing and fetching these changes. It's been kind of ham-handed, and I've managed to lose some minor changes through ignorance.
So I'm wondering if anyone can point to a relatively simple workflow that would handle the various tasks I mentioned:
1) improving functionality of the site (best done in Pycharm IDE)
2) production server issues and tweaks (best done on PythonAnywhere)
3) keeping everythign safely backed-up on Github
The other issue is that I have another django app that I want to build. It's easiest to temporarily hang it off the django project I've already built. But I'd prefer to keep it in its own repository.
So I have Original_Project, Original_App stored in Original_Repository
I want to make new_app, and have it, for the time being, run in Original_Project, but I want to version control it in New_Repository.
I think/hope that I could put a .gitignore in the Original_Repository, saying ignore the new_app/ Then I git init new_app/ as its own repository. Is that sound or mad?
You should avoid editing your code on the production server as much as possible, and never commit from the production server. If you end up having to tweaks things on the server (you shouldn't but well, shit happens and sometimes it's indeed easier to first get the code back to work on the server), then once it's working manually report your edits to your local repo, clear up the changes on the server and deploy the fixed code again. Here the github repo should be considered as the "master" repository for deployments, ie you work on your local repo, push to github, and on the server pull from github. This make sure you keep the github repo in sync.
wrt/ the "improving functionality" (aka "features") vs "server issues and tweaks" (aka "hotfixes"), git flow is a (mostly) sane workflow IMHO but that's a bit opinion-based here (some dislike it and have sensible arguments too).
Finally if you want to factor out one of your apps, the best is to have it in it's own (github) repo with all the proper python packaging stuff and make it a requirement of your main project. On your local dev environment you install it as an editable package, and for the production setup you install it as normal package pinned to the last stable version. Note that in both cases I assume you're using virtualenvs (and if you dont, well that's the very first issue you should address).
Update:
What are the downsides of of editing directly on the production server and committing from the production server?
Well quite simply a production server is not the place for coding - "production" means that you have users trying to do something with your website and they don't want to have the site breaking on them, their data lost or whatever because you are "tweaking" things. You should only deploy stable, well tested code on production, and the one and only one case where editing anything on the server might be a last resort option is when it's already broken and you want to get it back online asap whatever it takes (case of "first make it work, then make it clean").
Point is, I'm a professional developer working on projects that are business criticals and a broken site is not an option, so I'm very strict on this - but even if it's a hobby project, your users deserve some respect (at least if you expect to see them back).
A proper production chain actually involves at least three environments: your local dev environment, a staging server (which should closely mirror the production server - system, system package versions, configurations etc etc) to test out / showcase / eventually do minor config tweak, and the production server which should only ever see stable tested code.
I have always struggled with git, knowing it well enough to get thigs working, but never being sure I am doing thing well.
I would suggest installing git flow (it is probably available in your package manager if you are on Linux). Its a set of extensions that simplify a standard git worklfow. Since using it, this has pretty much been all the documentation I have needed.
https://danielkummer.github.io/git-flow-cheatsheet/
This may seem like a very broad question, but i am really interested to know about possible approaches. Our team has a Django Web app and we have huge amount of unit tests for our features. Now in github, we have master branch, develop branch, and individual feature/bug branches. Now the problem i want to solve is,
Every time some code is merged into develop branch, i want to run all(or subset) of unit tests against that branch. It would be cool to have it automated, i-e i do not have to trigger the test run.
I have read and heard about Jenkins - http://michal.karzynski.pl/blog/2014/04/19/continuous-integration-server-for-django-using-jenkins/. Currently one of the approaches i am leaning towards.
But i wanted to know if there are better approaches or tools which i can use.
Appreciate all your help.
For what it's worth, you can't really go wrong with Jenkins for the functionality you are looking to achieve.
Although Travis CI may be a better option given that it's meant to work seamlessly with Github and it appears all of your repositories have been moved to Github.
Really depends on your business needs though.
Getting Jenkins up and running, from past experiences, has always gone very smoothly and it gives you the benefit of keeping all data in house as you have the option to host Jenkins on your own private servers but probably doesn't scale or run as efficiently as Travis CI does depending on your setup.
Travis CI will probably allow for an even more seamless approach because it's already being hosted for you and tied directly into Github, but you won't get the privacy as running Jenkins on your own servers. There is a paid option though it appears for Travis CI which again, depending on your business needs, may be a better option.
When developing, my team obviously uses development as our environment.
When we run automated tests, we use testing.
We also have staging and production environments, respectively used for our testers to check out features and the final "live" product.
We're trying to setup an internal CI server to run our automated tests against and to eventually assist with automated deployments.
Since the CI server is really running automated tests, some think it should be run in testing environment. However, in order for the CI server to actually be useful, my thoughts are that it needs to be run in production mode with as close-as-possible a mirror of the actual production environment (without touching the production DB, obviously).
Is there an accepted environment that a CI server should be executed under? production environment (with different DB) seems the only logical answer to me, but I may be missing something...
Running any tests on PROD environment as you said
seems the only logical answer
but is not quite true. There are risks that your tests can seriously damage the actual environment/application to a point where you'll face a recovery option. After all the dark side of testing is to show/find that your software has not only minor bugs and it is working not as expected.
I can think of at least these 'why not test production' considerations:
when the product is launched, the customer rely on it. Expecting that your software is working ()being already tested). Your live environment should do its job and not be loaded with tests. If the product misbehaved (or did not perform), the technical team have to be sent to to cover the damage, fix the gaps and make it run hassle free. Now this not only affected the product cost, but delayed the project deadlines in a major way. This will make a recursive effect at the vendors profits and next few projects.
the production or development team when completes a product development at their end, have to produce this test environment for testing team prior to loading their newly developed product on that environment for testing.
To me, no matter that you
also have staging and production environments
it is essential to use the Test one accordingly. Further more Testing environment should be (configured) as close as it gets to the Production. Also one person could be trying to test while another person breaks the thing that he has been testing. With out the two being separate their is no way to do proper testing.
Just to be full answer, your STAGE environment can have different roles depending on the company.
One is that it can be the QA/STAGE environment that has an exact copy of production which is used for both QA and system testing (testing of the system when a lot of updates/changes or upgrade is going to go into production).
UPDATE:
That was my point too. The QA environment should be a mirror of the PROD. Possible solution about your issue with caching/pre-loading files onto staging/production is creation of pre-/post-steps .bat (let's assume) files.
In our current Test project we use this approach. In pre-steps we set-up files needed for test execution (like removing files from previous runs and downloading latest copies/artifacts). In post-steps we set up reporting files needed.The advantage is that your files will be collected and sync before every execution.
About the
not on the same physical hardware
in my case we support dedicated remote Test server. Advantages are clear, only thing that you need to be considered is that it'll require maintenance (administration).
I am a Qt/C++ developer. I would like to setup a continuous integration environment whereby after committing the source code, it triggers a build process that build the code for the 3 platforms I'm using:
Linux
OS X
Win32
If possible, how do I setup such environment. Any hints or links are welcome.
I've read around about Jenkins, but I can't find any good tutorial for it.
I also suggest Jenkins for several reasons:
It will run on all of the platforms you listed.
It can be configured to start a build when the repository is updated (hint: configure the Job to "Poll SCM" and you won't have to muck with your SCM tool to get it to tell Jenkins to start building).
It provides good support (mostly through plugins) for Unit Testing. [You're project is doing unit testing, right?]
The price is right
A bigger issue is going to have is that AFAIK, Qt doesn't really do cross-compiling for other platforms well. Using Jenkins (and the appropriate plugins), you should be able to solve this.
One method that comes quickly to mind is to have an instance of Jenkins on each platform. Each instance is responsible for building the version for its own platform. At the end of the build, the created artifacts are all put into a common, shared location.
Jenkins supports this feature via plugins for all major source control systems. If you seriously considering using Jenkins (and I would highly recommend it), consider buying John Ferguson Smart's Jenkins: The Definitive Guide.
Two solutions coming to my mind:
BuildBot
BuildBot is a highly customizable continuous integration system written in Python. The master component offers a nice web-based GUI to monitor and trigger builds; slave components are put on the target machines (usually virtual machines but they could be the Mac laptop of one of the developers). Docs are good enough to build up a basic system, customization could be a little tricky (at least it was for me). Using commit/push hooks provided by VC systems you can easily activate the master and trigger builds across the slaves. It also supports incremental builds (a must if your project is big).
CDash
Developed by the authors of CMake, CDash is a web application collecting builds coming from across the network, not exactly what you asked for but I think it's worth a try. Very powerful if you have a team of developers who could continuosly submit build result on their machines to the server (and if you use CMake it's almost transparent). You cannot trigger builds from the server as Buildbot does, but you could setup a bunch of VM with a cron which checks for changes and in case performs the build and sends results to CDash
Sure it's possible. Most of the version control systems are able to execute custom script on server side. Some of them (git, for example), has hooks to achieve the same locally. Have a look at git's post-commit hook.
All you need is to create a script that will trigger cross-platform builds.
Most version control systems allow post-commit hooks to allow you to kick off events like builds. Alternatively build systems can be configured to regularly poll a source control repository and manage their own build scheduling (this is how we use Jenkins).
Something to bear in mind is how long it will take to do a complete build across platforms and the typical number of check-ins in that interval. You might find batching check-ins a better way of doing continuous integration builds if you have an fair sized team or limited build server resources. Otherwise your build system could quickly end up trying to play catch up.
As for whether it is possible to build on all target platforms, that depends on your tool chain.
Joel seems to think highly of daily builds. For a traditional compiled application I can certainly see his justification, but how does this parallel over to web development -- or does it not?
A bit about the project I'm asking for --
There are 2 developers working on a Django (Python) web app. We have 1 svn repository. Each developer maintains a checkout and thier own copy of MySQL running locally (if you're unfamiliar with Django, it comes bundled with it's own test server, much the way ASP apps can run inside of Visual Studio). Development and testing are done locally, then committed back to the repository. The actual working copy of the website is an SVN checkout (I know about SVN export and it takes too long). The closest we have to a 'build' is a batch file that runs an SVN update on the working copy, does the django bits ('manage.py syncdb'), updates the search engine cache (solr), then restarts apache.
I guess what I don't see is the parallel to web apps.
Are you doing a source controlled web app with 'nightly builds' -- if so, what does that look like?
You can easily run all of your Django unit tests through the Django testing framework as your nightly build.
That's what we do.
We also have some ordinary unit tests that don't leverage Django features, and we run those, also.
Even though Python (and Django) don't require the kind of nightly compile/link/unit test that compiled languages do, you still benefit from the daily discipline of "Don't Break The Build". And a daily cycle of unit testing everything you own is a good thing.
We're in the throes of looking at Python 2.6 (which works perfectly for us) and running our unit tests with the -3 option to see which deprecated features we're using. Having a full suite of unit tests assures us that a change for Python 3 compatibility won't break the build. And running them nightly means that we have to be sure we're refactoring correctly.
Continuous integration is useful if you have the right processes around it. TeamCity from JetBrains is a great starting point if you want to build familiarity:
http://www.jetbrains.com/teamcity/index.html
There's a great article that relates directly to Django here:
http://www.ajaxline.com/continuous-integration-in-django-project
Hope this gets you started.
Web applications built in dynamic languages may not require a "compilation" step, but there can still be a number of "build" steps involved in getting the app to run. Your build scripts might install or upgrade dependencies, perform database migrations, and then run the test suite to insure that the code is "clean" w.r.t. the actual checked-in version in the repository. Or, you might deploy a copy of the code to a test server, then run a set of Selenium integration tests against the new version to insure that core site functionality still works.
It may help to do some reading on the topic of Continuous Integration, which is a very useful practice for webapp dev teams. The more fast-paced and agile your development process, the more you need regular input from automated testing and quality metrics to make sure you fail fast and loud on any broken version of the code.
If it's really just you and one other developer working on it, nightly builds are probably not going to give you much.
I would say that the web app equivalent of nightly builds would be staging sites (which can be built nightly).
Where nightly builds to a staging area start paying real dividends is when you have clients, project managers, and QA people that need to be able to see an up to date, but relatively stable version of the app. Your developer sandboxes (if you're like me, at least) probably spend a lot of time in an unusable state as you're breaking things trying to get the next feature implemented. So the typical problem is that a QA person wants to verify that a bug is fixed, or a PM wants to check that some planned feature was implemented correctly, or a client wants to see that you've made progress on the issue that they care about. If they only have access to developer sandboxes, there's a good chance that when they get around to looking at it, either the sandbox version isn't running (since it means ./manage.py runserver is up in a terminal somewhere) or it's in a broken state because of something else. That really slows down the whole team and wastes a lot of time.
It sounds like you don't have a staging setup since you just automatically update the production version. That could be fine if you're way more careful and disciplined than I (and I think most developers) am and never commit anything that isn't totally bulletproof. Personally, I'd rather make sure that my work has made it through at least some cursory QA by someone other than me before it hits production.
So, in conclusion, the setup where I work:
each developer runs their own sandbox locally (same as you do it)
there's a "common" staging sandbox on a dev server that gets updated nightly from a cronjob. PMs, clients, and QA go there. They are never given direct access to developer sandboxes.
There's an automated (though manually initiated) deployment to production. A developer or the PM can "push" to production when we feel things have been sufficiently QA'd and are stable and safe.
I'd say the only downside (besides a bit of extra overhead setting up the nightly staging builds) is that it makes for a day of turnaround on bug verification. ie, QA reports a bug in the software (based on looking at that day's nightly build), developer fixes bug and commits, then QA must wait until the next day's build to check that the bug is actually fixed. It's usually not that much of a problem since everyone has enough stuff going on that it doesn't affect the schedule. When a milestone is approaching though and we're in a feature-frozen, bugfix only mode, we'll do more frequent manual updates of the staging site.
I've had great success using Hudson for continuous integration. Details on using Hudson with Python by Redsolo.
A few months ago, several articles espousing continuous deployment caused quite a stir online. IMVU has details on how they deploy up to 5 times a day.
The whole idea behind frequent builds (nightly or more frequent like in continuous integration) is to get immediate feedback in order to reduce the elapsed time between the introduction of a problem and its detection. So, building frequently is useful only if you are able to generate some feedback through compilation, (ideally automated) testing, quality checks, etc. Without feedback, there is no real point.