How do people deploy/version control cronjobs to production? I'm more curious about conventions/standards people use than any particular solution, but I happen to be using git for revision control, and the cronjob is running a python/django script.
If you are using Fabric for deploment you could add a function that edits your crontab.
def add_cronjob():
run('crontab -l > /tmp/crondump')
run('echo "#daily /path/to/dostuff.sh 2> /dev/null" >> /tmp/crondump')
run('crontab /tmp/crondump')
This would append a job to your crontab (disclaimer: totally untested and not very idempotent).
Save the crontab to a tempfile.
Append a line to the tmpfile.
Write the crontab back.
This is propably not exactly what you want to do but along those lines you could think about checking the crontab into git and overwrite it on the server with every deploy. (if there's a dedicated user for your project.)
Using Fabric, I prefer to keep a pristine version of my crontab locally, that way I know exactly what is on production and can easily edit entries in addition to adding them.
The fabric script I use looks something like this (some code redacted e.g. taking care of backups):
def deploy_crontab():
put('crontab', '/tmp/crontab')
sudo('crontab < /tmp/crontab')
You can also take a look at:
http://django-fab-deploy.readthedocs.org/en/0.7.5/_modules/fab_deploy/crontab.html#crontab_update
django-fab-deploy module has a number of convenient scripts including crontab_set and crontab_update
You can probably use something like CFEngine/Chef for deployment (it can deploy everything - including cron jobs)
However, if you ask this question - it could be that you have many production servers each running large number of scheduled jobs.
If this is the case, you probably want a tool that can not only deploy jobs, but also track success failure, allow you to easily look at logs from the last run, run statistics, allow you to easily change the schedule for many jobs and servers at once (due to planned maintenance...) etc.
I use a commercial tool called "UC4". I don't really recommend it, so I hope you can find a better program that can solve the same problem. I'm just saying that administration of jobs doesn't end when you deploy them.
There are really 3 options of manually deploying a crontab if you cannot connect your system up to a configuration management system like cfengine/puppet.
You could simply use crontab -u user -e but you run the risk of someone having an error in their copy/paste.
You could also copy the file into the cron directory but there is no syntax checking for the file and in linux you must run touch /var/spool/cron in order for crond to pickup the changes.
Note Everyone will forget the touch command at some point.
In my experience this method is my favorite manual way of deploying a crontab.
diff /var/spool/cron/<user> /var/tmp/<user>.new
crontab -u <user> /var/tmp/<user>.new
I think the method I mentioned above is the best because you don't run the risk of copy/paste errors which helps you maintain consistency with your version controlled file. It performs syntax checking of the cron tasks inside of the file, and you won't need to perform the touch command as you would if you were to simply copy the file.
Having your project under version control, including your crontab.txt, is what I prefer. Then, with Fabric, it is as simple as this:
#task
def crontab():
run('crontab deployment/crontab.txt')
This will install the contents of deployment/crontab.txt to the crontab of the user you connect to the server. If you dont have your complete project on the server, you'd want to put the crontab file first.
If you're using Django, take a look at the jobs system from django-command-extensions.
The benefits are that you can keep your jobs inside your project structure, with version control, write everything in Python and configure crontab only once.
I use Buildout to manage my Django projects. With Buildout, I use z3c.recipe.usercrontab to install cron jobs in deploy or update.
You said:
I'm more curious about conventions/standards people use than any particular solution
But, to be fair, the particular solution will depend in your environment and there is no universal elegant silver bullet. Given that you happen to be using Python/Django, I recommend Celery. It is an asynchronous task queue for Python, which integrates nicely with Django. And, on top of the features that it gives as an asynchronous task queue, it also has specific features for periodic tasks.
I have personally used the django-celery-beat integration and it integrates perfectly with Django settings and behaves correctly in distributed environments. If your periodic tasks are related to Django stuff, I strongly recommend to take a look at Celery I started using it only for certain asynchronous mailing and ended up using it for a lot of asynchronous tasks + periodic sanity checks and other web application maintenance stuff.
Related
I am working with a client that demands multi-stage server setup: development server, stage server and production/live server.
Stage should be as stable as it can be to test all those new features we develop at the development server and take this to the live server in the end.
We use git and github for version controlling. I use Ubuntu server edition as the OS.
The problem is, I have never working in such multi-stage server plan. What software/projects would you recommend to do a proper way of handling such setup, especially deployment and moving a new feature developed to the stage and then to the live server ?
We use two different methods of moving code from environment to environment. The first is to use branches and triggers with our source control system (mercurial in our case, though you can do the same thing with git). The other, is to use fabric, a python library for executing shell code across a number of servers.
Using source control, you can have several main branches, like production development staging. Say you want to move a new feature into staging. I'll explain in terms of mercurial, but you can port the commands over to git and it should be fine.
hg update staging
hg merge my-new-feature
hg commit -m 'my-new-feature > staging'
hg push
You then have your remote source control server push to all of your web servers using a trigger. A trigger on each web server will then do an update and reload the web server.
To move from staging to production, it's just as easy.
hg update production
hg merge staging
hg commit -m 'staging > production'
hg push
It's not the nicest method of deployment, and it makes rolling back quite hard. But it's quick and easy to set up, and still a lot better than manually deploying each change to each server.
I won't go through fabric, as it can get quite involved. You should read their documentation so you understand what it is capable of. There are plenty of tutorials around for fabric and django. I highly recommend the fabric route as it gives you lots more control, and only involves writing some python.
There is a nice branching model for git (as it is also used by github itself for example). You can easily apply this branching model using git-flow, which is a git extension that enables you to apply some high level repository operations that fit into this model. There's also a nice blogpost about this.
I do not know what exactly you want to automize in your deployment workflow, but if you apply the model mentioned above, most of the correct version handling is done by git.
To add some further automatic processing to this, fabric is a simple but great tool, and you will find many tutorials about its usage (also in combination with git).
For handling python dependencies using virtualenv and pip is for sure a very good way to go.
If you need something more complex,eg. to handle more than one django instance on one machine and for handling system wide dependencies etc checkout puppet or chef.
Try Gondor.io or Ep.io, they both make it pretty easy (gondor especially excels in this area) to have two+ instances with very similar code, from your VCS -- and to move data back and forth. (if you need an invite, ask either in IRC, but if I recall, they're both open now)
So I realize this question is broad, as I'm looking for a place to start.
Is it possible to have a server that for example runs PHP scripts every day, or every 2 minutes?
Pretty much, I don't want to put these tasks in the users script, because I'm scared they would get run a lot more than is necessary, and I'd prefer to keep it separate.
I'm looking to keep it javascript, PHP or SQL as I don't have the time to learn a new language atm, so if any of you have any good places for me to read up on this, I would greatly appreciate it.
There might be some way to run javascript on the server side, but I'd certainly stick to PHP. SQL is not a script, by the way...
You can execute your PHP scripts every x minutes / hours / days using cronjobs. Just google crontab. You need (root) access to the server though.
Javascript will not run server side as you are expecting. what I would suggest is you look into writing a cron job to run a bash file. It would take minimal research to complete.
Cron Intro
Bash Intro
Yes, it's completely possible. On Linux (Mac too?), you use the cron job scheduler to run your scripts.
For Windows, you can try the Windows Task Scheduler. It provides similar functionality.
You can run just about any script with cron, but not JavaScript (unless you use it for serverside stuff).
I asked a previous question getting a django command to run on a schedule. I got a solution for that question, but I still want to get my commands to run from the admin interface. The obstacle I'm hitting is that my custom management commands aren't getting recognized once I get to the admin interface.
I traced this back to the __init__.py file of the django/core/management utility. There seems to be some strange behavior going on. When the server first comes up, a dictionary variable _commands is populated with the core commands (from django/core/management/commands). Custom management commands from all of the installed apps are also pushed into the _commands variable for an overall dictionary of all management commands.
Somehow, though between when the server starts and when django-chronograph goes to run the job from the admin interface, the _commands variable loses the custom commands; the only commands in the dictionary are the core commands. I'm not sure why this is. Could it be a path issue? Am I missing some setting? Is it a django-chronograph specific problem? So forget scheduling. How might I run a custom management command from the django admin graphical interface to prove that it can indeed be done? Or rather, how can I make sure that custom management commands available from said interface?
I'm also using django-chronograph
and for me it works fine. I did also run once into the problem that my custom commands were not regognized by the auto-discovery feature. I think the first reason was because the custom command had an error in it. Thus it might be an idea to check whether your custom commands run without problems from the command line.
The second reason was indeed some strange path issue. I might check back with my hosting provider to provide you with a solution. Will post back to you in a few days..
i am the "unix-guy" mentioned above by tom tom.
as far as i remember there were some issues in the cronograph code itself, so it would be a good idea to use the code tom tom posted in the comments.
where on the filesystem is django-cronograph stored (in you app-folder, in an extra "lib-folder" or in your site-packages?
when you have it in site-packages or another folder that is in your "global pythonpath" pathing should be no issue.
the cron-process itself DOES NOT USE THE SAME pythonpath, as your django app. remember: you start the cron-process via your crontab - right? so there are 2 different process who do not "know" each other: the cron-process AND the django-process (initialized by the webserver) so i would suggest to call the following script via crontab and export pythonpath again:
#!/bin/bash
PYTHONPATH=/path/to/libs:/path/to/project_root:/path/to/other/libs/used/in/project
export PYTHONPATH
python /path/to/project/manage.py cron
so the cron-started-process has the same pythonpath-information as your project.
greez from vienna/austria
berni
How to make Django execute something automatically at a particular time.?
For example, my django application has to ftp upload to remote servers at pre defined times. The ftp server addresses, usernames, passwords, time, day and frequency has been defined in a django model.
I want to run a file upload automatically based on the values stored in the model.
One way to do is to write a python script and add it to the crontab. This script runs every minute and keeps an eye on the time values defined in the model.
Other thing that I can roughly think of is maybe django signals. I'm not sure if they can handle this issue. Is there a way to generate signals at predefined times (Haven't read indepth about them yet).
Just for the record - there is also celery which allows to schedule messages for the future dispatch. It's, however, a different beast than cron, as it requires/uses RabbitMQ and is meant for message queues.
I have been thinking about this recently and have found django-cron which seems as though it would do what you want.
Edit: Also if you are not specifically looking for Django based solution, I have recently used scheduler.py, which is a small single file script which works well and is simple to use.
I've had really good experiences with django-chronograph.
You need to set one crontab task: to call the chronograph python management command, which then runs other custom management commands, based on an admin-tweakable schedule
The problem you're describing is best solved using cron, not Django directly. Since it seems that you need to store data about your ftp uploads in your database (using Django to access it for logs or graphs or whatever), you can make a python script that uses Django which runs via cron.
James Bennett wrote a great article on how to do this which you can read in full here: http://www.b-list.org/weblog/2007/sep/22/standalone-django-scripts/
The main gist of it is that, you can write standalone django scripts that cron can launch and run periodically, and these scripts can fully utilize your Django database, models, and anything else they want to. This gives you the flexibility to run whatever code you need and populate your database, while not trying to make Django do something it wasn't meant to do (Django is a web framework, and is event-driven, not time-driven).
Best of luck!
In my django project I would like to be able to delete certain entries in the database automatically if they're too old. I can write a function that checks the creation_date and if its too old, deletes it, but I want this function to be run automatically at regular intervals..Is it possible to do this?
Thanks
This is what cron is for.
You will be better off reading this section of the Django docs http://docs.djangoproject.com/en/1.2/howto/custom-management-commands/#howto-custom-management-commands
Then you can create your function as a Django management command and use it in conjunction with cron on *nix (or scheduled tasks on Windows) to run it on a schedule.
See this for a good intro guide to cron http://www.unixgeeks.org/security/newbie/unix/cron-1.html
What you require is a cron job.
A cron job is time-based job scheduler. Most web hosting companies provide this feature which lets u run a service or a script at a time of your choosing. Most Unix-based OSes have this feature.
You would have better direct help asking this question on serverfault.com, the sister site of stackoverflow.