I am using django background tasks to run some code in background. My project has been deployed and I run the background tasks using cron. The problem is when I made changes to my code, the ones related to the background tasks are not taken into account. It seems the cron still using the old code.
This is my crontab
*/5 * * * * /home/.../venv/bin/python /home/.../manage.py process_tasks [duration 299]
I think i need to kill the cron command and allow the code to update before running it again.
Related
We have a flask script get_logs.py that uses APScheduler and contains following job
scheduler.add_job(id="create_recommendation_entries", trigger = 'interval',seconds=60*10,func=create_entries)
Someone ran the script and now the the logs show that this script is still running at 10 minutes interval even after terminating.
The process id is not listed nor does it show using grep and we don't know whether it was executed using nohup or gunicorn.
How do I kill this job based on id="create_recommendation_entries"because I don't know any of its stats(port,pid etc).
Rerunning the script creates a different thread and stops after Ctrl+C but the previous one remains still in process
I have been trying a long for creating a periodic task in Django but there are lot of version constraints and not a clear explanation.
I recommend Celery. What is Celery?
Celery supports scheduling tasks. Check this doc
First of all, you want to create a management command following this guide.
https://docs.djangoproject.com/en/2.1/howto/custom-management-commands/
Say we want to run the closepoll command in the example every 5 minutes.
You'll then need to create a script to run this command.
Linux / MacOS:
#!/bin/bash -e
cd path/to/your/django/project
source venv/bin/activate # if you use venv
python manage.py closepoll # maybe you want to >> /path/to/log so you can log the results
store the file as run_closepoll.sh, run chmod +x run_closepoll.sh in command line
Now we can use crontab to run our command
run crontab -e in your command line
add this line:
*/5 * * * * /path/to/run_closepoll.sh
Now the command will run every 5 minutes.
If you're not familiar with crontab, you can use this website
https://crontab-generator.org/
Windows:
Same content as the above example, but remove the first line and save as run_closepoll.bat
In your start menu, search for Task Scheduler, follow the instructions on the GUI, it should be pretty simple from there.
for more info about the task scheduler, see here: https://learn.microsoft.com/en-us/windows/desktop/taskschd/using-the-task-scheduler
This blog explains clearly
https://medium.com/#yehandjoe/celery-4-periodic-task-in-django-9f6b5a8c21c7
Thanks!!!
I'm using django-cron and It works as expected. The only caveat is that you have to set a Cron job in the Linux system to run the command python manage.py runcrons.
I'm trying to set up a daily task for my Django application on Elastic Beanstalk. There doesn't appear to be an accepted way to set this up, as celery beat is the go-to solution for periodic tasks in Django, but isn't great for load-balanced environments.
I've seen some solutions doing things like setting up celery beat with leader_only=True, to only run one instance, but that leaves a single point of failure. I've seen other solutions that allow many instances of celery beat and use locks to make sure only one task goes through, but wouldn't this still eventually fail completely unless the failed instances were restarted? Another suggestion I've seen is to have a separate instance for running celery beat, but this would still be a problem unless it had some way of restarting itself if it failed.
Are there any decent solutions to this problem? I would much rather not have to babysit a scheduler, as it would be pretty easy to not notice that my task was not being run until a while later.
If you're using redis as your broker, look into installing RedBeat as the celery beat scheduler: https://github.com/sibson/redbeat
This scheduler uses locking in redis to make sure only a single beat instance is running. With this you can enable beat on each node's worker process and remove the use of leader_only=True.
celery worker -B -S redbeat.RedBeatScheduler
Let's say you have Worker A with beat lock and Worker B. If Worker A dies, Worker B will attempt to acquire the beat lock after a configurable amount of time.
I would suggest making a management command that runs with cron.
Using this method, you have your full Django ORM, all methods, etc. to work with. Wrapping your script in a try/except, you have the option to log failures in any way that you wish - email notifications, external logging systems like Sentry, straight to the DB, etc.
I user supervisord to run cron and it works well. It relies on time-tested tools that won't let you down.
Finally, using a database singleton to keep track of if a batch job has been run or is currently running in an environment where you have multiple instances of Django running load-balanced isn't bad practice, even if you feel a little icky about it. The DB is a very reliable means of telling you if the DB is being processed.
The one annoying thing about cron is that it doesn't import environment variables you may need for Django. I solved this with a simple Python script.
It writes the crontab on startup with needed environment variables etc. included. This example is for Ubuntu on EBS but should be relevant.
#!/usr/bin/env python
# run-cron.py
# sets environment variable crontab fragments and runs cron
import os
from subprocess import call
from master.settings import IS_AWS
# read django's needed environment variables and set them in the appropriate crontab fragment
eRDS_HOSTNAME = os.environ["RDS_HOSTNAME"]
eRDS_DB_NAME = os.environ["RDS_DB_NAME"]
eRDS_PASSWORD = os.environ["RDS_PASSWORD"]
eRDS_USERNAME = os.environ["RDS_USERNAME"]
try:
eAWS_STAGING = os.environ["AWS_STAGING"]
except KeyError:
eAWS_STAGING = None
try:
eAWS_PRODUCTION = os.environ["AWS_PRODUCTION"]
except KeyError:
eAWS_PRODUCTION = None
eRDS_PORT = os.environ["RDS_PORT"]
if IS_AWS:
fto = '/etc/cron.d/stortrac-cron'
else:
fto = 'test_cron_file'
with open(fto,'w+') as file:
file.write('# Auto-generated cron tab that imports needed variables and runs a python script')
file.write('\nRDS_HOSTNAME=')
file.write(eRDS_HOSTNAME)
file.write('\nRDS_DB_NAME=')
file.write(eRDS_DB_NAME)
file.write('\nRDS_PASSWORD=')
file.write(eRDS_PASSWORD)
file.write('\nRDS_USERNAME=')
file.write(eRDS_USERNAME)
file.write('\nRDS_PORT=')
file.write(eRDS_PORT)
if eAWS_STAGING is not None:
file.write('\nAWS_STAGING=')
file.write(eAWS_STAGING)
if eAWS_PRODUCTION is not None:
file.write('\nAWS_PRODUCTION=')
file.write(eAWS_PRODUCTION)
file.write('\n')
# Process queue of gobs
file.write('\n*/8 * * * * root python /code/app/manage.py queue --process-queue')
# Every 5 minutes, double-check thing is done
file.write('\n*/5 * * * * root python /code/app/manage.py thing --done')
# Every 4 hours, do this
file.write('\n8 */4 * * * root python /code/app/manage.py process_this')
# etc.
file.write('\n3 */4 * * * root python /ode/app/manage.py etc --silent')
file.write('\n\n')
if IS_AWS:
args = ["cron","-f"]
call(args)
And in supervisord.conf:
[program:cron]
command = python /my/directory/runcron.py
autostart = true
autorestart = false
I got exact same problem described in this post, but the answer doesn't help at all. In short, I am using Tivix django-cron, the cron job is not running at regular basis.
To illustrate the problem, the following cron job class is intended to send email every min once running runcrons command. But in fact, it only sends out one email and no more. That defeats the purpose of cron... What am I missing?
class TestCron(CronJobBase):
schedule = Schedule(run_every_mins=1)
code = 'test_cron_philip'
def do(self):
send_mail('cron test', 'body is test body', 'coach_zhong#163.com',
['admin#dessert.webfactional.com'],fail_silently=False)
Yes, you miss something ("runcrons" is not background deamon). From documentation:
"Now everytime you run the management command python manage.py
runcrons all the crons will run if required. Depending on the
application the management command can be called from the Unix crontab
as often as required. Every 5 minutes usually works for most of my
applications."
That means you have to put "runcrons" command in your crontab.
Example:
You have some CronJob that do something every 30 min.
To get this running you must edit you crontab (linux, mac) or task scheduler (windows) to run "python manage.py runcrons" for every, let say 1 min.
If you get this running, your CronJob will be pinged every 1 min and run if necessary (every 30 min or whatever value you have set).
Hope this helps.
I would like to generate periodical reports and have them sent to the users.
The user should be able to select frequency and date/time of sending (e.g. every day at 9.00, every week on monday, etc.), in addiction to other info relevant to the report content itself.
What do you think would be a good solution to integrate this in Django?
I would recommend doing it using cron (the unix job scheduler) in case you are on a unix system.
You can use django-cron, a Django module that wraps the cron job scheduling
But I usually write the task to be scheduled as a django custom management command, and schedule a regular cron job calling this.
If you have django installed in a virtual python environment, you should run a script activating the vitualenv and then calling the command (see example below).
On a unix system, using virtualenv:
script example (script.sh):
#!/bin/bash
source /path/to/virtualenv/bin/activate
python /path/to/django/project/manage.py custom_command
add line in cron (command: crontab -e):
* * * * * /path/to/script.sh >> /path/to/log/file.log 2>&1
replace the *s with the desired time and frequency (details in the default crontab file)
To install the new scheduled task, simply save the crontab file