Schedule task in Django using schedule package - django

I am trying to learn how to scheduled a task in Django using schedule package. Here is the code I have added to my view. I should mention that I only have one view so I need to run scheduler in my index view.. I know there is a problem in code logic and it only render scheduler and would trap in the loop.. Can you tell me how can I use it?
def job():
print "this is scheduled job", str(datetime.now())
def index(request):
schedule.every(10).second.do(job())
while True:
schedule.run_pending()
time.sleep(1)
objs= objsdb.objects.all()
template = loader.get_template('objtest/index.html')
context= { 'objs': objs}
return HttpResponse(template.render(context, request))

You picked the wrong approach. If you want to schedule something that should run periodically you should not do this within a web request. The request never ends, because of the wile loop - and browsers and webservers very much dislike this behavior.
Instead you might want to write a management command that runs on its own and is responsible to call your tasks.
Additionally you might want to read Django - Set Up A Scheduled Job? - they also tell something about other approaches like AMPQ and cron. But those would replace your choice of the schedule module.

Related

Is celery-beats can only trigger a celery task or normal task (Django)?

I am workign on a django project with celery and celery-beats. My main use case is use celery-beats to set up a periodical task as a background task, instead of using a front-end request to trigger. I would save the results and put it inside model, then pull the model to front-end view as a view to user.
My current problem is, not matter how I change the way I am calling my task, it always throwing the task is not registered in the task list inside celery.
I am trying to trigger a non-celery task(inside, it will call a celery taskthe , using celery beats module,
Below is the pesudo-code.
tasks.py:
#app.shared_task
def longrunningtask(a):
res = APIcall(a)
return res
caller.py:
from .task import longrunningtask
def dosomething(input_list):
for ele in input_list:
res.append(longrunningtask.delay(ele))
return res
Periodical Task :
schedule, created = CrontabSchedule.objects.get_or_create(hour = 1, minute = 34)
task = PeriodicTask.objects.create(crontab=schedule, name="XXX_task_", task='app.caller.dosomething'))
return HttpResponse("Done")
Nothing special about the periodical task, but This never works for me. It errored that not detected tasks or not registered tasks if I do not make the dosomething() as celery task.
Problem is I do not want to make the caller function a celery task, the reason being, that
Inside for loop, I would make parameter passing into the task(), I would like to see multiple celery long runing task is running with the for loop passing it and kick it. so I would create mutliple sub-task instead of as one giant running task.
Not necessary since longrunningtask is the task I need it to be run as celery task, no need its parent to be inside celery task.
Can someone please help me out of this dilemma? It's super frustrating and has been blocking me for a while.
Any suggestion or idea of this use case is also superhelpful!

How to stop/cancel a training session(Python)?

I have a function "train_model" which is call via a "train" (Flask) API. Once this API is triggered training of a model is started. On completion it saves a model. But I want to introduce a "cancel" API. Which will stop the training, and should return a valid response for "train" API.
You should probably consider using multiprocessing and run that model in a separate process and respond with a unique request_id, which is stored in the cache/db later whenever you want to cancel, so when model training is complete before exiting the process remove it from cache/db and your API should take request_id and stop that process or if it is not cache/db respond 404 accordingly since this way work is too much of re-inventing wheel you could simply consider using celery
sample untested code with multiprocessing pool
jobs = {}
#app.route("/api/train", methods=['POST'])
def train_it():
with Pool() as process_pool:
job = process_pool.apply_async(train_model_func,args=(your_inputs))
job['12455ABC'] = job
return "OK I got you! here is your request id 12455ABC"

How do I schedule my desktop application(PyQt) to send emails at some intervals

So, I have built this nice application(using camelot) and do not know how to implement a scheduled job on it, that can send emails based on some conditions at regular intervals. I am trying to implement it using schedule but don't know how to make my app call it automatically.
here is my code:
import time, schedule
def job():
print "I'm working..."
schedule.every(10).seconds.do(job)
while True:
schedule.run_pending()
time.sleep(1)
When I start my application, nothing works. How do I make my application be aware of this.
If possible, I would like this schedule job to execute without even starting my application.
If you want to schedule jobs that will interact with the user, you have to use Camelot Actions, which have their own method for executing functions in the model thread, posting the result to the GUI thread, and back...
But you don't need that, you just need to run jobs (functions) that will access the database, create emails and send it, without interacting with the user. This can be done with a completly independent application, without GUI.
In order to avoid creating two applications, you can change the behavior using a command line parameter. If you start the application without any params, it will open the GUI as usual, but if you run it with -background, it will only start the schedule loop.
Then you can hook your application on system start, executing it with "-background", and you will have the schedule running without requiring your users to start the application.
If later a user starts the application, then you'll have two instances running, the first one running the schedule loop, and the second one with the GUI.
main.py:
if __name__ == '__main__':
import sys
if "-background" in sys.argv:
import background
background.main()
else:
main() #camelot's main
background.py:
from camelot.core.orm import Session
import time, schedule
def job():
session = Session()
#Use session to access and manipulate model
print "I'm working..."
session.flush() #flush session to DB
def main():
from camelot.core.conf import settings
settings.setup_model()
schedule.every(10).seconds.do(job)
while True:
schedule.run_pending()
time.sleep(1)
You should run the app with python main.py -background.

HTTP call after celery task have changed state

I need a scheduler for my next project, and since I'm coding using Django I went for Celery.
What I am looking for is a way for a task to tell Django when it is done, so I can update the database and use SSE to tell the user. All this can be done fairly simple with just putting all the logic into the task. But what do I do when I am planning to have several celery workers?
I found a bunch of info online to cover the single-worker-case, but not many covering the problem if you have more than one worker.
What I thought about was using http callbacks from the workers to the web-server to let it know that the task is done. Looking at celery.task.http looked promising, but didnt do what I needed.
Is the solution to use signals and hook up manual http calls? Or am I on the wrong path? Isn't this a common problem? How can this be solved more elegantly?
So, what are you mean when you tell tell to Django? Is I understand you right, django request which initiliazed a Celery task, is still alive a time when this task is finished? I that case you can check some storage ( database, memcached, etc ). and send your SSE.
Look, there is one way to do that.
1. You django view send task to Celery, after that it goes to infinite loop ( or loop with timeout 60sec?) and waits results in memcached.
Celery gets task executes, and pastes results to memcached.
Django view gets new results, exit the loop and sends your SSE.
Next variant is
Django view sends task to Celery, and returns
Celery execute tasks, after executing it makes simple HTTP requests to your django app.
Django receives a http request from Celery, parse params and send SSE to your user again
Here is some code that seems to do what I want:
In django settings:
CELERY_ANNOTATIONS = {
"*": {
"on_failure": celery_handlers.on_failure,
"on_success": celery_handlers.on_success
}
}
In the celery_handlers.py file included:
def on_failure(self, exc, task_id, *args, **kwargs):
# Use urllib or similar to poke eg; api-int.mysite.com/task_handler/TASK_ID
pass
def on_success(self, retval, task_id, *args, **kwargs):
# Use urllib or similar to poke eg; api-int.mysite.com/task_handler/TASK_ID
pass
And then you can just setup api-int to use something like:
from celery.result import AsyncResult
task_obj = AsyncResult(task_id)
# Logic to handle task_obj.result and related goes here....

Stopping/Purging Periodic Tasks in Django-Celery

I have managed to get periodic tasks working in django-celery by subclassing PeriodicTask. I tried to create a test task and set it running doing something useless. It works.
Now I can't stop it. I've read the documentation and I cannot find out how to remove the task from the execution queue. I have tried using celeryctl and using the shell, but registry.tasks() is empty, so I can't see how to remove it.
I have seen suggestions that I should "revoke" it, but for this I appear to need a task id, and I can't see how I would find the task id.
Thanks.
A task is a message, and a "periodic task" sends task messages at periodic intervals. Each of the tasks sent will have an unique id assigned to it.
revoke will only cancel a single task message. To get the id for a task you have to keep
track of the id sent, but you can also specify a custom id when you send a task.
I'm not sure if you want to cancel a single task message, or if you want to stop the periodic task from sending more messages, so I'll list answers for both.
There is no built-in way to keep the id of a task sent with periodic tasks,
but you could set the id for each task to the name of the periodic task, that way
the id will refer to any task sent with the periodic task (usually the last one).
You can specify a custom id this way,
either with the #periodic_task decorator:
#periodic_task(options={"task_id": "my_periodic_task"})
def my_periodic_task():
pass
or with the CELERYBEAT_SCHEDULE setting:
CELERYBEAT_SCHEDULE = {name: {"task": task_name,
"options": {"task_id": name}}}
If you want to remove a periodic task you simply remove the #periodic_task from the codebase, or remove the entry from CELERYBEAT_SCHEDULE.
If you are using the Django database scheduler you have to remove the periodic task
from the Django Admin interface.
PS1: revoke doesn't stop a task that has already been started. It only cancels
tasks that haven't been started yet. You can terminate a running task using
revoke(task_id, terminate=True). By default this will send the TERM signal to
the process, if you want to send another signal (e.g. KILL) use
revoke(task_id, terminate=True, signal="KILL").
PS2: revoke is a remote control command so it is only supported by the RabbitMQ
and Redis broker transports.
If you want your task to support cancellation you should do so by storing a cancelled
flag in a database and have the task check that flag when it starts:
from celery.task import Task
class RevokeableTask(Task):
"""Task that can be revoked.
Example usage:
#task(base=RevokeableTask)
def mytask():
pass
"""
def __call__(self, *args, **kwargs):
if revoke_flag_set_in_db_for(self.request.id):
return
super(RevokeableTask, self).__call__(*args, **kwargs)
Just in case this may help someone ... We had the same problem at work, and despites some efforts to find some kind of management command to remove the periodic task, we could not. So here are some pointers.
You should probably first double-check which scheduler class you're using.
The default scheduler is celery.beat.PersistentScheduler, which is simply keeping track of the last run times in a local database file (a shelve).
In our case, we were using the djcelery.schedulers.DatabaseScheduler class.
django-celery also ships with a scheduler that stores the schedule in the Django database
Although the documentation does mention a way to remove the periodic tasks:
Using django-celery‘s scheduler you can add, modify and remove periodic tasks from the Django Admin.
We wanted to perform the removal programmatically, or via a (celery/management) command in a shell.
Since we could not find a command line, we used the django/python shell:
$ python manage.py shell
>>> from djcelery.models import PeriodicTask
>>> pt = PeriodicTask.objects.get(name='the_task_name')
>>> pt.delete()
I hope this helps!