Task queue in Django - django

I've only heard about tools like Celery, but I don't know if it fits my needs and is the best solution I can have.
Imagine a game like Travian. We initiate building and we have to wait N seconds until the construction is finished. When and how should we complete the construction?
Solution 1: Check if there are active construction every time the page loads. If queries like that takes some time we can make them asynchronous. If there are some - then complete.
However, in this way we are constantly waiting for the user to reload the page. Sure, we can use cronjob to check for constructions to be completed from time to time, but cronjobs execute once in a minute or less often. Constructions / attacks etc. must be executed as precisely as possible.
The solution above works, but has some cons. What are better and RELIABLE ways to perform actions like those I mentioned.
Moreover, let's assume that resources needs to be regenerated at X per hour speed and we need to regenerate them very precisely and pretty often. How can I achieve this without waiting for the page to be refreshed?
Finally, solution shall work in Webfaction hosting or any other shared hosting. I've heard that Celery doesn't work in Webfaction or am I mistaken?

Yes, celery have periodic tasks with seconds:
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html
Also you can run tasks in time with celery's crontab
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html#crontab-schedules
Also if you need to check resources count I think it's common part for every request, so your response should looks like
{
"header": {"resources": {"wood":1, "stone":500}}
"data": {.. you real data shoud be here...}
}
You need to add header to response that will contain common information like resources count, unread messages etc and handle it properly on client.
To improve it you can use nginx + ssl + memcache backend.

Related

django with heavy computation and long runtime - offline computation and send results

I have a django app where the user sends a request, and the server does some SQL lookup, followed by computation on results and finally showing the results to the user.
The SQL lookup and the computation afterwards can take a long time, maybe 30+ minutes. I have seen some webpages ask for email in such cases then send you the URL later. But I'm not sure how this can be done in django or whether there are other options for this situation. Any pointer will be very helpful.
(I'm sorry but as I said it's a rather general question, I don't know how can I provide a min runnable code for this)
One way to accomplish this would be to use something like Celery, which is a distributed task queue. The processing task would go into the queue (synchronously or asynchronously), and it would call a function to send an email to the user alerting them it is ready when the task is complete.
Documentation: https://docs.celeryproject.org/en/stable/django/first-steps-with-django.html

Returning the result of celery task to the client in Django template

So I'm trying to accomplish the following. User browses webpage and at the sime time there is a task running in the background. When the task completes it should return args where one of args is flag: True in order to trigger a javascript and javascript shows a modal form.
I tested it before without async tasks and it works, but now with celery it just stores results in database. I did some research on tornado-celery and related stuff but some of components like tornado-redis is not mantained anymore so it would not be vise in my opinion to use that.
So what are my options, thanks?
If I understand you correctly, then you want to communicate something from the server side back to the client. You generally have three options for that:
1) Make a long pending request to the server - kinda bad. Jumping over the details, it will bog down your web server if not configured to handle that, it will make your site score low on performance tests and if the request fails, everything fails.
2) Poll the server with numerous requests with a time interval (0.2 s, something like that) - better. It will increase the traffic, but the requests will be tiny and will not interfere with the site's performance very much. If you instate a long interval to not load the server with pointless requests, then the users will see the data with a bit of a delay. On the upside this will not fail (if written correctly) even if the connection is interrupted.
3) Websockets where the server can just hit the client with any message whenever needed - nice, but takes some time to get used to. If you want to try, you can use django-channels which is a nice library for Django websockets.
If I did not understand you correctly and this is not the problem at hand and you are figuring how to get data back from a Celery task to Django, then you can store the Celery task ID-s and use the ID-s to first check, if the task is completed and then query the data from Celery.

In Amazon SWF, can I abuse a Decision task to actually perform the work

I need Amazon SWF to distribute some work, make sure it's done asynchronously, make sure it's store in a reliable way and that it's automatically restarted. However, the workflow logic I need is extremely simple: it's just to get a single task executed.
I implemented it now the way it's supposed to be done:
Request workflow execution
Decider founds out about it and schedules an activity
Workers finds out about the activity request, performs the results and returns the results
Decider notices a result and copies it over in a workflow completion
It seems to me that I can just have the decider do the work – as it were – and complete the workflow execution immediately. That would take care of a lot of code. (The activity might also fail, timeout, etc. All things that I currently need to cater for.)
So back to my question: can I have a decider that performs the work itself and completes the 'workflow' immediately?
Yes. Actually, I think you came up with an interesting use case: using a minimal workflow as a centralized locking mechanism for one-off actions in a distributed system - such as cron jobs executed from a single host in a fleet of many (the hosts have to first undergo election and whichever wins the lock gets to execute an action). The same could be achieved with Amazon SWF and minimum amount of code:
A small Python example, using boto.swf (use 1. from this post to setup the domain):
To code the decider:
#MyDecider.py
import boto.swf.layer2 as swf
class OneShotDecider(swf.Decider):
domain = 'stackoverflow'
task_list = 'default_tasks'
version = '1.0'
def run(self):
history = self.poll()
if 'events' in history:
decisions = swf.Layer1Decisions()
print 'got the decision task, doing the work'
decisions.complete_workflow_execution()
self.complete(decisions=decisions)
return False
return True
To start the decider:
$ ipython -i decider.py
In [1]: while OneShotDecider().run(): print 'polling SWF for decision tasks'
Finally, to start the workflow:
$ ipython
In [1]: wf_type = swf.WorkflowType(domain='stackoverflow', name='MyWorkflow', version='1.0', task_list='default_tasks')
In [2]: wf_type.start()
Out[2]: <WorkflowExecution 'MyWorkflow-1.0' at 0x32e2a10>
Back in the decider window, you you'll see something like:
polling SWF for decision tasks
polling SWF for decision tasks
got the decision task, doing the work
If your workflow is likely to evolve its business logic or grow in the number of activities, it's probably best to stick to the standard way of having Deciders doing the business logic and Workers solving the tasks.
While yes, you can do this (as pointed out by the other answer), there are some things to consider before doing so:
Why are you using SWF to execute this task? Why bother setting it up as a workflow and paying for "StartWorkflow" executions if you can get the same benefit by just invoking your code more directly? If you need to track execution submissions and completions, you can just use an SQS queue for this and get the same results for cheaper.
Your workflows might be extremely simple right now, but they often can and do evolve to be more complex over time. Designing it right from the start can save time in the long run. Do you want future developers working on your code thinking that they should just add more logic to the workflow? Will they know to lookup how to use activities, or just follow the existing pattern you've started with? (Hint - they'll be likely to copy your pattern - developers are lazy :))

Set timeout on Django View Execution

How can I set a time limit for execution of a django view. i.e. a view never takes more than say, 10secs for execution and if it does, it should return half way from execution. My idea is that we can have a decorator. But i am not sure. Looking for a solution. Thanks in advance.
I would suggest to consider using Celery, which includes built in time limit support for tasks, and would keep your django app and server responsive:
A single task can potentially run forever, if you have lots of tasks
waiting for some event that will never happen you will block the
worker from processing new tasks indefinitely. The best way to defend
against this scenario happening is enabling time limits.

implementing a timer in a django app

In my Django app, I need to implement this "timer-based" functionality:
User creates some jobs and for each one defines when (in the same unit the timer works, probably seconds) it will take place.
User starts the timer.
User may pause and resume the timer whenever he wants.
A job is executed when its time is due.
This does not fit a typical cron scenario as time of execution is tied to a timer that the user can start, pause and resume.
What is the preferred way of doing this?
This isn't a Django question. It is a system architecture problem. The http is stateless, so there is no notion of times.
My suggestion is to use Message Queues such as RabbitMQ and use Carrot to interface with it. You can put the jobs on the queue, then create a seperate consumer daemon which will process jobs from the queue. The consumer has the logic about when to process.
If that it too complex a system, perhaps look at implementing the timer in JS and having it call a url mapped to a view that processes a unit of work. The JS would be the timer.
Have a look at Pinax, especially the notifications.
Once created they are pushed to the DB (queue), and processed by the cron-jobbed email-sending (2. consumer).
In this senario you won't stop it once it get fired.
That could be managed by som (ajax-)views, that call system process....
edit
instead of cron-jobs you could use a twisted-based consumer:
write jobs to db with time-information to the db
send a request for consuming (or resuming, pausing, ...) to the twisted server via socket
do the rest in twisted
You're going to end up with separate (from the web server) processes to monitor the queue and execute jobs. Consider how you would build that without Django using command-line tools to drive it. Use Django models to access the the database.
When you have that working, layer on on a web-based interface (using full Django) to manipulate the queue and report on job status.
I think that if you approach it this way the problem becomes much easier.
I used the probably simplest (crudest is more appropriate, I'm afraid) approach possible: 1. Wrote a model featuring the current position and the state of the counter (active, paused, etc), 2. A django job that increments the counter if its state is active, 3. An entry to the cron that executes the job every minute.
Thanks everyone for the answers.
You can always use a client based jquery timer, but remember to initialize the timer with a value which is passed from your backend application, also make sure that the end user didn't edit the time (edit by inspecting).
So place a timer start time (initial value of the timer) and timer end time or timer pause time in the backend (DB itself).
Monitor the duration in the backend and trigger the job ( in you case ).
Hope this is clear.