testing django-tasks - django

I am trying to write a test that involves running a django-tasks task. The problem is I can't seem to get the tasks to go beyond the "scheduled" status.
I have set
DJANGOTASK_DEMON_THREAD = True
in my settings, for simplicity.
ptask = djangotasks.task_for_function(f)
djangotasks.run_task(ptask)
while ptask.status!='successful':
ptask = djangotasks.task_for_function(f)
print ptask.status
time.sleep(5)
This is what I'm attempting, which works well outside of tests.
edit: fixed typo

I think you didn't assign a task worker. In your django directory :
> python manage.py taskd run
your scheduled tasks would be executed by this "taskd".

Related

Is it possible to pause the django background tasks?

For example I have a django background task like this.
notify_user(user.id, repeat=3600, repeat_until=2020-12-12 00:00:00)
Which will repeat every 1 hour until some datetime.
My question is :
Is it possible to pause/resume this task? (if not possible to resume then restart the task again would be fine also).
Is there someone who is experienced with django background tasks ?
There doesn't appear to be a documented way of achieving this, but you can always delete the task from the DB.
For example:
from background_task.models import Task
task = notify_user(user.id, repeat=3600, repeat_until=2020-12-12 00:00:00)
instance = Task.objects.get(id=task.pk)
instance.delete()
Now just call the task again to restart it:
task = notify_user(user.id, repeat=3600, repeat_until=2020-12-12 00:00:00)

Google App Engine deferred.defer task not getting executed

I have a Google App Engine Standard Environment application that has been working fine for a year or more, that, quite suddenly, refuses to enqueue new deferred tasks using deferred.defer.
Here's the Python 2.7 code that is making the deferred call:
# Find any inventory items that reference the product, and change them too.
# because this could take some time, we'll do it as a deferred task, and only
# if needed.
if upd:
updater = deferredtasks.InvUpdate()
deferred.defer(updater.run, product_key)
My app.yaml file has the necessary bits to support deferred.defer:
- url: /_ah/queue/deferred
script: google.appengine.ext.deferred.deferred.application
login: admin
builtins:
- deferred: on
And my deferred task has logging in it so I should see it running when it does:
#-------------------------------------------------------------------------------
# DEFERRED routine that updates the inventory items for a particular product. Should be callecd
# when ANY changes are made to the product, because it should trigger a re-download of the
# inventory record for that product to the iPad.
#-------------------------------------------------------------------------------
class InvUpdate(object):
def __init__(self):
self.to_put = []
self.product_key = None
self.updcount = 0
def run(self, product_key, batch_size=100):
updproduct = product_key.get()
if not updproduct:
logging.error("DEFERRED ERROR: Product key passed in does not exist")
return
logging.info(u"DEFERRED BEGIN: beginning inventory update for: {}".format(updproduct.name))
self.product_key = product_key
self._continue(None, batch_size)
...
When I run this in the development environment on my development box, everything works fine. Once I deploy it to the App Engine server, the inventory updates never get done (i.e. the deferred task is not executed), and there are no errors (and no other logging from the deferred task in fact) in the log files on the server. I know that with the sudden move to get everybody on Python 3 as quickly as possible, the deferred.defer library has been marked as not recommended because it only works with the 2.7 Python environment, and I planned on moving to task queues for this, but I wasn't expecting deferred.defer to suddenly stop working in the existing python environment.
Any insight would be greatly appreciated!
I'm pretty sure you cant pass the method of an instance to appengine taskqueue, because that instance will not get exist when your task runs since it will be running in a different process. I actually dont understand how your task ever worked when running remotely in the first place (and running locally is not an accurate representation of how things will run remotely)
Try changing your code to this:
if upd:
deferred.defer(deferredtasks.InvUpdate.run_cls, product_key)
and then InvUpdate is the same but has a new function run_cls:
class InvUpdate(object):
#classmethod
def run_cls(cls, product_key):
cls().run(product_key)
And I'm still on the process of migrating to cloud tasks and my deferred tasks still work

Run celery task without workers

How to run all celery tasks without workers, I mean call directly?
I can call task with TaskName.run(), but I want to write this in configurations, so how to make it?
Just set the CELERY_ALWAYS_EAGER settings to true, this will force celery not to queue the tasks and run them synchronously in the current process.
If you want to be able to do it per specific task, you can run them with apply() or run() as you mentioned, instead of running them with apply_async() or delay().
So tl;dr:
CELERY_ALWAYS_EAGER = True
# The following two would do and act the same, processing synchronously
my_task.run()
my_task.delay()
But
CELERY_ALWAYS_EAGER = False
# These two won't be the same anymore.
my_task.run() # Runs synchronously
my_task.delay() # Passed to the queue and runs Asynchronously, in another process
If I understand you right, you want to call the task synchronously.
Just call the method as normal:
TaskName()
You only need to use delay when you want to send it to the worker.
In complement to SpiXel 's answer, from this answer, CELERY_ALWAYS_EAGER has been renamed to CELERY_TASK_ALWAYS_EAGER in versions 4.0+. Worked for me with Django 1.11+Celery 4.1.0. So...
CELERY_TASK_ALWAYS_EAGER = False #assync
CELERY_TASK_ALWAYS_EAGER = True #serial

Flask.socket_io blocking calls when database queries are run

I am trying to use socket_io with my flask application. The problem is when i run database queries, like in the url_route function below. The first time the page loads properly but on consecutive calls the process goes into a blocking state. Even KeyboardInterrupt (Ctrl + c) terminates one of the python processes, i have to manually kill the other one.
One obvious solution would be to use a cache and use another script to run queries on database. Is there any other possible solution which could avoid running separate scripts?
#app.route('/status/<urlMap>')
def status(urlMap):
dictResponse = {}
data = models.Status.query.filter_by(urlmap = urlMap).first()
if data.conversion == "DONE":
dictResponse['conversion'] = 'success'
if data.published == "DONE":
dictResponse['publish'] = 'success'
return render_template('status.html',status = dictResponse)
Also on removing the import flask.ext.socketio and using app.run(host='0.0.0.0') instead of socketio.run(app,host='0.0.0.0') the app runs perfectly. So i think its the async gevent calls thats somehow blocking the process.
Like #Miguel pointed out the problem correctly. monkey patching the standard libraries solved the issue.
monkey.patch_all() solved the problem.

Prioritize some workflow executions over others

I've been using the flow framework for amazon swf and I want to be able to run priority workflow executions and normal workflow executions. If there are priority tasks, then activities should pick up the priority tasks ahead of normal priority tasks. What is the best way to accomplish this?
I'm thinking that the following might work but I wonder if there's a better/recommended approach.
I'll define two Activity Workers and two activity lists for the activity. One priority list and one normal list. Each worker will be using the same activity class.
Both workers will be run on the same host (ec2 instance).
On the workflow, I'll define two methods: startNormalWorkflow and startHighWorkflow. In the startHighWorkflow method, I can use ActivitySchedulingOptions to put the task on the high priority list.
Problem with this approach is that there is no guarantee that the high priority task is scheduled before normal tasks.
It's a good question, it had me scratching my head for a while.
Of course, there is more than one way to skin this cat and there exists a number of valid solutions. I focused here on the simplest possible that I could conceive of, namely, execution of tasks in order of priority within a single workflow.
The scenario goes as follows: I define one activity worker serving two task lists, default_tasks and urgent_tasks, with a trivial logic:
If there are pending tasks on the urgent_tasks list, then pick one from there,
Otherwise, pick a task from default_tasks
Execute any task selected.
The question is how to check if any high priority tasks are pending? CountPendingActivityTasks API comes to the rescue!
I know you use Flow for development. My example is written using boto.swf.layer2 as Python is so much easier for prototyping - but the idea remains the same and can be extended to a more complex scenario with high and low priority workflow executions.
So, to accomplish the above using boto.swf follow these steps:
Export credentials to the environment
$ export AWS_ACCESS_KEY_ID=your access key
$ export AWS_SECRET_ACCESS_KEY= your secret key
Get the code snippets
For convenience, you can fork it from github:
$ git clone git#github.com:oozie/stackoverflow.git
$ cd stackoverflow/amazon-swf/priority_tasks/
To bootstrap the domain and the workflow:
# domain_setup.py
import boto.swf.layer2 as swf
DOMAIN = 'stackoverflow'
VERSION = '1.0'
swf.Domain(name=DOMAIN).register()
swf.ActivityType(domain=DOMAIN, name='SomeActivity', version=VERSION, task_list='default_tasks').register()
swf.WorkflowType(domain=DOMAIN, name='MyWorkflow', version=VERSION, task_list='default_tasks').register()
Decider implementation:
# decider.py
import boto.swf.layer2 as swf
DOMAIN = 'stackoverflow'
ACTIVITY = 'SomeActivity'
VERSION = '1.0'
class MyWorkflowDecider(swf.Decider):
domain = DOMAIN
task_list = 'default_tasks'
version = VERSION
def run(self):
history = self.poll()
print history
if 'events' in history:
# Get a list of non-decision events to see what event came in last.
workflow_events = [e for e in history['events']
if not e['eventType'].startswith('Decision')]
decisions = swf.Layer1Decisions()
last_event = workflow_events[-1]
last_event_type = last_event['eventType']
if last_event_type == 'WorkflowExecutionStarted':
# At the start, get the worker to fetch the first assignment.
decisions.schedule_activity_task(ACTIVITY+'1', ACTIVITY, VERSION, task_list='default_tasks')
decisions.schedule_activity_task(ACTIVITY+'2', ACTIVITY, VERSION, task_list='urgent_tasks')
decisions.schedule_activity_task(ACTIVITY+'3', ACTIVITY, VERSION, task_list='default_tasks')
decisions.schedule_activity_task(ACTIVITY+'4', ACTIVITY, VERSION, task_list='urgent_tasks')
decisions.schedule_activity_task(ACTIVITY+'5', ACTIVITY, VERSION, task_list='default_tasks')
elif last_event_type == 'ActivityTaskCompleted':
# Complete workflow execution after 5 completed activities.
closed_activity_count = sum(1 for wf_event in workflow_events if wf_event.get('eventType') == 'ActivityTaskCompleted')
if closed_activity_count == 5:
decisions.complete_workflow_execution()
self.complete(decisions=decisions)
return True
Prioritizing worker implementation:
# worker.py
import boto.swf.layer2 as swf
DOMAIN = 'stackoverflow'
VERSION = '1.0'
class PrioritizingWorker(swf.ActivityWorker):
domain = DOMAIN
version = VERSION
def run(self):
urgent_task_count = swf.Domain(name=DOMAIN).count_pending_activity_tasks('urgent_tasks').get('count', 0)
if urgent_task_count > 0:
self.task_list = 'urgent_tasks'
else:
self.task_list = 'default_tasks'
activity_task = self.poll()
if 'activityId' in activity_task:
print urgent_task_count, 'urgent tasks in the queue. Executing ' + activity_task.get('activityId')
self.complete()
return True
Run the workflow from three instances of an interactive Python shell
Run the decider:
$ python -i decider.py
>>> while MyWorkflowDecider().run(): pass
...
Start an execution:
$ python -i decider.py
>>> swf.WorkflowType(domain='stackoverflow', name='MyWorkflow', version='1.0', task_list='default_tasks').start()
Finally, kick off the worker and watch the tasks as they're getting executed:
$ python -i worker.py
>>> while PrioritizingWorker().run(): pass
...
2 urgent tasks in the queue. Executing SomeActivity2
1 urgent tasks in the queue. Executing SomeActivity4
0 urgent tasks in the queue. Executing SomeActivity5
0 urgent tasks in the queue. Executing SomeActivity1
0 urgent tasks in the queue. Executing SomeActivity3
It turns out that using a separate task list that you have to check first doesn't work well.
There's a couple of problems.
First, the count API doesn't update reliably. So you may get 0 tasks even when there are urgent tasks in the queue.
Second, the call that polls for tasks hangs if there are no tasks available. So when you poll for the non-urgent tasks, that will "stick" for either 2 minutes, or until you have a non-urgent task to do.
So this can cause all kinds of problems in your workflow.
For this to work, SWF would have to implement a polling API that could return the first task from a list of task lists. Then it would be much easier.