Using MongoDB as message queue for Celery - django

I'm trying to use MongoDB as the message queue for Celery (in a Django app). The current development version of Celery (2.2.0rc2) is supposed to let you do this, but I can't seem to get any workers to pick up tasks I'm creating.
Versions:
celery v2.2.0rc3
mongodb 1.6.5
pymongo 1.9
django-celery 2.2.0rc2
In my settings, I have:
CELERY_RESULT_BACKEND = "mongodb"
CELERY_MONGODB_BACKEND_SETTINGS = {
# Shouldn't need these - defaults are correct.
"host": "localhost",
"port": 27017,
"database": "celery",
"taskmeta_collection": "messages",
}
BROKER_BACKEND = 'mongodb'
BROKER_HOST = "localhost"
BROKER_PORT = 27017
BROKER_USER = ""
BROKER_PASSWORD = ""
BROKER_VHOST = ""
import djcelery
djcelery.setup_loader()
I've created a test tasks.py file as follows:
from celery.decorators import task
#task()
def add(x, y):
return x + y
If I fire up celeryd in the background, it appears to start normally. I then open a python shell and run the following:
>>> from myapp.tasks import add
>>> result = add.delay(5,5)
>>> result
<AsyncResult: 7174368d-288b-4abe-a6d7-aeba987fa886>
>>> result.ready()
False
Problem is that no workers ever pick up the tasks. Am I missing a setting or something? How do I point celery to the message queue?

We had this same issue. While the doc says all tasks should be registered in Celery by calling
import djcelery
djcelery.setup_loader()
it wasn't working properly. So, we still used the
CELERY_IMPORTS = ('YOUR_APP.tasks',)
setting in settings.py. Also, make sure you restart Celery if you add a new task because Celery has to register the tasks when it first starts.
Django, Celerybeat and Celery with MongoDB as the Broker

Remember that Kombu work only with mongo 1.3+ because it need the functionality findandmodify.
If you are on ubuntu the last version in repository is the 1.2, than doesn't work.
Maybe you have also to set
BROKER_VHOST = "dbname"
Keep me posted if it works

Be sure to add this to your settings, or the workers can't find the task and will fail silently.
CELERY_IMPORTS = ("namespace", )

I had the same issue but when I upgraded to celery 2.3.3 everything worked like a charm.

Related

celery cannot find active tasks (locally and on Heroku)

I am trying to deploy an app on Heroku and I am using Celery and Redis to manage background tasks. I currently have a background task that collects data via FTP and puts it in the database. I also have a loading page that periodically refreshes until the task completes. However, I cannot retrieve the list of active tasks (inspect from celery.task.control returns None). I tried running this locally, and I can see that Celery receives the task (in the terminal). I can also see that Celery connects to Redis at the correct port during startup.
I have tried reinstalling several libraries, and ensuring that all variables in the settings.py file were set properly. I also tried checking the value of os.environ['REDIS_URL'], and it is correct.
relevant code from settings.py
CACHES = {
"default": {
"BACKEND": "redis_cache.RedisCache",
"LOCATION": os.environ['REDIS_URL'],
}
}
CELERY_BROKER_URL = os.environ['REDIS_URL']
CELERY_RESULT_BACKEND = os.environ['REDIS_URL']
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
celery.py:
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'genome.settings')
os.environ.setdefault('REDIS_URL', 'redis://localhost:6379/0')
app = Celery('genome_app')
app.conf.update(BROKER_URL=os.environ['REDIS_URL'],
CELERY_RESULT_BACKEND=os.environ['REDIS_URL'])
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
(in the app's views.py)
from celery.task.control import inspect
...
i = inspect()
active_tasks = list(i.active().values())[0]
AttributeError: 'NoneType' object has no attribute 'values'
from celery.task.control import inspect
i = inspect()
dictfile = i.active()
details={}
properties=[]
if you want to get the args,taskid as new dict
for dictele in dictfile:
for dictloop in dictfile[dictele]:
jobid=dictloop['args']
taskid= dictloop['id']
jobid=jobid.replace("('","")
jobid=jobid.replace("',)",'')
details["jobid"]=jobid
details["taskid"]=taskid
properties.append(details)
print(properties)
You can create your own task manager list by using above details.
I have been having the same problem for a while now. It seems though that the devs are aware of it (https://github.com/celery/kombu/issues/1081). I have found that by trying to force it to install an older version of kombu (4.5.0 seems to now work for me) it works again for the time being.

django celery SQS "No result backend is configured."

Not duplicate of Celery: No Result Backend Configured? because SQS is used.
Keep getting the following error:
No result backend is configured. Please see the documentation for more
information.
My production settings are the following:
CELERY_BROKER_URL = 'sqs://%s:%s#' % (
urllib.parse.quote(env.str('TASK_QUEUE_USER_ID'), safe=''),
urllib.parse.quote(env.str('TASK_QUEUE_USER_SECRET'), safe=''))
BROKER_URL = CELERY_BROKER_URL
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_RESULT_BACKEND = None # Disabling the results backend
RESULT_BACKEND = None # Disabling the results backend
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_DEFAULT_QUEUE = 'async_tasks'
SQS_QUEUE_NAME = 'async_tasks'
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_SEND_EVENTS = False
CELERY_BROKER_TRANSPORT_OPTIONS = {
'region': 'eu-west-2',
'polling_interval': 3,
'visibility_timeout': 3600,
}
CELERY_SEND_TASK_ERROR_EMAILS = True
#
# https://stackoverflow.com/questions/8048556/celery-with-amazon-sqs#8567665
#
CELERY_BROKER_TRANSPORT = 'sqs'
BROKER_TRANSPORT = 'sqs'
Running celery from the command line:
DJANGO_ENV=production celery -A async_tasks worker -l info
connects to SQS and polls, but when I try to do a demo call from the command line DJANGO_ENV=production python manage.py check_async:
from django.core.management.base import BaseCommand, CommandError
import async_tasks.tasks as tasks
class Command(BaseCommand):
help = 'Check if infrastructure for async tasks has been setup correctly.'
def handle(self, *args, **options):
try:
print('Sending async request.')
t = tasks.add.apply_async((2, 4))
out = t.get(timeout=1)
print(out)
print(t.status)
except Exception as e:
print(e)
raise CommandError('Error occured')
I get the error above. Have tried in development machine with redis and everything works well.
Any ideas?
You need a Celery Result Backend configured to be able to store and collect task results. Using Celery with an SQS broker w/o a result backend is ok for "fire and forget" patterns, but it's not enough if you want to be able to access the results of your tasks through methods like get().
Maybe this will help someone. The answer above is correct, but if you still want to use Django, SQS and Celery and still want to see the results you can use Django's ORM or Cache Framework as a backend by using the django-celery-results library.
Django-celery-results
Celery Documentation - ORM Cache Framework

Django+Celery in Heroku not executing async task

I have Django+Celery in Heroku, and Celery is set up as:
import djcelery
djcelery.setup_loader()
BROKER_URL = "django://" # tell kombu to use the Django database as the message queue
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
CELERY_RESULT_BACKEND = 'djcelery.backends.database:DatabaseBackend'
CELERY_ALWAYS_EAGER = False
CELERY_TIMEZONE = 'Europe/Madrid'
I have 2 tasks defined in tasks.py, one periodic and another that is executed on asynchronous calls:
#task
def test_one_shot():
print "One shot"
#periodic_task(run_every=crontab(minute="*/5"))
def test_periodic():
print "Periodic"
Heroku is configured with a main web worker and a auxiliar worker:
web: gunicorn config.wsgi:application ON
worker: python manage.py celery worker -B -l info ON
With this setup, I run the test_one_shot task as follows:
test_one_shot.apply_async(eta=datetime.now()+timedelta(minutes=2))
And although it appears as registered in the heroku logs:
Received task: test.tasks.test_one_shot[f29c609d-b6e8-45d4-808d-2ca690f029af] eta:[2016-08-07 00:09:30.262362+02:00]
It never executes. On the other hand, the periodic task test_periodic is executed as expected. What am I doing wrong?
Thanks!
EDIT: The task was executed was not appearing in the logs due a datetime time aware issue. However when the task is programmatically called, it is never executed.
I end up changing the celery backend to use RabbitMQ in Heroku following this guide, and the problem get solved.
Basically, I installed RabbitMQ on Heroku:
$ heroku addons:add cloudamqp
And set the new configuration for it:
import djcelery
djcelery.setup_loader()
CELERY_TIMEZONE = 'Europe/Madrid'
BROKER_URL = env("CLOUDAMQP_URL", default="django://")
BROKER_POOL_LIMIT = 1
BROKER_CONNECTION_MAX_RETRIES = None
CELERY_TASK_SERIALIZER = "json"
CELERY_ACCEPT_CONTENT = ["json", "msgpack"]
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
CELERY_ALWAYS_EAGER = False
if BROKER_URL == "django://":
INSTALLED_APPS += ("kombu.transport.django",)

Celery executing scheduled task a hundred times

I've configured celery in my django app in order to run a task every morning. The task simply sends an email to a group of users. The problem is that the same email is being sent a few hundred times!!
This is my celery config:
BROKER_URL = 'redis://127.0.0.1:6379/0'
BROKER_TRANSPORT = 'redis'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
'alert_user_is_not_buying-everyday-at-7': {
'task': 'opti.tasks.alert_users_not_buying',
'schedule': crontab(hour=7, minute=0),
},
}
and the task is:
#app.task(bind=True)
def alert_user_is_not_buying(self):
send_mail_to_users()
And I use this commands to start the worker and beat (I use supervisor for that):
exec celery --app=opti beat --loglevel=INFO
exec celery --app=opti worker --loglevel=INFO
I believe that there's no problem wih my send_mail_to_users() method, It looks like the emails are sent every 30 seconds....
What is missing?
Your CELERYBEAT_SCHEDULE setting is likely going unused, as you have CELERYBEAT_SCHEDULER set to use the DatabaseScheduler. How is that scheduler configured? I would guess that's where the problem is coming from.

celery + django - how to write task state to database

I'm running Celery with Django and RabbitMQ and want to see the task states in the database table. Unfortunately no entries are written into the table djcelery_taskstate and I can't figure out why.
My settings:
CELERY_ENABLE_UTC = True
BROKER_URL = "amqp://guest:guest#localhost:5672/"
CELERY_RESULT_BACKEND = "database"
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
CELERY_TRACK_STARTED = True
CELERY_SEND_EVENTS = True
CELERY_IMPORTS = ("project_management.tasks", "accounting.tasks", "time_tracking.tasks", )
CELERY_ALWAYS_EAGER = False
import djcelery
djcelery.setup_loader()
My Task:
class TestTask(Task):
def run(self, po_id):
self.update_state(state=states.STARTED, meta={'total': 0, 'done': False})
#do something..
self.update_state(state=states.SUCCESS, meta={'total': 100, 'done': True})
I'm starting the task as follows in a view:
TestTask.apply_async(args=[], kwargs={})
I'm starting celery workers as follows.
python manage.py celeryd -v 1 -B -s celery -E -l INFO
Console gives me the following output:
[2013-05-19 11:10:03,774: INFO/MainProcess] Task accounting.tasks.TestTask[5463b2ed-0eba-451d-b828-7a89fcd36348] succeeded in 0.0538640022278s: None
Any idea what is wrong with my setup?
You need to start up the snapshot camera as well in order to see the results in the database.
python manage.py celerycam
Once you have that running, you will be able to see entries in the djcelery tables.