I have problem using Celery, Redis and Django.
I am trying to use them to create a simple task.
However, an error occurs shortly after the task has been executed.
I will specify below a part of the code to better understand.I thank you for your attention.
CELERY_BROKER_URL = 'redis://:password#REDIS:6379/0'
CELERY_RESULT_BACKEND = 'redis://REDIS:6379/0'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
CELERY_TIMEZONE = 'America/Recife'
CELERY_BEAT_SCHEDULE = {
'task-send': {
'task': 'app.tasks.task_send_email',
'schedule': crontab(hour=5, minute=44)
}
}
Console Celery
[config]
app: sistema:0x7fa254a5d6f4
transport: redis://:**#redis:6379/0
results: redis://redis:6379/0
concurrency: 1 (prefork)
task events: OFF (enable -E to monitor tasks in this worker)
[queues]
exchange=celery(direct) key=celery
[tasks]
app.tasks.task_send_email
INFO/MainProcess] Connected to redis://:**#redis:6379/0
INFO/MainProcess] mingle: searching for neighbors
INFO/MainProcess] mingle: all alone
After execute the task a error occur
RuntimeWarning: Exception raised outside body: ResponseError('NOAUTH Authentication required.',):
The task is not completed.
Considering that your result backend URL does not have the authentication token, and you use the same server that obviously expect it, what I believe is happening is the following: You can successfully run the task (because backend URL is correct), but once the task runs, Celery tries to store the result (in the result backend), but since the result backend URL is invalid (redis://redis:6379/0, should be similar to the broker, ie. redis://:**#redis:6379/1 - use different database name) Celery throws an exception because it can't connect to Redis (NOAUTH Authentication required comes from Redis server).
Let's say your Redis server is redis.local, and your Redis authentication token is my53cr3tt0ken. Your Celery configuration should have these two:
broker_url = "redis://:my53cr3tt0ken#redis.local:6379/0"
celery_result_backend = "redis://:my53cr3tt0ken#redis.local:6379/1"
Notice I use different databases for broker and result backend - I recommend you do the same.
If your Redis encrypts communication, then you should use rediss://....
Related
Pre-warning: there is A LOT I don't understand
My Requirement
I need to be able to get the result of a celery task. I need the status to change to 'SUCCESS' when completed successfully.
For example:
I need to be able to get the result of x + y after executing add.delay(1,2) on the task below.
myapp/tasks.py
from celery import shared_task
from time import sleep
#shared_task
def add(x, y):
sleep(10)
return x + y
Is AWS SQS the right tool for my needs?
I read Celery's Using Amazon SQS and understand at the bottom it says this about the results.
Results
Multiple products in the Amazon Web Services family could be
a good candidate to store or publish results with, but there’s no such
result backend included at this point.
Question:
Does this mean django-celery-results can't be used with AWS SQS?
More Context Below
What I am doing executionally?
I look at my AWS queue (shows messages available as 3)
In my local terminal, I do celery -A ebdjango worker --loglevel=INFO (see celery output below)
In my PyCharm Python console connected to my Django project, I do r = add.delay(1,2)
r is an AsyncResult object:
>>> r = add.delay(1,2)
>>> r
<AsyncResult: b69c4287-5c82-4873-aa8c-227547511233>
In AWS, my "Messages available" went from 3 to 4
Locally, in my terminal, nothing happened (I expect SQS to send the message back to me locally? Is this wrong?)
I inspect r and see this:
>>> r.id
'b69c4287-5c82-4873-aa8c-227547511233'
>>> r.status
'PENDING'
>>> r.result
>>> type(r.result)
<class 'NoneType'>
ebdjango/settings.py
...
AWS_ACCESS_KEY_ID = "XXXXXXXXXXXXXXXXXXX"
AWS_SECRET_ACCESS_KEY = "YYYYYYYYYYYYYYYYYYYYYYYYYYY"
CELERY_BROKER_URL = "sqs://"
CELERY_BROKER_TRANSPORT_OPTIONS = {
'region': 'us-west-2',
'visibility_timeout': 3600,
'predefined_queues': {
'eb-celery-queue': {
'url': 'https://sqs.us-west-2.amazonaws.com/12345678910/eb-celery-queue',
'access_key_id': AWS_ACCESS_KEY_ID,
'secret_access_key': AWS_SECRET_ACCESS_KEY,
}
}
}
CELERY_SEND_EVENTS = False
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_TASK_DEFAULT_QUEUE = 'eb-celery-queue'
CELERY_WORKER_CONCURRENCY = 1
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_CONTENT_ENCODING = 'utf-8'
CELERY_RESULT_BACKEND = 'django-db' <-- Note: I have django-celery-results installed and set
Celery output at start:
(eb-virt) C:\Users\Jarad\Documents\PyCharm\DEVOPS\ebdjango>celery -A ebdjango worker --loglevel=INFO
[2021-08-27 14:35:31,914: WARNING/MainProcess] No hostname was supplied. Reverting to default 'None'
-------------- celery#Inspiron v5.1.2 (sun-harmonics)
--- ***** -----
-- ******* ---- Windows-10-10.0.19041-SP0 2021-08-27 14:35:31
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: ebdjango:0x1d64e4d4630
- ** ---------- .> transport: sqs://localhost//
- ** ---------- .> results:
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> eb-celery-queue exchange=eb-celery-queue(direct) key=eb-celery-queue
[tasks]
. ebdjango.celery.debug_task
. homepage.tasks.add
. homepage.tasks.count_widgets
. homepage.tasks.cu
. homepage.tasks.mul
. homepage.tasks.rename_widget
. homepage.tasks.xsum
[2021-08-27 14:35:31,981: WARNING/MainProcess] No hostname was supplied. Reverting to default 'None'
[2021-08-27 14:35:31,981: INFO/MainProcess] Connected to sqs://localhost//
[2021-08-27 14:35:32,306: WARNING/MainProcess] ...
[2021-08-27 14:35:32,307: INFO/MainProcess] celery#Inspiron ready.
I do notice that 1) results: section shows empty (like it's not defined) and 2) the task events are OFF which might be because task events aren't supported for SQS, but I don't know for certain. It seems I can set CELERY_SEND_EVENTS and it has no effect on task events output here.
I am trying to use Celery to perform a rather consuming algorithm on one of my models.
Currently in my home.tasks.py I have:
#shared_task(bind=True)
def get_hot_posts():
return Post.objects.get_hot()
#shared_task(bind=True)
def get_top_posts():
pass
Which inside my Post object model manager I have:
def get_hot(self):
qs = (
self.get_queryset()
.select_related("author")
)
qs_list = list(qs)
sorted_post = sorted(qs_list, key=lambda p: p.hot(), reverse=True)
return sorted_post
Which returns a list object of the hot posts.
I have used django_celery_beat in order to set a periodic task. Which I have configured in my settings.py
CELERY_BEAT_SCHEDULE = {
'update-hot-posts': {
'task':'get_hot_posts',
'schedule': 3600.0
},
'update-top-posts': {
'task':'get_top_posts',
'schedule': 86400
}
}
I do not if I can perform any functions on my models in Celery tasks, but my intention is to compute the top posts every 1 hour, and then simply use it in one of my views. How can I achieve this, I am not able to find how I can get the output of that task and use it in my views in order to render it in my template.
Thanks in advance!
EDIT
I am now caching the results:
settings.py:
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://127.0.0.1:6379/1",
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
"IGNORE_EXCEPTIONS": True,
}
}
}
CACHE_TTL = getattr(settings, 'CACHE_TTL', DEFAULT_TIMEOUT)
#shared_task(bind=True)
def get_hot_posts():
hot_posts = Post.objects.get_hot()
cache.set("hot_posts", hot_posts, timeout=CACHE_TTL)
However, when accessing objects in my view it return None, it seems my tasks are not working.
#login_required
def hot_posts(request):
posts = cache.get("hot_posts")
context = { 'posts':posts, 'hot_active':'-active'}
return render(request, 'home/homepage/home.html', context)
How can I check whether my tasks are running properly or not? And it is actually working and caching the queryset function.
EDIT: Configuration in settings.py:
BROKER_URL = 'redis://localhost:6379'
BROKER_TRANSPORT = 'redis'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_BEAT_SCHEDULE = {
'update-hot-posts': {
'task':'get_hot_posts',
'schedule': 3600.0
},
'update-top-posts': {
'task':'get_top_posts',
'schedule': 86400.0
},
'tester': {
'task':'tester',
'schedule': 60.0
}
}
I do not see and results when I go to my view andcache.get returns None, I think my tasks are not running but I cannot find the reason.
This is what happens when I run my worker:
celery -A register worker -E --loglevel=info
-------------- celery#apples-MacBook-Pro-2.local v4.4.6 (cliffs)
--- ***** -----
-- ******* ---- Darwin-16.7.0-x86_64-i386-64bit 2020-07-06 01:46:36
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: register:0x10f3da050
- ** ---------- .> transport: redis://localhost:6379//
- ** ---------- .> results: redis://localhost:6379/
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: ON
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. home.tasks.get_hot_posts
. home.tasks.get_top_posts
. home.tasks.tester
[2020-07-06 01:46:38,449: INFO/MainProcess] Connected to redis://localhost:6379//
[2020-07-06 01:46:38,500: INFO/MainProcess] mingle: searching for neighbors
[2020-07-06 01:46:39,592: INFO/MainProcess] mingle: all alone
[2020-07-06 01:46:39,650: INFO/MainProcess] celery#apples-MacBook-Pro-2.local ready.
Also for starting up beat I use:
celery -A register beat -l INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler
My suggestion is that you alter your model and make it taggable. Perhaps this: https://django-taggit.readthedocs.io/
Once you've done that you can modify your celery job that calculates hot posts. Once the new hot posts are calculated you can remove all the "hot" tags from all existing posts and then tag the newly-hot posts with the "hot" tag.
Then your view code can simply filter for posts with the hot tag.
EDIT
If you want to be sure that your code is actually executing there are extensions that you can use to do so. For example the django-celery-results backend will store whatever data your #shared_task returns (usually JSON if that's your message encoding) in the database along with a timestamp and maybe even the input args. So then you can see if/that your tasks are running as desired.
https://docs.celeryproject.org/en/stable/django/first-steps-with-django.html#django-celery-results-using-the-django-orm-cache-as-a-result-backend
You might also consider django-celery-beat to ensure that you have a nice visual way to see job schedules via the django admin
https://docs.celeryproject.org/en/stable/django/first-steps-with-django.html#django-celery-beat-database-backed-periodic-tasks-with-admin-interface
EDIT 2
If you're going to use the database scheduler (highly recommended!) then you'll need to login to the admin and add your tasks on the schedule that you want.
https://pinoylearnpython.com/wp-content/uploads/2019/04/Django-Celery-Beat-on-Admin-Site-Pinoy-Learn-Python-1024x718.jpg
EDIT 3
In your settings.py
CELERY_BEAT_SCHEDULE = {
'update-hot-posts': {
'task':'get_hot_posts',
'schedule': 3600.0
},
'update-top-posts': {
'task':'get_top_posts',
'schedule': 86400.0
},
'tester': {
'task':'tester',
'schedule': 60.0
}
}
The third task there is called tester which is supposed to run every 60s. I don't see that anywhere in your tasks. Because you have attempted to schedule a task which isn't defined anywhere as a #shared_task celery is getting confused and giving you the error messages about tester.
I am trying to run a celery task in a Django view using my_task.delay(). However, the task is never executed and the code is blocked on that line and the view never renders. I am using AWS SQS as a broker with an IAM user with full access to SQS.
What am I doing wrong?
Running celery and Django
I am running celery like this:
celery -A app worker -l info
And I am starting my Django server locally in another terminal using:
python manage.py runserver
The celery command outputs:
-------------- celery#LAPTOP-02019EM6 v4.1.0 (latentcall)
---- **** -----
--- * *** * -- Windows-10-10.0.16299 2018-02-07 13:48:18
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: app:0x6372c18
- ** ---------- .> transport: sqs://**redacted**:**#localhost//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ---- .> task events: OFF
--- ***** -----
-------------- [queues]
.> my-queue exchange=my-queue(direct) key=my-queue
[tasks]
. app.celery.debug_task
. counter.tasks.my_task
[2018-02-07 13:48:19,262: INFO/MainProcess] Starting new HTTPS connection (1): sa-east-1.queue.amazonaws.com
[2018-02-07 13:48:19,868: INFO/SpawnPoolWorker-1] child process 20196 calling self.run()
[2018-02-07 13:48:19,918: INFO/SpawnPoolWorker-4] child process 19984 calling self.run()
[2018-02-07 13:48:19,947: INFO/SpawnPoolWorker-3] child process 16024 calling self.run()
[2018-02-07 13:48:20,004: INFO/SpawnPoolWorker-2] child process 19572 calling self.run()
[2018-02-07 13:48:20,815: INFO/MainProcess] Connected to sqs://**redacted**:**#localhost//
[2018-02-07 13:48:20,930: INFO/MainProcess] Starting new HTTPS connection (1): sa-east-1.queue.amazonaws.com
[2018-02-07 13:48:21,307: WARNING/MainProcess] c:\users\nicolas\anaconda3\envs\djangocelery\lib\site-packages\celery\fixups\django.py:202: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2018-02-07 13:48:21,311: INFO/MainProcess] celery#LAPTOP-02019EM6 ready.
views.py
from .tasks import my_task
def index(request):
print('New request') # This is called
my_task.delay()
# Never reaches here
return HttpResponse('test')
tasks.py
...
#shared_task
def my_task():
print('Task ran successfully') # never prints anything
settings.py
My configuration is the following:
import djcelery
djcelery.setup_loader()
CELERY_BROKER_URL = 'sqs://'
CELERY_BROKER_TRANSPORT_OPTIONS = {
'region': 'sa-east-1',
}
CELERY_BROKER_USER = '****************'
CELERY_BROKER_PASSWORD = '***************************'
CELERY_TASK_DEFAULT_QUEUE = 'my-queue'
Versions:
I use the following version of Django and Celery:
Django==2.0.2
django-celery==3.2.2
celery==4.1.0
Thanks for your help!
A bit late, but maybe you are still interested. I got Celery with Django and SQS running and don't see any errors in your code. Maybe you missed something in the celery.py file? Here is my code for comparing.
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'djangoappname.settings')
# do not use namespace because default amqp broker would be called
app = Celery('lsaweb')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks()
Have you also checked if SQS is getting messages (try polling in the SQS administration area)?
I'm using Celery 4.0.0 with RabbitMQ as messages broker within a django 1.9 project, using django-celery-results for results backend. I'm new to Celery and RabbitMQ. The python version is 2.7.5.
After following the instructions in the Celery docs for configuring and using celery with django, and before adding any real tasks, I tried a simple task calling using django shell (manage.py shell), sending the debug_task as defined in the celery docs.
Task is sent OK, and looking at the rabbitmq queue, I can see a new message has arrived to the correct queue on the correct virtual host.
I run the worker and it looks like it starts OK, then it arrives to the event loop and does nothing. No error is presented, not in the worker output or in the rabbitmq logs.
On the other hand, celery status on the same machine returns that there are no active nodes.
I'm probably missing something here, but I don't know what it can be.
Don't know if this is relevant, but when I use 'celery purge' to clear the messages queue, it finds the message and purges it.
Celery configuration settings as added to django settings.py:
CELERY_BROKER_URL = 'amqp://user1:passwd1#rabbithost:5672/exp'
CELERY_TIMEZONE = TIME_ZONE # Using django's TZ
CELERY_TASK_TRACK_STARTED = True
CELERY_RESULT_BACKEND = 'django-db'
Task invocation in django shell:
>>> from project.celery import debug_task
>>> debug_task
<#task: project.celery.debug_task of project:0x23cad10>
>>> r = debug_task.delay()
>>> r
<AsyncResult: 33031998-4cd8-4dfe-8e9d-bda9398525bb>
>>> r.status
u'PENDING'
Celery worker invocation:
% celery -A project worker -l info -Q celery
-------------- celery#super9 v4.0.0 (latentcall)
---- **** -----
--- * *** * -- Linux-3.10.0-327.4.5.el7.x86_64-x86_64-with-centos-7.2.1511-Core 2016-11-24 18:15:27
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: project:0x25931d0
- ** ---------- .> transport: amqp://user1:**#rabbithost:5672/exp
- ** ---------- .> results:
- *** --- * --- .> concurrency: 24 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. project.celery.debug_task
[2016-11-24 18:15:28,984: INFO/MainProcess] Connected to amqp://user1:**#rabbithost:5672/exp
[2016-11-24 18:15:29,009: INFO/MainProcess] mingle: searching for neighbors
[2016-11-24 18:15:30,035: INFO/MainProcess] mingle: all alone
/dir/project/devel/python/devel-1.9-centos7/lib/python2.7/site-packages/celery/fixups/django.py:202: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-11-24 18:15:30,072: WARNING/MainProcess] /dir/project/devel/python/devel-1.9-centos7/lib/python2.7/site-packages/celery/fixups/django.py:202: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-11-24 18:15:30,073: INFO/MainProcess] celery#super9 ready.
Checking rabbitmq queue:
% rabbitmqctl list_queues -p exp
Listing queues ...
celery 1
Celery status invocation while the worker is "ready":
% celery -A project status
Error: No nodes replied within time constraint.
Thanks.
I'm using Django/Celery Quickstart... or, how I learned to stop using cron and love celery, and it seems the jobs are getting queued, but never run.
tasks.py:
from celery.task.schedules import crontab
from celery.decorators import periodic_task
# this will run every minute, see http://celeryproject.org/docs/reference/celery.task.schedules.html#celery.task.schedules.crontab
#periodic_task(run_every=crontab(hour="*", minute="*", day_of_week="*"))
def test():
print "firing test task"
So I run celery:
bash-3.2$ sudo manage.py celeryd -v 2 -B -s celery -E -l INFO
/scratch/software/python/lib/celery/apps/worker.py:166: RuntimeWarning: Running celeryd with superuser privileges is discouraged!
'Running celeryd with superuser privileges is discouraged!'))
-------------- celery#myserver v3.0.12 (Chiastic Slide)
---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: django://localhost//
- ** ---------- . app: default:0x12120290 (djcelery.loaders.DjangoLoader)
- ** ---------- . concurrency: 2 (processes)
- ** ---------- . events: ON
- ** ----------
- *** --- * --- [Queues]
-- ******* ---- . celery: exchange:celery(direct) binding:celery
--- ***** -----
[Tasks]
. GotPatch.tasks.test
[2012-12-12 11:58:37,118: INFO/Beat] Celerybeat: Starting...
[2012-12-12 11:58:37,163: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 11:58:37,249: WARNING/MainProcess] /scratch/software/python/lib/djcelery/loaders.py:132: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn("Using settings.DEBUG leads to a memory leak, never "
[2012-12-12 11:58:37,348: WARNING/MainProcess] celery#myserver ready.
[2012-12-12 11:58:37,352: INFO/MainProcess] consumer: Connected to django://localhost//.
[2012-12-12 11:58:37,700: INFO/MainProcess] child process calling self.run()
[2012-12-12 11:58:37,857: INFO/MainProcess] child process calling self.run()
[2012-12-12 11:59:00,229: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 12:00:00,017: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 12:01:00,020: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 12:02:00,024: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
The tasks are indeed getting queued:
python manage.py shell
>>> from kombu.transport.django.models import Message
>>> Message.objects.count()
234
And the count increases over time:
>>> Message.objects.count()
477
There are no lines in the log file that seem to indicate the task is being executed. I'm expecting something like:
[... INFO/MainProcess] Task myapp.tasks.test[39d57f82-fdd2-406a-ad5f-50b0e30a6492] succeeded in 0.00423407554626s: None
Any suggestions how to diagnose / debug this?
I'm new to celery as well, but from the comments on the link you provided, it looks like there was an error in the tutorial. One of the comments points out:
At this command
sudo ./manage.py celeryd -v 2 -B -s celery -E -l INFO
You must add "-I tasks" to load tasks.py file ...
Did you try that?
You should check that you specify BROKER_URL parameter inside django's settyngs.py.
BROKER_URL = 'django://'
And you should check that your timezones in django, mysql and celery is equal.
It helped me.
P.s.:
[... INFO/MainProcess] Task myapp.tasks.test[39d57f82-fdd2-406a-ad5f-50b0e30a6492] succeeded in 0.00423407554626s: None
This line means that your task was scheduled (!not executed!)
Please check your config and i hope that it helps you.
I hope someone could learn from my experience in hacking this.
After setting everything up according to the tutorial I noticed that when I call
add.delay(4,5)
nothing happens. the worker did not receive the task (nothing was printed on stderr).
The problem was with the rabbitmq installation. It turns out the default free disk size requirements is 1GB which was way too much for my VM.
what put me on track was to read the rabbitmq log file.
to find it I had to stop and start the rabbitmq server
sudo rabbitmqctl stop
sudo rabbitmq-server
rabbitmq dumps the log file location to the screen. in the file I noticed this:
=WARNING REPORT==== 14-Mar-2017::13:57:41 ===
disk resource limit alarm set on node rabbit#supporttip.
**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************
I then followed the instruction here in order to reduce the free disk limit
Rabbitmq ignores configuration on Ubuntu 12
As a baseline I used the config file from git
https://github.com/rabbitmq/rabbitmq-server/blob/stable/docs/rabbitmq.config.example
The change itself:
{disk_free_limit, "50MB"}