Celery tasks not returning results from redis - flask

How can I get the results of a task from redis? I want the query results that the task returned and not the status of the tasks.
Log files verify that the task returns the results. According to these docs get() can be used to return the task results and according to these docs it should work. Something tells me I am not actually saving the results to the redis backend.
Expected Behavior:
Run the task every 24 hours and store the db query results in redis. Use the redis cache to get those results on application calls.
Here is my task function.
#shared_task(name='get_top_ten_gainers', ignore_result=False)
def get_top_ten_gainers():
from collections import namedtuple
query = (
db_session.execute(
"""WITH p AS (
SELECT CompanyId,
100 * (MAX(CASE WHEN rn = 1 THEN CloseAdjusted END) / MAX(CASE WHEN rn = 2 THEN CloseAdjusted END) - 1) DayGain
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY CompanyId ORDER BY Date DESC) rn
FROM DimCompanyPrice
)
WHERE rn <= 2
GROUP BY CompanyId
ORDER BY DayGain DESC
LIMIT 10
)
SELECT *
FROM p
JOIN Company ON p.CompanyID = Company.ID"""
)
)
logger.info(query)
Gain = namedtuple('Gain', query.keys())
gains = [Gain(*q) for q in query.fetchall()]
payload = [[g.Symbol, g.Security, g.DayGain] for g in gains]
logger.info("`payload` of type {}: {}".format(type(payload),payload))
return payload
Here is my flask view helper function:
def _top_ten_gainers():
t0 = time.perf_counter()
from project.tasks import get_top_ten_gainers
result = get_top_ten_gainers.apply_async()
logger.info("`result` of type {}: {}".format(type(result),result))
logger.info("`result.get()` of type {}: {}".format(type(result.get()),result.get()))
total_payload['gainers'] = result.get()
logger.info('task finished in {}'.format(time.perf_counter() - t0))
Here is the celery ouput:
-------------- celery#desktop v5.1.2 (sun-harmonics)
--- ***** -----
-- ******* ---- Linux-5.11.0-41-generic-x86_64-with-glibc2.29 2021-12-02 18:07:16
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: default:0x7fb6f8bb5b80 (.default.Loader)
- ** ---------- .> transport: redis://127.0.0.1:6379/0
- ** ---------- .> results: redis://127.0.0.1:6379/0
- *** --- * --- .> concurrency: 16 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[2021-12-02 18:07:17,276: INFO/MainProcess] Connected to redis://127.0.0.1:6379/0
[2021-12-02 18:07:17,287: INFO/MainProcess] mingle: searching for neighbors
[2021-12-02 18:07:18,305: INFO/MainProcess] mingle: all alone
[2021-12-02 18:07:18,320: INFO/MainProcess] celery#desktop ready.
[2021-12-09 15:08:01,235: INFO/MainProcess] Task get_top_ten_gainers[26ce0da8-662c-4690-b95e-9d1e42e79e34] received
[2021-12-09 15:08:05,068: INFO/ForkPoolWorker-15] <sqlalchemy.engine.cursor.CursorResult object at 0x7f40c876e970>
[2021-12-09 15:08:05,087: INFO/ForkPoolWorker-15] `payload` of type <class 'list'>: [['MA', 'Mastercard', 3.856085884230387], ['GM', 'General Motors', 3.194377894904976], ['LH', 'LabCorp', 2.6360513469535274], ['CCI', 'Crown Castle', 2.491537650518838], ['DHI', 'D. R. Horton', 2.451821208757954], ['PEAK', 'Healthpeak Properties', 2.3698069046225845], ['BDX', 'Becton Dickinson', 2.355495473352187], ['DD', 'DuPont', 2.19100399536023], ['HLT', 'Hilton Worldwide', 1.9683928319458088], ['JKHY', 'Jack Henry & Associates', 1.7102615694164935]]
[2021-12-09 15:08:05,090: INFO/ForkPoolWorker-15] Task get_top_ten_gainers[26ce0da8-662c-4690-b95e-9d1e42e79e34] succeeded in 3.8531467270004214s: None
Here is my log file output:
[2021-12-09 15:08:01,216][index ][INFO ] `result` of type <class 'celery.result.AsyncResult'>: 26ce0da8-662c-4690-b95e-9d1e42e79e34
[2021-12-09 15:08:05,068][tasks ][INFO ] <sqlalchemy.engine.cursor.CursorResult object at 0x7f40c876e970>
[2021-12-09 15:08:05,087][tasks ][INFO ] `payload` of type <class 'list'>: [['MA', 'Mastercard', 3.856085884230387], ['GM', 'General Motors', 3.194377894904976], ['LH', 'LabCorp', 2.6360513469535274], ['CCI', 'Crown Castle', 2.491537650518838], ['DHI', 'D. R. Horton', 2.451821208757954], ['PEAK', 'Healthpeak Properties', 2.3698069046225845], ['BDX', 'Becton Dickinson', 2.355495473352187], ['DD', 'DuPont', 2.19100399536023], ['HLT', 'Hilton Worldwide', 1.9683928319458088], ['JKHY', 'Jack Henry & Associates', 1.7102615694164935]]
[2021-12-09 15:08:05,090][index ][INFO ] `result.get()` of type <class 'NoneType'>: None

Seems like your task is taking ~4 seconds and you're not waiting for it to finish.
You can wait for it like that:
def _top_ten_gainers():
from project.tasks import get_top_ten_gainers
result = get_top_ten_gainers.delay()
while not result.ready():
sleep(1)
gains = result.get()
logger.info("Task results: {}".format(gains))

Related

Celery receives tasks from rabbitmq, but not executing them

I have a Django project and have setup Celery + RabbitMQ to do heavy tasks asynchronously. When I call the task, RabbitMQ admin shows the task, Celery prints that the task is received, but the task is not executed.
Here is the task's code:
#app.task
def dummy_task():
print("I'm Here")
User.objects.create(username="User1")
return "User1 Created!"
In this view I send the task to celery:
def task_view(request):
result = dummy_task.delay()
return render(request, 'display_progress.html', context={'task_id': result.task_id})
I run celery with this command:
$ celery -A proj worker -l info --concurrency=2 --without-gossip
This is output of running Celery:
-------------- celery#DESKTOP-8CHJOEG v5.2.7 (dawn-chorus)
--- ***** -----
-- ******* ---- Windows-10-10.0.19044-SP0 2022-08-22 10:10:04
*** --- * ---
** ---------- [config]
** ---------- .> app: proj:0x23322847880
** ---------- .> transport: amqp://navid:**#localhost:5672//
** ---------- .> results:
*** --- * --- .> concurrency: 2 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- -------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks] .proj.celery.debug_task .
entitymatching.tasks.create_and_learn_machine .
entitymatching.tasks.dummy_task
[2022-08-22 10:10:04,068: INFO/MainProcess] Connected to
amqp://navid:**#127.0.0.1:5672// [2022-08-22 10:10:04,096:
INFO/MainProcess] mingle: searching for neighbors [2022-08-22
10:10:04,334: INFO/SpawnPoolWorker-1] child process 6864 calling
self.run() [2022-08-22 10:10:04,335: INFO/SpawnPoolWorker-2] child
process 12420 calling self.run() [2022-08-22 10:10:05,134:
INFO/MainProcess] mingle: all alone [2022-08-22 10:10:05,142:
WARNING/MainProcess]
C:\Users\Navid\PycharmProjects\proj\venv\lib\site-packages\celery\fixups\django.py:203:
UserWarning: Using settings.DEBUG leads to a memory
leak, never use this setting in production environments! warnings.warn('''Using settings.DEBUG leads to a memory
[2022-08-22 10:10:05,142: INFO/MainProcess] celery#DESKTOP-8CHJOEG
ready. [2022-08-22 10:10:05,143: INFO/MainProcess] Task
entitymatching.tasks.dummy_task[97f8a2eb-0006-4d53-ba6a-7b9f8649c84a]
received [2022-08-22 10:10:05,144: INFO/MainProcess] Task
entitymatching.tasks.dummy_task[17190479-0784-46b1-8dc6-870ead41e9c6]
received [2022-08-22 10:11:36,384: INFO/MainProcess] Task
proj.celery.debug_task[af3d633f-7b9a-4441-b375-9ce217a40ab3]
received
But "I'm Here" is not printed, and User1 is not created.
RabbitMQ shows that there are 3 "unack" messages in the queue:
You did not provide enough info although I think you have problem with your worker pools.
try adding
--pool=solo
at the end of your run command.
it will be like
celery -A proj worker -l info --concurrency=2 --without-gossip --pool=solo
although on production I recommend using gevent as your pool.
celery -A proj worker -l info --concurrency=2 --without-gossip --pool=gevent

How to run parallel tasking with celery django?

I am looking to run tasks in parallel with django celery.
Let's say the following task:
#shared_task(bind=True)
def loop_task(self):
for i in range(10):
time.sleep(1)
print(i)
return "done"
Each time a view is loaded then this task must be executed :
def view(request):
loop_task.delay()
My problem is that I want to run this task multiple times without a queue system in parallel mode. Each time a user goes to a view, there should be no queue to wait for a previous task to finish
Here is the celery command I use :
celery -A toolbox.celery worker --pool=solo -l info -n my_worker1
-------------- celery#my_worker1 v5.2.7 (dawn-chorus)
--- ***** -----
-- ******* ---- Windows-10-10.0.22000-SP0 2022-08-01 10:22:52
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: toolbox:0x1fefe7286a0
- ** ---------- .> transport: redis://127.0.0.1:6379//
- ** ---------- .> results:
- *** --- * --- .> concurrency: 8 (solo)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
I have already tried the solutions found here but none of them seem to do what I ask StackOverflow : Executing two tasks at the same time with Celery
I should have the following output:
0,1,2,...,9
If two users load the same page at the same time then we should have the previous output appearing twice
Result :
0,0,1,1,2,2,...,9,9
I think it's very simple to solve, but you need to test it.
Basically, you need to run task in async mode - for example, when you are trying to run task that send mass sms to multiple users, you do it in this way:
send_mass_sms.apply_async(
[
phone_numbers,
instance.body,
instance.id,
],
eta=instance.when,
)
Your code needs to be fixed this way:
def view(request):
loop_task.apply_async()
If you need to update data on website, you can store data in models and call ajax multiple times or implement logic via websockets, but this is topic for another question :)
Maybe need to start multi workers, but this does not guarantee that all tasks can be performed in parallel.
Will still has task in doesn't receive in queue. It depends on the number of workers and the speed of execution.
And if same result, you can set it in cache.

Can AWS SQS be Used in a Django Project with django-celery-results Backend?

Pre-warning: there is A LOT I don't understand
My Requirement
I need to be able to get the result of a celery task. I need the status to change to 'SUCCESS' when completed successfully.
For example:
I need to be able to get the result of x + y after executing add.delay(1,2) on the task below.
myapp/tasks.py
from celery import shared_task
from time import sleep
#shared_task
def add(x, y):
sleep(10)
return x + y
Is AWS SQS the right tool for my needs?
I read Celery's Using Amazon SQS and understand at the bottom it says this about the results.
Results
Multiple products in the Amazon Web Services family could be
a good candidate to store or publish results with, but there’s no such
result backend included at this point.
Question:
Does this mean django-celery-results can't be used with AWS SQS?
More Context Below
What I am doing executionally?
I look at my AWS queue (shows messages available as 3)
In my local terminal, I do celery -A ebdjango worker --loglevel=INFO (see celery output below)
In my PyCharm Python console connected to my Django project, I do r = add.delay(1,2)
r is an AsyncResult object:
>>> r = add.delay(1,2)
>>> r
<AsyncResult: b69c4287-5c82-4873-aa8c-227547511233>
In AWS, my "Messages available" went from 3 to 4
Locally, in my terminal, nothing happened (I expect SQS to send the message back to me locally? Is this wrong?)
I inspect r and see this:
>>> r.id
'b69c4287-5c82-4873-aa8c-227547511233'
>>> r.status
'PENDING'
>>> r.result
>>> type(r.result)
<class 'NoneType'>
ebdjango/settings.py
...
AWS_ACCESS_KEY_ID = "XXXXXXXXXXXXXXXXXXX"
AWS_SECRET_ACCESS_KEY = "YYYYYYYYYYYYYYYYYYYYYYYYYYY"
CELERY_BROKER_URL = "sqs://"
CELERY_BROKER_TRANSPORT_OPTIONS = {
'region': 'us-west-2',
'visibility_timeout': 3600,
'predefined_queues': {
'eb-celery-queue': {
'url': 'https://sqs.us-west-2.amazonaws.com/12345678910/eb-celery-queue',
'access_key_id': AWS_ACCESS_KEY_ID,
'secret_access_key': AWS_SECRET_ACCESS_KEY,
}
}
}
CELERY_SEND_EVENTS = False
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_TASK_DEFAULT_QUEUE = 'eb-celery-queue'
CELERY_WORKER_CONCURRENCY = 1
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_CONTENT_ENCODING = 'utf-8'
CELERY_RESULT_BACKEND = 'django-db' <-- Note: I have django-celery-results installed and set
Celery output at start:
(eb-virt) C:\Users\Jarad\Documents\PyCharm\DEVOPS\ebdjango>celery -A ebdjango worker --loglevel=INFO
[2021-08-27 14:35:31,914: WARNING/MainProcess] No hostname was supplied. Reverting to default 'None'
-------------- celery#Inspiron v5.1.2 (sun-harmonics)
--- ***** -----
-- ******* ---- Windows-10-10.0.19041-SP0 2021-08-27 14:35:31
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: ebdjango:0x1d64e4d4630
- ** ---------- .> transport: sqs://localhost//
- ** ---------- .> results:
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> eb-celery-queue exchange=eb-celery-queue(direct) key=eb-celery-queue
[tasks]
. ebdjango.celery.debug_task
. homepage.tasks.add
. homepage.tasks.count_widgets
. homepage.tasks.cu
. homepage.tasks.mul
. homepage.tasks.rename_widget
. homepage.tasks.xsum
[2021-08-27 14:35:31,981: WARNING/MainProcess] No hostname was supplied. Reverting to default 'None'
[2021-08-27 14:35:31,981: INFO/MainProcess] Connected to sqs://localhost//
[2021-08-27 14:35:32,306: WARNING/MainProcess] ...
[2021-08-27 14:35:32,307: INFO/MainProcess] celery#Inspiron ready.
I do notice that 1) results: section shows empty (like it's not defined) and 2) the task events are OFF which might be because task events aren't supported for SQS, but I don't know for certain. It seems I can set CELERY_SEND_EVENTS and it has no effect on task events output here.

Celery worker does not consume messages

I'm using Celery 4.0.0 with RabbitMQ as messages broker within a django 1.9 project, using django-celery-results for results backend. I'm new to Celery and RabbitMQ. The python version is 2.7.5.
After following the instructions in the Celery docs for configuring and using celery with django, and before adding any real tasks, I tried a simple task calling using django shell (manage.py shell), sending the debug_task as defined in the celery docs.
Task is sent OK, and looking at the rabbitmq queue, I can see a new message has arrived to the correct queue on the correct virtual host.
I run the worker and it looks like it starts OK, then it arrives to the event loop and does nothing. No error is presented, not in the worker output or in the rabbitmq logs.
On the other hand, celery status on the same machine returns that there are no active nodes.
I'm probably missing something here, but I don't know what it can be.
Don't know if this is relevant, but when I use 'celery purge' to clear the messages queue, it finds the message and purges it.
Celery configuration settings as added to django settings.py:
CELERY_BROKER_URL = 'amqp://user1:passwd1#rabbithost:5672/exp'
CELERY_TIMEZONE = TIME_ZONE # Using django's TZ
CELERY_TASK_TRACK_STARTED = True
CELERY_RESULT_BACKEND = 'django-db'
Task invocation in django shell:
>>> from project.celery import debug_task
>>> debug_task
<#task: project.celery.debug_task of project:0x23cad10>
>>> r = debug_task.delay()
>>> r
<AsyncResult: 33031998-4cd8-4dfe-8e9d-bda9398525bb>
>>> r.status
u'PENDING'
Celery worker invocation:
% celery -A project worker -l info -Q celery
-------------- celery#super9 v4.0.0 (latentcall)
---- **** -----
--- * *** * -- Linux-3.10.0-327.4.5.el7.x86_64-x86_64-with-centos-7.2.1511-Core 2016-11-24 18:15:27
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: project:0x25931d0
- ** ---------- .> transport: amqp://user1:**#rabbithost:5672/exp
- ** ---------- .> results:
- *** --- * --- .> concurrency: 24 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. project.celery.debug_task
[2016-11-24 18:15:28,984: INFO/MainProcess] Connected to amqp://user1:**#rabbithost:5672/exp
[2016-11-24 18:15:29,009: INFO/MainProcess] mingle: searching for neighbors
[2016-11-24 18:15:30,035: INFO/MainProcess] mingle: all alone
/dir/project/devel/python/devel-1.9-centos7/lib/python2.7/site-packages/celery/fixups/django.py:202: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-11-24 18:15:30,072: WARNING/MainProcess] /dir/project/devel/python/devel-1.9-centos7/lib/python2.7/site-packages/celery/fixups/django.py:202: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn('Using settings.DEBUG leads to a memory leak, never '
[2016-11-24 18:15:30,073: INFO/MainProcess] celery#super9 ready.
Checking rabbitmq queue:
% rabbitmqctl list_queues -p exp
Listing queues ...
celery 1
Celery status invocation while the worker is "ready":
% celery -A project status
Error: No nodes replied within time constraint.
Thanks.

django-celery consumer can not receive tasks

Installation
I am using django(1.4) celery(3.0.13) with RabbitMQ(v3.0.4), the backend db is sqlite.
Celery was installed by pip install django-celery
Setting
In setting.py:
# For django-celery
import djcelery
djcelery.setup_loader()
BROKER_URL = 'amqp://user:pwd#sd5:5672/8086'
### and adding 'djcelery' to INSTALLED_APPS
Running
After setup the database with south, I start rabbitmq-server and manage.py celery worker --loglevel=debug
I could see the connection was established:
-------------- celery#sd5 v3.0.16 (Chiastic Slide)
---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: amqp://utils#sd5:5672/8086
- ** ---------- . app: default:0x8a5106c (djcelery.loaders.DjangoLoader)
- ** ---------- . concurrency: 2 (processes)
- ** ---------- . events: OFF (enable -E to monitor this worker)
- ** ----------
- *** --- * --- [Queues]
-- ******* ---- . celery: exchange:celery(direct) binding:celery
--- ***** -----
[Tasks]
. celery.backend_cleanup
. celery.chain
. celery.chord
. celery.chord_unlock
. celery.chunks
. celery.group
. celery.map
. celery.starmap
. utils.weixin.tasks.celery_add
[2013-03-19 19:50:00,460: WARNING/MainProcess] celery#sd5 ready.
[2013-03-19 19:50:00,483: INFO/MainProcess] consumer: Connected to amqp://utils#sd5:5672/8086.
[2013-03-19 19:50:00,498: DEBUG/MainProcess] consumer: Ready to accept tasks!
And in rabbit#sd5.log:
=INFO REPORT==== 19-Mar-2013::19:50:00 ===
accepting AMQP connection <0.1655.0> (127.0.0.1:50087 -> 127.0.0.1:5672)
Problem
Then I run my task utils.weixin.tasks.celery_add in manage.py shell:
>>> from utils.weixin.tasks import celery_add
>>> result = celery_add.delay(1,3)
>>> result.ready()
False
>>> result.get()
hangup here forever...
And, nothing show up in the log of celery worker and log of rabbitmq, not any 'received task' etc.
It seems that calling the task does not comunicate with worker.
Question
What should I do to find out what has been done incorrectly. How should I fix this?
Appriciated!