DJCelery not storing task results in Django SQLite DB - django

DJCelery is not storing task results in my Django SQLite DB.
I have an existing Django project that I have started setting up Celery w/ RabbitMQ on. I started my RabbitMQ server. I can run Celery python manage.py celeryd --verbosity=2 --loglevel=DEBUG and Celerybeat python manage.py celerybeat --verbosity=2 --loglevel=DEBUG. Everything starts up w/ out error and my periodic example tasks also runs without error.
I used pip install django-celery to install. I have djcelery in my installed apps and ran python manage.py migrate djcelery. I added CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend' to the end of my settings.py file.
When I run python manage.py celeryd --verbosity=2 --loglevel=DEBUG, the startup text shows:
...
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results:
- *** --- * --- .> concurrency: 1 (prefork)
...
The results section being blank indicates to me that the configuration isn't right somehow but I can't figure out how. I tried using app.conf.update in my celery.py file to set the CELERY_RESULT_BACKEND but got the same results. I left out CELERY_RESULT_BACKEND, but that defaulted to no results. I also tried putting 'database' instead of 'djcelery.backends.database:DatabaseBackend' but that indicated it was attempting to use sqlalchemy instead of djcelery.
When I run python manage.py runserver I can see a DJCELERY section with tables Crontabs, Intervals, Periodic tasks, Tasks, and Workers. There isn't any data on my Tasks though.
Can anyone point out what could be wrong or missing? Thank you for your time.

tutuDajuju led me in the right direction - there's more to it so I'll write it all up. I abandoned using djcelery in favor of sqlalchemy with a separate back-end database outside of Django.
Inside my venv I ran pip install sqlalchemy. I then put CELERY_RESULT_BACKEND = 'db+sqlite:///celery_results.sqlite3' in settings.py. This connected Celery to the new SQLite database to use for state/results.
Running celery -A <projectapp>.celery:app worker then showed the database in the startup message:
...
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results: sqlite:///celery_results.sqlite3
- *** --- * --- .> concurrency: 1 (prefork)
...
At first I was worried because the database file wasn't created in my Django project dir. This is because I hadn't ran a task yet. Once I ran my first task, the database & tables were created correctly.
I verified task results were stored in the database by running a script:
from sqlalchemy import create_engine
engine = create_engine("sqlite:///celery_results.sqlite3")
connection = engine.connect()
result = connection.execute("select * from celery_taskmeta")
for row in result:
print(row)
connection.close()
I found the table names by:
print(engine.table_name())
Hope this helps someone out.

The celery docs mention a few different syntaxes, not sure what you tried is valid. Try the following:
# use a connection string
CELERY_RESULT_BACKEND = 'db+sqlite:///foo.db'
Update:
As in your comment, the docs also mention it is also possible to use the Django ORM/Cache as a result backend. To do this, you must pass the setting you tried into your celery app config:
app.conf.update(
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
)
Alternatively, the docs also explain
If you have connected Celery to your Django settings then you can add
this directly into your settings module (without the app.conf.update
part)
This is a reference to the configuration of the Celery app detailed in the same page. This basically means that if you configured your celery app in a module, and you add the Django settings module as a configuration source for Celery, then setting CELERY_RESULT_BACKEND in your Django settings module, as you did, will also work.
file: proj/proj/celery.py
# important to pass the Django settings to your celery app
app = Celery('proj')
app.config_from_object('django.conf:settings')
file: proj/proj/settings.py
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend'

Related

Django with Celery on Digital Ocean

The Objective
I am trying to use Celery in combination with Django; The objective is to set up Celery on a Django web application (deployed test environment) to send scheduled emails. The web application already sends emails. The ultimate objective is to add functionality to send out emails at a user-selected date-time. However, before we get there the first step is to invoke the delay() function to prove that Celery is working.
Tutorials and Documentation Used
I am new to Celery and have been learning through the following resources:
First Steps With Celery-Django documentation: https://docs.celeryq.dev/en/stable/django/first-steps-with-django.html#using-celery-with-django
A YouTube video on sending email from Django through Celery via a Redis broker: https://www.youtube.com/watch?v=b-6mEAr1m-A
The Redis/Celery droplet was configured per the following tutorial https://www.digitalocean.com/community/tutorials/how-to-install-and-secure-redis-on-ubuntu-20-04
I have spent several days reviewing existing Stack Overflow questions on Django/Celery, and tried a number of suggestions. However, I have not found a question specifically describing this effect in the Django/Celery/Redis/Digital Ocean context. Below is described the current situation.
What Is Currently Happening?
The current outcome, as of this post, is that the web application times out, suggesting that the Django app is not successfully connecting with the Celery to send the email. Please note that towards the bottom of the post is the output of the Celery worker being successfully started manually from within the Django app's console, including a listing of the expected tasks.
The Stack In Use
Python 3.11 and Django 4.1.6: Running on the Digital Ocean App platform
Celery 5.2.7 and Redis 4.4.2 on Ubuntu 20.04: Running on a separate Digital Ocean Droplet
The Django project name is, "Whurthy".
Celery Setup Code Snippets
The following snippets are primarily from the Celery-Django documentation: https://docs.celeryq.dev/en/stable/django/first-steps-with-django.html#using-celery-with-django
Whurthy/celery.py
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'Whurthy.settings')
app = Celery('Whurthy')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print(f'Request: {self.request!r}')
Whurthy/__init__.py
from .celery import app as celery_app
__all__ = ('celery_app',)
Application Specific Code Snippets
Whurthy/settings.py
CELERY_BROKER_URL = 'redis://SNIP_FOR_PRIVACY:6379'
CELERY_RESULT_BACKEND = 'redis://SNIP_FOR_PRIVACY:6379'
CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT = 30 * 60
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = TIME_ZONE
I have replaced the actual IP with the string SNIP_FOR_PRIVACY for obvious reasons. However, if this were incorrect I would not get the output below.
I have also commented out the bind and requirepass redis configuration settings to support troubleshooting during development. This makes the URL as simple as possible and rules out either the incoming IP or password as being the cause of this problem.
'events/tasks.py`
from celery import shared_task
from django.core.mail import send_mail
#shared_task
def send_email_task():
send_mail(
'Celery Task Worked!',
'This is proof the task worked!',
'notifications#domain.com',
['my_email#domain.com'],
)
return
For privacy reasons I have changed the to and from email addresses. However, please note that this function works before adding .delay() to the following snippet. In other words, the Django app sends an email up until I add .delay() to invoke Celery.
events/views.py (extract)
from .tasks import send_email_task
from django.shortcuts import render
def home(request):
send_email_task.delay()
return render(request, 'home.html', context)
The above is just the relevant extract of a larger file to show the specific line of code calling the function. The Django web application is working until delay() is appended to the function call, and so I have not included other Django project file snippets.
Output from Running celery -A Whurthy worker -l info in the Digital Ocean Django App Console
Ultimately, I want to Dockerize this command, but for now I am running the above command manually. Below is the output within the Django App console, and it appears consistent with the tutorial and other examples of what a successfully configured Celery instance would look like.
<SNIP>
-------------- celery#whurthy-staging-b8bb94b5-xp62x v5.2.7 (dawn-chorus)
--- ***** -----
-- ******* ---- Linux-4.4.0-x86_64-with-glibc2.31 2023-02-05 11:51:24
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: Whurthy:0x7f92e54191b0
- ** ---------- .> transport: redis://SNIP_FOR_PRIVACY:6379//
- ** ---------- .> results: redis://SNIP_FOR_PRIVACY:6379/
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. Whurthy.celery.debug_task
. events.tasks.send_email_task
This appears to confirm that the Digital Ocean droplet is starting up a Celery worker successfully (suggesting that the code snippets above are correct) and that Redis configuration is correct. The two tasks listed when starting Celery is consistent with expectations. However, I am clearly missing something, and cannot rule out that the way Digital Ocean runs droplets is getting in the way.
The baseline test is that the web application sends out an email through the function call. However, as soon as I add .delay() the web page request times out.
I have endeavoured to replicate all that is relevant. I welcome any suggestions to resolve this issue or constructive criticism to improve this question.
Troubleshooting Attempts
Attempt 1
Through the D.O. app console I ran python manage.py shell
I then entered the following into the shell:
>>> from events.tasks import send_email_task
>>> send_email_task
<#task: events.tasks.send_email_task of Whurthy at 0x7fb2f2348dc0>
>>> send_email_task.delay()
At this point the shell hangs/does not respond until I keyboard interrupt.
I then tried the following:
>>> send_email_task.apply()
<EagerResult: 90b7d92c-4f01-423b-a16f-f7a7c75a545c>
AND, the task sends an email!
So, the connection between Django-Redis-Celery appears to work. However, invoking delay() causes the web app to time out and the email to NOT be sent.
So either delay() isn't putting the task in the queue, or is getting stuck. But in either case, this does not appear to be a connection issue. However, because apply() runs the code in the thread of the caller this isn't resolving the issue.
Which does suggest this may be an issue with the broker. This in turn may be an issue with settings...
Made minor changes to broker settings in settings.py
CELERY_BROKER_URL = 'redis://SNIP_FOR_PRIVACY:6379/0'
CELERY_RESULT_BACKEND = 'redis://SNIP_FOR_PRIVACY:6379/1'
delay() still hangs in the shell.
Attempt 2
I discovered that in Digital Ocean the ipv4 does not work when used for the Broker URL. By replacing that with the private IP in the CELERY_BROKER_URL setting I was able to get delay() working within the Django app's shell.
However, while I can now get delay() working in the shell returning to the original objective still fails. In other words, when loading in the respective view the web application hangs.
I am currently researching other approaches. Any suggestions are welcome. Given that I can now get Celery to work through the broker in the shell but not in the web application I feel like I have made some progress but am still out of a solution.
As a side note, I am also trying to make this connection through a Digital Ocean Managed Redis DB, although that is presenting a completely different issue.
Ultimately, the answer I uncovered is a compromise, a workaround using a different Digital Ocean (D.O.) product. The workaround was to use a Managed Database (which simplifies things but gives you much less control) rather than a Droplet (which involves manual Linux/Redis installation and configuration, but gives you greater control). This isn't ideal for 2 reasons. First, it costs more ($6 vs $15 base cost). Second, I would have preferred to be able to work out how to manually setup Redis (and thus maintain greater control). However, I'll take a working solution over no solution for a very niche issue.
The steps to use a D.O. Managed Redis DB are:
Provision the managed Redis DB
Use the Public Network Connection String (as the connection string includes the password I store this in an environment variable)
Ensure that you have the appropriate ssl setting in the 'celery.py' file (snippet below)
celery.py
import os
from celery import Celery
from ssl import CERT_NONE
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj_name.settings')
app = Celery(
'proj_name',
broker_use_ssl={'ssl_cert_reqs': ssl.CERT_NONE},
redis_backend_use_ssl={'ssl_cert_reqs': ssl.CERT_NONE}
)
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print(f'Request: {self.request!r}')
settings.py
REDIS_URI = os.environ.get('REDIS_URI')
CELERY_BROKER_URL = f'{REDIS_URI}/0'
CELERY_RESULT_BACKEND = f'{REDIS_URI}/1'
CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT = 30 * 60
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = TIME_ZONE

Celery app ignoring beat_schedule option specified in settings.py file

I am trying to extend my django app with celery crontab functionality. For this purposes i created celery.py file where i put code as mentioned in official documentation.
Here is my the code from project/project/celery.py
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE','project.settings')
app=Celery('project')
app.config_from_object('django.conf::settings',namespace='CELERY')
Than inside my project/settings.py file i specify related to celery configs as follow
CELERY_TIMEZONE = "Europe/Moscow"
CELERYBEAT_SHEDULE = {
'test_beat_tasks':{
'task':'webhooks.tasks.adding',
'schedule':crontab(minute='*/1),
},
}
Than i run worker an celery beat in the same terminal by
celery -A project worker -B
But nothing happened i mean i didnt see that my celery beat task printing any output while i expected that my task webhooks.tasks.adding will execute
Than i decided to check that celery configs are applied. For this purposes in command line **python manage.py shell i checked celery.app.conf object
#i imported app from project.celery.py module
from project import celery
#than examined app configs
celery.app.conf
And inside of huge config's output of celery configs i saw that timezone is set to None
As i understand my problem is that initiated in project/celery.py app is ignoring my project/settings.py CELERY_TIMEZONE and CELERY_BEAT_SCHEDULE configs but why so? What i am doing wrong? Please guide me
After i spent so much time researching to solve this problem i found that my mistake was inside how i run worker and celery beat. While running worker as i did it wouldnt execute task in the terminal. To see is task is executing i should run it as follow celery -A project worker -B -l INFO or instead of INFO if you want more detailed output DEBUG can be added. Hope it will help anyone

Django celery daemon gives 'supervisor FATAL can't find command', but path is correct

Overview:
I'm trying to run celery as a daemon for tasks to send emails. It worked fine in development, but not in production. I have my website up now, and every function works fine (no django errors), but the tasks aren't going through because the daemon isn't set up properly, and I get this error in ubuntu 16.04:
project_celery FATAL can't find command '/home/my_user/myvenv/bin/celery'
Installed programs / hardware, and what I've done so far:
I'm using Django 2.0.5, python 3.5, ubuntu 16.04, rabbitmq, and celery all on a VPS. Im using a venv for it all. I've installed supervisor too, and it's running when I check with sudo service --status-all because it has a + next to it. Erlang is also installed, and when I check with top, rabbitmq is running. Using sudo service rabbitmq-server status shows rabbitmq is active too.
Originally, I followed the directions at the celery website, but they were very confusing and I couldn't get it to work after ~40 hours of testing/reading/watching other people's solutions. Feeling very aggravated and defeated, I chose the directions here to get the daemon set up and hope I get somewhere, and I have got further, but I get the error above.
I read through the supervisor documentation, checked the process states to try and debug the problem, and program settings, and I'm lost because my paths are correct as far as I can tell, according to the documentation.
Here's my file structure stripped down:
home/
my_user/ # is a superuser
portfolio-project/
project/
__init__.py
celery.py
settings.py # this file is in here too
app_1/
app_2/
...
...
logs/
celery.log
myvenv/
bin/
celery # executable file, is colored green
celery_user_nobody/ # not a superuser, but created for celery tasks
etc/
supervisor/
conf.d/
project_celery.conf
Here is my project_celery.conf:
[program:project_celery]
command=/home/my_user/myvenv/bin/celery worker -A project --loglevel=INFO
directory=/home/my_user/portfolio-project/project
user=celery_user_nobody
numprocs=1
stdout_logfile=/home/my_user/logs/celery.log
stderr_logfile=/home/my_user/logs/celery.log
autostart=true
autorestart=true
startsecs=10
stopwaitsecs = 600
stopasgroup=true
priority=1000
Here's my init.py:
from __future__ import absolute_import, unicode_literals
from .celery import app as celery_app
__all__ = ['celery_app']
And here's my celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
app = Celery('project')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
UPDATE: Here is my settings.py:
This is the only setting I have because the example at the celery website django instructions shows nothing more, unless I were to use something like redis. I put this in my settings.py file because the django instructions say you can: CELERY_BROKER_URL = 'amqp://localhost'
UPDATE: I created the rabbitmq user:
$ sudo rabbitmqctl add_user rabbit_user1 mypassword
$ sudo rabbitmqctl add_vhost myvhost
$ sudo rabbitmqctl set_user_tags rabbit_user1 mytag
$ sudo rabbitmqctl set_permissions -p myvhost rabbit_user1 ".*" ".*" ".*"
And when I do sudo rabbitmqctl status, I get Status of node 'rabbit#django2-portfolio', but oddly, I don't see any nodes running like the following, because the directions here show that I should see that:
{nodes,[rabbit#myhost]},
{running_nodes,[rabbit#myhost]}]
Steps I followed:
I created the .conf and .log files in the places I said.
sudo systemctl enable supervisor
sudo systemctl start supervisor
sudo supervisorctl reread
sudo supervisorctl update # no errors up to this point
sudo supervisorctl status
And after 6 I get this error:
project_celery FATAL can't find command '/home/my_user/myvenv/bin/celery'
UPDATE: I checked the error logs, and I have multiple instances of the following in /var/log/rabbitmq/rabbit#django2-portfolio.log:
=INFO REPORT==== 9-Aug-2018::18:26:58 ===
connection <0.690.0> (127.0.0.1:42452 -> 127.0.0.1:5672): user 'guest' authenticated and granted access to vhost '/'
=ERROR REPORT==== 9-Aug-2018::18:29:58 ===
closing AMQP connection <0.687.0> (127.0.0.1:42450 -> 127.0.0.1:5672):
missed heartbeats from client, timeout: 60s
Closing statement:
Anyone have any idea what's going on? When I look at my absolute paths in my project_celery.conf file, I see everything set correctly, but something's obviously wrong. Looking over my code more, rabbitmq says no nodes are running. when I do sudo rabbitmqctl status, but celery does when I do celery status (it shows OK 1 node online).
Any help would be greatly appreciated. I even made this account specifically because I had this problem. It's driving me mad. And if anyone needs any more info, please ask. This is my first time deploying anything, so I'm not a pro.
Can you try any of the following in your project_celery.conf
command=/home/my_user/myvenv/bin/celery worker -A celery --loglevel=INFO
directory=/home/my_user/portfolio-project/project
or
command=/home/my_user/myvenv/bin/celery worker -A project.celery --loglevel=INFO
directory=/home/my_user/portfolio-project/
Additionally, in celery.py can you add the parent folder of the project module to sys.path (or make sure that you've packaged your deploy properly and have installed it via pip or otherwise)?
I suspect (from your comments with #Jack Shedd that you're referring to a non-existent project due to where directory is set relative to the magic celery.py file.)

Run celery with Django start

I am using Django 1.11 and Celery 4.0.2.
We are using a PaaS (OpenShift 3) which runs over kubernetes - Dockers.
I am using a Python image, it knows only how to run one command on start (and follow for exit code - restart if fails),
How can I run celery worker in the same time I am running Django to make sure that failure of one of them will kill the both process (worker and Django)
I am using wsgi and gevent to start Django
Thank you!
You could use circus (supervisord is an alternative but they don't support python 3 currently)
In circus you create a circus.ini in your project directory.
Something like:
[watcher:celery]
working_dir = /var/www/your_app
virtualenv = virtualenv
cmd = celery
args = worker --app=your_app --loglevel=DEBUG -E
[watcher:django]
working_dir = /var/www/your_app
virtualenv = virtualenv
cmd = python
args = manage.py runserver
Then you start both with:
virtualenv/bin/circusd circus.ini
It should start both processes. I think this is a good way to create a "start" plan for your project. Maybe you want to add celerybeat or use channels (websockets in django), so you just can add a new watcher in your circus.ini. It's pretty dynamic

How to properly configure djcelery results backend to database

I'm trying to setup djangocelery to store task results in the databse.
I set:
CELERY_RESULT_BACKEND = 'djcelery.backends.database.DatabaseBackend'
then I synced and migrated the db (no errors).
Celery is working and tasks get processed (I can get the results), but admin shows there is no tasks. In the database are two tables celery_taskmeta and djcelery_taskmeta. First one is holding the results and second one is displayed in admin. Anyone has insight how to configure it properly?
Check the doc, when you use djcelery, set CELERY_RESULT_BACKEND="database" or don't even bother to write this line because djcelery sets it by default.
The result is stored in celery_taskmeta table, you should register djcelery.models.TaskMeta to admin by yourself:
# in some admin.py, which is contained by an app after `djcelery` in `INSTALLED_APPS`
# or directly in djcelery/admin.py
from djcelery.models import TaskMeta
class TaskMetaAdmin(admin.ModelAdmin):
readonly_fields = ('result',)
admin.site.register(TaskMeta, TaskMetaAdmin)
Related question with right answer is here.
You should actually run
python manage.py celery worker -E
and
python manage.py celerycam
After that tasks results will be displayed in admin (Djcelery › Tasks)
Moving the config update e.g.
app.conf.update(CELERY_RESULT_BACKEND='djcelery.backends.database.DatabaseBackend')
to the end of file celery.py did the trick for me .