I am trying to leverage the post_save function of Django Signals in combination with Celery tasks. After a new Message object is saved to the database, I want to evaluate if the instance has one of two attributes and if it does, call the 'send_sms_function' which is a Celery registered task.
tasks.py
from my_project.celery import app
#app.task
def send_sms_message(message):
# Do something
signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
import rollbar
rollbar.init('234...0932', 'production')
from dispatch.models import Message
from comm.tasks import send_sms_message
#receiver(post_save, sender=Message)
def send_outgoing_messages(sender, instance, **kwargs):
if instance.some_attribute == 'A' or instance.some_attribute == 'B':
try:
send_sms_message.delay(instance)
except:
rollbar.report_exc_info()
else:
pass
I'm testing this locally by running a Celery worker. When I am in the Django shell and call the Celery function, it works as expected. However when I save a Message instance to the database, the function does not work as expected: There is nothing posted to the task queue and I do not see any error messages.
What am I doing wrong?
This looks like a problem with serializing and/or your settings. When celery passes the message to your broker, it needs to have some representation of the data. Celery serializes the arguments you give a task but if you don't have it configured consistently with what you're passing (i.e. you have a mismatch where your broker is expecting JSON but you send it a pickled python object), tasks can fail simply because the worker can't easily decode what you're sending it. If you run the function in your shell (without the call to delay) it is called synchronously so there is no serialization or message passing.
In your settings you should be using the JSON serialization (unless you have a really good reason) but if not, then there could be something wrong with your pickling. You can always increase the log level to debug when you run celery to see more about serialization related errors with:
celery -A yourapp worker -l debug
When in doubt, use that print statement/function to make sure your signal receiver is running. If not, you can create an AppConfig class that imports your receivers in it's ready method or some other reasonable technique for making sure your receivers are being registered.
[opinion]
I would suggest doing something like this:
#receiver(post_save, sender=Message)
def send_outgoing_messages(sender, instance, **kwargs):
enqueue_message.delay(instance.id)
in yourmodule/tasks.py
#app.task
def enqueue_message(message_id):
msg = Message.object.get(id=message_id)
if msg.some_attribute in ('A', 'B'): # slick or
send_sms_message.delay(message_id)
You can always use Celery's composition techniques but here you have something that doesn't add more complexity to your request/response cycle.
[/opinion]
The expression if instance.some_attribute == 'A' or 'B' is probably your problem.
What you probably mean is:
if instance.some_attribute == 'A' or instance.some_attribute == 'B'
Or, how I would write it:
if instance.some_attribute in ('A', 'B')
you are calling the function synchronously instead of queuing it:
send_sms_message.delay(instance)
should queue the message
http://celery.readthedocs.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.delay
http://celery.readthedocs.org/en/latest/userguide/calling.html#basics
#dgel also points out a logic error
Related
Following the docs found here but I'm not receiving the signal. Is there more to add?
community/signals.py
from wagtail.core.signals import page_published
from wagtailPages.models import CommunityArticle
from notification.models import Notification
def notify_article_author(sender, **kwargs):
print("Processing page_published signal")
...
page_published.connect(notify_article_author, sender=CommunityArticle)
You need to make sure that the code is actually being loaded and run - usually this is done by registering the signal within the ready method of your AppConfig. (If you haven't already, you'll need to define default_app_config in __init__.py, as detailed at https://docs.djangoproject.com/en/stable/ref/applications/#for-application-authors .)
In my django application I am using celery. In a post_save signal, I am updating the index in elastic search. But for some reason the task gets hung and never actually executes the code:
What I use to run celery:
celery -A collegeapp worker -l info
The Signal:
#receiver(post_save, sender=University)
def university_saved(sender, instance, created, **kwargs):
"""
University save signal
"""
print('calling celery task')
update_university_index.delay(instance.id)
print('finished')
The task:
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(instance_id):
print('updating university index')
The only output I get is calling celery task. after waiting over 30 minutes, it doesn't ever get to any other print statements and the view continue to wait. Nothing ever shows in celery terminal.
Versions:
Django 3.0,
Celery 4.3,
Redis 5.0.9,
Ubuntu 18
UPDATE:
after doing some testing, using the debug_task defined inside the celery.py file in place of update_university_index does not lead to hanging. It behaves as expect. I thought maybe it could have been app.task vs task decorator but it seems that's not it.
#app.task(bind=True)
def debug_task(text, second_value):
print('printing debug_task {} {}'.format(text, second_value))
This happened with me once, I had made the dumbest error, django tells us to specify celery tasks in tasks.py file, and uses that for task discovery. After that it worked. Could you provide more insight into the directory structure using tree command?
This tutorial is for flask, but the same can be achieved in django. Where this particular tutorial shines is that after you tell celery to execute a task, it also provides you with a uuid and you can ping that url and monitor the progress of the task you triggered.
Verify that the tasks have been registered by celery using (Do make sure that celery is running):
from celery.task.control import inspect
i = inspect()
i.registered_tasks()
Or bash
$ celery inspect registered
$ celery -A collegeapp inspect registered
From https://docs.celeryproject.org/en/latest/faq.html#the-worker-isn-t-doing-anything-just-hanging
Why is Task.delay/apply*/the worker just hanging?
Answer: There’s a bug in some AMQP clients that’ll make it hang if it’s not able to authenticate the current user, the password doesn’t match or the user doesn’t have access to the virtual host specified. Be sure to check your broker logs (for RabbitMQ that’s /var/log/rabbitmq/rabbit.log on most systems), it usually contains a message describing the reason.
Change this line
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(instance_id):
print('updating university index')
To
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(self, instance_id):
print('updating university index')
Or add self to the task definition.
I'm still not sure as to why it doesn't work but I found a solution by replace task with app.task
importing app from my celery.py seemed to have resolved the issue.
from collegeapp.celery import app
#app.task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(self, instance_id):
print('updating university index')
I need to import data from several public APIs for a user after he signed up. django-allauth is included and I have registered a signal handler to call the right methods after allaut emits user_signed_up.
Because the data import needs to much time and the request is blocked by the signal, I want to use celery to do the work.
My test task:
#app.task()
def test_task(username):
print('##########################Foo#################')
sleep(40)
print('##########################' + username + '#################')
sleep(20)
print('##########################Bar#################')
return 3
I'm calling the task like this:
from game_studies_platform.taskapp.celery import test_task
#receiver(user_signed_up)
def on_user_signed_in(sender, request, *args, **kwargs):
test_task.apply_async('John Doe')
The task should be put into the queue and the request should be followed immediately. But it is blocked and I have to wait a minute.
The project is setup with https://github.com/pydanny/cookiecutter-django and I'm running it in a docker container.
Celery is configured to use the django database in development but will be redis in production
The solution was to switch CELERY_ALWAYS_EAGER = True to False in the local.py. I was pointed to that solution in the Gitter channel of cookiecutter-django.
The calls mention above where already correct.
I've been trying to learn Celery over the past week and adding it to my project that uses Django and Docker-Compose. I am having a hard time understanding how to get it to work; my issue is that I can't seem to get uploading to my database to work when using tasks. The upload function, insertIntoDatabase, was working fine before without any involvement with Celery but now uploading doesn't work. Indeed, when I try to upload, my website tells me too quickly that the upload was successful, but then nothing actually gets uploaded.
The server is started up with docker-compose up, which will make migrations, perform a migrate, collect static files, update requirements, and then start the server. This is all done using pavement.py; the command in the Dockerfile is CMD paver docker_run. At no point is a Celery worker explicitly started; should I be doing that? If so, how?
This is the way I'm calling the upload function in views.py:
insertIntoDatabase.delay(datapoints, user, description)
The upload function is defined in a file named databaseinserter.py. The following decorator was used for insertIntoDatabase:
#shared_task(bind=True, name="database_insert", base=DBTask)
Here is the definition of the DBTask class in celery.py:
class DBTask(Task):
abstract = True
def on_failure(self, exc, *args, **kwargs):
raise exc
I am not really sure what to write for tasks.py. Here is what I was left with by a former co-worker just before I picked up from where he left off:
from celery.decorators import task
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
#task(name="database_insert")
def database_insert(data):
And here are the settings I used to configure Celery (settings.py):
BROKER_TRANSPORT = 'redis'
_REDIS_LOCATION = 'redis://{}:{}'.format(os.environ.get("REDIS_PORT_6379_TCP_ADDR"), os.environ.get("REDIS_PORT_6379_TCP_PORT"))
BROKER_URL = _REDIS_LOCATION + '/0'
CELERY_RESULT_BACKEND = _REDIS_LOCATION + '/1'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ENABLE_UTC = True
CELERY_TIMEZONE = "UTC"
Now, I'm guessing that database_insert in tasks.py shouldn't be empty, but what should go there instead? Also, it doesn't seem like anything in tasks.py happens anyway--when I added some logging statements to see if tasks.py was at least being run, nothing actually ended up getting logged, making me think that tasks.py isn't even being run. How do I properly make my upload function into a task?
You're not too far off from getting this working, I think.
First, I'd recommend that you do try to keep your Celery tasks and your business logic separate. So, for example, it probably makes good sense to have the business logic involved with inserting your data into your DB in the insertIntoDatabase function, and then separately create a Celery task, perhaps name insert_into_db_task, that takes in your args as plain python objects (important) and calls the aforementioned insertIntoDatabase function with those args to actually complete the DB insertion.
Code for that example might looks like this:
my_app/tasks/insert_into_db.py
from celery.decorators import task
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
#task()
def insert_into_db_task(datapoints, user, description):
from my_app.services import insertIntoDatabase
insertIntoDatabase(datapoints, user, description)
my_app/services/insertIntoDatabase.py
def insertIntoDatabase(datapoints, user, description):
"""Note that this function is not a task, by design"""
# do db insertion stuff
my_app/views/insert_view.py
from my_app.tasks import insert_into_db_task
def simple_insert_view_func(request, args, kwargs):
# start handling request, define datapoints, user, description
# next line creates the **task** which will later do the db insertion
insert_into_db_task.delay(datapoints, user, description)
return Response(201)
The app structure I'm implying is just how I would do it and isn't required. Note also that you can probably use #task() straight up and not define any args for it. Might simplify things for you.
Does that help? I like to keep my tasks light and fluffy. They mostly just do jerk proofing (make sure the involved objs exist in DB, for instance), tweak what happens if the task fails (retry later? abort task? etc.), logging, and otherwise they execute business logic that lives elsewhere.
Also, in case it's not obvious, you do need to be running celery somewhere so that there are workers to actually process the tasks that your view code are creating. If you don't run celery somewhere then your tasks will just stack up in the queue and never get processed (and so your DB insertions will never happen).
I realize there are many other questions related to custom django signals that don't work, and believe me, I have read all of them several times with no luck for getting my personal situation to work.
Here's the deal: I'm using django-rq to manage a lengthy background process that is set off by a particular http request. When that background process is done, I want it to fire off a custom Django signal so that the django-rq can be checked for any job failure/exceptions.
Two applications, both on the INSTALLED_APPS list, are at the same level. Inside of app1 there is a file:
signals.py
import django.dispatch
file_added = django.dispatch.Signal(providing_args=["issueKey", "file"])
fm_job_done = django.dispatch.Signal(providing_args=["jobId"])
and also a file jobs.py
from app1 import signals
from django.conf import settings
jobId = 23
issueKey = "fake"
fileObj = "alsoFake"
try:
pass
finally:
signals.file_added.send(sender=settings.SIGNAL_SENDER,issueKey=issueKey,fileName=fileObj)
signals.fm_job_done.send(sender=settings.SIGNAL_SENDER,jobId=jobId)
then inside of app2, in views.py
from app1.signals import file_added, fm_job_done
from django.conf import settings
#Setup signal handlers
def fm_job_done_callback(sender, **kwargs):
print "hellooooooooooooooooooooooooooooooooooo"
logging.info("file manager job done signal fired")
def file_added_callback(sender, **kwargs):
print "hellooooooooooooooooooooooooooooooooooo"
logging.info("file added signal fired")
file_added.connect(file_added_callback,sender=settings.SIGNAL_SENDER,weak=False)
fm_job_done.connect(fm_job_done_callback,sender=settings.SIGNAL_SENDER,weak=False)
I don't get any feedback whatsoever though and am at a total loss. I know for fact that jobs.py is executing, and therefore also that the block of code that should be firing the signals is executing as well since it is in a finally block (no the try is not actually empty - I just put pass there for simplicity) Please feel free to ask for more information - I'll respond asap.
here is the solution for django > 2.0
settings.py:
change name of your INSTALLED_APPS from 'app2' to
'app2.apps.App2Config'
app2 -> apps.py:
from app1.signals import file_added, fm_job_done
Class App2Config(AppConfig):
name = 'app2'
def ready(self):
from .views import fm_job_done_callback, file_added_callback
file_added.connect(file_added_callback)
fm_job_done.connect(fm_job_done_callback)
use django receiver decorator
from django.dispatch import receiver
from app1.signals import file_added, fm_job_done
#receiver(fm_job_done)
def fm_job_done_callback(sender, **kwargs):
print "helloooooooooooooo"
#receiver(file_added)
def file_added_callback(sender, **kwargs):
print "helloooooooooooooo"
Also, I prefer to handle signals in models.py