I need to import data from several public APIs for a user after he signed up. django-allauth is included and I have registered a signal handler to call the right methods after allaut emits user_signed_up.
Because the data import needs to much time and the request is blocked by the signal, I want to use celery to do the work.
My test task:
#app.task()
def test_task(username):
print('##########################Foo#################')
sleep(40)
print('##########################' + username + '#################')
sleep(20)
print('##########################Bar#################')
return 3
I'm calling the task like this:
from game_studies_platform.taskapp.celery import test_task
#receiver(user_signed_up)
def on_user_signed_in(sender, request, *args, **kwargs):
test_task.apply_async('John Doe')
The task should be put into the queue and the request should be followed immediately. But it is blocked and I have to wait a minute.
The project is setup with https://github.com/pydanny/cookiecutter-django and I'm running it in a docker container.
Celery is configured to use the django database in development but will be redis in production
The solution was to switch CELERY_ALWAYS_EAGER = True to False in the local.py. I was pointed to that solution in the Gitter channel of cookiecutter-django.
The calls mention above where already correct.
Related
In my django application I am using celery. In a post_save signal, I am updating the index in elastic search. But for some reason the task gets hung and never actually executes the code:
What I use to run celery:
celery -A collegeapp worker -l info
The Signal:
#receiver(post_save, sender=University)
def university_saved(sender, instance, created, **kwargs):
"""
University save signal
"""
print('calling celery task')
update_university_index.delay(instance.id)
print('finished')
The task:
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(instance_id):
print('updating university index')
The only output I get is calling celery task. after waiting over 30 minutes, it doesn't ever get to any other print statements and the view continue to wait. Nothing ever shows in celery terminal.
Versions:
Django 3.0,
Celery 4.3,
Redis 5.0.9,
Ubuntu 18
UPDATE:
after doing some testing, using the debug_task defined inside the celery.py file in place of update_university_index does not lead to hanging. It behaves as expect. I thought maybe it could have been app.task vs task decorator but it seems that's not it.
#app.task(bind=True)
def debug_task(text, second_value):
print('printing debug_task {} {}'.format(text, second_value))
This happened with me once, I had made the dumbest error, django tells us to specify celery tasks in tasks.py file, and uses that for task discovery. After that it worked. Could you provide more insight into the directory structure using tree command?
This tutorial is for flask, but the same can be achieved in django. Where this particular tutorial shines is that after you tell celery to execute a task, it also provides you with a uuid and you can ping that url and monitor the progress of the task you triggered.
Verify that the tasks have been registered by celery using (Do make sure that celery is running):
from celery.task.control import inspect
i = inspect()
i.registered_tasks()
Or bash
$ celery inspect registered
$ celery -A collegeapp inspect registered
From https://docs.celeryproject.org/en/latest/faq.html#the-worker-isn-t-doing-anything-just-hanging
Why is Task.delay/apply*/the worker just hanging?
Answer: There’s a bug in some AMQP clients that’ll make it hang if it’s not able to authenticate the current user, the password doesn’t match or the user doesn’t have access to the virtual host specified. Be sure to check your broker logs (for RabbitMQ that’s /var/log/rabbitmq/rabbit.log on most systems), it usually contains a message describing the reason.
Change this line
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(instance_id):
print('updating university index')
To
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(self, instance_id):
print('updating university index')
Or add self to the task definition.
I'm still not sure as to why it doesn't work but I found a solution by replace task with app.task
importing app from my celery.py seemed to have resolved the issue.
from collegeapp.celery import app
#app.task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(self, instance_id):
print('updating university index')
I use multiple post_save functions to trigger different celery (4.4.0, 4.8.3) tasks and tried Django 2 and 3. For some strange reason celery stopped executing all tasks in parallel instead only one task gets received each time model is saved. The other tasks are not even received.
To run all the tasks, I have to save the model multiple times. It was working before and I have no idea why the behavior changed all of a sudden.
I am starting the queue with:
celery -A appname worker -l info -E
My post save functions:
#receiver(models.signals.post_save, sender=RawFile)
def execute_rawtools_qc(sender, instance, created, *args, **kwargs):
rawtools_qc.delay(instance.path, instance.path)
#receiver(models.signals.post_save, sender=RawFile)
def execute_rawtools_metrics(sender, instance, created, *args, **kwargs):
rawtools_metrics.delay(instance.abs_path, instance.path)
And my tasks:
#shared_task
def rawtools_metrics(raw, output_dir):
cmd = rawtools_metrics_cmd(raw=raw, output_dir=output_dir)
os.system(cmd)
#shared_task
def rawtools_qc(input_dir, output_dir):
cmd = rawtools_qc_cmd(input_dir=input_dir, output_dir=output_dir)
os.system(cmd)
Before those tasks where executed in parallel as soon as the model was saved. Now, the first task gets executed when the model instance is saved, and the second instance is executed the second time the model is saved. And then the functions alternate each time. Any idea what may cause this strange behavior?
UPDATE: I think both task are executed randomly, but only one for each save.
Also, there are no other celery workers running.
If you are running both functions for the same model run them in the same post_save method:
#receiver(models.signals.post_save, sender=RawFile)
def execute_rawtools_qc(sender, instance, created, *args, **kwargs):
rawtools_qc.delay(instance.path, instance.path)
rawtools_metrics.delay(instance.abs_path, instance.path)
I have a model I am sending email and sms to user in post_save signal I am creating the model multiple times so it is sending email and sms multiple time.
I am planning to write new test for testing sms and email.
def send_activation_mail_sms(sender, instance, created, **kwargs):
if created :
mobile_activation = UserMobileActivation.objects.create(user=instance,randomword=randomword(50),ref=ref)
email_activation = UserEmailActivation.objects.create(user=instance,randomword=randomword(50),ref=ref)
url_email = "{0}view/v1/email/activation/{1}/".format(HOSTNAME,email_activation.randomword) short_url_email = url_shortener(long_url_email)
url_sms = "{0}view/v1/mobile/activation/{1}".format(HOSTNAME,mobile_activation.randomword)
app.send_task("apps.tasks.send_sms",
args=[TEXTLOCAL_APIKEY,mobile_activation.stockuser.user.username ,'TXTLCL','Activate your mobile here {0}'.format(url_sms)])
app.send_task("apps.tasks.send_email",
args=[email_activation.user.user.email, EMAIL_VERIFICATION_SUBJECT,
EMAIL_VERIFICATION_TEMPLATE, {"host": HOSTNAME, "verify_email_url": url_email}])
I am passing created arg in post_save signal is there any way I can pass extra arg here so that while doing python manage.py test it will skip sending sms and email. I used versioning one way I was thinking to have different version of API for testing but as there is no request coming here I cannot catch request.version here. Please suggest.
Initially set some variable in your settings.py to identify the environment currently working on
# settings.py
MY_ENV = "DEVELOPMENT"
Then, run the celery tasks/additional scripts based on the MY_ENV
from django.conf import settings
def send_activation_mail_sms(sender, instance, created, **kwargs):
if created and settings.MY_ENV == "DEVELOPMENT":
# do your stuff
Django provide us to override the settings configs during the testing, see the doc Override Settings. So you could override the MY_ENV value in the test itself
I have several API's as sources of data, for example - blog posts. What I'm trying to achieve is to send requests to this API's in parallel from Django view and get results. No need to store results in db, I need to pass them to my view response. My project is written on python 2.7, so I can't use asyncio. I'm looking for advice on the best practice to solve it (celery, tornado, something else?) with examples of how to achieve that cause I'm only starting my way in async. Thanks.
A solution is use Celery and pass your request args to this, and in the front use AJAX.
Example:
def my_def (request):
do_something_in_celery.delay()
return Response(something)
To control if a task is finished in Celery, you can put the return of Celery in a variable:
task_run = do_something_in_celery.delay()
In task_run there is a property .id.
This .id you return to your front and use it to monitor the status of task.
And your function executed in Celery must have de decorator #task
#task
do_something_in_celery(*args, **kwargs):
You will a need to control the tasks, like a Redis or RabbitMQ.
Look this URLs:
http://masnun.com/2014/08/02/django-celery-easy-async-task-processing.html
https://buildwithdjango.com/blog/post/celery-progress-bars/
http://docs.celeryproject.org/en/latest/index.html
I found a solution using concurrent.futures ThreadPoolExecutor from futures lib.
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
You can also check out the rest of the concurrent.futures doc.
Important!
The ProcessPoolExecutor class has known (unfixable) problems on Python 2 and should not be relied on for mission critical work.
I am trying to leverage the post_save function of Django Signals in combination with Celery tasks. After a new Message object is saved to the database, I want to evaluate if the instance has one of two attributes and if it does, call the 'send_sms_function' which is a Celery registered task.
tasks.py
from my_project.celery import app
#app.task
def send_sms_message(message):
# Do something
signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
import rollbar
rollbar.init('234...0932', 'production')
from dispatch.models import Message
from comm.tasks import send_sms_message
#receiver(post_save, sender=Message)
def send_outgoing_messages(sender, instance, **kwargs):
if instance.some_attribute == 'A' or instance.some_attribute == 'B':
try:
send_sms_message.delay(instance)
except:
rollbar.report_exc_info()
else:
pass
I'm testing this locally by running a Celery worker. When I am in the Django shell and call the Celery function, it works as expected. However when I save a Message instance to the database, the function does not work as expected: There is nothing posted to the task queue and I do not see any error messages.
What am I doing wrong?
This looks like a problem with serializing and/or your settings. When celery passes the message to your broker, it needs to have some representation of the data. Celery serializes the arguments you give a task but if you don't have it configured consistently with what you're passing (i.e. you have a mismatch where your broker is expecting JSON but you send it a pickled python object), tasks can fail simply because the worker can't easily decode what you're sending it. If you run the function in your shell (without the call to delay) it is called synchronously so there is no serialization or message passing.
In your settings you should be using the JSON serialization (unless you have a really good reason) but if not, then there could be something wrong with your pickling. You can always increase the log level to debug when you run celery to see more about serialization related errors with:
celery -A yourapp worker -l debug
When in doubt, use that print statement/function to make sure your signal receiver is running. If not, you can create an AppConfig class that imports your receivers in it's ready method or some other reasonable technique for making sure your receivers are being registered.
[opinion]
I would suggest doing something like this:
#receiver(post_save, sender=Message)
def send_outgoing_messages(sender, instance, **kwargs):
enqueue_message.delay(instance.id)
in yourmodule/tasks.py
#app.task
def enqueue_message(message_id):
msg = Message.object.get(id=message_id)
if msg.some_attribute in ('A', 'B'): # slick or
send_sms_message.delay(message_id)
You can always use Celery's composition techniques but here you have something that doesn't add more complexity to your request/response cycle.
[/opinion]
The expression if instance.some_attribute == 'A' or 'B' is probably your problem.
What you probably mean is:
if instance.some_attribute == 'A' or instance.some_attribute == 'B'
Or, how I would write it:
if instance.some_attribute in ('A', 'B')
you are calling the function synchronously instead of queuing it:
send_sms_message.delay(instance)
should queue the message
http://celery.readthedocs.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.delay
http://celery.readthedocs.org/en/latest/userguide/calling.html#basics
#dgel also points out a logic error