Django celery run multiple workers with different queues - django

i try to configure three queues/workers for celery in django.
settings.py
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
CELERY_TIMEZONE = 'Europe/Berlin'
CELERY_QUEUES = (
Queue('manually_task', Exchange('manually_task'), routing_key='manually_task'),
Queue('periodically_task', Exchange('periodically_task'), routing_key='periodically_task'),
Queue('firsttime_task', Exchange('firsttime_task'), routing_key='firsttime_task'),
)
CELERY_ROUTES = {
'api.tasks.manually_task': {
'queue': 'manually_task',
'routing_key': 'manually_task',
},
'api.tasks.periodically_task': {
'queue': 'periodically_task',
'routing_key': 'periodically_task',
},
'api.tasks.firsttime_task': {
'queue': 'firsttime_task',
'routing_key': 'firsttime_task',
},
}
I have three tasks and every task should be have their own queue/worker.
My tasks look like this:
#shared_task
def manually_task(website_id):
print("manually_task");
website = Website.objects.get(pk=website_id)
x = Proxy(website, "49152")
x.startproxy()
x = None
#periodic_task(run_every=(crontab(hour=19, minute=15)), ignore_result=True)
def periodically_task():
websites = Website.objects.all()
for website in websites:
x = Proxy(website, "49153")
x.startproxy()
x = None
#shared_task
def firsttime_task(website_id):
website = Website.objects.get(pk=website_id)
x = Proxy(website, "49154")
x.startproxy()
x = None
Now for the first trial i start only one worker:
celery -A django-proj worker -Q manually_task -n manually_task
My problem is that the task not execute apparently, "manually_task" not printed.
Why its not working?

Based on the commments I suggest you should either provide queue name when you are calling a task from a view like manually_task.apply_async((webseite.pk,), queue='manually_task'), you or add default queue named celery when you start the worker as in celery -A django-proj worker -Q manually_task,celery

Related

How to configure DB for Django, Celery, and SQS

I'm trying to offload a montecarlo task to Celery, and save the results to PostgreSQL AWS DB (RDS)
from views.py:
newResults = MonteCarloResultTrue.objects.create(htmlCompanyArray = "[]")
viewRun = runMonteCarloAsync.delay(checkedCompaniesIDs,newResults.id)
The object is created, but in tasks.py, the DB object is not being edited:
#app.task(bind=True)
def runMonteCarloAsync(self,checkedCompaniesIDs, calc_id):
newResults = MonteCarloResultTrue.objects.get(id=calc_id)
newResults.htmlCompanyArray = "[asdf]"
newResults.save()
How can I update the DB from the Celery task? Do I need to explicitly tell Celery where to look for the DB? (settings.py):
CELERY_accept_content = ['application/json']
CELERY_task_serializer = 'json'
CELERY_TASK_DEFAULT_QUEUE = 'django-queue-dev'
CELERY_BROKER_URL = 'sqs://{0}:{1}#'.format(
urllib.parse.quote(AWS_ACCESS_KEY_ID, safe=''),
urllib.parse.quote(AWS_SECRET_ACCESS_KEY, safe='')
)
CELERY_BROKER_TRANSPORT_OPTIONS = {
"region": "us-east-1",
'polling_interval': 20
}
CELERY_RESULT_BACKEND = 'django-db'
CELERY_CACHE_BACKEND = 'django-cache'
Procfile:
web: python manage.py runserver
celery_worker: celery -A rvm.settings.celery.app worker --loglevel=INFO
celery_beat: celery
What am I missing?
The problem was with my Procfile. I changed it to:
celery_worker: celery -A rvm worker --loglevel=INFO
and I can now save in the task (Using Celery 5.x)

How to save Celery data into Django DB

I'm running a Django app on AWS, with a PostgreSQL DB, using SQS. I'm trying to offload the montecarlo simulation onto Celery, but I am unable to save the results to my DB.
My view looks like:
runMonteCarloAsync.delay(checkedCompaniesIDs)
The task looks like:
#app.task(bind=True)
def runMonteCarloAsync(self,checkedCompaniesIDs):
# Do some montecarlo stuff
data = []
newResults = MonteCarloResultTrue(data=data)
newResults.save()
Here is my settings.py:
CELERY_accept_content = ['application/json']
CELERY_task_serializer = 'json'
CELERY_TASK_DEFAULT_QUEUE = 'django-queue-dev'
CELERY_BROKER_URL = 'sqs://{0}:{1}#'.format(
urllib.parse.quote(AWS_ACCESS_KEY_ID, safe=''),
urllib.parse.quote(AWS_SECRET_ACCESS_KEY, safe='')
)
CELERY_BROKER_TRANSPORT_OPTIONS = {
"region": "us-east-1",
'polling_interval': 20
}
CELERY_RESULT_BACKEND = 'django-db'
CELERY_CACHE_BACKEND = 'django-cache'
ProcFile:
web: python manage.py runserver
celery_worker: celery worker -A rvm.settings.celery.app --concurrency=1 --loglevel=INFO -n worker.%%h
celery_beat: celery
I can see the messages hitting SQS:
There are no new DB entires. I feel like I'm missing something, but I can't figure it out.

Celery task is pending in the browser but succeeded in the python shell of Django

I'm using Django, Celery, and RabbitMQ for simple tasks on Ubuntu but celery gives no response.
I can't figure out why the task is pending in the browser, while it is done when I used the shell by executing python3 manage.py shell.
Here is my tasks.py file:
from celery import shared_task, task
#shared_task
def createContainer(container_data):
print(container_data,"create")
return "created"
#shared_task
def destroyContainer(container_data):
print(container_data,"destroy")
return "destroyed"
Here is my views.py file:
def post(self,request):
if str(request.data["process"]) == "create":
postdata = {
"image_name" : request.data["image_name"],
"image_tag" : request.data["image_tag"],
"owner" : request.user.id
}
# I tried to print the postdata variable before the task and it is working
createContainer.delay(postdata)
elif str(request.data["process"]) == "destroy":
postdata = {
"cont_id" : request.data["cont_id"]
}
# I tried to print the postdata variable before the task and it is working
destroyContainer.delay(postdata)
# I tried to print anything here, but it was not reachable and never executed
Here is the code I tried in the shell:
>>> from dockerapp.tasks import create_container
>>> create_container.delay("fake data")
>>> <AsyncResult: c37c47f3-6965-4f2e-afcd-01de60f82565>
Also, I can see the logs of celery here in another terminal by executing celery -A dockerproj worker -l info
It results in these lines when I used the shell:
Received task: dockerapp.tasks.create_container[c37c47f3-6965-4f2e-afcd-01de60f82565]
fake data #print
create #print
Task dockerapp.tasks.create_container[c37c47f3-6965-4f2e-afcd-01de60f82565] succeeded in 0.003456833990640007s
but it shows no results when I use it with the browser in a POST request.
However, I saw many solutions adding some lines to the settings.py file as celery configurations
I tried all of these lines:
CELERY_BROKER_URL = 'amqp://127.0.0.1'
CELERY_TIMEZONE = 'UTC'
CELERY_TRACK_STARTED = True
CELERY_TASK_TRACK_STARTED = True
CELERY_CACHE_BACKEND = 'amqp'
CELERY_TASK_TIME_LIMIT = 30 * 60
CELERY_IGNORE_RESULT = False
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['json']
I even tried this with celery terminal:
celery -A dockerproj worker -l info -P threads
celery -A dockerproj worker -l info --pool=solo (people said it fixed the issue in windows, however, i tried it)

Celery doesn't return or fail when calling apply_async. Works with celery_beat

I have a problem with calling celery tasks with apply_async.
I have in settings.py:
CELERY_BROKER_TRANSPORT_OPTIONS = {'confirm_publish': True}
CELERY_BROKER_URL = env('RABBITMQ_URL')
print(CELERY_BROKER_URL) //pyamqp://un:pw#app-rabbitmq:5672
CELERY_TASK_QUEUES = (
Queue('default', Exchange('default', type='direct'), routing_key='default'),
Queue('email', Exchange('email', type='direct'), routing_key='email'),
)
CELERY_TASK_ROUTES = {
'core.services.CoreSendEmailTaskService.*': {
'exchange': 'email',
'routing_key': 'email'
},
}
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_DEFAULT_QUEUE = 'default'
CELERY_TASK_ACKS_LATE = True
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_EXPIRES = 600
CELERY_RESULT_SERIALIZER = 'json'
CELERY_RESULT_BACKEND = env('CELERY_RESULT_BACKEND_URL') //redis://app-redis:6379/3
CELERY_RESULT_PERSISTENT = False
CELERY_WORKER_TASK_TIME_LIMIT = 65
CELERY_WORKER_TASK_SOFT_TIME_LIMIT = 60
CELERY_WORKER_HIJACK_ROOT_LOGGER = False
In project_config/celery_config/tasks/__init__.py:
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app
__all__ = ('app',)
and in project_config/celery_config/tasks/celery.py:
class CeleryApp(celery.Celery):
def on_configure(self):
sentry_dns = os.environ.get('DJANGO_SENTRY_DNS', None)
if sentry_dns and os.environ.get('ENVIRONMENT', 'local') == 'production':
client = raven.Client(
sentry_dns
)
register_logger_signal(client)
register_signal(client)
app = CeleryApp('tasks')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
class CeleryTasksConfig(AppConfig):
name = 'project_config.celery_config.tasks'
verbose_name = 'Celery Tasks'
def ready(self):
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
The weird thing is that the task executes on staging and production but locally it doesn't. Also locally when scheduled with beat the task runs normally.
The worker is locally started with command:
celery -A project_config.celery_config.tasks worker -O fair --loglevel=DEBUG --maxtasksperchild=1000 --queues=default,email -P prefork

django celery SQS "No result backend is configured."

Not duplicate of Celery: No Result Backend Configured? because SQS is used.
Keep getting the following error:
No result backend is configured. Please see the documentation for more
information.
My production settings are the following:
CELERY_BROKER_URL = 'sqs://%s:%s#' % (
urllib.parse.quote(env.str('TASK_QUEUE_USER_ID'), safe=''),
urllib.parse.quote(env.str('TASK_QUEUE_USER_SECRET'), safe=''))
BROKER_URL = CELERY_BROKER_URL
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_RESULT_BACKEND = None # Disabling the results backend
RESULT_BACKEND = None # Disabling the results backend
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_DEFAULT_QUEUE = 'async_tasks'
SQS_QUEUE_NAME = 'async_tasks'
CELERY_ENABLE_REMOTE_CONTROL = False
CELERY_SEND_EVENTS = False
CELERY_BROKER_TRANSPORT_OPTIONS = {
'region': 'eu-west-2',
'polling_interval': 3,
'visibility_timeout': 3600,
}
CELERY_SEND_TASK_ERROR_EMAILS = True
#
# https://stackoverflow.com/questions/8048556/celery-with-amazon-sqs#8567665
#
CELERY_BROKER_TRANSPORT = 'sqs'
BROKER_TRANSPORT = 'sqs'
Running celery from the command line:
DJANGO_ENV=production celery -A async_tasks worker -l info
connects to SQS and polls, but when I try to do a demo call from the command line DJANGO_ENV=production python manage.py check_async:
from django.core.management.base import BaseCommand, CommandError
import async_tasks.tasks as tasks
class Command(BaseCommand):
help = 'Check if infrastructure for async tasks has been setup correctly.'
def handle(self, *args, **options):
try:
print('Sending async request.')
t = tasks.add.apply_async((2, 4))
out = t.get(timeout=1)
print(out)
print(t.status)
except Exception as e:
print(e)
raise CommandError('Error occured')
I get the error above. Have tried in development machine with redis and everything works well.
Any ideas?
You need a Celery Result Backend configured to be able to store and collect task results. Using Celery with an SQS broker w/o a result backend is ok for "fire and forget" patterns, but it's not enough if you want to be able to access the results of your tasks through methods like get().
Maybe this will help someone. The answer above is correct, but if you still want to use Django, SQS and Celery and still want to see the results you can use Django's ORM or Cache Framework as a backend by using the django-celery-results library.
Django-celery-results
Celery Documentation - ORM Cache Framework