Celery + Django not working at the same time - django

I have Django 2.0 project that is working fine, its integrated with Celery 4.1.0, I am using jquery to send ajax request to the backend but I just realized its loading endlessly due to some issues with celery.
Celery Settings (celery.py)
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'converter.settings')
app = Celery('converter', backend='amqp', broker='amqp://guest#localhost//')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
Celery Tasks (tasks.py)
from __future__ import absolute_import, unicode_literals
from celery import shared_task
#shared_task(time_limit=300)
def add(number1, number2):
return number1 + number2
Django View (views.py)
class AddAjaxView(JSONResponseMixin, AjaxResponseMixin, View):
def post_ajax(self, request, *args, **kwargs):
url = request.POST.get('number', '')
task = tasks.convert.delay(url, client_ip)
result = AsyncResult(task.id)
data = {
'result': result.get(),
'is_ready': True,
}
if result.successful():
return self.render_json_response(data, status=200)
When I send ajax request to the Django app it is loading endlessly but when terminate Django server, and I run celery -A demoproject worker --loglevel=info that's when my tasks are running.
Question
How do I automate this so that when I run Django project my celery tasks will work automatically when I send ajax request?

If you are on development environment, you have to run manually celery worker as it does not run automatically on the background, in order to process the jobs in the queue. So if you want to have a flawless workflow, you need both Django default server and celery worker running. As stated in the documentation:
In a production environment you’ll want to run the worker in the background as a daemon - see Daemonization - but for testing and development it is useful to be able to start a worker instance by using the celery worker manage command, much as you’d use Django’s manage.py runserver:
celery -A proj worker -l info
You can read their documentation for daemonization.
http://docs.celeryproject.org/en/latest/userguide/daemonizing.html

Related

Reuse of Celery configuration values for Heroku and local Flask

I'm running a Flask app that runs several Celery tasks (with Redis as the backend) and sometimes caches API calls with Flask-Caching. It will run on Heroku, although at the moment I'm running it locally. I'm trying to figure out if there's a way to reuse my various config variables for Redis access. Mainly in case Heroku changes the credentials, moves Redis to another server, etc. Currently I'm reusing the same Redis credentials in several ways.
From my .env file:
CACHE_REDIS_URL = "redis://127.0.0.1:6379/1"
REDBEAT_REDIS_URL = "redis://127.0.0.1:6379/1"
CELERY_BROKER_URL = "redis://127.0.0.1:6379/1"
RESULT_BACKEND = "redis://127.0.0.1:6379/1"
From my config.py file:
import os
from pathlib import Path
basedir = os.path.abspath(os.path.dirname(__file__))
class Config(object):
# non redis values are above and below these items
CELERY_BROKER_URL = os.environ.get("CELERY_BROKER_URL", "redis://127.0.0.1:6379/0")
RESULT_BACKEND = os.environ.get("RESULT_BACKEND", "redis://127.0.0.1:6379/0")
CELERY_RESULT_BACKEND = RESULT_BACKEND # because of the deprecated value
CACHE_REDIS_URL = os.environ.get("CACHE_REDIS_URL", "redis://127.0.0.1:6379/0")
REDBEAT_REDIS_URL = os.environ.get("REDBEAT_REDIS_URL", "redis://127.0.0.1:6379/0")
In extensions.py:
from celery import Celery
from src.cache import cache
celery = Celery()
def register_extensions(app, worker=False):
cache.init_app(app)
# load celery config
celery.config_from_object(app.config)
if not worker:
# register celery irrelevant extensions
pass
In my __init__.py:
import os
from flask import Flask, jsonify, request, current_app
from src.extensions import register_extensions
from config import Config
def create_worker_app(config_class=Config):
"""Minimal App without routes for celery worker."""
app = Flask(__name__)
app.config.from_object(config_class)
register_extensions(app, worker=True)
return app
from my worker.py file:
from celery import Celery
from celery.schedules import schedule
from redbeat import RedBeatSchedulerEntry as Entry
from . import create_worker_app
# load several tasks from other files here
def create_celery(app):
celery = Celery(
app.import_name,
backend=app.config["RESULT_BACKEND"],
broker=app.config["CELERY_BROKER_URL"],
redbeat_redis_url = app.config["REDBEAT_REDIS_URL"],
)
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
flask_app = create_worker_app()
celery = create_celery(flask_app)
# call the tasks, passing app=celery as a parameter
This all works fine, locally (I've tried to remove code that isn't relevant to the Celery configuration). I haven't finished deploying to Heroku yet because I remembered that when I install Heroku Data for Redis, it creates a REDIS_URL setting that I'd like to use.
I've been trying to change my config.py values to use REDIS_URL instead of the other things they use, but every time I try to run my celery tasks the connection fails unless I have distinct env values as shown in my config.py above.
What I'd like to have in config.py would be this:
import os
from pathlib import Path
basedir = os.path.abspath(os.path.dirname(__file__))
class Config(object):
REDIS_URL = os.environ.get("REDIS_URL", "redis://127.0.0.1:6379/0")
CELERY_BROKER_URL = os.environ.get("CELERY_BROKER_URL", REDIS_URL)
RESULT_BACKEND = os.environ.get("RESULT_BACKEND", REDIS_URL)
CELERY_RESULT_BACKEND = RESULT_BACKEND
CACHE_REDIS_URL = os.environ.get("CACHE_REDIS_URL", REDIS_URL)
REDBEAT_REDIS_URL = os.environ.get("REDBEAT_REDIS_URL", REDIS_URL)
When I try this, and when I remove all of the values from .env except for REDIS_URL and then try to run one of my Celery tasks, the task never runs. The Celery worker appears to run correctly, and the Flask-Caching requests run correctly (these run directly within the application rather than using the worker). It never appears as a received task in the worker's debug logs, and eventually the server request times out.
Is there anything I can do to reuse Redis_URL with Celery in this way? If I can't, is there anything Heroku does expect me to do to maintain the credentials/server path/etc for where it is serving Redis for Celery, when I'm using the same instance of Redis for several purposes like this?
By running my Celery worker with the -E flag, as in celery -A src.worker:celery worker -S redbeat.RedBeatScheduler --loglevel=INFO -E, I was able to figure out that my error was happening because Flask's instance of Celery, in gunicorn, was not able to access the config values for Celery that the worker was using.
What I've done to try to resolve this appears to have worked.
In extensions.py, instead of configuring Celery, I've done this, removing all other mentions of Celery:
from celery import Celery
celery = Celery('scraper') # a temporary name
Then, on the same level, I created a celery.py:
from celery import Celery
from flask import Flask
from src import extensions
def configure_celery(app):
TaskBase = extensions.celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
extensions.celery.conf.update(
broker_url=app.config['CELERY_BROKER_URL'],
result_backend=app.config['RESULT_BACKEND'],
redbeat_redis_url = app.config["REDBEAT_REDIS_URL"]
)
extensions.celery.Task = ContextTask
return extensions.celery
In worker.py, I'm doing:
from celery import Celery
from celery.schedules import schedule
from src.celery import configure_celery
flask_app = create_worker_app()
celery = configure_celery(flask_app)
I'm doing a similar thing in app.py:
from src.celery import configure_celery
app = create_app()
configure_celery(app)
As far as I can tell, this doesn't change how the worker behaves at all, but it allows me to access the tasks, via blueprint endpoints, in the browser.
I found this technique in this article and its accompanying GitHub repo

Stop Celery caching Django template

I have Django 3.24, celery 4.4.7 and celerybeat 2.2.0 setup via the RabbitMQ broker.
I have have a celery task that renders a Django template and then sends it to a number of email recipients.
The Django template is dynamic in as much as changes can be made to it's content at any time, which in turn rewrites the template. The trouble is that on occasions, I have to restart celery to get it to re-read the template.
My question is, is there any way of forcing celery to reread the template file, without requiring a full celery restart?
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'backoffice.settings')
app = Celery('backoffice')
default_config = 'backoffice.celery_config'
app.config_from_object(default_config)
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
celery_config.py
from django.conf import settings
broker_url = "amqp://someusername:somepassword#webserver:5672/backoffice"
worker_send_task_event = False
task_ignore_result = True
task_time_limit = 60
task_soft_time_limit = 50
task_acks_late = True
worker_prefetch_multiplier = 10
worker_cancel_long_running_tasks_on_connection_loss = True
celery command
celery -A backoffice worker -B -l info --without-heartbeat --without-gossip --without-mingle

Prevent Redis/Celery scheduled tasks from being processed multiple times - Django on Heroku

I am looking for some advice. I use Celery/Redis Scheduled tasks to check changes via an API request every 10 seconds. If there is a change, a database object with the request feedback will be created and when it did some calculations a boolean named is_called will be set to True, to prevent duplicates.
This worked fine locally and for a time also on Heroku, but since an update (nothing in the task code changed) the worker seems busy and uses up to 7 ForkPoolWorkers.
For example the ForkPoolWorker-1 will work on the same task as ForkPoolWorker-7, which ignores the is_called = True boolean and gives me duplicate calculations with 1 database object created.
What's the best way to serve a task only to 1 ForkPoolWorker at a time? I read the docs and google research, but it's not completely clear what and how to do this.
I built in Django on Heroku.
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "stockpilot.settings")
app = Celery('proj')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
app.conf.beat_schedule = {
'every-ten-seconds': {
'task': 'get_api_task',
'schedule': 10.0
},
}
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))

celery not working in django and just waiting (pending)

i'm trying found how celery is working. i have a project that have about 10 app.now i want use celery .
setting.py:
CELERY_BROKER_URL = 'amqp://rabbitmq:rabbitmq#localhost:5672/rabbitmq_vhost'
CELERY_RESULT_BACKEND = 'redis://localhost'
i created a user in rabbitmq with this info:username: rabbitq and password:rabbitmq . then i create a vhost with name rabbitmq_vhost and add rabbitmq permission to it. all is fine i think because all of error about rabbitmq disappear .
here is my test.py:
from .task import when_task_expiration
def test_celery():
result = when_task_expiration.apply_async((2, 2), countdown=3)
print(result.get())
task.py:
from __future__ import absolute_import, unicode_literals
import logging
from celery import shared_task
from proj.celery import app
#app.task
def when_task_expiration(task, x):
print(task.id, 'task done')
return True
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
app = Celery('proj')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
now when i call test_celery() in python shell it's pending.i try to replace #shared_task and #app.task(bind=True) but noting changed.even i try use .delay() instead apply_async((2, 2), countdown=3) and again nothing happend.
i'm trying to use celery to call a function in specific time during this quesation that i ask in past.thank you.
You most likely forgot to run at least one Celery worker process. To do so, execute the following in the shell: celery worker -A proj.celery -c 4 -l DEBUG (here I assumed your Celery application is defined in proj/celery.py as you have Celery('proj') in there)

Start Celery worker on production. Using Django/Python on Azure/linux app service

I have a website with an API that customers can send their API-post-calls. These API's have attachments in form of a PDFs or similar that gets stored in a folder /MEDIA/Storage/. The app is written in Django.
The API-call gets stored in a model through DRF and serializers. After the data is stored some logic is done, emails os sent, lookups and storing in data-tables etc. Since this takes so much time. I implemented Celery (Azure Cache for Redis as Broker) in my app, so that only the first storage in model is done as usual. The rest us queued up through Celery.
This works well on my local machine (mac os). But not on production (Azure/Linux).
I have tried git hooks, but i cannot get it working.
I have tried some terminal through ssh on the azure VM, but no luck...
I have looked into Daemonization but it was complicated.
settings.py
CELERY_BROKER_URL = 'redis://:<password>=#<appname>.redis.cache.windows.net:6379/0'
CELERY_RESULT_BACKEND = 'django-db'
CELERY_CACHE_BACKEND = 'django-cache'
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hapionline.settings')
app = Celery('hapionline')
app.config_from_object('django.conf:settings', namespace="CELERY")
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
views.py
class ProcSimpleList(generics.CreateAPIView): # Endast Create för att skapa en proc
serializer_class = ProcSimpleSerializer
permission_classes = (IsAdminOrReadOnly,)
lookup_url_kwarg = 'proc_id'
def perform_create(self, serializer):
q = serializer.save()
# Queue from starting worker. Queue created when starting cereal.
transaction.apply_async(queue='high_priority', args=(q.proc_id, self.request.user.pk))
Local machine: All works well with the command: celery -A hapionline worker -l info -Q high_priority
Production: I do not know where to run the command on the production server?
If the worker is started on the local machine, it starts the Azure Cache, and calling the production environment API works. But since the worker is started locally the Paths too attached files in the API are incorrect and local, not production-like. /User/../Media/.. instead of /wwwroot/../media/..
Any ideas? How do I start a worker on the production VM? Is there a way to run a the start worker "script" after the git push azure master?
I skipped Azure and moved the app to Heroku. This worked as a charm.