celery: error: unrecognized arguments: -A, Flask, argparse - python-2.7

In a Flask based web application, taking two command line arguments ini filename, port number using argparse, in the same file celery app also defined.But while running the celery application I'm getting the above error.
import argparse
from flask import Flask
from celery import Celery
app = Flask(__name__)
parser = argparse.ArgumentParser(prog="testpgm")
parser.add_argument('-c','--cfgfile', default='domain.ini', help="provide ini file path")
parser.add_argument('-p','--port', default=5000, help="-p port number eg - 'python run.py -p <port>, default to 5000")
args = parser.parse_args()
ini_path = args.cfgfile
port = args.port
-------CELERY CONFIGS-------
app.config["CELERY_QUEUES"] = (
Queue('queue1', Exchange('queue1'), routing_key='queue1')
)
def make_celery(flaskapp):
#getting celery broker uri
celery_broker_uri= CeleryBrokerWrapper().get_broker_uri(broker,username,password,host,port,vhost)
celeryinit = Celery(flaskapp.import_name, broker=celery_broker_uri)
celeryinit.conf.update(flaskapp.config)
taskbase = celeryinit.Task
class ContextTask(taskbase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return taskbase.__call__(self, *args, **kwargs)
celeryinit.Task = ContextTask
return celeryinit
celery = make_celery(app)
but when I'm running celery using
celery -A testpgm.celery worker --loglevel=info --concurrency=5 -Q queue1
I'm getting the error like
testpgm: error: unrecognized arguments: -A testpgm.celery worker --loglevel=info --concurrency=5 -Q queue1
Its looks like an argparse error, how can I customise argparse for my application, with out having problem with celery's command line arguments..

Had a similar issue, argparse also complained for me.
Quick Fix: use parse_known_args, as opposed to parse_args
args, unknown = parser.parse_known_args()
source:
Python argparse ignore unrecognised arguments
Ugly Fix:
define the celery worker args as part of the argparse your main app has
"Do it right" Fix:
Consider using argparse in your main function so that celery does not clash with it
Handling argparse conflicts

you need to re-order the args:
celery worker -A testpgm.celery --loglevel=info --concurrency=5 -Q queue1

Related

Reuse of Celery configuration values for Heroku and local Flask

I'm running a Flask app that runs several Celery tasks (with Redis as the backend) and sometimes caches API calls with Flask-Caching. It will run on Heroku, although at the moment I'm running it locally. I'm trying to figure out if there's a way to reuse my various config variables for Redis access. Mainly in case Heroku changes the credentials, moves Redis to another server, etc. Currently I'm reusing the same Redis credentials in several ways.
From my .env file:
CACHE_REDIS_URL = "redis://127.0.0.1:6379/1"
REDBEAT_REDIS_URL = "redis://127.0.0.1:6379/1"
CELERY_BROKER_URL = "redis://127.0.0.1:6379/1"
RESULT_BACKEND = "redis://127.0.0.1:6379/1"
From my config.py file:
import os
from pathlib import Path
basedir = os.path.abspath(os.path.dirname(__file__))
class Config(object):
# non redis values are above and below these items
CELERY_BROKER_URL = os.environ.get("CELERY_BROKER_URL", "redis://127.0.0.1:6379/0")
RESULT_BACKEND = os.environ.get("RESULT_BACKEND", "redis://127.0.0.1:6379/0")
CELERY_RESULT_BACKEND = RESULT_BACKEND # because of the deprecated value
CACHE_REDIS_URL = os.environ.get("CACHE_REDIS_URL", "redis://127.0.0.1:6379/0")
REDBEAT_REDIS_URL = os.environ.get("REDBEAT_REDIS_URL", "redis://127.0.0.1:6379/0")
In extensions.py:
from celery import Celery
from src.cache import cache
celery = Celery()
def register_extensions(app, worker=False):
cache.init_app(app)
# load celery config
celery.config_from_object(app.config)
if not worker:
# register celery irrelevant extensions
pass
In my __init__.py:
import os
from flask import Flask, jsonify, request, current_app
from src.extensions import register_extensions
from config import Config
def create_worker_app(config_class=Config):
"""Minimal App without routes for celery worker."""
app = Flask(__name__)
app.config.from_object(config_class)
register_extensions(app, worker=True)
return app
from my worker.py file:
from celery import Celery
from celery.schedules import schedule
from redbeat import RedBeatSchedulerEntry as Entry
from . import create_worker_app
# load several tasks from other files here
def create_celery(app):
celery = Celery(
app.import_name,
backend=app.config["RESULT_BACKEND"],
broker=app.config["CELERY_BROKER_URL"],
redbeat_redis_url = app.config["REDBEAT_REDIS_URL"],
)
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
flask_app = create_worker_app()
celery = create_celery(flask_app)
# call the tasks, passing app=celery as a parameter
This all works fine, locally (I've tried to remove code that isn't relevant to the Celery configuration). I haven't finished deploying to Heroku yet because I remembered that when I install Heroku Data for Redis, it creates a REDIS_URL setting that I'd like to use.
I've been trying to change my config.py values to use REDIS_URL instead of the other things they use, but every time I try to run my celery tasks the connection fails unless I have distinct env values as shown in my config.py above.
What I'd like to have in config.py would be this:
import os
from pathlib import Path
basedir = os.path.abspath(os.path.dirname(__file__))
class Config(object):
REDIS_URL = os.environ.get("REDIS_URL", "redis://127.0.0.1:6379/0")
CELERY_BROKER_URL = os.environ.get("CELERY_BROKER_URL", REDIS_URL)
RESULT_BACKEND = os.environ.get("RESULT_BACKEND", REDIS_URL)
CELERY_RESULT_BACKEND = RESULT_BACKEND
CACHE_REDIS_URL = os.environ.get("CACHE_REDIS_URL", REDIS_URL)
REDBEAT_REDIS_URL = os.environ.get("REDBEAT_REDIS_URL", REDIS_URL)
When I try this, and when I remove all of the values from .env except for REDIS_URL and then try to run one of my Celery tasks, the task never runs. The Celery worker appears to run correctly, and the Flask-Caching requests run correctly (these run directly within the application rather than using the worker). It never appears as a received task in the worker's debug logs, and eventually the server request times out.
Is there anything I can do to reuse Redis_URL with Celery in this way? If I can't, is there anything Heroku does expect me to do to maintain the credentials/server path/etc for where it is serving Redis for Celery, when I'm using the same instance of Redis for several purposes like this?
By running my Celery worker with the -E flag, as in celery -A src.worker:celery worker -S redbeat.RedBeatScheduler --loglevel=INFO -E, I was able to figure out that my error was happening because Flask's instance of Celery, in gunicorn, was not able to access the config values for Celery that the worker was using.
What I've done to try to resolve this appears to have worked.
In extensions.py, instead of configuring Celery, I've done this, removing all other mentions of Celery:
from celery import Celery
celery = Celery('scraper') # a temporary name
Then, on the same level, I created a celery.py:
from celery import Celery
from flask import Flask
from src import extensions
def configure_celery(app):
TaskBase = extensions.celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
extensions.celery.conf.update(
broker_url=app.config['CELERY_BROKER_URL'],
result_backend=app.config['RESULT_BACKEND'],
redbeat_redis_url = app.config["REDBEAT_REDIS_URL"]
)
extensions.celery.Task = ContextTask
return extensions.celery
In worker.py, I'm doing:
from celery import Celery
from celery.schedules import schedule
from src.celery import configure_celery
flask_app = create_worker_app()
celery = configure_celery(flask_app)
I'm doing a similar thing in app.py:
from src.celery import configure_celery
app = create_app()
configure_celery(app)
As far as I can tell, this doesn't change how the worker behaves at all, but it allows me to access the tasks, via blueprint endpoints, in the browser.
I found this technique in this article and its accompanying GitHub repo

Can't start the worker for Running celery with Flask

I am following the example given in the following url to run celery with Flask:
http://flask.pocoo.org/docs/0.12/patterns/celery/
I followed everything word by word. The only difference being, my make_celery function is created under the following hierarchy:
package1|
|------CeleryObjCreator.py
|
CeleryObjectCraetor.py has the make_celery function under CeleryObjectCreatorClass as follows:
from celery import Celery
class CeleryObjectHelper:
def make_celery(self, app):
celery = Celery(app.import_name, backend=app.config['CELERY_RESULT_BACKEND'],
broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
Now, I am facing problems with starting the celery worker.
In the end of the article, it suggests to start the celery worker as follows:
$ celery -A your_application.celery worker
In my case, I am using <> for your_application string which doesn't work and it gives the following error:
ImportError: No module named 'package1.celery'
So I am not sure what should be the value of your_application string here to start the celery worker.
EDIT
As suggested by Nour Chawich, i did try running the Flask app from the command line. my server does come up successfully.
Also, since app is a directory in my project structure where app.py is, in app.py code i replaced app = Flask(name) with flask_app = Flask(name) to separate out the variable names
But when i try to start the celery worker using command
celery -A app.celery -loglevel=info
it is not able to recognize the following imports that I have in my code
import app.myPackage as myPackage
it throws the following error
ImportError: No module named 'app'
So I am really not sure what is going on here. any ideas ?

Django and Celery - re-loading code into Celery after a change

If I make a change to tasks.py while celery is running, is there a mechanism by which it can re-load the updated code? or do I have to shut Celery down a re-load?
I read celery had an --autoreload argument in older versions, but I can't find it in the current version:
celery: error: unrecognized arguments: --autoreload
Unfortunately --autoreload doesn't work and it is deprecated.
You can use Watchdog which provides watchmedo a shell utilitiy to perform actions based on file events.
pip install watchdog
You can start worker with
watchmedo auto-restart -- celery worker -l info -A foo
By default it will watch for all files in current directory. These can be changed by passing corresponding parameters.
watchmedo auto-restart -d . -p '*.py' -- celery worker -l info -A foo
Add -R option to recursively watch the files.
If you are using django and don't want to depend on watchdog, there is a simple trick to achieve this. Django has autoreload utility which is used by runserver to restart WSGI server when code changes.
The same functionality can be used to reload celery workers. Create a seperate management command called celery. Write a function to kill existing worker and start a new worker. Now hook this function to autoreload as follows. For Django >= 2.2
import sys
import shlex
import subprocess
from django.core.management.base import BaseCommand
from django.utils import autoreload
class Command(BaseCommand):
def handle(self, *args, **options):
autoreload.run_with_reloader(self._restart_celery)
#classmethod
def _restart_celery(cls):
if sys.platform == "win32":
cls.run('taskkill /f /t /im celery.exe')
cls.run('celery -A phoenix worker --loglevel=info --pool=solo')
else: # probably ok for linux2, cygwin and darwin. Not sure about os2, os2emx, riscos and atheos
cls.run('pkill celery')
cls.run('celery worker -l info -A foo')
#staticmethod
def run(cmd):
subprocess.call(shlex.split(cmd))
For django < 2.2
import sys
import shlex
import subprocess
from django.core.management.base import BaseCommand
from django.utils import autoreload
class Command(BaseCommand):
def handle(self, *args, **options):
autoreload.main(self._restart_celery)
#classmethod
def _restart_celery(cls):
if sys.platform == "win32":
cls.run('taskkill /f /t /im celery.exe')
cls.run('celery -A phoenix worker --loglevel=info --pool=solo')
else: # probably ok for linux2, cygwin and darwin. Not sure about os2, os2emx, riscos and atheos
cls.run('pkill celery')
cls.run('celery worker -l info -A foo')
#staticmethod
def run(cmd):
subprocess.call(shlex.split(cmd))
Now you can run celery worker with python manage.py celery which will autoreload when codebase changes.
This is only for development purposes and do not use it in production.
You could try SIGHUP on the parent worker process, it restarts the worker, but I'm not sure if it picks up new tasks. Worth a shot, thought :)
FYI, for anyone using Docker, I couldn't find an easy way to make the above options work, but I found (along with others) another little script here which does use watchdog and works perfectly.
Save it as some_name.py file in your main directory, add pip install psutil and watchdog to requirements.txt, update the path/cmdline variables at the top, then in the worker container of your docker-compose.yml insert:
command: python ./some_name.py
Watchmedog doesn't work for me inside a docker container.
This is the way I made it work with Django:
# worker_dev.py (put it next to manage.py)
from django.utils import autoreload
def run_celery():
from projectname import celery_app
celery_app.worker_main(["-Aprojectname", "-linfo", "-Psolo"])
print("Starting celery worker with autoreload...")
autoreload.run_with_reloader(run_celery)
Then run python worker_dev.py or set it as your Dockerfile CMD or docker-compose command.

How to execute multiple django celery tasks with a single worker?

Here is my celery file:
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'ets.settings')
from django.conf import settings # noqa
app = Celery('proj',
broker='redis://myredishost:6379/0',
backend='redis://myredishost:6379/0',
include=['tracking.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_TASK_RESULT_EXPIRES=3600,
)
if __name__ == '__main__':
app.start()
Here is my task file:
#app.task
def escalate_to_sup(id, group):
escalation_email, created = EscalationEmail.objects.get_or_create()
escalation_email.send()
return 'sup email sent to: '+str(group)
#app.task
def escalate_to_fm(id, group):
escalation_email, created = EscalationEmail.objects.get_or_create()
escalation_email.send()
return 'fm email sent to: '+str(group)
I start the worker like this:
celery -A ets worker -l info
I have also tried to add concurrency like this:
celery -A ets worker -l info --concurrency=10
I attempt to call the tasks above with the following:
from tracking.tasks import escalate_to_fm, escalate_to_sup
def status_change(equipment):
r1 = escalate_to_sup.apply_async((equipment.id, [1,2]), countdown=10)
r2 = escalate_to_fm.apply_async((equipment.id, [3,4]), countdown=20)
print r1.id
print r2.id
This prints:
c2098768-61fb-41a7-80a2-f79a73570966
23959fa3-7f80-4e20-a42f-eef75e9bedeb
The escalate_to_sup and escalate_fm functions log to the worker intermittently. At least 1 executes, but never both.
I have tried spinning up more workers, and then both tasks execute. I do this like:
celery -A ets worker -l info --concurrency=10 -n worker1.%h
celery -A ets worker -l info --concurrency=10 -n worker2.%h
The problem is I don't know how many of the tasks might execute concurrently so spinning up a worker for every possible tasks to execute is not feasible.
Does celery expect a work for every active task?
How do I execute multiple tasks with a single worker?

Django. Simple Celery task not working

I'm new to Celery. I have a task that is not working adn I don't know why. Im using rabbitmq Here is my code:
In settings.py:
BROKER_URL = "amqp://guest#localhost//"
tasks.py:
from celery.decorators import task
from celery.utils.log import get_task_logger
from hisoka.models import FeralSpirit, Fireball
logger = get_task_logger(__name__)
#task
def test_task():
fireball = Fireball.objects.last()
feral_spirit = FeralSpirit.objects.filter(fireball=fireball).last()
counters = feral_spirit.increase_counter()
logger.info(feral_spirit + "counters: " + counters)
The task is just a test, it is designed to increase a counter that is a field of the FeralSpirit model. It works correctly if I don't call the function with delay()
views.py
class FireballDetail(ListView):
def get_queryset(self, *args, **kwargs):
test_task.delay()
...
I have a rabbitmq server running correctly (or at least it looks like that) on one terminal and the django localhost server on another terminal. Am I missing something obvious? I have a celery.py and a modified __init__ file, exactly following the documentation.
Most probably your celery worker is not running, try
celery -A {project_name} worker --loglevel=info -Q {queue_name}
Substitute the value of project_name and queue_name. Default queue_name is default