Celery starts the scheduler more often than specified in the settings - django

Tell me in what there can be a problem with Celery worker? When I run it, it starts executing the task more often than once a second, although it takes an interval of several minutes.
 
Running the bit: "celery market_capitalizations beat -l info --scheduler django_celery_beat.schedulers: DatabaseScheduler"
Launch of a vorker: "celery -A market_capitalizations worker -l info -S django"
 
Maybe I'm not starting the service correctly?
Settings:
INSTALLED_APPS = [
        'django.contrib.admin',
        'django.contrib.auth',
        'django.contrib.contenttypes',
        'django.contrib.sessions',
        'django.contrib.messages',
        'django.contrib.staticfiles',
        'exchange_rates',
        'django_celery_beat',
        'django_celery_results',
        ]
    TIME_ZONE = 'Europe / Saratov'
    USE_I18N = True
    USE_L10N = True
    USE_TZ = True
    CELERY_BROKER_URL = 'redis: // localhost: 6379'
    CELERY_RESULT_BACKEND = 'redis: // localhost: 6379'
    CELERY_ACCEPT_CONTENT = ['application / json']
    CELERY_TASK_SERIALIZER = 'json'
    CELERY_RESULT_SERIALIZER = 'json'
    CELERY_TIMEZONE = TIME_ZONE
    CELERY_ENABLE_UTC = False
    CELERYBEAT_SCHEDULER = 'django_celery_beat.schedulers: DatabaseScheduler'
running services
When the task is started, a request is not sent.
admin panel
Tell me, please, how to make a celery pick up task time from a web page and run the task with it?
I tried to run the task through the code, but it still runs more often than in a second.
 
    
from celery.schedules import crontab
    app.conf.beat_schedule = {
        'add-every-5-seconds': {
            'task': 'save_exchange_rates_task',
            'schedule': 600.0,
            # 'args': (16, 16)
        },
    }
 

I ran into the similiar issue when using django-celery-beat. But when I turn off USE_TZ(USE_TZ = False), the issue was gone.
But I need set USE_TZ to False to let my app TZ not aware the time zone.
If you have any solution, can you share it? Thansk.
My dev environment:
Python 3.7 + Django 2.0 + Celery 4.2 + Django-celery-beat 1.4
BTW,
Now I am using configuration schedule in settings and it is working well
I am still finding the solution to use django-celery-beat to use the db to manager the tasks.
CELERY_BEAT_SCHEDULE = {
'audit-db-every-10-minutes': {
'task': 'myapp.tasks.db_audit',
'schedule': 600.0, # 10 minutes
'args': ()
},
}

Related

How to route tasks to different queues with Celery and Django

I am using the following stack:
Python 3.6
Celery v4.2.1 (Broker: RabbitMQ v3.6.0)
Django v2.0.4.
According Celery's documentation, running scheduled tasks on different queues should be as easy as defining the corresponding queues for the tasks on CELERY_ROUTES, nonetheless all tasks seem to be executed on Celery's default queue.
This is the configuration on my_app/settings.py:
CELERY_BROKER_URL = "amqp://guest:guest#localhost:5672//"
CELERY_ROUTES = {
'app1.tasks.*': {'queue': 'queue1'},
'app2.tasks.*': {'queue': 'queue2'},
}
CELERY_BEAT_SCHEDULE = {
'app1_test': {
'task': 'app1.tasks.app1_test',
'schedule': 15,
},
'app2_test': {
'task': 'app2.tasks.app2_test',
'schedule': 15,
},
}
The tasks are just simple scripts for testing routing:
File app1/tasks.py:
from my_app.celery import app
import time
#app.task()
def app1_test():
print('I am app1_test task!')
time.sleep(10)
File app2/tasks.py:
from my_app.celery import app
import time
#app.task()
def app2_test():
print('I am app2_test task!')
time.sleep(10)
When I run Celery with all the required queues:
celery -A my_app worker -B -l info -Q celery,queue1,queue2
RabbitMQ will show that only the default queue "celery" is running the tasks:
sudo rabbitmqctl list_queues
# Tasks executed by each queue:
# - celery 2
# - queue1 0
# - queue2 0
Does somebody know how to fix this unexpected behavior?
Regards,
I have got it working, there are few things to note here:
According Celery's 4.2.0 documentation, CELERY_ROUTES should be the variable to define queue routing, but it only works for me using CELERY_TASK_ROUTES instead. The task routing seems to be independent from Celery Beat, therefore this will only work for tasks scheduled manually:
app1_test.delay()
app2_test.delay()
or
app1_test.apply_async()
app2_test.apply_async()
To make it work with Celery Beat, we just need to define the queues explicitly in the CELERY_BEAT_SCHEDULE variable. The final setup of the file my_app/settings.py would be as follows:
CELERY_BROKER_URL = "amqp://guest:guest#localhost:5672//"
CELERY_TASK_ROUTES = {
'app1.tasks.*': {'queue': 'queue1'},
'app2.tasks.*': {'queue': 'queue2'},
}
CELERY_BEAT_SCHEDULE = {
'app1_test': {
'task': 'app1.tasks.app1_test',
'schedule': 15,
'options': {'queue': 'queue1'}
},
'app2_test': {
'task': 'app2.tasks.app2_test',
'schedule': 15,
'options': {'queue': 'queue2'}
},
}
And to run Celery listening on those two queues:
celery -A my_app worker -B -l INFO -Q queue1,queue2
Where
-A: name of the project or app.
-B: Initiates the task scheduler Celery beat.
-l: Defines the logging level.
-Q: Defines the queues handled by this worker.
I hope this saves some time to other developers.
adding queue parameter to the decorator may help you,
#app.task(queue='queue1')
def app1_test():
print('I am app1_test task!')
time.sleep(10)
Okay as i have tried the same command that you have used to run the worker so I found that you just have to remove the "celery after the -Q parameter and that'll be fine too.
So the old command is
celery -A my_app worker -B -l info -Q celery,queue1,queue2
And the new command is
celery -A my_app worker -B -l info -Q queue1,queue2

Celery error: kombu.exceptions.NotBoundError: Can't call method on Exchange not bound to a channel

I'm using celery 4.0.2 with rabbitmq 3.6.6 and Django 1.10, here is my configuration:
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
app = Celery('my_app')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
app.conf.BROKER_URL = 'amqp://{}:{}#{}'.format(settings.AMQP_USER, settings.AMQP_PASSWORD, settings.AMQP_HOST)
app.conf.CELERY_DEFAULT_EXCHANGE = 'my_app.celery'
app.conf.CELERY_DEFAULT_QUEUE = 'my_app.celery_default'
app.conf.CELERY_TASK_SERIALIZER = 'json'
app.conf.CELERY_ACCEPT_CONTENT = ['json']
app.conf.CELERY_IGNORE_RESULT = True
app.conf.CELERY_DISABLE_RATE_LIMITS = True
app.conf.BROKER_POOL_LIMIT = 2
app.conf.CELERY_QUEUES = (
Queue(settings.QUEUE_1),
Queue(settings.QUEUE_2),
Queue(settings.QUEUE_3),
)
It works fine, but when I try to add a new queue, ie
app.conf.CELERY_QUEUES = (
Queue(settings.QUEUE_1),
Queue(settings.QUEUE_2),
Queue(settings.QUEUE_3),
Queue(settings.QUEUE_4),
)
I get this error:
kombu.exceptions.NotBoundError: Can't call method on Exchange not bound to a channel
If I remove one of these queues, it works again, so it seems to be limited to 3 queues. I don't understand why. Celery is launched like this:
celery worker -A my_app.celery_app
Any idea? Thanks in advance!
Ok this is probably because I'm using Python 3.6, see: https://github.com/celery/kombu/issues/675

Python Django Celery is taking too much memory

I am running a celery server which have 5,6 task to run periodically. Celery is taking too much memory after 5,6 days of continuous execution.
Celery documentation is very confusing. I am using following settings.
# celeryconfig.py
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'xxx.settings'
# default RabbitMQ broker
BROKER_URL = "amqp://guest:guest#localhost:5672//"
from celery.schedules import crontab
# default RabbitMQ backend
CELERY_RESULT_BACKEND = None
#4 CONCURRENT proccesess are running.
CELERYD_CONCURRENCY = 4
# specify location of log files
CELERYD_LOG_FILE="/var/log/celery/celery.log"
CELERY_ALWAYS_EAGER = True
CELERY_IMPORTS = (
'xxx.celerydir.cron_tasks.deprov_cron_script',
)
CELERYBEAT_SCHEDULE = {
'deprov_cron_script': {
'task': 'xxx.celerydir.cron_tasks.deprov_cron_script.check_deprovision_vms',
'schedule': crontab(minute=0, hour=17),
'args': ''
}
}
I am running celery service using nohup command(this will run this in background).
nohup celery beat -A xxx.celerydir &
After going through documentation. I came to know that DEBUG was True in settings.
Just change value of DEBUG in settings.
REF:https://github.com/celery/celery/issues/2927

Running celery task when celery beat starts

How do I schedule a task to run when I start celery beat then again in 1 hours and so.
Currently I have schedule in settings.py:
CELERYBEAT_SCHEDULE = {
'update_database': {
'task': 'myapp.tasks.update_database',
'schedule': timedelta(seconds=60),
},
}
I saw a post from 1 year here on stackoverflow asking the same question:
How to run celery schedule instantly?
However this does not work for me, because my celery worker get 3-4 requests for the same task, when I run django server
I'm starting my worker and beat like this:
celery -A dashboard_web worker -B --loglevel=INFO --concurrency=10
Crontab schedule
You could try to use a crontab schedule instead which will run every hour and start 1 min after initialization of the scheduler. Warning: you might want to do it a couple of minutes later in case it takes longer to start, otherwise you might need to wait the full hour.
from celery.schedules import crontab
from datetime import datetime
CELERYBEAT_SCHEDULE = {
'update_database': {
'task': 'myapp.tasks.update_database',
'schedule': crontab(minute=(datetime.now().minute + 1) % 60),
},
}
Reference: http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#crontab-schedules
Ready method of MyAppConfig
In order to ensure that your task is run right away, you could use the same method as before to create the periodic task without adding 1 to the minute. Then, you call your task in the ready method of MyAppConfig which is called whenever your app is ready.
#myapp/apps.py
class MyAppConfig(AppConfig):
name = "myapp"
def ready(self):
from .tasks import update_database
update_database.delay()
Please note that you could also create the periodic task directly in the ready method if you were to use django_celery_beat.
Edit: Didn't see that the second method was already covered in the link you mentioned. I'll leave it here in case it is useful for someone else arriving here.
Try setting the configuration parameter CELERY_ALWAYS_EAGER = True
Something like this
app.conf.CELERY_ALWAYS_EAGER = True

RabbitMQ/Celery/Django Memory Leak?

I recently took over another part of the project that my company is working on and have discovered what seems to be a memory leak in our RabbitMQ/Celery setup.
Our system has 2Gb of memory, with roughly 1.8Gb free at any given time. We have multiple tasks that crunch large amounts of data and add them to our database.
When these tasks run, they consume a rather large amount of memory, quickly plummeting our available memory to anywhere between 16Mb and 300Mb. The problem is, after these tasks finish, the memory does not come back.
We're using:
RabbitMQ v2.7.1
AMQP 0-9-1 / 0-9 / 0-8 (got this line from the
RabbitMQ startup_log)
Celery 2.4.6
Django 1.3.1
amqplib 1.0.2
django-celery 2.4.2
kombu 2.1.0
Python 2.6.6
erlang 5.8
Our server is running Debian 6.0.4.
I am new to this setup, so if there is any other information you need that could help me determine where this problem is coming from, please let me know.
All tasks have return values, all tasks have ignore_result=True, CELERY_IGNORE_RESULT is set to True.
Thank you very much for your time.
My current config file is:
CELERY_TASK_RESULT_EXPIRES = 30
CELERY_MAX_CACHED_RESULTS = 1
CELERY_RESULT_BACKEND = False
CELERY_IGNORE_RESULT = True
BROKER_HOST = 'localhost'
BROKER_PORT = 5672
BROKER_USER = c.celery.u
BROKER_PASSWORD = c.celery.p
BROKER_VHOST = c.celery.vhost
I am almost certain you are running this setup with DEBUG=True wich leads to a memory leak.
Check this post: Disable Django Debugging for Celery.
I'll post my configuration in case it helps.
settings.py
djcelery.setup_loader()
BROKER_HOST = "localhost"
BROKER_PORT = 5672
BROKER_VHOST = "rabbit"
BROKER_USER = "YYYYYY"
BROKER_PASSWORD = "XXXXXXX"
CELERY_IGNORE_RESULT = True
CELERY_DISABLE_RATE_LIMITS = True
CELERY_ACKS_LATE = True
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
CELERY_ROUTES = ('FILE_WITH_ROUTES',)
You might be hitting this issue in librabbitmq. Please check whether or not Celery is using librabbitmq>=1.0.1.
A simple fix to try is: pip install librabbitmq>=1.0.1.