I have a Django app, and I'm using Celery Beat to run a task periodically. If I call the task when running Celery, it runs without errors:
app/tasks.py
...
#task(name='task1')
def func():
# Some code
func.run()
...
If I then start Celery celery -A project worker -l info, the task runs without errors.
The issue comes when I try to run that same task with Celery Beat, imagine I have this schedule:
app.conf.beat_schedule = {
'some_task': {
'task': 'task1',
'schedule': crontab(minute=30, hour='22')
}
}
This task should run every day on 22:30, and it does, the task starts but then hangs without logging anything, I cannot figure out the root of the issue, this is not a memory error, I have already checked that, and the task runs fine on my local machine using Celery Beat.
I have also tried to use Celery Beat Daemon, but the task keeps hanging whenever it starts. I can't figure out what is happening, any suggestions?
Use the app.task or shared_task decorator for your task. Without an app instance, celery beat will not be calling the task with the correct task signature for the celery app to recognize. You can find the documentation on how to write a basic task here.
Related
I am trying to extend my django app with celery crontab functionality. For this purposes i created celery.py file where i put code as mentioned in official documentation.
Here is my the code from project/project/celery.py
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE','project.settings')
app=Celery('project')
app.config_from_object('django.conf::settings',namespace='CELERY')
Than inside my project/settings.py file i specify related to celery configs as follow
CELERY_TIMEZONE = "Europe/Moscow"
CELERYBEAT_SHEDULE = {
'test_beat_tasks':{
'task':'webhooks.tasks.adding',
'schedule':crontab(minute='*/1),
},
}
Than i run worker an celery beat in the same terminal by
celery -A project worker -B
But nothing happened i mean i didnt see that my celery beat task printing any output while i expected that my task webhooks.tasks.adding will execute
Than i decided to check that celery configs are applied. For this purposes in command line **python manage.py shell i checked celery.app.conf object
#i imported app from project.celery.py module
from project import celery
#than examined app configs
celery.app.conf
And inside of huge config's output of celery configs i saw that timezone is set to None
As i understand my problem is that initiated in project/celery.py app is ignoring my project/settings.py CELERY_TIMEZONE and CELERY_BEAT_SCHEDULE configs but why so? What i am doing wrong? Please guide me
After i spent so much time researching to solve this problem i found that my mistake was inside how i run worker and celery beat. While running worker as i did it wouldnt execute task in the terminal. To see is task is executing i should run it as follow celery -A project worker -B -l INFO or instead of INFO if you want more detailed output DEBUG can be added. Hope it will help anyone
We are using celery 3.1.17 with redis backend in our flask 0.10.1 application. On our server every celery task created from some celery task is getting created twice. For example,
#celery.task(name='send_some_xyz_users_alerts')
def send_some_xyz_users_alerts():
list_of_users = find_some_list_of_users()
for user in list_of_users:
send_user_alert.delay(user)
#celery.task(name='send_user_alert')
def send_user_alert(user):
data = get_data_for_user(user)
send_mail_to_user(data)
If we start send_some_xyz_users_alerts from our application it runs once. I then see 2 send_user_alert tasks running in celery for each user. Both these tasks have different task_ids. We have 2 workers running on server. Some times these duplicate tasks run on same worker. Sometimes on different workers. I have tried lot to find the problem without any luck. Would really appreciate if someone knows why this could happen. Things were running fine for months on these versions of celery and flask and suddenly we are seeing this problem on our servers. Tasks run fine on local env.
So I've been trying to figure out how to make scheduled tasks, I've found Celery and been able to to make simple scheduled tasks. To do this I need to open up a command line and run celery -A proj beat for the tasks to happen. This works fine in a development environment, but when putting this into production that will be an issue.
So how can I get celery to work without the command line use? When my production server is online, how can I make sure my scheduler goes up with it? Can Celery do this or do I need to go down another method?
We use Celery in our production environment, which happens to be on Heroku. We are in the process of moving to AWS. In both environments, Celery hums along nicely.
It would be helpful to understand what your production environment will look like. I'm slightly confused as to why you would be worried about turning off your computer, as using Django implies that you are running serving up a website... Are you serving your website from your laptop??
Anyway, assuming that you are going to run your production server from a cloud platform, all you have to do is send whatever command lines you need to run Django AND the command lines for Celery (as you have already noted in your question).
In terms of configuration, you say that you have 'scheduled' tasks, so that implies you have set up a beat schedule in your config.py file. If not, it should look something like this (assumes you have a module called tasks.py which holds your celery task definitions:
from celery.schedules import crontab
beat_schedule = {
'task1': {
'task': 'tasks.task_one',
'schedule': 3600
},
'task2': {
'task': 'tibController.tasks.update_old_retail',
'schedule': crontab(hour=12, minute=0, day_of_week='mon-fri'
}
}
Then in your tasks.py just call the config file you just do this:
from celery import Celery
import config
app = Celery('tasks')
app.config_from_object(config)
You can find more on crontab in the docs. You can also checkout this repo for a simple Celery example.
In summary:
Create a config file that identifies which tasks to run when
Load the config file into your Celery app
Get a cloud platform to run your code on.
Run celery exactly like you have already identified
Hope that helps.
I'm running a site using Django 10, RabbitMQ, and Celery 4 on CentOS 7.
My Celery Beat and Celery Worker instances are controlled by supervisor and I'm using the django celery database scheduler.
I've scheduled a cron style task using the cronsheduler in Django-admin.
When I start celery beat and worker instances the job fires as expected.
But if a change the schedule time in Django-admin then the changes are not picked up unless I restart the celery-beat instance.
Is there something I am missing or do I need to write my own scheduler?
Celery Beat, with the 'django_celery_beat.schedulers.DatabaseScheduler' loads the schedule from the database. According to the following doc https://media.readthedocs.org/pdf/django-celery-beat/latest/django-celery-beat.pdf this should force Celery Beat to reload:
A schedule that runs at a specific interval (e.g. every 5 seconds).
•
django_celery_beat.models.CrontabSchedule
A
schedule
with
fields
like
entries
in
cron: minute hour day-of-week day_of_month month_of_year.
django_celery_beat.models.PeriodicTasks
This model is only used as an index to keep track of when the schedule has changed. Whenever you update a PeriodicTask a counter in this table is also incremented, which tells the celery beat
service to reload the schedule from the database.
If you update periodic tasks in bulk, you will need to update the counter manually:
from django_celery_beat.models import PeriodicTasks
PeriodicTasks.changed()
From the above I would expect the Celery Beat process to check the table regularly for any changes.
i have changed the celery from 4.0 to 3.1.25, django to 1.9.11 and installed djcelery 3.1.17. Then test again, It's OK. So, maybe it's a bug.
I have a solution by:
Creating a separate worker process that consumes a RabbitMQ queue.
When Django updates the database it posts a message to the queue containing the name of the Celery Beat process (name defined by Supervisor configuration).
The worker process then restarts the named Celery Beat process.
A bit long winded but does the job. Also makes it easier to manage multiple Django apps on the same server that require the same functionality.
I have a celery instance running on heroku on a woker dyno. I use it with django and djcelery and I use the following command to start it:
python manage.py celery worker -B -E --loglevel=info --autoscale=50,8 --without-gossip
I added a task to the queue but it never got executed. The logs show this:
Scaling down -7 processes. (This is a while before the following)
Received task: myapp.tasks.my_task[75f095cb-9652-4cdb-87d4-dc31cecaadff]
Scaling up 1 processes.
But thats it, after this, the job didn't get done at all. I fired other tasks after this as well and all of them were successfully executed. The same issue was repeated a couple of hours later.
I have a few questions:
1. The "Received task ..." log line means that the celery daemon picked the job off of the queue, right? Why did the worker process become inactive?
2. Since my autoscale is set to 50,8, shouldn't there always be 8 workers? Why were all workers scaled down?
3. Is there any way to know that a task is stuck? What is the recommended failure recovery for such a case if the task being fired is mission critical.
My celery version is 3.1.9, django-celery version 3.1.9, redis version 2.6.16
Settings in django settings:
CELERY_SEND_TASK_ERROR_EMAILS = True
import djcelery
djcelery.setup_loader()
BROKER_URL = 'redis://:password#host:port/0'
BROKER_BACKEND = 'django'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'