How do I restart celery workers gracefully? - django

While issuing a new build to update code in workers how do I restart celery workers gracefully?
Edit:
What I intend to do is to something like this.
Worker is running, probably uploading a 100 MB file to S3
A new build comes
Worker code has changes
Build script fires signal to the Worker(s)
Starts new workers with the new code
Worker(s) who got the signal after finishing the existing job exit.

According to https://docs.celeryq.dev/en/stable/userguide/workers.html#restarting-the-worker you can restart a worker by sending a HUP signal
ps auxww | grep celeryd | grep -v "grep" | awk '{print $2}' | xargs kill -HUP

celery multi start 1 -A proj -l info -c4 --pidfile=/var/run/celery/%n.pid
celery multi restart 1 --pidfile=/var/run/celery/%n.pid
http://docs.celeryproject.org/en/latest/userguide/workers.html#restarting-the-worker

If you're going the kill route, pgrep to the rescue:
kill -9 `pgrep -f celeryd`
Mind you, this is not a long-running task and I don't care if it terminates brutally. Just reloading new code during dev. I'd go the restart service route if it was more sensitive.

You can do:
celery multi restart w1 -A your_project -l info # restart workers
Example

You should look at Celery's autoreloading

What should happen to long running tasks? I like it this way: long running tasks should do their job. Don't interrupt them, only new tasks should get the new code.
But this is not possible at the moment: https://groups.google.com/d/msg/celery-users/uTalKMszT2Q/-MHleIY7WaIJ

I have repeatedly tested the -HUP solution using an automated script, but find that about 5% of the time, the worker stops picking up new jobs after being restarted.
A more reliable solution is:
stop <celery_service>
start <celery_service>
which I have used hundreds of times now without any issues.
From within Python, you can run:
import subprocess
service_name = 'celery_service'
for command in ['stop', 'start']:
subprocess.check_call(command + ' ' + service_name, shell=True)

If you're using docker/docker-compose and putting celery into a separate container from the Django container, you can use
docker-compose kill -s HUP celery
, where celery is the container name. The worker will be gracefully restarted and the ongoing task is not brutally stopped.
Tried pkill, kill, celery multi stop, celery multi restart, docker-compose restart. All not working. Either the container is stopped abruptly or the code is not reloaded.
I just want to reload my code in the prod server manually with a 1-liner. Don't want to play with daemonization.

Might be late to the party. I use:
sudo systemctl stop celery
sudo systemctl start celery
sudo systemctl status celery

Related

Exit server terminal while after celery execution

I have successfully created a periodic task which updates each minute, in a django app. I everything is running as expected, using celery -A proj worker -B.
I am aware that using celery -A proj worker -B to execute the task is not advised, however, it seems to be the only way for the task to be run periodically.
I am logging on to the server using GitBash, after execution, I would like to exit GitBash with the celery tasks still being executed periodically.
When I press ctrl+fn+shift it is a cold worker exit, which stops execution completely (which is not desirable).
Any help?
If you are on a linux server, You might want to use a process manager like supervisord or even systemd to keep your process running.
On windows, one might look at running celery as a service or running as part of rabbitMQ.
In WSL, it seems like a bat file will get wsl commands to run as a service.

Celery executes a task after calling task.delay() for 3-5 times

I am using celery in django project. I have tried using rabbitmq and redis backend but neither does work. Used celery version is 3.1.26.post2. I have to call 2, 3 sometimes 5 times task.delay() to see the task running. And sometimes usually after frequently calling the same task its "execution rate" increases and executes the task 70-80% of the time. For example, it drops 1 or 2 of 5 task.delay() calls, but executes 3-4 of them. Did you experience something like this? What can be the reason?
OK, based on your description there are a few bits I don't know (and they would help):
how do you start your workers (i.e. celery worker -A your_package_name)
are you sure you subscribe to the same broker you later check with rabbitmqctl
Based on your feedback I guess your tasks either take very long to complete or in some weird way hang and never finish. They definitely land within default queue created by celery worker upon start (called celery).
Posting code of sample task you try to insert into the queue and also code sample of how you try to insert it into the queue would help too.
I would normally define my task like this (in my package that defines what tasks are supposed to do, this code will be executed by celery worker):
from your_package_name.celery import app
#app.task
my_task_name(my_param):
#do something here!
return True
I would insert my task into the queue like this (i.e. from python shell or from my other package that is supposed to insert tasks into the queue):
my_task_name.apply_async(
args=(my_param,),
queue='my_queue_name',)
Somewhere in your_package_name there is a bit of code where you define your broker (in my case I keep it in celeryconfig.py but it's up to you)
BROKER_URL = 'amqp://your_user_name:very_secret_pwd#localhost:5672/your_vhost'
Do not confuse vhost with your host name.
If like me you use rabbitmq then you need to create vhost, user and password before attempting to use the broker (run below in bash as root)
sudo -u rabbitmq -n rabbitmqctl add_user your_user_name very_secret_pwd
sudo -u rabbitmq -n rabbitmqctl add_vhost your_vhost
sudo -u rabbitmq -n rabbitmqctl set_user_tags your_user_name your_example_tag
sudo -u rabbitmq -n rabbitmqctl set_permissions -p your_vhost your_user_name ".*" ".*" ".*"
I would start my worker like this:
python -m celery worker -A your_package_name -Q my_queue_name -c 1 -f /tmp/celery.log --loglevel="INFO"
And then I would look at celery logs within /tmp/celery.log and also list my queues like this (in bash as root):
rabbitmqctl list_queues -p your_vhost
Hope this will help you get on the right tracks.

crontab not working with celery multi start

I am trying to get Celery work for awhile now. All my crontabs works just fine when I test it synchronously
sudo celery -A testdjango worker --loglevel=DEBUG --beat
but when I do
celery multi start -A testdjango w1 -l info
none of my crontabs work. I am not sure why
Note: I tried other schedule intervals as well like with time delta The same thing happens with that as well.
So I am fairly certain this is not a crontab thing but somehow related to the way I am starting celery multi.
Also, the worker turns on just fine since I can see it in Celery Flower but no tasks get executed.
So, the answer is pretty straightforward
Since periodic tasks need Beat just add --beat with the command.
something like this
celery multi start -A testdjango w1 --beat -l info
Alternatively instead of running Beat inside your worker process (which the docs for 3.1.18 say is not recommended) you can run it dedicated in the background with
celery beat -A testdjango --pidfile=/blah/beat.pid --detach
Be sure to save the pidfile somewhere so you can also kill the process later.

scheduling start up doesn't work

When setting-up Celery and I want to use scheduling do I add both init scripts below or just the celerybeat one?
https://github.com/ask/celery/blob/master/contrib/generic-init.d/celeryd
https://raw.github.com/ask/celery/master/contrib/generic-init.d/celerybeat
The issue is I have both scripts and Celery does not run in beat mode and scheduling does not work (normal task do?)
Run:
sh -x /etc/init.d/celeryd start
This should print to screen any errors on startup, from this you should see whats going wrong.

Issues with celery daemon

We're having issues with our celery daemon being very flaky. We use a fabric deployment script to restart the daemon whenever we push changes, but for some reason this is causing massive issues.
Whenever the deployment script is run the celery processes are left in some pseudo dead state. They will (unfortunately) still consume tasks from rabbitmq, but they won't actually do anything. Confusingly a brief inspection would indicate everything seems to be "fine" in this state, celeryctl status shows one node online and ps aux | grep celery shows 2 running processes.
However, attempting to run /etc/init.d/celeryd stop manually results in the following error:
start-stop-daemon: warning: failed to kill 30360: No such process
While in this state attempting to run celeryd start appears to work correctly, but in fact does nothing. The only way to fix the issue is to manually kill the running celery processes and then start them again.
Any ideas what's going on here? We also don't have complete confirmation, but we think the problem also develops after a few days (with no activity this is a test server currently) on it's own with no deployment.
I can't say that I know what's ailing your setup, but I've always used supervisord to run celery -- maybe the issue has to do with upstart? Regardless, I've never experienced this with celery running on top of supervisord.
For good measure, here's a sample supervisor config for celery:
[program:celeryd]
directory=/path/to/project/
command=/path/to/project/venv/bin/python manage.py celeryd -l INFO
user=nobody
autostart=true
autorestart=true
startsecs=10
numprocs=1
stdout_logfile=/var/log/sites/foo/celeryd_stdout.log
stderr_logfile=/var/log/sites/foo/celeryd_stderr.log
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
Restarting celeryd in my fab script is then as simple as issuing a sudo supervisorctl restart celeryd.