Celery tasks are executed by different program binaries - django

I have a Django web application that executes tasks via celery. It is run with a combination of Apache, uWSGI, and Redis. For some reason, one of the tasks is being executed by the uWSGI server and the other is executed by the Python interpreter. This is causing permissions issues as uWSGI doesn't run as the same user as Python does.
What could cause the tasks to be run by different programs? Is there a setting somewhere?

Turns out, I needed to call the task with .delay() to get the Celery deamon to execute the task instead of uWSGI.

Related

Where to run Celery on AWS

In my django web app I am running a calculation which takes 2-10 minutes, and the AWS server times out with a 504 server error.
It seems that the best way to fix this problem is to implement a worker, and offload the calculations. Through some research, it seems that Celery with a Redis server (maybe AWS SQS instead?) are best for Django.
However, I only see tutorials for instances that are run locally. The Redis server is hosted by Railway and Celery is run in a separate terminal than Django (on the same local machine). I'm curious if/where Celery and Redis should be running on AWS.
This answer says you should try to run celery as a deamon in the background. AWS Elastic Beanstalk uses supervisord already to run some deamon processes. So you can leverage that to run celeryd and avoid creating a custom AMI for this. It works nicely for me. but don't the Celery and Redis servers still need to run somehow?
Where does the Celery server run?
How do I start it?
How can I leverage supervisord to run daemon processes? The documentation isn't helping me very much with AWS integration
You can configure Procfile to run multiple processes like main django app, celery and celery-beat in parallel as documented here:
web: <command to start your django app>
celery: celery -A <path_to_celery_app> worker
celery_beat: celery -A <path_to_celery_app> beat

How to deploy a django project to google cloud with celery workers?

So i have a django project which I installed celery and heroku redis for it and I use google cloud for deployment. Everything works fine at local but i need to run my celery workers on website 24/7. I searched for supervisor and installed it too. I start supervisor from my command line. Celery workers runs since I ran supervisor. But there is a problem. I can not hold my pc open all the time. When I close, supervisor stops too. I did not figured out cloud tasks as well. Lastly, I read some infos about kubernetes and celery. Is it possible to use celery with kubernetes and how can i install kubernetes-celery django setup?
You need to be running your Django server with Gunicorn, your Redis service as a separate service, and your celery worker as a third service.
Alternatively, if you want one single container instance (Pod in k8s) you can set a supervisor to run gunicorn and your celery worker inside the same pod

apache + mod_wsgi restart keeping active tasks

Im running my django project using apache + mod_wsgi in daemon mode. When I have to make the server notice changes in the source code, I touch the wsgi.py file, but I have an issue with this approach.
Some tasks that are triggered from the front-end take 10 minutes to complete. If I touch the wsgi file while one of this long tasks are running, they get killed by the restart.
Is there any way to make the server to refresh the code, but keeping the previous unfinished tasks running until the are done?
Thanks!
Don't run long-running tasks in web processes. Use an offline task manager like Celery.

running nginx/postgres with supervisord - required?

In all standard django productions setup templates I've seen, gunicorn is run with supervisor, whereas nginx/postgres are not configured under supervisor.
Any reason? Is this required for a production system? If not, why not?
In this architecture, Gunicorn works as the application server which runs our Django code. Supervisor is just a process management utility which restarts the Gunicorn server if it crashes. The Gunicorn server may crash due to our bad code, but nginx and postgres remain intact. So in the basic config we only look after the gunicorn process through supervisor. Though we could do the same for nginx and postgres too.
You need supervisor for gunicorn because it's an simply server without any tools to restart it, run it at system startup, stop it at system shutdown or reload when it crashes.
Postgresql and nginx can take care of themselves in that aspect, so there is no need for them to be running under supervisor.
Actually, you can just use init.d, upstart or system.d to start, stop and restart gunicorn, supervisor is just easier way to handle such small servers like gunicorn.
Consider also that it is common to run multiple django apps on one system, and that requires multiple separated instances of gunicorn. Supervisor will handle them better than init, upstart or system.d
There is also uWSGI server that won't need supervisor, because it has built-in features to handle multiple instances, starting, stopping and also auto-reloading on code change. Look at uWSGI emperor system.

Heroku, Django and celery on RabbitMQ

I'm building a Django project on Heroku.
I understand that gunicorn is recommended as a webserver so I need an event loop type of worker and I use gevent for that.
It seems that monkey patching gevent does most of the work for me so I can have concurrency, but how am I supposed to connect to the RabbitMQ without real threads or jamming the whole loop?
I am baffled by this since Heroku themselves recommend gunicorn, celery and RabbitMQ but I don't see how all of these work together.
Do you understand that celery and gunicorn are used for different purposes?
Gunicorn is the webserver responding to requests made by users, serving them web pages or JSON data.
Celery is an asynchronous task manager, i.e. it lets you run arbitrary python code irrespective of web requests to your server.
Do you understand this distinction?