Django + Celery + Redis on a K8s cluster - django

I have a Django application deployed on a K8s cluster. I need to send some emails (some are scheduled, others should be sent asynchronously), and the idea was to delegate those emails to Celery.
So I set up a Redis server (with Sentinel) on the cluster, and deployed an instance for a Celery worker and another for Celery beat.
The k8s object used to deploy the Celery worker is pretty similar to the one used for the Django application. The main difference is the command introduced on the celery worker: ['celery', '-A', 'saleor', 'worker', '-l', 'INFO']
Scheduled emails are sent with no problem (celery worker and celery beat don't have any problems connecting to the Redis server). However, the asynchronous emails - "delegated" by the Django application - are not sent because it is not possible to connect to the Redis server (ERROR celery.backends.redis Connection to Redis lost: Retry (1/20) in 1.00 second. [PID:7:uWSGIWorker1Core0])
Error 1:
socket.gaierror: [Errno -5] No address associated with hostname
Error 2:
redis.exceptions.ConnectionError: Error -5 connecting to redis:6379. No address associated with hostname.
The Redis server, Celery worker, and Celery beat are in a "redis" namespace, while the other things, including the Django app, are in the "development" namespace.
Here are the variables that I define:
- name: CELERY_PASSWORD
valueFrom:
secretKeyRef:
name: redis-password
key: redis_password
- name: CELERY_BROKER_URL
value: redis://:$(CELERY_PASSWORD)#redis:6379/1
- name: CELERY_RESULT_BACKEND
value: redis://:$(CELERY_PASSWORD)#redis:6379/1
I also tried to define CELERY_BACKEND_URL (with the same value as CELERY_RESULT_BACKEND), but it made no difference.
What could be the cause for not connecting to the Redis server? Am I missing any variables? Could it be because pods are in a different namespace?
Thanks!

Solution from #sofia that helped to fix this issue:
You need to use the same namespace for the Redis server and for the Django application. In this particular case, change the namespace "redis" to "development" where the application is deployed.

Related

OperationalError, Error 111 connecting to 127.0.0.1:6379. Connection refused. After deploying in heroku

I am getting the below error after I deployed my website on heroku.
Error 111 connecting to 127.0.0.1:6379. Connection refused.
Request Method: POST
Request URL: https://website.herokuapp.com/account/register
Django Version: 3.2.8
Exception Type: OperationalError
Exception Value:
Error 111 connecting to 127.0.0.1:6379. Connection refused.
Exception Location: /app/.heroku/python/lib/python3.8/site-packages/kombu/connection.py, line 451, in _reraise_as_library_errors
Python Executable: /app/.heroku/python/bin/python
Python Version: 3.8.12
Python Path:
['/app',
'/app/.heroku/python/bin',
'/app',
'/app/.heroku/python/lib/python38.zip',
'/app/.heroku/python/lib/python3.8',
'/app/.heroku/python/lib/python3.8/lib-dynload',
'/app/.heroku/python/lib/python3.8/site-packages']
Server time: Sat, 11 Dec 2021 21:17:12 +0530
So basically my website has to send email regarding otp, after registration and also some contract related emails. These email are neccessary to be sent hence can't be avoided. I posted a question earlier here regardig how to minimize the time that sending emails takes so that the user doesn't have to wait the entire time. I was suggested to use asynchronous code for this. So i decided to use celery for this. I followed the youtube video that taught how to use it.
Now after I pushed the code in the website I am getting this error. I am beginner andd have no idea how to rectify it. Please suggest me what shoul I do. Below are the details and configurations.
settings.py
CELERY_BROKER_URL = 'redis://127.0.0.1:6379'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379'
CELERY_ACCEPT_CONTENT =['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SELERLIZER = 'json'
requirements.txt
amqp==5.0.6
asgiref==3.4.1
billiard==3.6.4.0
celery==5.2.1
click==8.0.3
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
colorama==0.4.4
Deprecated==1.2.13
dj-database-url==0.5.0
Django==3.2.8
django-ckeditor==6.1.0
django-filter==21.1
django-js-asset==1.2.2
django-multiselectfield==0.1.12
dnspython==2.1.0
As I mentioned I am beginer, please suggest me a detailed ans as to how I can recctify this error.
Here's the problem:
CELERY_BROKER_URL = 'redis://127.0.0.1:6379'
Redis won't be running on your local dyno. You'll have to run it somewhere else and configure your code to connect to it. A common choice is to run Redis via an addon:
Once you’ve chosen a broker, create your Heroku app and attach the add-on to it. In the examples we’ll use Heroku Redis as the Redis provider but there are plenty of other Redis providers in the Heroku Elements Marketplace.
If you choose to use Heroku Redis, you'll be able to get the connection string to your instance via the REDIS_URL environment variable:
Heroku add-ons provide your application with environment variables which can be passed to your Celery app. For example:
import os
app.conf.update(BROKER_URL=os.environ['REDIS_URL'],
CELERY_RESULT_BACKEND=os.environ['REDIS_URL'])
Your Celery app now knows to use your chosen broker and result store for all of the tasks you define in it.
Other addons will provide similar configuration mechanisms.
All quoted documentation here, and most links, come from Heroku's Using Celery on Heroku article. I suggest you read the entire document for more information.

Prometheus + Django + gunicorn with --preload and multiple processes publishing just one port

I'm running django+prometheus+gunicorn, exporting /metrics using multiple processes per process, as described in django-prometheus documentation. When I run gunicorn with --reload and two workers, I can see how ports 8001 and 8002 are opened serving prometheus metrics, one per process. But when I run gunicorn with --preload, only port 8001 is opened.
What do I need to do to get one prometheus endpoint per process while using --preload?
django-prometheus settings:
PROMETHEUS_METRICS_EXPORT_PORT_RANGE = range(8001, 8020)
PROMETHEUS_METRICS_EXPORT_ADDRESS = ''
PROMETHEUS_EXPORT_MIGRATIONS = False
Versions:
django==3.1.0
prometheus_client==0.8.0
django-prometheus==2.1.0
gunicorn==20.0.4

Django celery tasks in separate server

We have two servers, Server A and Server B. Server A is dedicated for running django web app. Due to large number of data we decided to run the celery tasks in server B. Server A and B uses a common database. Tasks are initiated after post save in models from Server A,webapp. How to implement this idea using rabbitmq in my django project
You have 2 servers, 1 project and 2 settings(1 per server).
server A (web server + rabbit)
server B (only celery for workers)
Then you set up the broker url in both settings. Something like this:
BROKER_URL = 'amqp://user:password#IP_SERVER_A:5672//' matching server A to IP of server A in server B settings.
For now, any task must be sent to rabbit in server A to virtual server /.
In server B, you must just initialize celery worker, something like this:
python manage.py celery worker -Q queue_name -l info
and thats it.
Explanation: django sends messages to rabbit to queue a task, then celery workers request some message to execute a task.
Note: Is not required that rabbitMQ have to be installed in server A, you can install rabbit in server C and reference it in the BROKER_URL in both settings(A and B) like this: BROKER_URL='amqp://user:password#IP_SERVER_C:5672//'.
Sorry for my English.
greetings.

Django Celery cannot connect to remote RabbitMQ on EC2

I created a rabbitmq cluster on two instances on EC2. My django app uses celery for async tasks which in turn uses RabbitMQ for message queue.
Whenever I start celery with the command:
python manage.py celery worker --loglevel=INFO
OR
python manage.py celeryd --loglevel=INFO
I keep getting following error message related to remote RabbitMQ:
[2015-05-19 08:58:47,307: ERROR/MainProcess] consumer: Cannot connect to amqp://myuser:**#<ip-address>:25672/myvhost/: Socket closed.
Trying again in 2.00 seconds...
I set permissions using:
sudo rabbitmqctl set_permissions -p myvhost myuser ".*" ".*" ".*"
and then restarted rabbitmq-server on both the cluster nodes. However, it didn't help.
In log file, I see few entries like below:
=INFO REPORT==== 19-May-2015::08:14:41 ===
accepting AMQP connection <0.1981.0> (<ip-address>:38471 -> <ip-address>:5672)
=ERROR REPORT==== 19-May-2015::08:14:44 ===
closing AMQP connection <0.1981.0> (<ip-address>:38471 -> <ip-address>:5672):
{handshake_error,opening,0,
{amqp_error,access_refused,
"access to vhost 'myvhost' refused for user 'myuser'",
'connection.open'}}
The file /usr/local/etc/rabbitmq/rabbitmq-env.conf contains an entry for NODE_IP_ADDRESS to bind it only to localhost. Removing the NODE_IP_ADDRESS entry from the config binds the port to all network inferfaces.
Source: https://superuser.com/questions/464311/open-port-5672-tcp-for-access-to-rabbitmq-on-mac
Turns out I had not created appropriate configuration files. In my case (Ubuntu 14.04), I had to create below two configuration files:
$ cat /etc/rabbitmq/rabbitmq-env.conf
RABBITMQ_NODE_IP_ADDRESS=<ip_of_ec2_instance>
<ip_of_ec2_instance> has to be the internal IP that EC2 uses. Not the public IP that one uses to ssh into the instance. It can be obtained using ip a command.
$ cat /etc/rabbitmq/rabbitmq.config
[
{mnesia, [{dump_log_write_threshold, 1000}]},
{rabbit, [{tcp_listeners, [25672]}]},
{rabbit, [{loopback_users, []}]}
].
I think the line {rabbit, [{tcp_listeners, [25672]}]}, was one of the most important piece of configuration that I was missing.
Thanks #dgil for the initial troubleshooting help.
The question has been answered. but just leaving notes with a similar issue i faced should anybody else find it useful
I have a flask app running on ec2 with amqp as a broker on port 5672 and ec2 elasticcache memcached as a backend. The amqp broker had trouble picking up tasks that were getting fired - so i resolved it by fixing as such
Assuming you have rabbitmq-server installed (sudo apt-get install rabbitmq-server), add the user and set the properties as such
sudo add_user username password
set_permissions username ".*" ".*" ".*"
restart server: sudo service rabbitmq-server restart
In your flask app for the celery configuration
broker_url=amqp://username:password#localhost:5672// (Set as above)
backend=cache+memcached://(ec2 cache url):11211/
(The cache+memcached:// tripped me up - without it i kept getting an import error (cannot import module)
Open up the port 5672 on your ec2 instance in the security group.
Now if you fire up your celery worker, it should pick up the the tasks that get fired and store the results on your memcached server

Jobs firing to wrong celery

I am using django celery and rabbitmq as my broker (guest rabbit user has full access on local machine). I have a bunch of projects all in their own virtualenv but recently needed celery on 2 of them. I have one instance of rabbitmq running
(project1_env)python manage.py celery worker
normal celery stuff....
[Configuration]
broker: amqp://guest#localhost:5672//
app: default:0x101bd2250 (djcelery.loaders.DjangoLoader)
[Queues]
push_queue: exchange:push_queue(direct) binding:push_queue
In my other project
(project2_env)python manage.py celery worker
normal celery stuff....
[Configuration]
broker: amqp://guest#localhost:5672//
app: default:0x101dbf450 (djcelery.loaders.DjangoLoader)
[Queues]
job_queue: exchange:job_queue(direct) binding:job_queue
When I run a task in project1 code it fires to the project1 celery just fine in the push_queue. The problem is when I am working in project2 any task tries to fire in the project1 celery even if celery isn't running on project1.
If I fire back up project1_env and start celery I get
Received unregistered task of type 'update-jobs'.
If I run list_queues in rabbit, it shows all the queues
...
push_queue 0
job_queue 0
...
My env settings and CELERYD_CHDIR and CELERY_CONFIG_MODULE are both blank.
Some things I have tried:
purging celery
force_reset on rabbitmq
rabbitmq virtual hosts as outlined in this answer: Multi Celery projects with same RabbitMQ broker backend process
moving django celery setting out and setting CELERY_CONFIG_MODULE to the proper settings
setting the CELERYD_CHDIR in both projects to the proper directory
None of these thing have stopped project2 tasks trying to work in project1 celery.
I am on Mac, if that makes a difference or helps.
UPDATE
Setting up different virtual hosts made it all work. I just had it configured wrong.
If you're going to be using the same RabbitMQ instance for both Celery instances, you'll want to use virtual hosts. This is what we use and it works. You mention that you've tried it, but your broker URLs are both amqp://guest#localhost:5672//, with no virtual host specified. If both Celery instances are connected to the same host and virtual host, they will produce to and consume from the same set of queues.