Django Channels Worker not responding to websocket.connect - django

I'm having a problem with django channels. Daphne accepts WebSocket CONNECT requests properly, but then the workers doesn't respond to the request with the supplied method in consumers.py. The thing is this happens only most of the time. Sometimes it responds with the method in the consumers.py but most of the time the worker doesn't respond at all. I have a duplicate code working fine in vagrant (trusty64) environment, but the code behaves like that in an actual trusty64 machine. It should be noted that the trusty64 machine that hosts the app also has other application running (about 4 apps running at the same time).
I have a routing.py set up like this
from channels import route
from app.consumers import connect_tracking, disconnect_tracking
channel_routing = [
route("websocket.connect", connect_tracking, path=r'^/websocket/tms/tracking/stream/$'),
route("websocket.disconnect", disconnect_tracking, path=r'^/websocket/tms/tracking/stream/$'),
]
with the corresponding consumers.py that looks like this
import json
from channels import Group
from channels.sessions import channel_session
from channels.auth import http_session_user, channel_session_user, channel_session_user_from_http
from django.conf import settings
#channel_session_user_from_http
def connect_tracking(message):
group_name = settings.TRACKING_GROUP_NAME
print "%s is joining %s" % (message.user, group_name)
Group(group_name).add(message.reply_channel)
#channel_session_user
def disconnect_tracking(message):
group_name = settings.TRACKING_GROUP_NAME
print "%s is joining %s" % (message.user, group_name)
Group(group_name).discard(message.reply_channel)
and some channels related lines in settings.py like this
redis_host = os.environ.get('REDIS_HOST', 'localhost')
CHANNEL_LAYERS = {
"default": {
# This example app uses the Redis channel layer implementation asgi_redis
"BACKEND": "asgi_redis.RedisChannelLayer",
"CONFIG": {
"hosts": [(redis_host, 6379)],
},
"ROUTING": "tms_app.routing.channel_routing",
},
}
referencing another question, I've tried running daphne and worker like this
daphne tms_app.asgi:channel_layer --port 9015 --bind 0.0.0.0 -v2
python manage.py runworker -v3
I've captured daphne's and the worker's log, it looks like this
Daphne log :
2016-12-30 17:00:18,870 INFO Starting server at 0.0.0.0:9015, channel layer tms_app.asgi:channel_layer
2016-12-30 17:00:26,788 DEBUG WebSocket open for websocket.send!APpWONQKKDXR
192.168.31.197:48933 - - [30/Dec/2016:17:00:26] "WSCONNECT /websocket/tms/tracking/stream/" - -
2016-12-30 17:00:26,790 DEBUG Upgraded connection http.response!sqlMPEEtolDP to WebSocket websocket.send!APpWONQKKDXR
corresponding worker log :
2016-12-30 17:00:22,265 - INFO - runworker - Running worker against channel layer default (asgi_redis.core.RedisChannelLayer)
2016-12-30 17:00:22,265 - INFO - worker - Listening on channels http.request, websocket.connect, websocket.disconnect, websocket.receive
As you can see when there's a WSCONNECT event, the worker doesn't respond to it.
There's another question that's close to this issue that was solved by downgrading Twisted to 16.2 but it doesn't work for me.
UPDATE January 3, 2017
I cannot replicate the issue on a local vagrant machine despite using the same code and same settings for nginx, supervisor, gunicorn and daphne. I tried changed the channel layers settings so it uses IPC instead of redis and it works. Here's the settings :
CHANNEL_LAYERS = {
"default": {
"BACKEND": "asgi_ipc.IPCChannelLayer",
"ROUTING": "tms_app.routing.channel_routing",
"CONFIG": {
"prefix": "tms",
},
},
}
However this does not solve the current problem as I intend to use Redis channel layers because it's more easier to scale compared to IPC. Does this mean there's something wrong with my redis server?

I think the reason your Connection doesnt complete is because you are not sending the accept message like this:
message.reply_channel.send({'accept': True})
This is what works for my version of Channels, but you should make check the docs for your version to make sure what works for you

Related

Django; ThreadPoolExecutor causes FATAL; sorry, too many clients

I'm aware this error has already been posted but the context of my issue is different. I have tried proposed solutions without success.
I'm using django-apscheduler to schedule some management commands, one of these uses ThreadPoolExecutor to speed up an otherwise lengthy process.
Unfortunately, it does not close connections, which eventually leads to:
exception:connection to server at "localhost" (::1), port 5432 failed: FATAL: sorry, too many clients already
I'm using PostgreSQL backend, and when I run the management command directly I can see in pgAdmin that the number of database sessions reduces after execution.
However, when ran by django-apscheduler they accumulate and eventually cause FATAL: sorry, too many clients already.
I've tried calling close_old_connections() from from django.db import close_old_connections but that doesn't seem to do anything.
Can someone point me in the right direction?
This is what my management command looks like:
import concurrent
from django.core.management.base import BaseCommand
from data_model.models import DataModel
class Command(BaseCommand):
help = "Do something interesting"
def handle(self, *args, **kwargs):
with concurrent.futures.ThreadPoolExecutor(10) as executor:
list(executor.map(DataModel.objects.do_something_interesting))
I did also try adding expiration time CONN_MAX_AGE to DB config but that didn't help:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql_psycopg2",
"NAME": env.str("POSTGRES_DB"),
"USER": env.str("POSTGRES_USER"),
"PASSWORD": env.str("POSTGRES_PASSWORD"),
"HOST": env.str("POSTGRES_HOST"),
"PORT": 5432,
"CONN_MAX_AGE": 60
},
}
Management command execution completed where the arrow is:

Socket handshake error when using gunicorn

I have a flask app that processes a web socket stream of audio from Twilio.
The app works fine without gunicorn but when I start it with gunicorn I get only the first message of the socket (connect) and an unsuccessful handshake. Here is how the app looks:
from flask import Flask
from flask_sockets import Sockets
from geventwebsocket.handler import WebSocketHandler
from gevent import pywsgi
...
app = Flask(__name__)
sockets = Sockets(app)
...
#sockets.route('/media')
def media(ws):
...
if __name__ == '__main__':
server = pywsgi.WSGIServer(('', HTTP_SERVER_PORT), app, handler_class=WebSocketHandler)
server.serve_forever()
When I start the app directly using python flaskapp.py it works ok.
When I start it using gunicorn by writing:
gunicorn -k flask_sockets.worker --bind 0.0.0.0:5055 --log-level=bug flaskapp:app
this is where the connection "hangs" and carries no further than the initial connection, apparently due to the handshake failing.
It's important to note that I haven't "gevent monkey patched" the code, but I'm not sure if it has anything to do with the problem.
Any idea will much be appreciated!
Don't have the ability to test this right now, but perhaps try with:
from flask import Flask
from flask_sockets import Sockets
from geventwebsocket.handler import WebSocketHandler
from gevent import pywsgi
...
app = Flask(__name__)
sockets = Sockets(app)
...
#sockets.route('/media')
def media(ws):
...
server = pywsgi.WSGIServer(('', HTTP_SERVER_PORT), app, handler_class=WebSocketHandler)
if __name__ == '__main__':
server.serve_forever()
Then change the launch command to:
gunicorn -k flask_sockets.worker --bind 0.0.0.0:5055 --log-level=bug flaskapp:server
(Gunicorn should be importing the server object, which can't live within that final if statement, as that code only runs when launched with python directly).

celery: could not connect to rabbitmq

Using rabbitmq as broker for celery. Issue is coming while running command
celery -A proj worker --loglevel=info
celery console shows this
[2017-06-23 07:57:09,261: ERROR/MainProcess] consumer: Cannot connect to amqp://bruce:**#127.0.0.1:5672//: timed out.
Trying again in 2.00 seconds...
[2017-06-23 07:57:15,285: ERROR/MainProcess] consumer: Cannot connect to amqp://bruce:**#127.0.0.1:5672//: timed out.
Trying again in 4.00 seconds...
following are the logs from rabbitmq
=ERROR REPORT==== 23-Jun-2017::13:28:58 ===
closing AMQP connection <0.18756.0> (127.0.0.1:58424 -> 127.0.0.1:5672):
{handshake_timeout,frame_header}
=INFO REPORT==== 23-Jun-2017::13:29:04 ===
accepting AMQP connection <0.18897.0> (127.0.0.1:58425 -> 127.0.0.1:5672)
=ERROR REPORT==== 23-Jun-2017::13:29:14 ===
closing AMQP connection <0.18897.0> (127.0.0.1:58425 -> 127.0.0.1:5672):
{handshake_timeout,frame_header}
=INFO REPORT==== 23-Jun-2017::13:29:22 ===
accepting AMQP connection <0.19054.0> (127.0.0.1:58426 -> 127.0.0.1:5672)
Any input would be appreciated.
I know its late
But I came across the same issue today, spent almost an hour to find the exact fix. Thought it might help someone else
I was using celery version 4.1.0
Hope you have configured RabbitMQ properly, if not please configure it as mentioned in the page http://docs.celeryproject.org/en/latest/getting-started/brokers/rabbitmq.html#setting-up-rabbitmq
Also cross check if the broker url is correct. Here is the brocker url syntax
amqp://user_name:password#localhost/host_name
You might not need to specify the port number, since it will automatically select the default one
If you follow the same variables from the setup tutorial link above your Brocker url will be like
amqp://myuser:mypassword#localhost/myvhost
Follow this project structure
Project
../app
../Project
../settings.py
../celery.py
../tasks.py
../celery_config.py
celery_config.py
# - - - - - - - - - -
# BROKER SETTINGS
# - - - - - - - - - -
# BROKER_URL = os.environ['APP_BROKER_URL']
BROKER_HEARTBEAT = 10
BROKER_HEARTBEAT_CHECKRATE = 2.0
# Setting BROKER_POOL_LIMIT to None disables pooling
# Disabling pooling causes open/close connections for every task.
# However, the rabbitMQ cluster being behind an Elastic Load Balancer,
# the pooling is not working correctly,
# and the connection is lost at some point.
# There seems no other way around it for the time being.
BROKER_POOL_LIMIT = None
BROKER_TRANSPORT_OPTIONS = {'confirm_publish': True}
BROKER_CONNECTION_TIMEOUT = 20
BROKER_CONNECTION_RETRY = True
BROKER_CONNECTION_MAX_RETRIES = 100
celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery
from Project import celery_config
app = Celery('Project',
broker='amqp://myuser:mypassword#localhost/myvhost',
backend='amqp://',
include=['Project'])
# Optional configuration, see the application user guide.
# app.conf.update(
# result_expires=3600,
# CELERY_BROKER_POOL_LIMIT = None,
# )
app.config_from_object(celery_config)
if __name__ == '__main__':
app.start()
tasks.py
from __future__ import absolute_import, unicode_literals
from .celery import app
#app.task
def add(x, y):
return x + y
Then start the celery with “celery -A Project worker -l info” from the project directory
Everything will be fine.
set CELERY_BROKER_POOL_LIMIT = None in settings.py
This solution is for GCP users.
I've been working on GCP and faced the same issue.
The error message was :
[2022-03-15 16:56:00,318: ERROR/MainProcess] consumer: Cannot connect
to amqp://root:**#34.125.161.132:5672/vhost: timed out.
I spent almost one hour to solve this issue and finally found the solution
We have to add the port number 5672 in the Firewall rules
Steps:
Go to Firewall
select default-allow-http rule
press Edit
search "Specified protocols and ports"
add 5672 in tcp box ( example if you want to add more ports : 80,5672,8000 )
save the changes and there you go !

Django channels times out with daphne and worker

I have a problem with django channels.
My Django app was running perfectly with WSGI for HTTP requests.
I tried to migrate to channels in order to allow websocket requests, and it turns out that after installing channels and running ASGI (daphne) and a worker, the server answers error 503 and the browser displays error 504 (time out) for the http requests that were previously working (admin page for example).
I read all the tutorial I could find and I do not see what the problem can be. Moreover, if I run with "runserver", it works fine.
I have an Nginx in front of the app (on a separate server), working as proxy and loadbalancer.
I use Django 1.9.5 with asgi-redis>=0.10.0, channels>=0.17.0 and daphne>=0.15.0. The wsgi.py and asgi.py files are in the same folder. Redis is working.
The command I was previously using with WSGI (and which still works if I switch back to it) is:
uwsgi --http :8000 --master --enable-threads --module Cats.wsgi
The command that works using runserver is:
python manage.py runserver 0.0.0.0:8000
The commands that fail for the requests that work with the 2 other commands are:
daphne -b 0.0.0.0 -p 8000 Cats.asgi:channel_layer
python manage.py runworker
Other info:
I added 'channels' in the installed apps (in settings.py)
other settings.py relevant info
CHANNEL_LAYERS = {
"default": {
"BACKEND": "asgi_redis.RedisChannelLayer",
"ROUTING": "Cats.routing.app_routing",
"CONFIG": {
"hosts": [(os.environ['REDIS_HOST'], 6379)],
},
},
}
Cats/routing.py
from channels.routing import route, include
from main.routing import routing as main_routing
app_routing = [
include(main_routing, path=r"^/ws/main"),
]
main/routing.py
from channels.routing import route, include
http_routing = [
]
stream_routing = [
route('websocket.receive', 'main.consumers.ws_echo'), #just for test once it will work
]
routing = [
include(stream_routing),
include(http_routing),
]
main/consumers.py
def ws_echo(message):
message.reply_channel.send({
'text': message.content['text'],
})
#this consumer is just for test once it will work
Any idea what could be wrong? All help much appreciated! Ty
EDIT:
I tried a new thing:
python manage.py runserver 0.0.0.0:8000 --noworker
python manage.py runworker
And this does not work, while python manage.py runserver 0.0.0.0:8000 was working...
Any idea that could help?
channels will use default views for un-routed requests.
assuming you use the javascripts right, I suggest you use only your default Cats/routing.py file as following:
from channels.routing import route
from main.consumers import *
app_routing = [
route('websocket.connect', ws_echo, path="/ws/main")
]
or with reverse to help with your path
from django.urls import reverse
from channels.routing import route
from main.consumers import *
app_routing = [
route('websocket.connect', ws_echo, path=reverse('main view name'))
]
I think also your consumer should be changed. when browser connects using websockets the server should first handle adding message reply channel. something like:
def ws_echo(message):
Group("notifications").add(message.reply_channel)
Group("notifications").send({
"text": json.dumps({'testkey':'testvalue'})
})
the send function should probably be called up on different event and the "notifications" Group should probably changed to have a channel dedicated to the user. something like
from channels.auth import channel_session_user_from_http
#channel_session_user_from_http
def ws_echo(message):
Group("notify-private-%s" % message.user.id).add(message.reply_channel)
Group("notify-private-%s" % message.user.id).send({
"text": json.dumps({'testkey':'testvalue'})
})
If you're using heroku or dokku make sure you've properly set the "scale" to include the worker process. By default they will only run the web instance and not the worker!
For heroku
heroku ps:scale web=1:free worker=1:free
For dokku create a file named DOKKU_SCALE and add in:
web=1
worker=1
See:
http://blog.codelv.com/2017/10/timouts-django-channels-on-dokku.html
https://blog.heroku.com/in_deep_with_django_channels_the_future_of_real_time_apps_in_django

Running periodic tasks with django and celery

I'm trying create a simple background periodic task using Django-Celery-RabbitMQ combination. I installed Django 1.3.1, I downloaded and setup djcelery. Here is how my settings.py file looks like:
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "/"
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
....
import djcelery
djcelery.setup_loader()
...
INSTALLED_APPS = (
'djcelery',
)
And I put a 'tasks.py' file in my application folder with the following contents:
from celery.task import PeriodicTask
from celery.registry import tasks
from datetime import timedelta
from datetime import datetime
class MyTask(PeriodicTask):
run_every = timedelta(minutes=1)
def run(self, **kwargs):
self.get_logger().info("Time now: " + datetime.now())
print("Time now: " + datetime.now())
tasks.register(MyTask)
And then I start up my django server (local development instance):
python manage.py runserver
Then I start up the celerybeat process:
python manage.py celerybeat --logfile=<path_to_log_file> -l DEBUG
I can see entries like this in the log:
[2012-04-29 07:50:54,671: DEBUG/MainProcess] tasks.MyTask sent. id->72a5963c-6e15-4fc5-a078-dd26da663323
And I also can see the corresponding entries getting created in database, but I can't find where it is logging the text I specified in the actual run function in MyTask class.
I tried fiddling with the logging settings, tried using the django logger instead of celery logger, but of no use. I'm not even sure, my task is getting executed. If I print any debug information in the task, where does it go?
Also, this is first time I'm working with any type of message queuing system. It looks like the task will get executed as part of the celerybeat process - outside the django web framework. Will I still be able to access all the django models I created.
Thanks,
Venkat.
Celerybeat it stuff, which pushes task when it need, but not executing them. You tasks instances stored in RabbitMq server. You need to execute celeryd daemon for executing your tasks.
python manage.py celeryd --logfile=<path_to_log_file> -l DEBUG
Also if you using RabbitMq, I recommend to you to install special rabbitmq management plugins:
rabbitmq-plugins list
rabbitmq-enable rabbitmq_management
service rabbitmq-server restart
It will be available at http://:55672/ login: guest pass: guest. Here you can check how many tasks in your rabbit instance online.
You should check the RabbitMQ logs, since celery sends the tasks to RabbitMQ and it should execute them. So all the prints of the tasks should be in RabbitMQ logs.