Django channels redis channel layer opens a lot of connections - django

We ported a part of our application to django channels recently, using the redis channel layer backend. A part of our setup still runs on python2 in a docker which is why we use redis pub/sub to send messages back to the client. A global listener (inspired by this thread) catches all messages and distributes them into the django channels system. It all works fine so far but I see a lot of debug messages Creating tcp connection... passing by. The output posted below corresponds to one event. Both the listener as well as the consumer seem to be creating two redis connections. I have not enough knowledge about the underlying mechanism to be able to tell if this the expected behavior, thus me asking here. Is this to be expected?
The Listener uses a global channel layer instance:
# maps the publish type to a method name of the django channel consumer
PUBLISH_TYPE_EVENT_MAP = {
'state_change': 'update_client_state',
'message': 'notify_client',
}
channel_layer = layers.get_channel_layer()
class Command(BaseCommand):
help = u'Opens a connection to Redis and listens for messages, ' \
u'and then whenever it gets one, sends the message onto a channel ' \
u'in the Django channel system'
...
def broadcast_message(self, msg_body):
group_name = msg_body['subgrid_id'].replace(':', '_')
try:
event_name = PUBLISH_TYPE_EVENT_MAP[msg_body['publish_type']]
# backwards compatibility
except KeyError:
event_name = PUBLISH_TYPE_EVENT_MAP[msg_body['type']]
async_to_sync(channel_layer.group_send)(
group_name, {
"type": event_name,
"kwargs": msg_body.get('kwargs'),
})
The consumer is a JsonWebsocketConsumer that is initialized like this
class SimulationConsumer(JsonWebsocketConsumer):
def connect(self):
"""
Establishes the connection with the websocket.
"""
logger.debug('Incoming connection...')
# subgrid_id can be set dynamically, see property subgrid_id
self._subgrid_id = self.scope['url_route']['kwargs']['subgrid_id']
async_to_sync(self.channel_layer.group_add)(
self.group_name,
self.channel_name
)
self.accept()
And the method that is called from the listener:
def update_client_state(self, event):
"""
Public facing method that pushes the state of a simulation back
to the client(s). Has to be called through django channels
```async_to_sync(channel_layer.group_send)...``` method
"""
logger.debug('update_client_state event %s', event)
current_state = self.redis_controller.fetch_state()
data = {'sim_state': {
'sender_sessid': self.session_id,
'state_data': current_state}
}
self.send_json(content=data)
A single event gives me this output
listener_1 | DEBUG !! data {'publish_type': 'state_change', 'subgrid_id': 'subgrid:6d1624b07e1346d5907bbd72869c00e8'}
listener_1 | DEBUG !! event_name update_client_state
listener_1 | DEBUG !! kwargs None
listener_1 | DEBUG !! group_name subgrid_6d1624b07e1346d5907bbd72869c00e8
listener_1 | DEBUG Using selector: EpollSelector
listener_1 | DEBUG Parsing Redis URI 'redis://:#redis-threedi-server:6379/13'
listener_1 | DEBUG Creating tcp connection to ('redis-threedi-server', 6379)
listener_1 | DEBUG Parsing Redis URI 'redis://:#redis-threedi-server:6379/13'
listener_1 | DEBUG Creating tcp connection to ('redis-threedi-server', 6379)
threedi-server_1 | DEBUG Parsing Redis URI 'redis://:#redis-threedi-server:6379/13'
threedi-server_1 | DEBUG Creating tcp connection to ('redis-threedi-server', 6379)
threedi-server_1 | DEBUG update_client_state event {'type': 'update_client_state', 'kwargs': None}
threedi-server_1 | DEBUG Parsing Redis URI 'redis://:#redis-threedi-server:6379/13'
threedi-server_1 | DEBUG Creating tcp connection to ('redis-threedi-server', 6379)

Related

ValueError('Cannot invoke RPC: Channel closed!') when multiple process are launched together

I get this error when I launch, from zero, more than 4 process in sync:
{
"insertId": "61a4a4920009771002b74809",
"jsonPayload": {
"asctime": "2021-11-29 09:59:46,620",
"message": "Exception in callback <bound method ResumableBidiRpc._on_call_done of <google.api_core.bidi.ResumableBidiRpc object at 0x3eb1636b2cd0>>: ValueError('Cannot invoke RPC: Channel closed!')",
"funcName": "handle_event",
"lineno": 183,
"filename": "_channel.py"
}
This is the pub-sub schema:
pub-sub-schema
The error seems to happen at step 9 or 10.
The actual code is:
future = publisher.publish(
topic_path,
encoded_message,
msg_attribute=message_key
)
future.add_done_callback(
callback=lambda f:
logging.info(...)
)
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path(
PROJECT_ID,
"..."
)
streaming_pull_future = subscriber.subscribe(
subscription_path,
callback=aggregator_callback_handler.handle_message
)
aggregator_callback_handler.callback = streaming_pull_future
wait_result(
timeout=300,
pratica=...,
f_check_res_condition=lambda: aggregator_callback_handler.response is not None
)
streaming_pull_future.cancel()
subscriber.close()
The module aggregator_callback_handler handles .nack and .ack.
The error is returned for some seconds, then the VMs on which the services are hosted scales and the error stops. Same if, instead of launching the processes all together, I scale them manually launching them one by one and leaving some sleep in-between.
I've already checked the timeouts and put the subscriber outside of context manager, but those solutions doesn't work.
Any idea on how to handle this?

Access Kafka Cluster Outside GCP

I'm currently trying to access the kafka cluster(bitnami) from my local machine, however the problem is that even after exposing the required host and ports in server.properties and adding firewall rules to allow 9092 port it just doesn't connect.
I'm running 2 broker and 1 zookeeper configuration.
Expected Output: Producer.bootstrap_connected() should return True.
Actual Output: False
server.properties
listeners=SASL_PLAINTEXT://:9092
advertised.listeners=SASL_PLAINTEXT://gcp-cluster-name:9092
sasl.mechanism.inter.broker.protocol=PLAIN`
sasl.enabled.mechanisms=PLAIN
security.inter.broker.protocol=SASL_PLAINTEXT
Consumer.py
from kafka import KafkaConsumer
import json
sasl_mechanism = 'PLAIN'
security_protocol = 'SASL_PLAINTEXT'
# Create a new context using system defaults, disable all but TLS1.2
context = ssl.create_default_context()
context.options &= ssl.OP_NO_TLSv1
context.options &= ssl.OP_NO_TLSv1_1
consumer = KafkaConsumer('organic-sense',
bootstrap_servers='<server-ip>:9092',
value_deserializer=lambda x: json.loads(x.decode('utf-8')),
ssl_context=context,
sasl_plain_username='user',
sasl_plain_password='<password>',
sasl_mechanism=sasl_mechanism,
security_protocol = security_protocol,
)
print(consumer.bootstrap_connected())
for data in consumer:
print(data)

Channels websocket AsyncJsonWebsocketConsumer disconnect not reached

I have the following consumer:
class ChatConsumer(AsyncJsonWebsocketConsumer):
pusher = None
async def connect(self):
print(self.scope)
ip = self.scope['client'][0]
print(ip)
self.pusher = await self.get_pusher(ip)
print(self.pusher)
await self.accept()
async def disconnect(self, event):
print("closed connection")
print("Close code = ", event)
await self.close()
raise StopConsumer
async def receive_json(self, content):
#print(content)
if 'categoryfunctionname' in content:
await cellvoltage(self.pusher, content)
else:
print("ERROR: Wrong data packets send")
print(content)
#database_sync_to_async
def get_pusher(self, ip):
p = Pusher.objects.get(auto_id=1)
try:
p = Pusher.objects.get(ip=ip)
except ObjectDoesNotExist:
print("no pusher found")
finally:
return p
Connecting, receiving and even getting stuff async from the database works perfectly. Only disconnecting does not work as expected. The following Terminal Log explains what's going on:
2018-09-19 07:09:56,653 - INFO - server - HTTP/2 support not enabled (install the http2 and tls Twisted extras)
2018-09-19 07:09:56,653 - INFO - server - Configuring endpoint tcp:port=1111:interface=192.168.1.111
2018-09-19 07:09:56,653 - INFO - server - Listening on TCP address 192.168.1.111:1111
[2018/09/19 07:11:25] HTTP GET / 302 [0.02, 10.171.253.112:35236]
[2018/09/19 07:11:25] HTTP GET /login/?next=/ 200 [0.05, 10.111.253.112:35236]
{'type': 'websocket', 'path': '/ws/chat/RP1/', 'headers': [(b'upgrade', b'websocket'), (b'connection', b'Upgrade'), (b'host', b'10.111.111.112:1111'), (b'origin', b'http://10.111.253.112:1111'), (b'sec-websocket-key', b'vKFAnqaRMm84AGUCxbAm3g=='), (b'sec-websocket-version', b'13')], 'query_string': b'', 'client': ['10.111.253.112', 35238], 'server': ['192.168.1.111', 1111], 'subprotocols': [], 'cookies': {}, 'session': <django.utils.functional.LazyObject object at 0x7fe4a8d1ba20>, 'user': <django.utils.functional.LazyObject object at 0x7fe4a8d1b9e8>, 'path_remaining': '', 'url_route': {'args': (), 'kwargs': {'room_name': 'RP1'}}}
10.111.253.112
[2018/09/19 07:11:25] WebSocket HANDSHAKING /ws/chat/RP1/ [10.111.253.112:35238]
[2018/09/19 07:11:25] WebSocket CONNECT /ws/chat/RP1/ [10.111.111.112:35238]
no pusher found
1 - DEFAULT - 0.0.0.0
ERROR: Wrong data packets send
{'hello': 'Did I achieve my objective?'}
[2018/09/19 07:11:46] WebSocket DISCONNECT /ws/chat/RP1/ [10.111.253.112:35238]
2018-09-19 07:11:56,792 - WARNING - server - Application instance <Task pending coro=<SessionMiddlewareInstance.__call__() running at /home/pi/PycharmProjects/LOGGER/venv/lib/python3.6/site-packages/channels/sessions.py:175> wait_for=<Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib/python3.6/asyncio/futures.py:403, <TaskWakeupMethWrapper object at 0x7fe4a82e6fd8>()]>> for connection <WebSocketProtocol client=['10.171.253.112', 35238] path=b'/ws/chat/RP1/'> took too long to shut down and was killed.
After the 10 seconds timeout it gives a warning that the connection was killed:
WARNING - server - Application instance taskname running at location
at linenumber for connection cxn-name took too long to shut down and
was killed.
The disconnect method was thus also not reached.
What could this be?
Am I using the correct method?
Could I expand the timeout period?
if you are intending to run some custom logic during connection close then you should override websocket_disconnect and then call super (rather than rais the exception yourself)
https://github.com/django/channels/blob/507cb54fcb36df63282dd19653ea743036e7d63c/channels/generic/websocket.py#L228-L241
The code linked in the other answer was very helpful. Here it is for reference (as of May 2022):
async def websocket_disconnect(self, message):
"""
Called when a WebSocket connection is closed. Base level so you don't
need to call super() all the time.
"""
try:
for group in self.groups:
await self.channel_layer.group_discard(group, self.channel_name)
except AttributeError:
raise InvalidChannelLayerError(
"BACKEND is unconfigured or doesn't support groups"
)
await self.disconnect(message["code"])
raise StopConsumer()
async def disconnect(self, code):
"""
Called when a WebSocket connection is closed.
"""
pass
I'd note that there is no need to use super() on websocket_disconnect. They have a hook disconnect which was perhaps added since the original answer that can be used to add any teardown code. You can simply override the disconnect method in your own class and it will be called.

Rabbitmq message queues pile up untill system crash (all queues are "ready")

I have a simple Raspberry pi + Django + Celery + Rabbitmq setup that I use to send and receive data from Xbee radios while users interact with the web app.
For the life of me I cant get Rabbitmq (or celery?) under control where after only a single day (sometimes a little longer) the whole system crashes due to some kind of memory leak.
What I am suspecting is that the queues are piling up and never being removed.
Heres a picture of what I see after only a few minutes of run time:
Seems that all of the queues are in the "ready" state.
What's strange is that it would appear that the workers do in fact receive the message and run the task.
The task is very small and shouldn't take longer than 1 second.
I have verified the tasks do execute to the last line and should be returning ok.
I'm no expert and have no clue what I'm actually looking at so I'm unsure if that is normal behavior and my issue lies elsewhere?
I have everything set to run as daemonized, however even when running in development modes I get same results.
I have spent the last four hours debugging with Google search and found it was taking me in circles and I was not finding clarity.
[CONFIGS AND CODE]
in /ect/default/celeryd I have set the following:
CELERY_APP="MyApp"
CELERYD_NODES="w1"
# Python interpreter from environment.
ENV_PYTHON="/home/pi/.virtualenvs/myappenv/bin/python"
# Where to chdir at start.
CELERYD_CHDIR="/home/pi/django_projects/MyApp"
# Virtual Environment Setup
ENV_MY="/home/pi/.virtualenvs/myappenv"
CELERYD="$ENV_MY/bin/celeryd"
CELERYD_MULTI="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryd_multi"
CELERYCTL="$ENV_MY/bin/celeryctl"
CELERYD_OPTS="--app=MyApp --concurrency=1 --loglevel=FATAL"
CELERYD_LOG_FILE="/var/log/celery/%n.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
CELERYD_USER="celery"
CELERYD_GROUP="celery"
tasks.py
#celery.task
def sendStatus(modelContext, ignore_result=True, *args, **kwargs):
node = modelContext#EndNodes.objects.get(node_addr_lg=node_addr_lg)
#check age of message and proceed to send status update if it is fresh, otherwise we'll skip it
if not current_task.request.eta == None:
now_date = datetime.now().replace(tzinfo=None) #the time now
eta_date = dateutil.parser.parse(current_task.request.eta).replace(tzinfo=None)#the time this was supposed to run, remove timezone from message eta datetime
delta_seconds = (now_date - eta_date).total_seconds()#seconds from when this task was supposed to run
if delta_seconds >= node.status_timeout:#if the message was queued more than delta_seconds ago this message is too old to process
return
#now that we know the message is fresh we can proceed to process the contents and send status to xbee
hostname = current_task.request.hostname #the name/key in the schedule that might have related xbee sessions
app = Celery('app')#create a new instance of app (because documented methods didnt work)
i = app.control.inspect()
scheduled_tasks = i.scheduled()#the schedule of tasks in the queue
for task in scheduled_tasks[hostname]:#iterate through each task
xbee_session = ast.literal_eval(task['request']['kwargs'])#the request data in the message (converts unicode to dict)
if xbee_session['xbee_addr'] == node.node_addr_lg:#get any session data for this device that we may have set from model's save override
if xbee_session['type'] == 'STAT':#because we are responding with status update we look for status sessions
app.control.revoke(task['request']['id'], terminate=True)#revoke this task because it is redundant and we are sending update now
page_mode = chr(node.page_mode)#the paging mode to set on the remote device
xbee_global.tx(dest_addr_long=bytearray.fromhex(node.node_addr_lg),
frame_id='A',
dest_addr='\xFF\xFE',
data=page_mode)
celery splash:
-------------- celery#raspberrypi v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Linux-4.4.11-v7+-armv7l-with-debian-8.0
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: MyApp:0x762efe10
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results: amqp://
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. MyApp.celery.debug_task
. clone_app.tasks.nodeInterval
. clone_app.tasks.nodePoll
. clone_app.tasks.nodeState
. clone_app.tasks.resetNetwork
. clone_app.tasks.sendData
. clone_app.tasks.sendStatus
[2016-10-11 03:41:12,863: WARNING/Worker-1] Got signal worker_process_init for task id None
[2016-10-11 03:41:12,913: WARNING/Worker-1] JUST OPENED
[2016-10-11 03:41:12,915: WARNING/Worker-1] /dev/ttyUSB0
[2016-10-11 03:41:12,948: INFO/MainProcess] Connected to amqp://guest:**#127.0.0.1:5672//
[2016-10-11 03:41:13,101: INFO/MainProcess] mingle: searching for neighbors
[2016-10-11 03:41:14,206: INFO/MainProcess] mingle: all alone
[2016-10-11 03:41:14,341: WARNING/MainProcess] celery#raspberrypi ready.
[2016-10-11 03:41:16,223: WARNING/Worker-1] RAW DATA
[2016-10-11 03:41:16,225: WARNING/Worker-1] {'source_addr_long': '\x00\x13\xa2\x00#\x89\xe9\xd7', 'rf_data': '...^%:STAT:`', 'source_addr': '[*', 'id': 'rx', 'options': '\x01'}
[2016-10-11 03:41:16,458: INFO/MainProcess] Received task: clone_app.tasks.sendStatus[6e1a74ec-dca5-495f-a4fa-906a5c657b26] eta:[2016-10-11 03:41:17.307421+00:00]
I can provide additional details if required!!
And thank you for any help resolving this.
Wow, almost immedietly after posting my question I found this post and it has completely resolved my issue.
As I expected ignore_result=True was required, I just was not sure where it belonged.
Now I see no queues except maybe for the instant a worker is running a task. :)
Here's the change in tasks.py:
#From
#celery.task
def sendStatus(modelContext, ignore_result=True, *args, **kwargs):
#Some code here
#To
#celery.task(ignore_result=True)
def sendStatus(modelContext, *args, **kwargs):
#Some code here

APNS issue with django

I'm using the following project for enabling APNS in my project:
https://github.com/stephenmuss/django-ios-notifications
I'm able to send and receive push notifications on my production app fine, but the sandbox apns is having strange issues which i'm not able to solve. It's constantly not connecting to the push service. When I do manually the _connect() on the APNService or FeedbackService classes, I get the following error:
File "/Users/MyUser/git/prod/django/ios_notifications/models.py", line 56, in _connect
self.connection.do_handshake()
Error: [('SSL routines', 'SSL3_READ_BYTES', 'sslv3 alert handshake failure')]
I tried recreating the APN certificate a number of times and constantly get the same error. Is there anything else i'm missing?
I'm using the endpoints gateway.push.apple.com and gateway.sandbox.push.apple.com for connecting to the service. Is there anything else I should look into for this? I have read the following:
Apns php error "Failed to connect to APNS: 110 Connection timed out."
Converting PKCS#12 certificate into PEM using OpenSSL
Error Using PHP for iPhone APNS
Turns out Apple changed ssl context from SSL3 to TLSv1 in development. They will do this in Production eventually (not sure when). The following link shows my pull request which was accepted into the above project:
https://github.com/stephenmuss/django-ios-notifications/commit/879d589c032b935ab2921b099fd3286440bc174e
Basically, use OpenSSL.SSL.TLSv1_METHOD if you're using python or something similar in other languages.
Although OpenSSL.SSL.SSLv3_METHOD works in production, it may not work in the near future. OpenSSL.SSL.TLSv1_METHOD works in production and development.
UPDATE
Apple will remove SSL 3.0 support in production on October 29th, 2014 due to the poodle flaw.
https://developer.apple.com/news/?id=10222014a
I have worked on APN using python-django, for this you need three things URL, PORT and Certificate provided by Apple for authentication.
views.py
import socket, ssl, json, struct
theCertfile = '/tmp/abc.cert' ## absolute path where certificate file is placed.
ios_url = 'gateway.push.apple.com'
ios_port = 2195
deviceToken = '3234t54tgwg34g' ## ios device token to which you want to send notification
def ios_push(msg, theCertfile, ios_url, ios_port, deviceToken):
thePayLoad = {
'aps': {
'alert':msg,
'sound':'default',
'badge':0,
},
}
theHost = ( ios_url, ios_port )
data = json.dumps( thePayLoad )
deviceToken = deviceToken.replace(' ','')
byteToken = deviceToken.decode('hex') # Python 2
theFormat = '!BH32sH%ds' % len(data)
theNotification = struct.pack( theFormat, 0, 32, byteToken, len(data), data )
# Create our connection using the certfile saved locally
ssl_sock = ssl.wrap_socket( socket.socket( socket.AF_INET, socket.SOCK_STREAM ), certfile = theCertfile )
ssl_sock.connect( theHost )
# Write out our data
ssl_sock.write( theNotification )
# Close the connection -- apple would prefer that we keep
# a connection open and push data as needed.
ssl_sock.close()
Hopefully this would work for you.