After moving to channels2 I'm still struggling with python's "new" async/await and asyncio.
First I tried to reproduce Worker and Background Tasks from the docs but then I realised that my task should just run as simple async function.
So, my test function is
async def replay_run(self, event):
print_info("replay_run", event, self.channel_name)
import asyncio
for i in range(10):
await asyncio.sleep(1)
print_info("replay_run-",i, event, self.channel_name)
and both of the following calls inside async def receive_json(self, event)
seem to prevent a subsequent incoming message from being handled right away.
Version 1:
await self.channel_layer.send(
self.channel_name, {
"type": "replay_run", "sessionID": msg["sessionID"]
})
Version 2:
await self.replay_run(msg)
First I thought of version 1 because I thought I needed to register a "new event consumer" like await asyncio.gather...
Any hint on how to do this right is appreciated ...
Found the solution here: django.channels async consumer does not appear to execute asynchronously
Apparently the way would be
Version 3:
asyncio.ensure_future( self.replay_runit(msg) )
... in order to schedule the "long running coroutine" without blocking the channel.
Related
In my django quiz project I work with Channels. My consumer QuizConsumer is synchronous but now I need it to start a timer that runs in parallel to its other functions and sends a message after the timer is over.
class QuizConsumer(WebsocketConsumer):
def configure_round(self, lobby_id):
# Call function to send message to begin the round to all players
async_to_sync(self.channel_layer.group_send)(
self.lobby_group_name,
{
'type':'start_round',
'lobby_id':lobby_id,
}
)
self.show_results(lobby_id, numberVideos, playlength)
def start_round(self, event):
self.send(text_data=json.dumps({
'type':'start_round',
'lobby_id': event['lobby_id']
}))
def show_results(self, lobby_id, numberVideos, playlength):
async_to_sync(self.channel_layer.group_send)(
self.lobby_group_name,
{
'type':'show_results_delay',
'lobby_id':lobby_id,
'numberVideos':numberVideos,
'playlength':playlength,
}
)
def show_results_delay(self, event):
time.sleep(event['numberVideos']*event['playlength']+5)
self.send(text_data=json.dumps({
'type':'show_solution',
'lobby_id': event['lobby_id']
}))
I tried to work with async functions and await, but I did not get it to work as I expected yet. I tried to work with async functions but so far it disconnected from the websocket or it did not send the "start_round" until the timer was finished.
I have the following async consumer:
class MyAsyncSoncumer(AsyncWebsocketConsumer):
async def send_http_request(self):
async with aiohttp.ClientSession(
timeout=aiohttp.ClientTimeout(total=60) # We have 60 seconds total timeout
) as session:
await session.post('my_url', json={
'key': 'value'
})
async def connect(self):
await self.accept()
await self.send_http_request()
async def receive(self, text_data=None, bytes_data=None):
print(text_data)
Here, on connect method, I first accept the connection, then call a method that issues an http request with aiohttp, which has a 60 second timeout. Lets assume that the url we're sending the request to is inaccessible. My initial understanding is, as all these methods are coroutines, while we are waiting for the response to the request, if we receive a message, receive method would be called and we could process the message, before the request finishes. However, in reality, I only start receiveing messages after the request times out, so it seems like the consumer is waiting for the send_http_request to finish before being able to receive messages.
If I replace
await self.send_http_request()
with
asyncio.create_task(self.send_http_request())
I can reveive messages while the request is being made, as I do not await for it to finish on accept method.
My understanding was that in the first case also, while awaiting for the request, I would be able to receive messages as we are using different coroutines here, but that is not the case. Could it be that the whole consumer instance works as a single coroutine? Can someone clarify what's happenning here?
Consumers in django channels each run thier own runloop (async Task). But this is per consumer not per message, so if you are handling a message and you await something then the entire runloop for that websocket connection is awaiting.
I need to continuously get data from a MySQL database which gets data with an update frequency of around 200 ms. I need to continuously update the data value on the dashboard text field.My dashboard is built on Django.
I have read a lot about Channels but all the tutorials are about chat applications. I know that I need to implement WebSockets which will basically have an open connection and get the data. With the chat application, it makes sense but I haven't come across anything which talks about MySQL database.
I also read about mysql-events. Since the data which is getting in the table is from an external sensor, I don't understand how I can monitor a table inside Django i.e whenever a new row is added in the table, I need to get that new inserted based on a column value.
Any ideas on how to go about it? I have gone through a lot of articles and I couldnt find something specific to this requirement.
Thanks to Timothee Legros answer, it kinda helped me move along in the right direction.
Everywhere on the internet, it says that Django channels is/can be used for real-time applications, but nowhere it talks about the exact implementation(other than chat applications).
I used Celery, Django Channels and Celery's Beat to accomplish the task and it works as expected.
There are three parts to it. Setting up channel's, then creating a celery task, calling it periodically (with the help of Celery Beat) and then sending that task's output to channel's so that it can send that data to the websocket.
Channels
I followed the original tutorial on Channel's website and build up on that.
routing.py
from django.urls import re_path
from . import consumers
websocket_urlpatterns = [
re_path(r'ws/chat/(?P<room_name>\w+)/$', consumers.ChatConsumer),
re_path(r'ws/realtimeupdate/$', consumers.RealTimeConsumer),
]
consumers.py
class RealTimeConsumer(AsyncWebsocketConsumer):
async def connect(self):
self.channel_group_name = 'core-realtime-data'
# Join room group
await self.channel_layer.group_add(
self.channel_group_name,
self.channel_name
)
await self.accept()
async def disconnect(self, close_code):
# Leave room group
await self.channel_layer.group_discard(
self.channel_group_name,
self.channel_name
)
# Receive message from WebSocket
async def receive(self, text_data):
print(text_data)
pass
async def loc_message(self, event):
# print(event)
message_trans = event['message_trans']
message_tag = event['message_tag']
# print("sending data to websocket")
await self.send(text_data=json.dumps({
'message_trans': message_trans,
'message_tag': message_tag
}))
This class will basically send data to the websocket once it receives it. Above two will be specific to the app.
Now we will setup Celery.
In the project's base directory, where the setting file resides, we need to make three files.
celery.py This will init the celery.
routing.py This will be used to route the channel's websocket addresses.
task.py This is where we will setup the task
celery.py
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj_name.settings')
app = Celery('proj_name', backend='redis://localhost', broker='redis://localhost/')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print(f'Request: {self.request!r}')
routing.py
from channels.auth import AuthMiddlewareStack
from channels.routing import ProtocolTypeRouter, URLRouter
from app_name import routing
application = ProtocolTypeRouter({
# (http->django views is added by default)
'websocket': AuthMiddlewareStack(
URLRouter(
routing.websocket_urlpatterns
)
),
})
tasks.py
#shared_task(name='realtime_task')
def RealTimeTask():
time_s = time.time()
result_trans = CustomModel_1.objects.all()
result_tag = CustomModel_2.objects.all()
result_trans_json = serializers.serialize('json', result_trans)
result_tag_json = serializers.serialize('json', result_tag)
# output = {"ktr": result_transmitter_json, "ktag": result_tag_json}
# print(output)
channel_layer = get_channel_layer()
message = {'type': 'loc_message',
'message_transmitter': result_trans_json,
'message_tag': result_tag_json}
async_to_sync(channel_layer.group_send)('core-realtime-data', message)
print(time.time()-time_s)
The task, after completing the task, sends the result back to the Channels, which in turn will relay it to the websocket.
Settings.py
# Channels
CHANNEL_LAYERS = {
'default': {
'BACKEND': 'channels_redis.core.RedisChannelLayer',
'CONFIG': {
"hosts": [('127.0.0.1', 6379)],
},
},
}
CELERY_BEAT_SCHEDULE = {
'task-real': {
'task': 'realtime_task',
'schedule': 1 # this means, the task will run itself every second
},
}
Now the only thing left is to create a websocket in the javascript file and start listening to it.
//Create web socket to receive data
const chatSocket = new WebSocket(
'ws://'
+ window.location.host
+ '/ws/realtimeupdate'
+ '/'
);
chatSocket.onmessage = function(e) {
const data = JSON.parse(e.data);
console.log(e.data + '\n');
// For trans
var arrayOfObjects = JSON.parse(data.message_trans);
//Do your thing
//For tags
var arrayOfObjects_tag = JSON.parse(data.message_tag);
//Do your thing
}
};
chatSocket.onclose = function(e) {
console.error('Chat socket closed unexpectedly');
};
To answer the MySQL usage, I am inserting data into the MySQL database from external sensor and in the tasks.py, am querying the table using Django ORM.
Overall, it does the intended work, populate a real-time dashboard with real-time data from MySQL . Am sure, there might be different and better approach to it, please let me know about it.
Your best bet if you need to constantly query your sql database would be to use Celery or dramatiq which is simpler/easier but less battle tested in combination with Django Channels.
Celery allows you to create workers (kind of like background processes) that you can send tasks (functions) to. When a worker receives a task it will execute. All this is done in the background. From the task that the worker is executing you can actually send data back through a websocket directly from the worker. This only works if you have django channels + channel layers enabled because when you enable channel layers, each consumer instance created when you open a channel/websocket will have a name that you can pass to the worker so that it knows which websocket to send the query data back to.
Here is what the flow of this process would look like:
Client requests to connect to your websocket
Consumer instance is created and with it a specific name for it
Consumer instance accepts connection
Consumer triggers celery task and passes the name
Worker begins polling your SQL databases every X seconds
When worker finds new entry use the name it was given and send the new entry back through the websocket.
I suggest reading django channels documentation on consumers and channel layers as well as celery or dramatiq tutorials to understand how those work. For all this to work you will also have to learn about Redis and a message queue service such as RabbitMQ. There is just too much to put in a simple answer but I can provide more information if you have specific questions.
Edit:
Get Redis Server Setup on your machine. If you are on Windows like me then you have to download WSL 2 and install Ubuntu from the Windows Store (free). This link can walk you through it.
Get RabbitMQ server setup. Follow their tutorial
Enable Django Channels and Django-Channel-layers and then setup Redis as your default Django-channels backend.
Setup Dramatiq or Celery. I prefer Dramatiq as it is basically a new and improved version of Celery albeit being less popular. It is much easier to setup and use. This is the github repo for Django-dramatiq and it will walk you through how to set it up. Note that just like when you launch your django server with python manage.py runserver you have to launch dramatiq workers with python manage.py rundramatiq before testing you website.
Create a tasks.py file in your django app and inside of that task implement your code to check MySQL database for new entries. If you haven't figured that out already here is the link to get started with that. In your tasks file you should have a function with the dramatiq.actor decorator on top so that dramatiq knows that the function is a task.
Build a django-channels consumer to handle WebSocket connections as well as allow you to send data through the WebSocket connection. This is what the standard consumer would look like:
class AsyncDashboardConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
await self.accept()
async def disconnect(self, code):
await self.close()
async def receive_json(self, text_data=None, bytes_data=None, **kwargs):
someData = text_data['someData']
someOtherData = text_data['someOtherData']
if 'execute_getMySQLdata' in text_data['function']:
await self.getData(someData, someOtherData)
async def sendDataToClient(self, event):
await self.send(text_data=event['text'])
async def getData(self, someData, someOtherData):
sync_to_async(SQLData.send(self.channel_name, someData, someOtherData))
connect function is called when the client attempts to connect to the WebSocket URL that your routing file (in step 2) points to this consumer.
recieve_json function is called whenever the client sends data to your django server.
getData function is called from the recieve_json function and sends a message to start your dramatiq task that you created earlier to check SQL db. Note that when you send the message you must pass in self.channel_name as you use that channel_name to send data back through the WebSocket directly from the dramatiq worker/task.
sendDataToClient function is used when you send data back to the client. So when you send data from your task this is the function you must pass in as a callable.
To send data from the task you created earlier use this: async_to_sync(channel_layer.send)(channelName, {'type': 'sendData', 'text': jsonPayload}). Notice how you pass the channelName as well as the sendData function from your consumer.
Finally, this is what the javascript on the client side would look like:
let socket = new WebSocket("wss://javascript.info/article/websocket/demo/hello");
socket.onopen = function(e) {
alert("[open] Connection established");
alert("Sending to server");
socket.send("My name is John");
};
socket.onmessage = function(event) {
alert(`[message] Data received from server: ${event.data}`);
};
socket.onclose = function(event) {
if (event.wasClean) {
alert(`[close] Connection closed cleanly, code=${event.code} reason=${event.reason}`);
} else {
// e.g. server process killed or network down
// event.code is usually 1006 in this case
alert('[close] Connection died');
}
};
socket.onerror = function(error) {
alert(`[error] ${error.message}`);
};
This code came directly from this JavaScript WebSocket walkthrough.
This is how a basic web application with background workers would continually update information in real-time. There are probably other ways of doing this without background workers but since you want to get information as fast as possible as soon as it arrives it is better to have a background process that is continually checking for updates. On another note, the code above means that separate connections to the database are opened for each new client that connects but you can easily take advantage of django-channels groups and have one connection to your database that then just sends to all clients in certain groups.
Build a microservice for Websockets connections
Another way to implement such a feature - is to build a standalone WebSocket microservice.
Monolyth architecture isn't what you need here. Every WebSocket will open a connection to the Django (which will be behind reverse proxy and server: NGINX and Gunicorn ex.). If your client opens two tabs in the browser you will get 2 connections etc...
My recommendation is to modify the tech stack (yes, I'm a huge fan of Django, but there are many cool solutions in building WS):
Use Starlette ready for production framework with build-in WebSockets: https://www.starlette.io/websockets/
Use uvicorn.workers.UvicornWorker for Gunicorn to manage your ASGI application: this is only 1 line of code, like gunicorn -w 4 -k uvicorn.workers.UvicornWorker --log-level warning example:app
handle your WebSocket connections and use examples to request updates from the database: https://www.starlette.io/database/
Use super simple Javascript code to open the connection of the client-side and listen for updates.
So your models, templates, the view will be managed by Django.
Your WebSocket connections will be managed by Starlette in a native async way.
If you're interested in such an option I can make detailed instructions.
I have added django.channels to a django project in order to support long running processes that notify users of progress via websockets.
Everything appears to work fine except for the fact that the implementation of the long running process doesn't seem to respond asynchronously.
For testing I have created an AsyncConsumer that recognizes two types of messages 'run' and 'isBusy'.
The 'run' message handler sets a 'busy flag' sends back a 'process is running' message, waits asynchronously for 20 seconds resets the 'busy flag' and then sends back a 'process complete message'
The 'isBusy' message returns a message with the status of the busy flag.
My expectation is that if I send a run message I will receive immediately a 'process is running' message back and after 20 seconds I will receive a 'process complete' message.
This works as expected.
I also expect that if I send a 'isBusy' message I will receive immediately a response with the state of the flag.
The observed behaviour is as follows:
a message 'run' is sent (from the client)
a message 'running please wait' is immediately received
a message 'isBusy' is sent (from the client)
the message reaches the web socket listener on the server side
nothing happens until the run handler finishes
a 'finished running' message is received on the client
followed immediately by a 'process isBusy:False' message
Here is the implementation of the Channel listener:
class BackgroundConsoleConsumer(AsyncConsumer):
def __init__(self, scope):
super().__init__(scope)
self.busy = False
async def run(self, message):
print("run got message", message)
self.busy = True
await self.channel_layer.group_send('consoleChannel',{
"type":"consoleResponse",
"text":"running please wait"
})
await asyncio.sleep(20)
self.busy = False
await self.channel_layer.group_send('consoleChannel',{
"type":"consoleResponse",
"text": "finished running"
})
async def isBusy(self,message):
print('isBusy got message', message)
await self.channel_layer.group_send('consoleChannel',{
"type":"consoleResponse",
"text": "process isBusy:{0}".format(self.busy)
})
The channel is set up in the routing file as follows:
application = ProtocolTypeRouter({
"websocket": AuthMiddlewareStack(
URLRouter([
url("^console/$", ConsoleConsumer),
])
),
"channel": ChannelNameRouter({
"background-console":BackgroundConsoleConsumer,
}),
})
I run the channel with one worker (via ./manage.py runworker ).
The experiment was done with the django test server (via runserver).
Any ideas as to why the channel consumer does not appear to work asynchronously would be appreciated.
After a bit of digging around here is the problem and one solution to it.
A channel adds messages sent to it to a asyncio.Queue and processes them sequentially.
It is not enough to release the coroutine control (via a asyncio.sleep() or something similar), one must finish processing the message handler before a new message is received by the consumer.
Here is the fix to the previous example that behaves as expected (i.e. responds to the isBusy messages while processing the run long running task)
Thank you #user4815162342 for your suggestions.
class BackgroundConsoleConsumer(AsyncConsumer):
def __init__(self, scope):
super().__init__(scope)
self.busy = False
async def run(self, message):
loop = asyncio.get_event_loop()
loop.create_task(self.longRunning())
async def longRunning(self):
self.busy = True
await self.channel_layer.group_send('consoleChannel',{
"type":"the.type",
"text": json.dumps({'message': "running please wait", 'author': 'background console process'})
})
print('before sleeping')
await asyncio.sleep(20)
print('after sleeping')
self.busy = False
await self.channel_layer.group_send('consoleChannel',{
"type":"the.type",
"text": json.dumps({'message': "finished running", 'author': 'background console process'})
})
async def isBusy(self,message):
print('isBusy got message', message)
await self.channel_layer.group_send('consoleChannel',{
"type":"the.type",
"text": json.dumps({'message': "process isBusy:{0}".format(self.busy),
'author': 'background console process'})
})
I have a long running celery task which iterates over an array of items and performs some actions.
The task should somehow report back which item is it currently processing so end-user is aware of the task's progress.
At the moment my django app and celery seat together on one server, so I am able to use Django's models to report the status, but I am planning to add more workers which are away from Django, so they can't reach DB.
Right now I see few solutions:
Store intermediate results manually using some storage, like redis or mongodb making then available over the network. This worries me a little bit because if for example I will use redis then I should keep in sync the code on a Django side reading the status and Celery task writing the status, so they use the same keys.
Report status to the Django back from celery using REST calls. Like PUT http://django.com/api/task/123/items_processed
Maybe use Celery event system and create events like Item processed on which django updates the counter
Create a seperate worker which runs on a server with django which holds a task which only increases items proceeded count, so when the task is done with an item it issues increase_messages_proceeded_count.delay(task_id).
Are there any solution or hidden problems with the ones I mentioned?
There are probably many ways to achieve your goal, but here is how I would do it.
Inside your long running celery task set the progress using django's caching framework:
from django.core.cache import cache
#app.task()
def long_running_task(self, *args, **kwargs):
key = "my_task: %s" % self.result.id
...
# do whatever you need to do and set the progress
# using cache:
cache.set(key, progress, timeout="whatever works for you")
...
Then all you have to do is make a recurring AJAX GET request with that key and retrieve the progress from cache. Something along those lines:
def task_progress_view(request, *args, **kwargs):
key = request.GET.get('task_key')
progress = cache.get(key)
return HttpResponse(content=json.dumps({'progress': progress}),
content_type="application/json; charset=utf-8")
Here is a caveat though, if you are running your server as multiple processes, make sure that you are using something like memcached, because django's native caching will be inconsistent among the processes. Also I probably wouldn't use celery's task_id as a key, but it is sufficient for demonstration purpose.
Take a look at flower - a real-time monitor and web admin for Celery distributed task queue:
https://github.com/mher/flower#api
http://flower.readthedocs.org/en/latest/api.html#get--api-tasks
You need it for presentation, right? Flower works with websockets.
For instance - receive task completion events in real-time (taken from official docs):
var ws = new WebSocket('ws://localhost:5555/api/task/events/task-succeeded/');
ws.onmessage = function (event) {
console.log(event.data);
}
You would likely need to work with tasks ('ws://localhost:5555/api/tasks/').
I hope this helps.
Simplest:
Your tasks and django app already share access one or two data stores - the broker and the results backend (if you're using one that is different to the broker)
You can simply put some data into one or other of these data stores that indicates which item the task is currently processing.
e.g. if using redis simply have a key 'task-currently-processing' and store the data relevant to the item currenlty being processed in there.
You can use something like Swampdragon to reach the user from the Celery instance (you have to be able to reach it from the client thou, take care not to run afoul of CORS thou). It can be latched onto the counter, not the model itself.
lehins' solution looks good if you don't mind your clients repeatedly polling your backend. That may be fine but it gets expensive as the number of clients grows.
Artur Barseghyan's solution is suitable if you only need the task lifecycle events generated by Celery's internal machinery.
Alternatively, you can use Django Channels and WebSockets to push updates to clients in real-time. Setup is pretty straightforward.
Add channels to your INSTALLED_APPS and set up a channel layer. E.g., using a Redis backend:
CHANNEL_LAYERS = {
"default": {
"BACKEND": "channels_redis.core.RedisChannelLayer",
"CONFIG": {
"hosts": [("redis", 6379)]
}
}
}
Create an event consumer. This will receive events from Channels and push them via Websockets to the client. For instance:
import json
from asgiref.sync import async_to_sync
from channels.generic.websocket import WebSocketConsumer
class TaskConsumer(WebsocketConsumer):
def connect(self):
self.task_id = self.scope['url_route']['kwargs']['task_id'] # your task's identifier
async_to_sync(self.channel_layer.group_add)(f"tasks-{self.task_id}", self.channel_name)
self.accept()
def disconnect(self, code):
async_to_sync(self.channel_layer.group_discard)(f"tasks-{self.task_id}", self.channel_name)
def item_processed(self, event):
item = event['item']
self.send(text_data=json.dumps(item))
Push events from your Celery tasks like this:
from asgiref.sync import async_to_sync
from channels.layers import get_channel_layer
...
async_to_sync(get_channel_layer.group_send)(f"tasks-{task.task_id}", {
'type': 'item_processed',
'item': item,
})
You can also write an async consumer and/or invoke group_send asynchronously. In either case you no longer need the async_to_sync wrapper.
Add websocket_urlpatterns to your urls.py:
websocket_urlpatterns = [
path(r'ws/tasks/<task_id>/', TaskConsumer.as_asgi()),
]
Finally, to consume events from JavaScript in your client, you can do something like this:
let task_id = 123;
let protocol = location.protocol === 'https:' ? 'wss://' : 'ws://';
let socket = new WebSocket(`${protocol}${window.location.host}/ws/tasks/${task_id}/`);
socket.onmessage = function(event) {
let data = JSON.parse(event.data);
let item = data.item;
// do something with the item (e.g., push it into your state container)
}