Django + Coffescript: Real time video application with io sockets - django

I have been trying to solve this for 2 weeks and I have not been able to reach a solution.
Here is what I am trying to do:
I need a web application in which users can upload a video; the video is going to be transformed using opencv's python API. Since I have Python's API for opencv I decided to create the webapp using Django. Everything is fine to that point.
The problem is that the video transformation is a very long process so I was trying to implement some real time capabilities in order to show the user the video as it is transformed, in other words, I transform a frame and show it to the user inmediatly. I am trying to do this with CoffeScript and io sockets following some examples; however I havent been successful.
My question is; what would be the right approach to add real time capabilities to a Django application ?

I'd recommend using a non-django service to handle the websockets. Setting up websockets properly is tricky on both the client and server side. Look at pusher.com for a free/cheap solution that will just work and save you a whole lot of hassle.
The initial request to start rendering should kick off the long-lived process, and return with an ID which is used to listen to the websocket for updates.
Once you have your websockets set up, you can send messages to the client about each finished frame. Personally I wouldn't try to push the whole frame down the websocket, but rather just send a message saying the frame is done with a URL to get the frame. Then normal HTTP with its caching and browser niceties moves the big data.
You're definitely not choosing the easy path. The easy path is to have your long-lived render task update the render state in the database, and have the client poll that. Extra server load, but a lot simpler.

Django itself really is focused on doing one kind of web interface, which is following the HTTP Request/Response pattern. To maintain a persistent connection with clients, which socket.io really makes dead simple, you need to diverge a bit from a normal Django installation.
This article discusses the issue of doing real-time with Django, with the help of Orbited and Twisted. It's rather old, and it relies on Comet, which is not the preferred way of doing real-time these days.
You might benefit a lot by going for Socket.io on the client, and something like Tornado (wiki) + Tornado client for Socket.io. But, if you really want to stick with Django for the web development (which Tornado also provide), you would need to make the two work together internally, each handling their particular use case.
Finally, this other article discusses how to make Django work with gevent (an coroutine-based networking library for Python) and Socket.io, which might well be your best option otherwise.
Don't hesitate to post questions/comments as they pop up!

Related

Python Flask App - tool to send/push real-time sensor data to clients

I know the question I am going to ask is a bit duplicate. But, I am still asking as I want to know the latest technologies and I am a bit lost after researching for a few hours.
I have a Raspberry Pi logging real-time temperature and humidity. Now, I am writing a flask app to push these data to clients who (subject to rights) will be able to observe continuously without refreshing the dashboard/page.
What will be the best option to make an efficient system, keeping in mind that there will be multiple sensors in the future? The options I find:
Ajax
WebSocket
Framework e.g. bokeh or dash
MQTT
Please give me your opinions.
Good choice if you want to write your backend using Python is:
Server: Flask with Sokcet.IO + InfluxDB for real-time data storing
Frontend: Some JS framework or pure Js + websocket
UPD (this message is too long to post it to comments):
https://www.smashingmagazine.com/2018/02/sse-websockets-data-flow-http2/
The thing is that I'm not talking that websocket is the right solution for all possible cases/problems and should be used everywhere. Obviously it's depends on your needs and your project architecture. I think that article can help you to make a choice: if your app architecture requires full-duplex browser-server connection - you can use websocket for this and that will work for you, but if your frontend requires only one-way data send direction - from server to browser - you can use SSE, as the article says about SSE: "our main flow of data is from the server to the client and in much fewer occasions from the client to the server". To sum it up, you need to think about your application architecture and about how data needs to be sent between browser and server to choose right technology. Also, if you don't want to use neither websocket nor SSE - you can use ajax to pull data from server and that will also work for you.

Different approaches for achieving asynchronism on the web

What is the conceptual differences in approaches for achieving asynchronism with the following technologies. My primary search was for Django, but I'm looking for a conceptual answer which describes the ideas behind the technology.
Socket.IO and gevent
WebSockets
RabbitMQ & Celery
I've found tutorials on the web regarding most of these approaches, however they don't explain the concepts behind, just the technical implementation.
Asynchronous programming can be quite hard to get your head around. The basic idea is a way to get around blocking IO. Blocking IO is anything from writing to a file querying a database, querying a REST API, anything that would interrupt your applications processing flow while it waits for something else to happen.
Say for example your building a Gallery application, users can upload large HD resolution images to show off. But your going to need to make various copies of this large image the user uploads, thumbnails, smaller resolution versions etc To do this requires a bit of blocking IO. You need to compress the images and thats quite intensive and once compressed you need to write those to disk, after that you might need to store all this information in your DB. To do this in one single request would result in a very slow clunky performance for your user and honestly your backend process would probably time out before it could complete all the tasks needed. It also doesn't scale very well.
One way to get around this would be to use Asynchronous programming. Once the users upload has finished you could fire various signals to other applications sat waiting just for their chance to compress an image or write data to a DB. Once that signals fired those background processes get to work, and your user doesn't have to sit and wait for a long request to complete, instead they can carry on browsing the site, have a coffee and be notified when their thumbnails have been made etc.
In the example above I would implement this using Celery, RabbitMQ and SocketIO (or maybe TornadIO). Once the users upload has completed I would fire a celery task, celery uses RabbitMQ (I prefer Redis) to manage tasks, you could have 10, 20, 30 celery workers crunching away on these image uploads in the background. Once celery has finished it's job it would fire a message to the Socket server, for handling web sockets, which the users browser is connected too. This would send the user a notification in real time that their new image they uploaded is now ready to be shared with the world
This is the really basic example around Asynchronous Event driven programming. As best I understand it anyway. Anyone else please correct me.
I hope this helps. :)

How to communicate between C++ server app and django web app

I have some framework doing specific task in C++ and a django-based web app. The idea is to launch this framework, receive some data from it, send some data or request and check it's status in some period.
I'm looking for the best way of communication. Both apps run on the same server. I was wondering if a json server in C++ is a good idea. Django would send a request to this server, and server would parse it, and delegate a worker thread to complete the task. Almost all data that need to be send is string-like. Other data will be stored in database so there is no problem with that.
Is JSON a good idea? Maybe you know some better mechanism for local communication between C++ and django?
If your requirement is guaranteed to always have the C++ application on the same machine as the Django web application, include the C++ code by converting it into a shared library and wrapping python around it. Just like this Calling C/C++ from python?
JSON and other serializations make sense if you are going to do remote calls and the code needs to communicate across machines.
JSON seems like a fair enough choice for data serialization - it's good at handling strings and there's existing libraries for encoding/decoding JSON in both python & C++.
However, I think your bigger problem is likely to be the transport protocol that you use for transferring JSON between your client and server. Here's some options:
You could build an HTTP server into your C++ application (which I think might be what you mean by "JSON server" in your question), which would work fine, though might be a bit of a pain to implement unless you get a hold of a library to handle the hard work for you.
Another option might be to use the 0MQ library to send JSON (or otherwise) messages between your client and server. I think this would probably be a lot easier than implementing a full HTTP server, and 0MQ has some interprocess communication code that would likely be a lot faster than sending things over the network.
A third option would just be to run your C++ as a standalone application and pass the data in to it via stdin or command line parameters. This is probably the simplest way to do things, though may not be the most flexble. If you were to go this way, you might be better off just building a Python/C++ binding as suggested by ablm.
Alternatively you could attempt to build some sort of job queue based on redis or something other database system. The idea being that your django application puts some JSON describing the job into the job queue, and then the C++ application could periodically poll the queue, and use a seperate redis entry to pass the results back to the client. This could have the advantage that you could reasonably easily have several "workers" reading from the job queue with minimal effort.
There's almost definitely some other ways to go about it, but those are the ones that immediately spring to mind.

Django / Comet (Push): Least of all evils?

I have read all the questions and answers I can find regarding Django and HTTP Push. Yet, none offer a clear, concise, beginning-to-end solution about how to accomplish a basic "hello world" of so-called "comet" functionality.
First question (1): To what extent is the problem that HTTP simply isn't (at least so far) made for this? Are all the potential solutions essentially hacks?
2) What's the best currently available solution?
Orbited?
Some other Twisted-based solution?
Tornado?
node.JS?
XMPP w/ BOSH?
Some other solution?
3) How does nginx push module play into this discussion?
4) Which of these solutions require replacement of the typical mod_wsgi / nginx (or apache) deployment model? Why do they require this? Is this a favorable transition in any case?
5) How significant are the advantages of using a solution that is already in Python?
Alex Gaynor's presentation from PyCon 2010, which I just watched on blip.tv, is amazing and informative, but not terrifically specific on the current state of HTTP Push in Django. One thing that he said that gave me some confidence was this: Orbited does a good job of abstracting and simulating the concept of network sockets. Thus, when WebSockets actually land, we'll be in a good place for a transition.
6) How does HTML5 Websockets differ from current solutions? Is Gaynor's assessment of the ease of transition from Orbited accurate?
I'd take a look at evserver (http://code.google.com/p/evserver/) if all you need is comet.
It "supports [the] little known Asynchronous WSGI extension" and is build around libevent. Works like a charm and supports django. The actual handler code is a bit ugly, but it scales well as it really is async io.
I have used evserver and I'm currently moving to cyclone (tornado on twisted) because I need a little more than evserver offsers. I need true bidirectional io (think socket.io (http://socket.io/)) and while evserver could support it I thought it was easier to reimplement tornado's socket.io in cyclone (I opted for cyclone instead of tornado as cyclone is build on twisted, thus allowing for more transports that aren't implemented in twisted (i.c. zeromq)) Socket.io supports websockets, comet style polling, and, much more interseting, flash based websockets. I think that in most practical situations websockets + flash based websockets are enough to support 99% (according to adobe flash penetration is about 99% (http://www.adobe.com/products/player_census/flashplayer/version_penetration.html)) of a websites visitors (only people not using flash need to fallback to one of socket.io its (less perfomant and resource hogging) backup transports)
Be aware though websockets are not an http transport thus putting them behind http based proxies (e.g haproxy in http mode) breaks the connection. Better serve them on an alternate ip or port so you can proxy in tcp mode (e.g haproxy in tcp mode).
To answer your questions:
(1) If you don't need a bidirectional transport longpolling based solutions are good enough (all they do is keep a connection open). Things do get iffy when you need your connection to be statefull or you need to be able to both send and receive data. In the latter case socket.io helps. However websockets are made for this scenario and with the support of flash its available to most of a websites vistors (via socket.io or standalone, however socket.io has the added benefit of backup transports for those people not wanting to install flash)
(2) if all you need is push, evserver is your best bet. It uses the the same javascripts on the client side as orbited. Else look at socket.io (this also needs a supporting server, the only python one available is tornado.)
(3) It's just one other server implementation. If i read it correctly it's push only. pushing data to a client is done by making http equest from your app to the nginx server. (nginx then takes care they reach the client). If you're inteersted in this, look at mongrel2 (http://mongrel2.org/home) it not only has handlers for longpolling but also for websockets.(instead of making http request to mongrel, this time you use zeromq handlers to get data to your mongrel server) (Do take note of the developer's lack of enthusiasm for websockets and flash based websockets. Especially taking into account that the websocket protocol tends to evolve you might, at some point, need to recode mongrel2's websocket support yourself keep having support for websockets)
(4) All solutions except evserver replace wsgi with something else. Though most servers also have some wsgi support ontop of this "something else". No matter what solution you choose be careful that one cpu intensive or otherwise io blocking request doesn't block the server. (you either need multiple instances or threads).
(5) Not very significant. All solutions depend on some custom handlers to push (and, if applicable, receive) data to the client. All solutions i mentioned allow these handlers to be written in python. If you want to use a completely different framework (node.js) then you have to weigh the ease of node.js (it's assumed to be easy, but it's also rather experimental, and i found very few libraries to be actually stable) against the convenience of using your existing code base and the available libraries (e.g. if your app needs a blog ther are plenty django blogs you could plug in, but none for node.js) Also don't stare yourself blind on performance stats. unless you plan to push dumb predefined data (what all benchmarks do) to the client you'll find that the actual processing of data adds much more overhead than even the worst async io implementation. (But you still want to use an async io based server if you plan to have many simultaneous clients, threading simply isn't meant to keep thousands of connections alive)
(6) websockets offer bidirectional communication, long polling/comet only pushes data but does not accept writes. (Socket.io simulates this bidirectional support by using two http requests, one to longpoll, one to send data. It tracks their interdependance by a (session) id that's part of both requests query string). flash based websockets are similar to real websockets (the difference is that their implementation is in the swf, not your browser). Also the websockets protocol does not follow the http protocol; longpolling/comet stuff does (technically the websocket client sends an upgrade request to websocket server, the upgraded protocol isn't http anymore)
There is support for WebSockets with django-websocket, but unfortunately there are major issues with it for getting it working; here's a quote from that page:
Disclaimer (what you should know when using django-websocket)
BIG FAT DISCLAIMER - right at the moment its technically NOT possible in any way to use a websocket with WSGI. This is a known issue but cannot be worked around in a clean way due to some design decision that were made while the WSGI stadard was written. At this time things like Websockets etc. didn't exist and were not predictable.
...
But not only WSGI is the limiting factor. Django itself was designed around a simple request to response scenario without Websockets in mind. This also means that providing a standard conform websocket implemention is not possible right now for django. However it works somehow in a not-so pretty way. So be aware that tcp sockets might get tortured while using django-websocket.
So at the moment, WSGI: no go; Django: hardly any go, even with django-websockets; see also a comment in the author's original announcement:
I can't say this looks like a good idea. You're doing long-lived connections in a way that is going to require threading. django-websocket requires threading setup, and won't work if you've got processes (because you'd just have too many processes) but threads won't scale for a lot of connections at the same time, either, so its just a false safety. You need an asynchronous platform for long-lived things, and I do this by doing my app in Django and my comet and websocket in Node.js
Personally if trying to use WebSockets (which I expect to be next year), I would try the combination of Twisted and Cyclone first. They're designed to cope with WebSockets, and scale well. If you write your code properly to remove unnecessary dependencies on Django, you should be able to use much of your code in a Twisted-based system. This is a very distinct advantage over using Node.js or Comet or any system in another language. You could also make a simple push
Finally, you could also just decide it's too hard and use an external service to provide the push support. That then becomes a matter of sending a simple JSON request to their servers instead of worrying about how to make the connection and how concurrency will work and things like that. Of course, you'll need to pay for it (though currently it may be free while in Beta), but you don't need to worry about implementation details; you won't have the full power of WebSockets that way though - just push support.
I can't believe it's been over six years since I asked this question.
Async with Django (and the associated network traffic, eg websockets) has been an itch for many of us in the community. I have taken these past few years, to among other things, scratch this itch.
hendrix
hendrix is a WSGI/ASGI conatiner that runs on Twisted. It has been a project mainly driven by 5 enthusiasts, with help and funding from some visionary organizations. It is in production today at dozens, but not hundreds, of companies.
I'll leave it to you to read the documentation to see why it's the best solution to this problem, but a few quick highlights:
it's based on Twisted, requires no knowledge or use of Twisted internals, but leaves them all available
It "just works" in the sense that you don't need any special server or process configuration to do async and socket traffic from within your Django (or Pyramid, or Flask) app
It is very likely to be forward-compatible with ASGI, the Django Channels standard, and is in some meaningful ways the first ASGI container
It ships with simple APIs that maintain the flow of your view logic and are easy to unit test.
Please see this talk that I gave at Django-NYC (at the Buzzfeed offices) for more information about why I think this is the best answer to this question.
Re question #2, I recently was given a tour of the internals of a Django app that uses Comet heavily, and Orbited was the solution they chose.

How to keep a C++ realtime server application with a modern web client interface?

I develop industrial client/server application (C++) with strong real time requirements.
I feel it is time to change the look of the client interface - which is developed in MFC - but I am wondering which would be the right choice.
If I go for a web client is there any way to exchange data between C++ and javascript other than AJAX <-> Web service <-> COM ?
Requirements for the web client are: Quick statuses refresh, user commands, tables
My team had to make that same decision a few months ago...
The cool thing about making it a web application would be that it would be very easy to modify later on. Even the user of the interface (with a little know-how) could modify it to suit his/her needs. Custom software becomes just that much easier.
We went with a web interface and ajax seems the way to go, it was quite responsive.
On the other hand, depending on how strong your real time requirements are, it might prove difficult. We had the challenge of plotting real time data through a browser, we ended up going with a firefox plugin to draw the plot. If you're simply trying to display real time text data, it shouldn't be as big an issue.
Run some tests for your specific application and see what it looks like.
Something else to consider, if you are having a web page be an interface to your server, keep in mind you will need to figure a way to update one client when another changes the state of the server if you plan on allowing multiple interfaces to your server.
I usually build my applications 2-folded :
Have the real heavy-duty application CLI-only. The protocol used is usually text-only based, composed of requests and answers.
Wrap a GUI around as another process that talks to the CLI back-end.
The web interface is then just another GUI to wrap around. It is also much easier to wrap a REST/JSON based API on the CLI interface (just automatically translate the messages).
The debugging is also quite easy to do, since you can just dump the requests between the 2 elements and reproduce the bugs much more easily.
Write an HTTP server in your server to handle the AJAX feedback. If you don't want to serve files, create your server on a non-standard port (eg. 8081) and use a regular web server for the actual web page delivery. Now have your AJAX engine communicate with the server on the Bizarro port instead of port 80.
But it's not that hard to write the file server part, also. If you do that, you also get to generate web pages on-the-fly with your data pre-filled, if you want.
Google Desktop Search does this now. When I search my desktop for 'foobar', the URL that opens is this:
http://127.0.0.1:4664/search?q=foobar&flags=68&num=10
In this case, the 4664 is the Bizarro port. (GoogleDesktop serves all the data here; it only uses the Bizarro port to avoid conflicts with any web server I might be running.)
You may want to consider where your data lives. If your application feeds a back-end database, you could write a web app leaving your c++ code in tact -- the web application would be independent and offer up pages to web users and talk directly to the database -- In this case you have as many options, and more, as you have indicated.