I am working on a face recognition project using Flask as my web server running on a Ubuntu 14.04 Machine. I am using OpenCV 2.4.9 as my image processing software which is written using Python2.7. I would like to be able to access a clients webcam through their browser to capture a image or frame from the webcam stream and send it back to the server to be processed. Is there an easy way using python to obtain access to the clients webcam or is it possible to use JavaScript in conjunction with my current code.
I'll assume that you are more interested in architectural decisions for you application that specific implementation details. You will need to use client side and server side for this application.
Client side is html page with javascript that will capture images from web cam. There are many resources on internet about this topic. This article explains how it works with some examples. I would recommend to use some javascript library like this one
The next thing is to decide how client application and server side transfers image data. In case you would like to stream webcam video to server, do some computation and stream data back to client application, WebSockets are your friend. This tutorial describes how to set up flask application for websockets.
Much easier approach is to POST image data to the server, do some computation and respond to client. Downside of this approach is that it's not suitable for continuous video processing. But you can use it for single video frame processing. Otherwise you would flooded your server with requests.
The last thing to decide is how much processing is done to images on server side. If you would do some extensive computation that takes long time, I would recommend celery for background tasks. HOWEVER this would change architecture considerably.
For a proof of concept, I would recommend following. Take single image with webcam, post it to server, do quick computation on image and respond with what you've had computed.
Good luck.
Related
I want to build a live streaming app.
My thought process:
Get the Video/Audio data from the
navigator.mediaDevices.getUserMedia(constraints); [client-streamer]
create rooms using sockets(Socket.IO or WebSockets from flask) [backend]
Send the data in 1 to the room members using sockets.
display the media on the client-side.
Is that correct? How should I do it?
how do I broadcast data to specific room members and not to everyone? (flask)
How to consistently send data from the streamer -> server -> room members. the stream is given from 1 is an object, where is the data?
any other better ideas will be great! thanks.
I need to implement the server-side by myself without help from libraries that will do the work for me.
Implementing a streaming platform is not trivial. Unfortunately, it is not as simple as emitting chunks received from the MediaRecorder with onndatavailable and forwarding them to users using a WebSocket server - this is not scalable nor efficient nor reliable.
Below are some strategies you can try for different types of scenarios:
P2P: If you want to have simple peer-to-peer streaming, you can use WebRTC to achieve that with a simple socket.io server for signaling purposes.
Conference: Here things start to get more complicated. You will need a media server if you want to be somewhat scalable. One approach is to route your stream to the users using an SFU or MCU. This will take care of forwarding/processing media to different peers efficiently.
Broadcast: Here things are also non-trivial. Common WebRTC-based architectures include ingesting the WebRTC stream and forward that to an HLS server which will let your stream chunks available for clients through a CDN, or perform RTP forwarding of the WebRTC stream, convert it to RTMP using something like FFmpeg and deliver it through Youtube Live or Twitch to leverage from their infrastructure.
Be aware that the last 2 items are resource-intensive and will certainly not be cheap to maintain.
Below are some open source projects that could help you along the way:
Janus
MediaSoup
AntMedia
Jitsi
Good luck!
Explaining all this is far beyond the scope of a Stack Overflow answer.
Here are a few hints:
You need to use the MediaRecorder API to capture compressed data from your gUM (getUserMedia) stream. MediaRecorder support is inconsistent between makes and models of browser. though.
It kicks a Blob into its onndatavailable handler every so often.
They're compressed as a webm data stream.
You can push those Blobs to a server with socket.io, and the server can turn around and push them to whatever clients you want to.
Playing the webm on the clients is tricky. You may, on some makes and models of browsers, be able to feed the webm stream to the Media Source API using appendBuffer(). But some browsers cannot consume the webm streams.
These webm streams are useless to a player without all their Blob data in order. You can't just start sending a new client the Blobs of the stream when they sign in; you have to restart the MediaRecorder.
(You may be able to make it work without a MediaRecorder restart if you send the first few k bytes of the stream to each new client before sending the current Blob. Extracting those bytes is an intricate programming job involving the ebml package to parse the webm stream and extract the prologue. I have not proven this concept.)
Because getting all this to work -- originator -- server -- viewer is such a pain in the xxx neck, you may want to investigate using something like mediasoup instead. It uses WebRTC transport rather than socket.io, and works cross-platform.
I am working on a live-streaming prototype, I have been reading a lot about how live-streaming works and many different approaches but I still can't find a live-streaming stack that suits my needs...
These are the requirements for my prototype:
1)The video/audio recording must come from a web browser using the webcam, the idea is that the client preferably shouldn't need to install plugins or do anything complicated(maybe installing Flash player plugin is acceptable, only for recording the video, the viewers should be able to view the stream without plugins).
2)It can't be peer to peer since I also need to store the entire video in my server (or in Amazon s3 servers for example) for viewing later.
3)The viewers should also be able to watch the stream without the need of installing anything, from their web browsers, say Chrome and Firefox for example. We want to use the HTML5 video tag if possible.
4)The prototype should be constructed without expending money preferably. I have seen that AWS-Cloudfront and Wowza offer free trials so we are thinking about using these 2 services.
5)The prototype should be able to maintain 1 live stream at a time and 2 viewers, just that, so there are no restrictions regarding this.
Any suggestions?
I am specially stuck/confused with the uploading/encoding video part of the architecture(I am new to streaming and all the formats/codecs/protocols/technologies are making it really hard to digest).
As of right now, I came across WebRTC that apparently allows me to do what I want, record and encode video from the browser using the webcam, but this API only works with HTTPS sites. Are there any alternatives that work with HTTP sites?
The other part that I am not completely sure about is the need for an encoding server, for example Wowza Streaming Engine, why do I need it? Isn't it enough if I use for example WebRTC for encoding the video and then I just send it to the distribution service (AWS-Cloudfront for example)? I do understand that the encoding server would allow me to support many different devices since it will create lots of different encodings and serve many different HTTP protocols, but do I need it for this prototype? I just want to make a 1 format (MP4 for example) live-stream that can be viewed in 2 web browsers, that's all, I don't need variety of formats nor support for different bandwidths or devices.
Base on your requirement, WebRTC is good way.
API only works with HTTPS sites. Are there any alternatives that work
with HTTP sites?
No. Currently Firefox is only browser is allow WebRTC on HTTP, but finally it need HTTPS
For doing this prototype you need to go with the Wowza WebRTC.
While going with wowza all the streams are delivered from the wowza only.So it become a routed WebRTC.
Install Wowza - https://www.wowza.com/docs/how-to-install-and-configure-wowza-streaming-engine
Enable the WebRTC - https://www.wowza.com/docs/how-to-use-webrtc-with-wowza-streaming-engine
Downaload and configure the Streamlock. or Selfsigned JKS file - https://www.wowza.com/docs/how-to-request-an-ssl-certificate-from-a-certificate-authority
Download the sample WebRTC - https://www.wowza.com/_private/webrtc/
Publish stream using the Publish HTML and Play through the Play HTML ( Supported Chrome,Firefox & Opera Browsers)
For MP4 files in WebRTC : you need to enable the transcoder with h264 & aac. Also you need to enable the option Record all the incoming Streams in the properties of application which you are creating for the WebRTC ( Not the DVR ).Using the File writer module save all the recorded files in a custom location.By using a custom script(Bash,Python) Move all the Transcoded files to the s3 bucket, Deliver through cloudfront.
I have been trying to solve this for 2 weeks and I have not been able to reach a solution.
Here is what I am trying to do:
I need a web application in which users can upload a video; the video is going to be transformed using opencv's python API. Since I have Python's API for opencv I decided to create the webapp using Django. Everything is fine to that point.
The problem is that the video transformation is a very long process so I was trying to implement some real time capabilities in order to show the user the video as it is transformed, in other words, I transform a frame and show it to the user inmediatly. I am trying to do this with CoffeScript and io sockets following some examples; however I havent been successful.
My question is; what would be the right approach to add real time capabilities to a Django application ?
I'd recommend using a non-django service to handle the websockets. Setting up websockets properly is tricky on both the client and server side. Look at pusher.com for a free/cheap solution that will just work and save you a whole lot of hassle.
The initial request to start rendering should kick off the long-lived process, and return with an ID which is used to listen to the websocket for updates.
Once you have your websockets set up, you can send messages to the client about each finished frame. Personally I wouldn't try to push the whole frame down the websocket, but rather just send a message saying the frame is done with a URL to get the frame. Then normal HTTP with its caching and browser niceties moves the big data.
You're definitely not choosing the easy path. The easy path is to have your long-lived render task update the render state in the database, and have the client poll that. Extra server load, but a lot simpler.
Django itself really is focused on doing one kind of web interface, which is following the HTTP Request/Response pattern. To maintain a persistent connection with clients, which socket.io really makes dead simple, you need to diverge a bit from a normal Django installation.
This article discusses the issue of doing real-time with Django, with the help of Orbited and Twisted. It's rather old, and it relies on Comet, which is not the preferred way of doing real-time these days.
You might benefit a lot by going for Socket.io on the client, and something like Tornado (wiki) + Tornado client for Socket.io. But, if you really want to stick with Django for the web development (which Tornado also provide), you would need to make the two work together internally, each handling their particular use case.
Finally, this other article discusses how to make Django work with gevent (an coroutine-based networking library for Python) and Socket.io, which might well be your best option otherwise.
Don't hesitate to post questions/comments as they pop up!
--EDIT--
I wasn't very well understood with the initial question, so allow me to rephrase.
I am working in an image processing application for Android.
Let's admit I will send an image from android to some server.
What I want to know is how to process this image with opencv (c/c++) on the server and return the results to mobile.
Look into setting up a web service if you're just trying to offload the processing to a server and send back some processed data. There's a ton of examples and sample setups based on the server environment (OS, speed, bandwidth needs, etc) out there that should help you get started. You would then setup the OpenCV environment on the server, and perform all of your processing through those libraries. We would need more information on what type of image processing you hope to accomplish to help you more, but again there are lots of examples for OpenCV and great documentation as well. The Android side will depend on how you setup the web service, so based on that choice there are different solutions available for easily interfacing with your server.
So we have some server with some address port and ip. we are developing that server so we can implement on it what ever we need for help. What are standard/best practices for data transfer speed management between C++ windows client app and server (C++)?
My main point is in how to get how much data can be uploaded/downloaded from/to client via his low speed network to my relatively super fast server. (I need it for set up of his live stream Audio/Video bit rate)
My try on explaining number 3.
We do not care how fast is our server. It is always faster than needed. We care about client tyring to stream out to our server his media. he streams encoded (via ffmpeg) live video data to our server. But he has say ADSL with 500kb/s of outgoing traffic. Also he uses some ICQ or what so ever so he has less than 500 kb/s per second. And he wants to stream live video! So we need to set up our ffmpeg to encode video with respect to the bit rate user can provide. We develop server side and client side. We need a way of finding out how much user can upload per second currently (so value can change dynamically over time)
Check this CodeProject Article
it's dot-net but you can try figure out the technique from there.
I found what I wanted. "thrulay, network capacity tester" A C++ code library for Available bandwidth tracking in real time on clients. And there is "Spruce" and it is also oss. It is made using some of linux code but I use Boost library so it will be easy to rewrite.
Offtop: I want to report that there is some group of people on SO down voting on all questions on this topic - I do not know why they are so angry but they deffenetly exist.