Record audio from mic and save as .wav django web app

Record audio from mic and save as .wav django web app - django

I have created a web application using django , html and jquery( and js ).
I need to record audio from a mic and store it as a .wav file. What is the best way to go about doing this ? (Better if it's supported on most browsers like chrome, firefox, safari)
I don't mind using a flash plugin if it's easy to understand and use.
Please suggest good ideas and links.
Thanks in advance.

Flash highly compresses the audio data before sending it, if you use the conventional ways of acquiring data from microphone. That is, if you use NetStream.publish() with a microphone attached to it. I'm actually not sure about the format, but would imagine that it is something proprietary... could be MP3. But it could be also Speex... at least I know that Flash supports this format.
Now, Microphone class is capable of exposing the raw sound data within the application. You need to listen to sampleData event dispatched from its instance. However, the documentation, for some reason, doesn't cover that... This is relatively new feature, so, perhaps they just forgot to add it in the docs. Here however, they posted an example of how to do that (scroll to the "Capturing microphone sound data" paragraph). You will need to write the "encoder" for WAV data yourself, but the format it outputs the audio is already some sort of PCM, so you will only need to write the proper headers (or so I think).

Related

What is the path from BITMAP[+WAVE(s)] to RTSP (Twitch) via C/C++ in Windows?

So I'm trying to get a basic tool to output video/audio(s) to Twitch. I'm new to this side (AV) of programming so I'm not even sure what to look for. I'm trying to use mainly Windows infrastructure and third party where not available.
What are the steps of getting raw bitmap and wave data into a codec and then into a rtsp client and finally showing up on Twitch? I'm not looking for code. I'm looking for concepts so I can search for as I'm not absolutely sure what to search for. I'd rather not go through OBS source code to figure it out and use that as last resort.
So I capture the monitor via Output Duplication and also the Sound on the system as a wave and the microphone as another wave. I'm trying to push this to Twitch. I know that there's Media Foundation on Windows but I don't know how far to streaming it can get as I assume there no netcode integrated in it? And also the libav* collection in FFMPEG.
What are the basic steps of sending bitmap/wave to Twitch via any of thee above libraries or even others as long as they work on Windows. Please don't add code, I just need a not very long conceptual explanation and I'll take it from there. Try to cover also how bitrate and framerate gets regulated (do I have do it or the codec does it)?
Assume absolute noob level in this area (concept-wise not code-wise).

adding "read aloud" feature to book app written in Cocos2D

I created a book app and used Cocos2D and physics engine (Chipmunk) to create it. I would like to add "read aloud" feature to it.
So far I found instructions/books and tutorials how to add read aloud feature when book is created with iBook Author (but I couldn't use iBook Author due to some limitations) using Epub3 and SMIL.
I also found a good tutorial from J. Shapiro how to make narrated book using AVSpeechSynthesizer. This helps, only that I would like to use recorded voice, rather than synthesized sound. I don't know if this approach can be modified to do so?
I also know how it can be done in Sprite Kit framework.
The only info that I couldn't find is how to add "read aloud" feature to the app written using Cocos2D. Could it be done within SimpleAudioEngine, or it can be combined with some other engine (possibly from Sprite Kit framework)?
I would appreciate very much if somebody can give me some references/pointers or tutorial links where to look for some answers how to add this feature.
Thanking you in advance.

I would like to use recorded voice, rather than synthesized sound
Good. Add your voice recording audio files (caf, wav or mp3 format) to the project. Play it back at the appropriate time using:
[[SimpleAudioEngine sharedEngine] playEffect:#"someVoiceRecordingFile.wav"];

Define what read aloud means to you because I find that a lot of terms, especially semi-vague ones like this, are used differently depending on who is using it.
When you say read aloud book do you essentially mean a digital storybook that reads the story to you by simply playing narration audio? I've created dozens of these and what you are asking has multiple steps depending on what features you are going for in your book. If you mean simply playing audio and that is it, then yes you could do that in cocos2d using SimpleAudioEngine (as one option) but I assume you already knew that which is why this question has a tab bit of vagueness to it. Either way you probably wouldn't want to play narration as an effect but rather stream it. To do that along with background music you'd stream background music via the left channel and narration via the right. You can easily add a method to SimpleAudioEngine to make this nice and neat. To get you started something similar to this can be used to access the right channel:
CDLongAudioSource* sound = [[CDAudioManager sharedManager] audioSourceForChannel:kASC_Right];
if ([sound isPlaying])
{
[sound stop];
}
[sound load:fileName];
Also use the proper settings and recommended formats for streaming audio such as aifc (or really all audio in general). Although I believe you can stream mp3 without it being decompressed first, the problem is with timing. If you are using highlighted text or looping audio then aifc is the better option. Personally I've never had a reason to use mp3. Wav with narration is something I'd avoid even if just for the file size increase. If the mp3 is decompressed even for streaming (which I'm not sure if it is off the top of my head) then you'd have a huge spike in memory that will be both highly unwanted and at times down right bad.
There are many other things that can go into it but those are the basic first steps. If you want to do things like highlighted text, per-word animations, etc then that will take more work of course and you'd need to be comfortable with cocos2d, SpriteKit, or whatever you decide to use. I'll be doing a tutorial series on it one day soon so I'll cover all of that stuff.
On the other hand, if you are talking about recording someone's voice and having it playback i.e. a mother recording herself reading the story so her child can hear her voice whenever they are using your app, then you'd simply record the audio like you would any other piece of audio, save it to the device, and play it back when the page is displayed in the proper reading mode (or whatever you personally call it). One place to look is the AVAudioRecorder that is part of the AVFoundation framework. Simply Google "iOS audio recording" for examples if you need them.

From C++ image frames to html5 <video tag in client browser

In my C++ application I have video image frames coming from a web camera.
I wish to send those image frames down to a HTML5 video tag element for live video playing from the camera. How can I do this?

For a starting point you are going to want to look into WebM and H.264/MPEG-4 AVC. Both of these technologies are used as HTML5 media streams. It use to be that FireFox only supported WebM while Safari and Chrome both supported H.264. I am not sure about their current states, but you will probably have to implement both.
Your C++ will then have to implement a web server that can stream these formats on the fly. Which may require significant work. If you choose this route this Microsoft document may be of some use. Also, the WebM page has developer documentation. It is possible that H.264 must be licensed for a cost. WebM allows royalty free usage.
If I am not mistaken neither of these formats has to be completely downloaded in order to work. So you would just have to encode and flush the current frame you have over and over again.
Then as far as the video tag in HTML5 you just have to provide it the URLS your C++ server will respond to. Here is some documentation on that. Though, you may want to see if there is some service to mirror these streams as not to overload your application.
An easier way to stream your webcam could be simply to use FFMPEG.
Another usefull document can be found at:
http://www.cecs.uci.edu/~papers/aspdac06/pdf/p736_7D-1.pdf
I am no expert, but I hope that at least helps you get your start.

how to use DirectShow to render audio in C++

I am just starting to learn DirectShow with C++. I need to use DirectShow to record the audio and write it to a WAV file on the disk. I heard from other people that Win 7 does not allow for rendering audio using DirectShow.
In addition, I would like to know how should I start with recoding audio using DirectShow with C++? If there is sample source, it would be great.
Thanks in advance.

I think you may have misunderstood these other people. Windows Media Foundation is aimed to be the successor of DirectShow, but DirectShow is still a very valid technology on Windows 7.
The easiest thing to accomplish what you want to do, is to get it right using the GraphEdit tool first ( I assume you want to do this programmatically).
Create a graph that contains your audio device, a WavDestFilter, and a file writer.
Source -> WavDest -> File Writer
Play the graph. Stop the graph and you should have created a .wav file with the recorded audio. If you can get this right, then you need to do the whole thing programmatically.
There are a couple of samples in the SDK that show you how to programmatically add filters to a graph and connect them, that should enable you to get started.
WRT the WavDestFilter, IIRC it might not be in all versions of the SDK, you'll have to find an appropriate one. You also need to build it, and regsvr32 it, so that it will show up in your list of available filters in GraphEdit.
If this all seems a bit much, I would read through the DirectShow documentation on MSDN to at least get an overview of DirectShow.

c++ convert/play videos and images

I'm looking for build in library for converting videos/images. i heard something about DirectShow. Do you know any library you have used to convert videos/images?

For transcoding (converting one video format to another) using Directshow is bit tricky, you want to use Media Foundation for this job.
There is Transcode API available in Media Foundation to achieve this task. This link has more details on Transcode API, tutorials and samples to get you started.

You can use DirectShow for grabbing images from video stream. For it you must create your own filter node. It is complex task because of filter is COM object that will work within chain (DirectShow filter graph) of other filter nodes - codecs. So after creating you need register your filter in system. As for me i think you can try it because you can use all registered codecs in system and as result get decompressed/final image into your filter. As other solution i think that you can try to use modules from some open source media player. For example try VideoLAN but as i know it is big thing and not easy to use.
Good luck!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js