How to write to an AMR-WB file in multichannel mode with Python under linux from RTP PAYLOAD - rtp

I've successfully written the RTP-payload into an amr-file and it works fine, according to the answer written in this question.
https://stackoverflow.com/questions/61961965/convert-rtp-payload-payload-type-107-amr-wb-16khz-1channel-to-wav
Now I tried to write a multichannel file according to chapter 5.2 according to document RFC 4867, but I failed. No decoder accepts the file. I've got 2 channels (stereo).
I've already checked the correctness of the voice data in the file. And they seem correct.
First I put the header according to chapter 5.2:
https://www.rfc-editor.org/rfc/rfc4867#section-5.2
The magic number I checked and it's correct. Then I add the chan-desc-field and it should be as far as I understood:
b'\x00\x00\x00\x02'
In the audiofile I read the same values. Is this correct?
The audiodata is saved according to 5.3 (1st paragraph):
1.pack_chan1 1.pack_chan2, 2.pack_chan1 2.pack_chan2, etc
Now clicking on the amr-file (or awb), the decoder says, that an error has occurred.
On the contrary, when I just write one channel in a file, according
to chapter 5.1 and 5.3, everything works fine. It can be played with windows VLC-MEDIA PLAYER and
Linux Ubuntu's Videos.
Where is my mistake?
Thanks and regards
Update:
Chapter 3.5 of RFC4867, second paragraph says, that usually, codecs do not support encoding of
multi-channel audio content into a single bitstream, they can be
used to separately encode and decode each of the individual channels.
So what can I do to produce the stereo sound with the 2 channels?

Related

FFmpeg av_read_frame returns a size but no data?

I have written some C code to access ffmpeg and wrapped it in a C++/CLI (.NET managed) class. The program fetches a live video stream and extracts frames and converts them to PNG files.
Unfortunately the images that are saved to disk are always black (opening them in Notepad++ shows that they are full of nulls).
I am using the assemblies aformat/codec-55.dll and the development headers and libs for compilation from ffmpeg-20131120-git-e502783-win64-dev. The whole project is compiled using Managed C++ (Cpp/cli) .NET 4.0 for 64-bit.
After some investigation the problem appears to be that av_read_frame fills the AVPacket->size value correctly, but the AVPAcket->data is always null. When the frame is finished (got==1) then the data for the AVFrame is just a matrix of nulls. :(
Here is the code:
Example code (sorry, but it didnt paste well into SO)
I think the problem is at line 34 when the packet is returned like so:
Please, how can I get this to work? What have I done wrong?
Decoding part seems fine to me. I am not so sure about the encoding & saving to PNG. Why don't you try to just dump (frame->linesize*frame->height) bytes from frame->data to disk with fwrite and have a look to it with Irfanview (for instance)?

Record directshow audio device to file

I've stumbled through some code to enumerate my microphone devices (with some help), and am able to grab the "friendly name" and "clsid" information from each device.
I've done some tinkering with GraphEd.exe to try and figure out how I can take audio from directshow and write it to a file (I'm not currently concerned about the format, wav should be fine), and can't seem to find the right combination.
One of the articles I've read linked to this Windows SDK sample, but when I examined the code, I ended up getting pretty confused at how to use that code, ie. setting the output file, or specifying which audio capture device to use.
I also came across a codeguru article that has a nicely featured audio recorder, but it does not have an interface for selecting the audio device, and I can't seem to find where it statically picks which recording device to use.
I think I'd be most interested in figuring out how to use the Windows SDK sample, but any explanation on either of the two approaches would be fantastic.
Edit: I should mention my knowledge and ability as a win32 COM programmer is very low on the scale, so if this is easy, just explain it to me like I'm five, please.
Recording audio into file with DirectShow needs you to build the right filter graph, as you should have figured out already. The parts include:
The device itself, which you instantiate via moniker (not CLSID!), it is typically PCM format
Multiplexer component that converts streams into container format
File Writer Filter that takes file-compatible stream and writes into a file
The tricky moment is #2 since there is not standard component available. Windows SDK samples however contains the missing part - WavDest Filter Sample. Building it and making it ready for use, you can build a graph that records from device into .WAV file.
Your graph will look like this, and it's built easily programmatically as well:
I noticed that I have a variation of WavDest installed with Google Earth - for the case you have troubles building it yourself and you will be looking for prebuilt binary.
You can instruct ffmpeg to record from a directshow device, and output to a file.

Recording application output to video using FFmpeg (or similar)

We have a requirement to lets users record a video of our 3D application. I can already grab the individual rendered frames so this question is specifically about how to write frames into a video file.
I don't think writing each frame as a separate file and post-processing is a workable option.
I can look at options to record to a simple video file for later optimising/encoding, or writing directly to a sensibly encoded format.
FFmpeg was suggested in another post but it looks a bit daunting to me. Is it the best option, if not what can be suggested? We can work with LGPL but not full GPL.
We're working on Windows (Win32 not MFC) in C++. Sample/pseudo code with your recommended library is very much appreciated... basically after how to do 3 functions:
startRecording() does whatever initialization is needed
recordFrame() takes pointer to frame data and encodes it, ideally with timing data
endRecording() finalizes the video file, shuts down video system, etc
Check out the sources to Taksi on sourceforge. http://taksi.sourceforge.net/
You need 2 things.
1. A code to compress the frames.
2. A container file format. Like AVI or MPG.
Taksi useses the old VideoForWindows API and AVI not the newer COM API's but it still might work for you.

How to encode pixels from buffer to h.264 or VP8

I have a application (qt c++) that reads data from USB-device, decodes that data into 24bit RGB pixels which are stored in a uchar array.
Framerate is ~10 FPS. Framesize is 128x4096.
Question is: How to encode these frames into VP8 or h.264 video in real time?
No external processes are allowed, everything needs to run inside my application.
ffmpeg is an option but how to include it to my project and use it? Documentation is rather bad to say the least. Also x264 could be an option but same question as to ffmpeg. And it's also quite expensive, 1$ for unit and minimum of 10000.
Simple guide would be helpful but I doubt there exists one.
Application should run in Windows and Linux.
The problem with the VP8 SDK is that the examples only encode to IVF. That codec appears to have been shut down by Microsoft due to a security flaw (buffer overflow). It's pretty hard to even get the VP8 project setup when you can't even check the results. It at least uses a BSD license scheme and its supposedly unencumbered with patents.
The VP8 SDK has some routines for converting formats, but they are buried in the source tree.
An option not mentioned is the Intel Media SDK, but that locks you to windows.
There is also Theora and Dirac.
X264 has an encoder, but it would be expensive to get a commercial license.
GPLv2 source code is not "free". I don't care what they try to get you to believe.
There is also a project called "Revel - the Really Easy Video Encoding Library". That is a path to getting MPEG-4 part 2 files encoded. H264 is MPEG-4 part 10. H264 is also called AVC. Revel is also GPL'd.
Ffmpeg is a catch all utility that tries to create a wrapper around the various encoders/decoders. If you use the x264 encoder with it, it becomes GPLv2.
The VP8 SDK has documentation and even some sample code

Saving as Flash in C++

How to save an IPLImage of OpenCV as a Flash file? Maybe there is a library that does that?
If you mean storing your output as a flash video (.flv) just use ffmpeg (libavcodec/libavformat). It is cross platform and supports the .flv format (besides a massive amout of others) and should be quite easy to do. You can embed audio too.
As a note: ffmpeg is partially included in opencv (depending on your build) as a video coder/decoder, i don't know though if you can force it to write as .flv (by choosing the right codec string) from within opencv. Anyways it's not too hard to convert IplImage to a ffmpeg buffer and store from there.
A problem you might have is that latest opencv (2.1) has trouble to build with ffmpeg support or is build against some ffmpeg version you don't want. But as mentioned above you don't need to use ffmpeg via the opencv 2.1 api, since you can use it directly by using the ffmpeg api.
Look for the examples in libavcodec on how to write a video, and check the opencv source on how to convert from IplImage to AVPacket/AVFrame. I've done this before and it was quite
easy to do.
I don't know Flash much, but you can manipulate the data pointer of an IplImage (named char *imageData). Data is accessible as between 1 and 4 bit plans, in a format you surely know. Try writing your Flash file from this data pointer.
lital , Well to my knowledge openCV doesn't support creating flash .
My solution for such a problem is Red5 Server
and as their page says
Red5 is an Open Source Flash Server
written in Java that supports:
Streaming Video (FLV, F4V, MP4)
....
You could dump your images in a sequence of files, say img00000.ppm, img00001.ppm, ..., and then delegate the video encoding to MEncoder, which, according to docs, supports flv.
That's what we usually do in order to prepare videos such as this one.