Media Foundation AMR decode - c++

I have a file with .amr extension, and I want to get it's sample rate and number of channels using Microsoft Media Foundation. Further, I want to decode and get the uncompressed data.
I can successfully get those from .aac .mp4 and other file types but not from from a .amr file (or 3.gp file which contains .amr track).
So, for other types I do:
IMFSourceReader *m_pReader;
IMFMediaType *m_pAudioType;
MFCreateSourceReaderFromURL(filePath, NULL, &m_pReader);
m_pReader->SetStreamSelection(MF_SOURCE_READER_ALL_STREAMS, false);
m_pReader->SetStreamSelection(MF_SOURCE_READER_FIRST_AUDIO_STREAM, true);
m_pReader->GetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, &m_pAudioType);
UINT32 numChannels,sampleRate;
m_pAudioType->GetUINT32(MF_MT_AUDIO_NUM_CHANNELS, &numChannels);
m_pAudioType->GetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, &sampleRate);
Consider there are no any errors during this code.
For .amr files, some garbage is being written in the numChannels and sampleRate.
Does anyone have experience with this and knows how to recognize and/or get proper channels and sample rate for further decoding?
BTW, Windows Media Player plays this file with no problems.
Thanks in advance.

So I found out that it supports decoding for .amr files not encoding.
Just before we get this properties:
UINT32 numChannels,sampleRate;
m_pAudioType->GetUINT32(MF_MT_AUDIO_NUM_CHANNELS, &numChannels);
m_pAudioType->GetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, &sampleRate);
We have to set a new media type to our Source Reader
m_pAudioType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio)
m_pAudioType->SetGUID(MF_MT_SUBTYPE, MFAudioFormat_Float)
m_pReader->SetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, NULL, m_pAudioType);

Related

Determining the format of audio file (MP3) using SAPI

HiI'm trying to create a "Speech to text" app that can transcribe any audio/video file. I've created an app based on this post and it works great for WAV files. But if I use an MP3 file, the line hr = cpInputStream->BindToFile(wInputFileName.c_str(), SPFM_OPEN_READONLY, &sInputFormat.FormatId(), sInputFormat.WaveFormatExPtr(), SPFEI_ALL_EVENTS); returns
The Parameter is incorrect
The question is, can I use MP3 files as input for SAPI? and if yes, how do I determine the correct format for the call to hr = sInputFormat.AssignFormat(SPSF_16kHz16BitStereo) because SPSF_16kHz16BitStereo will certainly not be correct and I don't think we should hardcode it.

How do I record an audio in JUCE that have headers without the 'JUNK' subchunk?

I am trying to develop an application using the JUCE library that can record an audio or open an audio file. The audio file is to be passed into the openSMILE program to have its feature values extracted. All audio files are in wave format and the application is to be finally built for the iPhone platform.
I have developed the part of the application that allows the application to record audio and open an audio file from the file directory. I am able to pass some audio files into openSMILE to have their feature values extracted, but not others. All those recorded from the JUCE application itself all cannot be passed in.
The error produced when passing those audio files that cannot be passed is as follows:
smilePcm: Riff: 46464952
Format: 45564157
Subchunk1ID: 4b4e554a
Subchunk2ID: 0
AudioFormat: 0
Subchunk1Size: 34smilePcm: bogus wave/riff header or file in wrong format ('Audio/Audio Recording.wav')! (maybe you are trying to read a 32-bit wave file which is not yet supported (new header type...)?)(ERROR) [1] in cWaveSource: failed reading wave header from file 'Audio/Audio Recording.wav'! Maybe this is not a WAVE file?
To try to find the cause of the error, I then extracted information about the wave headers of the passable and non-passable audio files using Riffpad.
In the audio files that could be passed into the openSMILE program, the wave file header information are as follows:
Audio 1
RIFF-WAVE - (len= 180260, off= 12)
fmt - (len=16, off=20)
data - (len=180224, off=44)
Audio 2
RIFF-WAVE - (len= 19236, off= 12)
fmt - (len=16, off=20)
data - (len=19200, off=44)
And the non-passable ones are as follows:
Audio 3 <---Recorded from my JUCE application
RIFF-WAVE - (len= 128096, off= 12)
JUNK - (len=52, off=20)
fmt - (len=16, off=80)
data - (len=128000, off=104)
Audio 4 <---A random audio file that also can't be passed into openSMILE
RIFF-WAVE - (len= 21289308, off= 12)
fmt - (len=40, off=20)
fact - (len = 4, off=68)
data - (len=21289248, off=80)
I am guessing (correct me if I am wrong) that the error would be removed if I can remove the JUNK subchunk from the wave file recorded, i.e. Audio 3, so that the headers will be similar to that in the passable audio files.
I thought of 2 possibilities that might be able to resolve this issue:
Record Juce Audio with a header format similar to the passable audio file headers (most straightforward and preferred method, if workable)
Convert the Audio file after recording, so that the headers will be similar (I have read that using libsndfile and Audio Compression Manager (ACM) might work, but I am not sure if they are workable for cross platform that JUCE can build to, e.g. iPhone)
For the first way, is there any way I can record Audio in the 'right' format as with the passable audio files?
For the second way, could I use a library that can be built for cross platform, or somehow take out the data chunk of the recorded audio, and add a header with the 'right' format to it? (What i gathered from what I read is that, the JUNK allows for information to be included, and if it is not required, can be skipped. I presume that removing it would not be a problem, as long as i edit the total length from the RIFF-WAVE subchunk.)
Are any of the methods above possible, and if so, how should I carry them out?
Thanks!
Solved: Apparently there was a comment in wavAudioFormat.cpp on enabling JUCE_WAV_DO_NOT_PAD_HEADER_SIZE to remove JUNK padding.
Leaving the steps here for anybody who want to record audio in wavAudioFormat but have "crappy wav players" that cannot read the padded recorded audio:
go to Debug > Project Properties > Configuration Properties > C/C++ > Preprocessor > Preprocessor Definition.
Click Edit
Add in JUCE_WAV_DO_NOT_PAD_HEADER_SIZE to the list.

WIC WINCODEC_ERR_BADHEADER only for JPEG images

I have a simple encoding/ decoding application using Windows Imaging Component API. The issue I'm having is that when I use either the JPEGXR or BMP formats, everything works fine. However, when I use the JPEG codec - the encoder works fine and I can visually verify the generated JPEG image, but when I try to decode that stream, I get a WINCODEC_ERR_BADHEADER (0x88982f61)
Here's the line that fails:
hr = m_pFactory->CreateDecoderFromStream(
pInputStream,
NULL,
WICDecodeMetadataCacheOnDemand,
&pDecoder);
Here pInputStream is an IStream created from a byte array (output of the encoder - a black box which outputs a byte vector).
Please help! This is driving me nuts!
When passing stream as an argument, make sure to pre-seek it to proper initial position (esp. seek it back to the beginning if you just wrote data into it and expect further retrieval). APIs are typically not expected to seek, because this way they let you provide data in the middle of a bigger stream.

Raw Audio File to AAC using Windows Media Foundation on Windows 7

Thanks for taking some time to read my question.
I'm developping a C++ application using Qt and windows API.
I'm recording the microphone output in small 10s audio files in raw format, and I want to convert them to aac format.
I have tried to read as many things as I could, and thought it would be a great idea to start from windows media foundation transcode API.
Problem is, I can't seem to use a .raw or .pcm file in the "CreateObjectFromUrl" function, and so I'm pretty much stuck here for the moment. It keeps on failing. The hr return code equals 3222091460. I have tried to pass an .mp3 file to the function and of course it works, so no url-human-failure involved.
MF_OBJECT_TYPE ObjectType = MF_OBJECT_INVALID;
IMFSourceResolver* pSourceResolver = NULL;
IUnknown* pUnkSource = NULL;
// Create the source resolver.
hr = MFCreateSourceResolver(&pSourceResolver);
if (FAILED(hr))
{
qDebug() << "Failed !";
}
// Use the source resolver to create the media source.
hr = pSourceResolver->CreateObjectFromURL(
sURL, // URL of the source.
MF_RESOLUTION_MEDIASOURCE, // Create a source object.
NULL, // Optional property store.
&ObjectType, // Receives the created object type.
&pUnkSource // Receives a pointer to the media source.
);
The MFCreateSourceResolver works fine, but CreateObjectFromURL does not succeed :(
So I have two questions for you folks :
Is it possible to encode raw audio files to aac files using windows media foundation ?
If yes, what should I read to accomplish what I want ?
I want to point out that I can't just use ffmpeg or libav because I can't afford any license for my software, and don't want it to be under the GPL license. But if there are alternatives to windows media foundations to encode raw audio files to aac, I would be glad to hear them.
And finally, sorry for my bad english, this is obviously not my native language and I'm sorry if I made your eyes bleed. (and happy if I made you laugh)
Have a nice day
The hr return code equals 3222091460
Those are HRESULT codes. Use this "ShowHresult" tool to have them conveniently decoded for you. The code means 0xC00D36C4 MF_E_UNSUPPORTED_BYTESTREAM_TYPE "The byte stream type of the given URL is unsupported."
The problem is basically that there is no support for these raw files, .WAV is a good source for raw audio - the file holds both format descriptor and the payload.
You can obviously read data from the raw audio file yourself and compress into AAC using Media Foundation's AAC Encoder via its IMFTransform interface. This is reasonably easy and you have AAC data on the output to e.g. write into raw .AAC.
Alternate options to Media Foundation is DirectShow (there are suitable codecs, though I thought it might be not so easy to start), libfaac, FFmpeg's libavcodec (available under LGPL, not GPL).

Edit the frame rate of an avi file

Is it possible to change the frame rate of an avi file using the Video for windows library? I tried the following steps but did not succeed.
AviFileInit
AviFileOpen(OF_READWRITE)
pavi1 = AviFileGetStream
avi_info = AviStreamInfo
avi_info.dwrate = 15
EditStreamSetInfo(dwrate) returns -2147467262.
I'm pretty sure the AVIFile* APIs don't support this. (Disclaimer: I was the one who defined those APIs, but it was over 15 years ago...)
You can't just call EditStreamSetInfo on an plain AVIStream, only one returned from CreateEditableStream.
You could use AVISave, then, but that would obviously re-copy the whole file.
So, yes, you would probably want to do this by parsing the AVI file header enough to find the one DWORD you want to change. There are lots of documents on the RIFF and AVI file formats out there, such as http://www.opennet.ru/docs/formats/avi.txt.
I don't know anything about VfW, but you could always try hex-editing the file. The framerate is probably a field somewhere in the header of the AVI file.
Otherwise, you can script some tool like mencoder[1] to copy the stream to a new file under a different framerate.
[1] http://www.mplayerhq.hu/
HRESULT: 0x80004002 (2147500034)
Name: E_NOINTERFACE
Description: The requested COM interface is not available
Severity code: Failed
Facility Code: FACILITY_NULL (0)
Error Code: 0x4002 (16386)
Does it work if you DON'T call EditStreamSetInfo?
Can you post up the code you use to set the stream info?