Determining the format of audio file (MP3) using SAPI - c++

HiI'm trying to create a "Speech to text" app that can transcribe any audio/video file. I've created an app based on this post and it works great for WAV files. But if I use an MP3 file, the line hr = cpInputStream->BindToFile(wInputFileName.c_str(), SPFM_OPEN_READONLY, &sInputFormat.FormatId(), sInputFormat.WaveFormatExPtr(), SPFEI_ALL_EVENTS); returns
The Parameter is incorrect
The question is, can I use MP3 files as input for SAPI? and if yes, how do I determine the correct format for the call to hr = sInputFormat.AssignFormat(SPSF_16kHz16BitStereo) because SPSF_16kHz16BitStereo will certainly not be correct and I don't think we should hardcode it.

Related

Convert Raw to Wav Streams in NodeJS

I am using a nodeJS library naudio —link— to record sound from a 2 microphones (total 4 channel audio with each microphone being stereo). This library spits out a .raw file in the following specs: 16 BIT, 48000Hz Sample Rate, Channel Count 4
// var portAudio = require('../index.js');
var portAudio = require('naudiodon');
var fs = require('fs');
//Create a new instance of Audio Input, which is a ReadableStream
var ai = new portAudio.AudioInput({
channelCount: 4,
sampleFormat: portAudio.SampleFormat16Bit,
sampleRate: 48000,
deviceId: 13
});
ai.on('error', console.error);
//Create a write stream to write out to a raw audio file
var ws = fs.createWriteStream('rawAudio_final.raw');
//Start streaming
ai.pipe(ws);
ai.start();
process.once('SIGINT', ai.quit);
Instead of the .raw file, I am trying to convert this to two individual .wav files. With the above encoding and information, what would be the best way to do so? I tried to dig around for easy ways to deinterleaving and getting .wav but seem to be hitting a wall.
The addon is a wrapper around a C++ library called portaudio which according to its documentation supports writing to a WAV file.
What you could do is extend the addon and bind a NodeJS function to the underlying C++ function that write to WAV.
This will give you a good performance if it is an issue.
If you want something easier you could look up utilities that do the conversion and call them from within your script using ex like this
Look similar to this question.
You may also take a look here to know how to create wav file from javascript.

Media Foundation AMR decode

I have a file with .amr extension, and I want to get it's sample rate and number of channels using Microsoft Media Foundation. Further, I want to decode and get the uncompressed data.
I can successfully get those from .aac .mp4 and other file types but not from from a .amr file (or 3.gp file which contains .amr track).
So, for other types I do:
IMFSourceReader *m_pReader;
IMFMediaType *m_pAudioType;
MFCreateSourceReaderFromURL(filePath, NULL, &m_pReader);
m_pReader->SetStreamSelection(MF_SOURCE_READER_ALL_STREAMS, false);
m_pReader->SetStreamSelection(MF_SOURCE_READER_FIRST_AUDIO_STREAM, true);
m_pReader->GetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, &m_pAudioType);
UINT32 numChannels,sampleRate;
m_pAudioType->GetUINT32(MF_MT_AUDIO_NUM_CHANNELS, &numChannels);
m_pAudioType->GetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, &sampleRate);
Consider there are no any errors during this code.
For .amr files, some garbage is being written in the numChannels and sampleRate.
Does anyone have experience with this and knows how to recognize and/or get proper channels and sample rate for further decoding?
BTW, Windows Media Player plays this file with no problems.
Thanks in advance.
So I found out that it supports decoding for .amr files not encoding.
Just before we get this properties:
UINT32 numChannels,sampleRate;
m_pAudioType->GetUINT32(MF_MT_AUDIO_NUM_CHANNELS, &numChannels);
m_pAudioType->GetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, &sampleRate);
We have to set a new media type to our Source Reader
m_pAudioType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio)
m_pAudioType->SetGUID(MF_MT_SUBTYPE, MFAudioFormat_Float)
m_pReader->SetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, NULL, m_pAudioType);

Raw Audio File to AAC using Windows Media Foundation on Windows 7

Thanks for taking some time to read my question.
I'm developping a C++ application using Qt and windows API.
I'm recording the microphone output in small 10s audio files in raw format, and I want to convert them to aac format.
I have tried to read as many things as I could, and thought it would be a great idea to start from windows media foundation transcode API.
Problem is, I can't seem to use a .raw or .pcm file in the "CreateObjectFromUrl" function, and so I'm pretty much stuck here for the moment. It keeps on failing. The hr return code equals 3222091460. I have tried to pass an .mp3 file to the function and of course it works, so no url-human-failure involved.
MF_OBJECT_TYPE ObjectType = MF_OBJECT_INVALID;
IMFSourceResolver* pSourceResolver = NULL;
IUnknown* pUnkSource = NULL;
// Create the source resolver.
hr = MFCreateSourceResolver(&pSourceResolver);
if (FAILED(hr))
{
qDebug() << "Failed !";
}
// Use the source resolver to create the media source.
hr = pSourceResolver->CreateObjectFromURL(
sURL, // URL of the source.
MF_RESOLUTION_MEDIASOURCE, // Create a source object.
NULL, // Optional property store.
&ObjectType, // Receives the created object type.
&pUnkSource // Receives a pointer to the media source.
);
The MFCreateSourceResolver works fine, but CreateObjectFromURL does not succeed :(
So I have two questions for you folks :
Is it possible to encode raw audio files to aac files using windows media foundation ?
If yes, what should I read to accomplish what I want ?
I want to point out that I can't just use ffmpeg or libav because I can't afford any license for my software, and don't want it to be under the GPL license. But if there are alternatives to windows media foundations to encode raw audio files to aac, I would be glad to hear them.
And finally, sorry for my bad english, this is obviously not my native language and I'm sorry if I made your eyes bleed. (and happy if I made you laugh)
Have a nice day
The hr return code equals 3222091460
Those are HRESULT codes. Use this "ShowHresult" tool to have them conveniently decoded for you. The code means 0xC00D36C4 MF_E_UNSUPPORTED_BYTESTREAM_TYPE "The byte stream type of the given URL is unsupported."
The problem is basically that there is no support for these raw files, .WAV is a good source for raw audio - the file holds both format descriptor and the payload.
You can obviously read data from the raw audio file yourself and compress into AAC using Media Foundation's AAC Encoder via its IMFTransform interface. This is reasonably easy and you have AAC data on the output to e.g. write into raw .AAC.
Alternate options to Media Foundation is DirectShow (there are suitable codecs, though I thought it might be not so easy to start), libfaac, FFmpeg's libavcodec (available under LGPL, not GPL).

Edit the frame rate of an avi file

Is it possible to change the frame rate of an avi file using the Video for windows library? I tried the following steps but did not succeed.
AviFileInit
AviFileOpen(OF_READWRITE)
pavi1 = AviFileGetStream
avi_info = AviStreamInfo
avi_info.dwrate = 15
EditStreamSetInfo(dwrate) returns -2147467262.
I'm pretty sure the AVIFile* APIs don't support this. (Disclaimer: I was the one who defined those APIs, but it was over 15 years ago...)
You can't just call EditStreamSetInfo on an plain AVIStream, only one returned from CreateEditableStream.
You could use AVISave, then, but that would obviously re-copy the whole file.
So, yes, you would probably want to do this by parsing the AVI file header enough to find the one DWORD you want to change. There are lots of documents on the RIFF and AVI file formats out there, such as http://www.opennet.ru/docs/formats/avi.txt.
I don't know anything about VfW, but you could always try hex-editing the file. The framerate is probably a field somewhere in the header of the AVI file.
Otherwise, you can script some tool like mencoder[1] to copy the stream to a new file under a different framerate.
[1] http://www.mplayerhq.hu/
HRESULT: 0x80004002 (2147500034)
Name: E_NOINTERFACE
Description: The requested COM interface is not available
Severity code: Failed
Facility Code: FACILITY_NULL (0)
Error Code: 0x4002 (16386)
Does it work if you DON'T call EditStreamSetInfo?
Can you post up the code you use to set the stream info?

Saving image to file with IImageEncoder

do you have a working code to share.
I’m trying to figure out how to save to a file an IBitmapImage image.
I need to resize existing .jpg file and it seems like the only API for Windows Mobile. I managed to load it convert it to IImage -> IBitmapImage -> IBasicBitmapOps and resize it finally, but I have no clue how to save it properly to a new file.
Use IBitmapImage::LockBits to get access to the image data via its BitmapData* lockedBitmapData parameter. Use the BitmapData to prepare a bitmap file info header, then write that one and the image data in BitmapData::Scan0 to a file using regular file writing with ::WriteFile or higher level ones if you use such.