I'm trying to use Google Oboe for a 3D audio processing app due to it's low latency. The app will have a C++ backend, which does the processing, and the frontend is done with Flutter. I'm running a couple of tests to see if it'll work but I'm having issues loading assets from Flutter to Oboe. I checked the example RhythmGame in Oboe's repo, done with Java, but couldn't quiet find a way of doing that straight from Dart to C++.
The connection between front and backend is through dart::ffi
Here's what I've tried so far. Based on the example published by Richard Heap here, I changed the noise variable from just a sine wave to a short fragment of a song in a wav file:
class _MyAppState extends State<MyApp> {
final stream = OboeStream();
var noise = Float32List(512);
Timer t;
#override
void initState() {
super.initState();
// for (var i = 0; i < noise.length; i++) {
// noise[i] = sin(8 * pi * i / noise.length);
// }
_loadSound();
}
void _loadSound() async {
final ByteData data = await rootBundle.load('assets/song_cut.wav');
noise = data.buffer.asFloat32List();
}
(...)
Then this function in Dart calls the Dart wrapper of the native library:
void start() {
stream.start();
var interval = (512000 / stream.getSampleRate()).floor() + 1;
t = Timer.periodic(Duration(milliseconds: interval), (_) {
stream.write(noise);
});
}
The wrapper in Dart is:
void write(Float32List original) {
var length = original.length;
var copy = allocate<Float>(count: length)
..asTypedList(length).setAll(0, original);
FfiGoogleOboe()._streamWrite(_nativeInstance, copy, length);
free(copy);
}
_streamWrite is the native function in C++:
EXTERNC void stream_write(void* ptr, void* data, int32_t size) {
auto stream = static_cast<OboeFfiStream*>(ptr);
auto dataToWrite = static_cast<float*>(data);
stream->write(dataToWrite, size);
}
void OboeFfiStream::write(float *data, int32_t size) {
managedStream->write(data, size, 1000000);
}
Now I can hear the song but it comes out with too much distortion. When trying with the sine I could hear it too, but it also had some distortion. I'm not yet using the callback mode in Oboe, since I wanted to try if this worked first.
1 - what format is your WAV file in? Is it 32 bit floats? Don't forget that WAV files have a header, so you should discard the first few tens of bytes (up to the data segment). Be sure that you start reading the audio data on a float boundary (which may not be a multiple of 4 if the header isn't). If necessary, just use a hex editor to ascertain the offset of the float data and start reading there. Or, truncate the header and rename your asset to song_cut.raw. Audacity should be able to produce a header-less raw audio file.
2 - What sample rate is your audio clip recorded at? Does that match the sample rate of the device? (Note that iOS devices are normally 44.1k, but Android devices are frequently 48k. When using an Android emulator on macOS, who knows what the reported sample rate will be! Expect pitch distortion if your rates don't match - or use a resampler. I think Oboe has one. Alternatively, the sample repo associated with the talk contains one you can use.)
3 - note that the timer interval is finely tuned (for demo purposes) to the approximate time taken to deliver 512 samples at the sound card rate. This might be ok for demos, but isn't for real life. Also, your wav file probably doesn't have exactly 512 samples in it. Either adjust your audio loop to 512 samples, or adjust the 512000 constant to match the number of samples in your loop.
4a - You aren't using the callback method yet, but you probably should as soon as possible. One method I've had success with is to use a lock-free circular buffer. The Oboe callback tries to empty the buffer, while the Dart timer routine tries to fill it. The bigger the buffer the less chance there is of an underflow, but the worse the latency.
4b - The ideal solution would be to have the Oboe callback call up into Dart, but I haven't found a way to do that as C->Dart calls must be on the main Dart thread, but the Oboe callbacks are surely on a high-priority IO thread.
Related
I am working off a demo from the book "Learning Core Audio: A Hands-On Guide to Audio Programming for Mac and iOS." Chapter 8 shows how to set up a simple AudioUnit graph to play through from the AUHAL input unit to an output unit. This setup doesn't actually connect the audio units; instead, both units use a callback and pass audio data through an instance of CARingBuffer. I'm coding for MacOS 10.15.6, and using code directly from the publisher here. Here's a picture of how it works:
The code builds and runs, but I get no audio. Note that later, after introducing a speech synthesis unit, I do get playback, so I know the basics are working.
InputRenderProc asks the AUHAL unit for input and stores it in the ring buffer.
MyAUGraphPlayer *player = (MyAUGraphPlayer*) inRefCon;
// have we ever logged input timing? (for offset calculation)
if (player->firstInputSampleTime < 0.0) {
player->firstInputSampleTime = inTimeStamp->mSampleTime;
if ((player->firstOutputSampleTime > -1.0) &&
(player->inToOutSampleTimeOffset < 0.0)) {
player->inToOutSampleTimeOffset = player->firstInputSampleTime - player->firstOutputSampleTime;
}
}
// render into our buffer
OSStatus inputProcErr = noErr;
inputProcErr = AudioUnitRender(player->inputUnit,
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
player->inputBuffer);
if (! inputProcErr) {
inputProcErr = player->ringBuffer->Store(player->inputBuffer,
inNumberFrames,
inTimeStamp->mSampleTime);
UInt32 sz = sizeof(player->inputBuffer);
printf ("stored %d frames at time %f (%d bytes)\n", inNumberFrames, inTimeStamp->mSampleTime, sz);
for (int i = 0; i < player->inputBuffer->mNumberBuffers; i++ ){
//printf("stored audio string[%d]: %s\n", i, player->inputBuffer->mBuffers[i].mData);
}
}
If I uncomment the printf statement, I see what looks like audio data being stored.
stored audio string[1]: #P'\274a\353\273\336^\274x\205 \2741\330B\2747'\274\371\361U\274\346\274\274}\212C\274\334\365%\274\261\367\273\340\307/\274E
stored 512 frames at time 134610.000000 (8 bytes)
However, when I fetch from the ring buffer in the GraphRenderCallback like this...
MyAUGraphPlayer *player = (MyAUGraphPlayer*) inRefCon;
// have we ever logged output timing? (for offset calculation)
if (player->firstOutputSampleTime < 0.0) {
player->firstOutputSampleTime = inTimeStamp->mSampleTime;
if ((player->firstInputSampleTime > -1.0) &&
(player->inToOutSampleTimeOffset < 0.0)) {
player->inToOutSampleTimeOffset = player->firstInputSampleTime - player->firstOutputSampleTime;
}
}
// copy samples out of ring buffer
OSStatus outputProcErr = noErr;
// new CARingBuffer doesn't take bool 4th arg
outputProcErr = player->ringBuffer->Fetch(ioData,
inNumberFrames,
inTimeStamp->mSampleTime + player->inToOutSampleTimeOffset);
I get nothing (I know I can't expect proper null-terminated string output, but I thought I'd see something).
fetched 512 frames at time 160776.000000
fetched audio string[0, size 2048]: xx
fetched audio string[1, size 2048]: xx
fetched 512 frames at time 161288.000000
fetched audio string[0, size 2048]: xx
fetched audio string[1, size 2048]: xx
This is not a permission problem; I have other non-AudioUnit code that can get mic input. In addition, I created a plist that makes this app prompt for mic access every time, so I know that is working. I cannot understand why data goes into this ring buffer, but never comes out.
These days you need to declare that you want to use the microphone, providing an explanation string. This wasn't the case in 2012 when Learning Core Audio was published.
In short, you now need to:
add an NSMicrophoneUsageDescription string to your Info.plist
add sandboxing capability and enable Audio Input
The sample code you're using is a command line tool, so adding an Info.plist to it in Xcode isn't as simple as with a .app package. Also the code does not seem to work if you run it from Xcode. In my case it has to be run for Terminal.app. This may be due to the fact that my Terminal has microphone permissions (viewable in System Preferences > Security & Privacy > Microphone). You can and probably should explicitly request microphone access from the user (yourself in this case!) by using requestAccessForMediaType on an AVCaptureDevice. That's right, AVFoundation code in a Core Audio tutorial, what's the world coming to.
There are more details on the above steps in this answer
p.s. I think the person who thought capturing zeroes instead of returning an error was a good idea is probably good friends with whoever invented returning HTTP 200 with an error code in the body.
I have a DSP software which captures the audio playing using the WASAPI api in shared loopback mode.
hr = _pAudioClient->Initialize(AUDCLNT_SHAREMODE_SHARED, AUDCLNT_STREAMFLAGS_LOOPBACK, 0, 0, _pFormat, 0);
This part works fine, but now I want to be able to detect the number of channels actually playing. In other words how would I be able to detect if the audio playing is in stereo, 5.1, 7.1?
The problem is:
* Since loopback have to use shared mode there could be multiple sources playing.
* This analysis has to be done in real-time. Can't wait until playback is done.
* Detect the difference between a channel not used at all by any playback source and a channel that is temporarily silent
The best solution in my mind would be If I could retrieve a list of all playback source/sub mixes and query them each for the number of channels. That way I don't have to analyse the audio data stream itself.
Loopback recording takes place in mix format defined on the endpoint, so regardless of what the original audio format was you get the data in the mix format, mixed from possibly multiple played sources and also converted to such shared format.
Device Formats
Loopback Recording
WASAPI loopback contains the mix of all audio being played...
The GetMixFormat method retrieves the stream format that the audio engine uses for its internal processing of shared-mode streams...
After an application has used GetMixFormat or IsFormatSupported to find an appropriate format for a shared-mode or exclusive-mode stream, the application can call the Initialize method to initialize a stream with that format. An application that attempts to initialize a shared-mode stream with a format that is not identical to the mix format obtained from the GetMixFormat method, but that has the same number of channels and the same sample rate as the mix format, is likely to succeed. Before calling Initialize, the application can call IsFormatSupported to verify that Initialize will accept the format.
That is, even though WASAPI offers some flexibility in audio format, channel configuration and sample rate are defined by shared format when it comes to loopback capture.
As you are getting the mix, you cannot really identify "non-active" channels: this information is lost during mixing to shared format.
Also, the actual shared format can be configured interactively via Control Panel:
Ok I now have a solution to my problem. As far as I know you can not detect sub-mixes in the shared mix so the only option was to analyze the audio stream/capture buffer.
First during my main capture loop I set the current timestamp for all channels playing.
const time_t now = Date::getCurrentTimeMillis();
//Iterate all capture frames
for (i = 0; i < numFramesAvailable; ++i) {
for (j = 0; j < _nChannelsIn; ++j) {
//Identify which channels are playing.
if (pCaptureBuffer[j] != 0) {
_pUsedChannels[j] = now;
}
}
}
Then every second I call this function which evaluates if a channel has played the last second. Based upon which channels are playing I can do conditional routing.
void checkUsedChannels() {
const time_t now = Date::getCurrentTimeMillis();
//Compare now against last used timestamp and determine active channels
for (size_t i = 0; i < _nChannelsIn; ++i) {
if (now - _pUsedChannels[i] > 1000) {
_pUsedChannels[i] = 0;
}
}
//Update conditional routing
for (const Input *pInut : _inputs) {
pInut->evalConditions();
}
}
Very simple solution but it appears to be working.
I am trying to plot some serial data on my Qt Gui program using qcustomplot class. I had no trouble when I tried to plot low sampling frequency datas like 100 data/second. The graph was really cool and was plotting the data smoothly. But at high sampling rates such 1000data/second, plotter makes a bottleneck for serial read function. It slow downs serial there was a huge delay like 4-5 seconds apart from device. Straightforwardly, plotter could not reach the data stream speed. So, is there any common issue which i dont know about or any recommendation?
I thougth these scenarious,
1- to devide whole program to 2 or 3 thread. For example, serial part runs in one thread and plotting part runs in another thread and two thread communicates with a QSemaphore
2- fps of qcustom plot is limited. but there should be a solution because NI LABVIEW plots up to 2k of datas without any delay
3- to desing a new virtual serial device in usb protocol. Now, I am using ft232rl serial to usb convertor.
4- to change programming language. What is the situation and class support in C# or java for realtime plotting? (I know it is like a kid saying, but this is a pretex to be experienced in other languages)
My serial device send data funct(it is foo device for experiment there is no serious coding) is briefly that:
void progTask()
{
DelayMsec(1); //my delay function, milisecond
//read value from adc13
Adc13Read(adcValue.ui32Part);
sendData[0] = (char)'a';
sendData[1] = (char)'k';
sendData[2] = adcValue.bytes[0];
sendData[3] = (adcValue.bytes[1] & 15);
Qt Program read function is that:
//send test data
UARTSend(UART6_BASE,&sendData[0],4);
}
union{
unsigned char bytes[2];
unsigned int intPart;
unsigned char *ptr;
}serData;
void MedicalSoftware::serialReadData()
{
if(serial->bytesAvailable()<4)
{
//if the frame size is less than 4 bytes return and
//wait to full serial receive buffer
//note: serial->setReadBufferSize(4)!!!!
return;
}
QByteArray serialInData = serial->readAll();
//my algorithm
if(serialInData[0] == 'a' && serialInData[1] == 'k')
{
serData.bytes[0] = serialInData[2];
serData.bytes[1] = serialInData[3];
}else if(serialInData[2] == 'a' && serialInData[3] == 'k')
{
serData.bytes[0] = serialInData[0];
serData.bytes[1] = serialInData[1];
}
else if(serialInData[1] == 'a' && serialInData[2] == 'k')
{
serial->read(1);
return;
}else if(serialInData[0] == 'k' && serialInData[3] == 'a')
{
serData.bytes[0] = serialInData[1];
serData.bytes[1] = serialInData[2];
}
plotMainGraph(serData.intPart);
serData.intPart = 0;
}
And qcustom plot setting fuction is:
void MedicalSoftware::setGraphsProperties()
{
//MAIN PLOTTER
ui->mainPlotter->addGraph();
ui->mainPlotter->xAxis->setRange(0,2000);
ui->mainPlotter->yAxis->setRange(-0.1,3.5);
ui->mainPlotter->xAxis->setLabel("Time(s)");
ui->mainPlotter->yAxis->setLabel("Magnitude(mV)");
QSharedPointer<QCPAxisTickerTime> timeTicker(new QCPAxisTickerTime());
timeTicker->setTimeFormat("%h:%m:%s");
ui->mainPlotter->xAxis->setTicker(timeTicker);
ui->mainPlotter->axisRect()->setupFullAxesBox();
QPen pen;
pen.setColor(QColor("blue"));
ui->mainPlotter->graph(0)->setPen(pen);
dataTimer = new QTimer;
}
And the last is plot function:
void MedicalSoftware::plotMainGraph(const quint16 serData)
{
static QTime time(QTime::currentTime());
double key = time.elapsed()/1000.0;
static double lastPointKey = 0;
if(key-lastPointKey>0.005)
{
double value0 = serData*(3.3/4096);
ui->mainPlotter->graph(0)->addData(key,value0);
lastPointKey = key;
}
ui->mainPlotter->xAxis->setRange(key+0.25, 2, Qt::AlignRight);
counter++;
ui->mainPlotter->replot();
counter = 0;
}
Quick answer:
Have you tried:
ui->mainPlotter->replot(QCustomPlot::rpQueuedReplot);
according to the documentation it can improves performances when doing a lot of replots.
Longer answer:
My feeling on your code is that you are trying to replot as often as you can to get a "real time" plot. But if you are on a PC with a desktop OS there is no such thing as real time.
What you should care about is:
Ensure that the code that read/write to the serial port is not delayed too much. "Too much" is to be interpreted with respect to the connected hardware. If it gets really time critical (which seems to be your case) you have to optimize your read/write functions and eventually put them alone in a thread. This can go as far as reserving a full hardware CPU core for this thread.
Ensure that the graph plot is refreshed fast enough for the human eyes. You do not need to do a full repaint each time you receive a single data point.
In your case you receive 1000 data/s which make 1 data every ms. That is quite fast because that is beyond the default timer resolution of most desktop OS. That means you are likely to have more than a single point of data when calling your "serialReadData()" and that you could optimize it by calling it less often (e.g call it every 10ms and read 10 data points each time). Then you could call "replot()" every 30ms which would add 30 new data points each time, skip about 29 replot() calls every 30ms compared to your code and give you ~30fps.
1- to devide whole program to 2 or 3 thread. For example, serial part
runs in one thread and plotting part runs in another thread and two
thread communicates with a QSemaphore
Dividing the GUI from the serial part in 2 threads is good because you will prevent a bottleneck in GUI to block the serial communication. Also you could skip using semaphore and simply rely on Qt signal/slot connections (connected in Qt::QueuedConnection mode).
4- to change programming language. What is the situation and class
support in C# or java for realtime plotting? (I know it is like a kid
saying, but this is a pretex to be experienced in other languages)
Changing the programming language, in best case, won't change anything or could hurt your performances, especially if you go toward languages which are not compiled to native CPU instructions.
Changing the plotting library on the other hand could change the performances. You can look at Qt Charts and Qwt. I do not know how they compare to QCustomPlot though.
I'm new to Qt's multimedia library and in my application I want to mix audio from multiple input devices (e.g. microphone), in order to stream it via TCP.
As far as I know I have to obtain the specific QAudioDeviceInfo for all needed devices first - together with an according QAudioFormat object - and use this with QAudioInput. Then I simply call start() for every created QAudioInput object and read out pending bytes with readLine().
But how can I mix audio data of multiple devices to one buffer?
I am not sure if there is any Qt specific method / class to do this. However it's pretty simple to do it yourself.
The most basic way (assuming you are using PCM), you can simply add the two streams/buffers together word by word (if I recall they are 16-bit PCM words).
So if you have two input buffers:
int16 buff1[10];
int16 buff2[10];
int16 mixBuff[10];
// Fill them...
//... code goes here to read from the buffers ....
// Add them (effectively mix them)
for (int i = 0; i < 10; i++)
{
mixBuff[i] = buff1[i] + buff2[i];
}
Now, this is very crude and does not take any scaling into consideration. So imagine buff1 and buff2 both use 80% of the dynamic range (call this full volume, beyond which you get distortion), then when you add them together you will get number overrun (i.e. 16-bit max is 65535 so 50000 + 50000 will be a over run).
Each time you mix you effectively need half the two inputs (so 65535 / 2 + 65535 / 2 = 65535... i.e. when you add them up you can't overrun). So your mix code is like this:
for (int i = 0; i < 10; i++)
{
mixBuff[i] = (buff1[i] >> 1) + (buff2[i] >> 1);
}
There is much more you can do (noise removal etc...) but then the maths start getting a bit hairy. This is very simple. You can use the shift afterwards to increase / decrease volume as simple volume control if you want.
EDIT
One thing to note... you are using readline() (which the docs say reads the data out as ASCII). I always use read() which it does not state the "format" it is read out, but I am assuming binary. So this code may not work if you use readline() but I have never tried it. It works well for read(), you don't really want to be working in ASCII if you want to manipulate the data.
Can someone explain how snd_pcm_writei
snd_pcm_sframes_t snd_pcm_writei(snd_pcm_t *pcm, const void *buffer,
snd_pcm_uframes_t size)
works?
I have used it like so:
for (int i = 0; i < 1; i++) {
f = snd_pcm_writei(handle, buffer, frames);
...
}
Full source code at http://pastebin.com/m2f28b578
Does this mean, that I shouldn't give snd_pcm_writei() the number of
all the frames in buffer, but only
sample_rate * latency = frames
?
So if I e.g. have:
sample_rate = 44100
latency = 0.5 [s]
all_frames = 100000
The number of frames that I should give to snd_pcm_writei() would be
sample_rate * latency = frames
44100*0.5 = 22050
and the number of iterations the for-loop should be?:
(int) 100000/22050 = 4; with frames=22050
and one extra, but only with
100000 mod 22050 = 11800
frames?
Is that how it works?
Louise
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html#gf13067c0ebde29118ca05af76e5b17a9
frames should be the number of frames (samples) you want to write from the buffer. Your system's sound driver will start transferring those samples to the sound card right away, and they will be played at a constant rate.
The latency is introduced in several places. There's latency from the data buffered by the driver while waiting to be transferred to the card. There's at least one buffer full of data that's being transferred to the card at any given moment, and there's buffering on the application side, which is what you seem to be concerned about.
To reduce latency on the application side you need to write the smallest buffer that will work for you. If your application performs a DSP task, that's typically one window's worth of data.
There's no advantage in writing small buffers in a loop - just go ahead and write everything in one go - but there's an important point to understand: to minimize latency, your application should write to the driver no faster than the driver is writing data to the sound card, or you'll end up piling up more data and accumulating more and more latency.
For a design that makes producing data in lockstep with the sound driver relatively easy, look at jack (http://jackaudio.org/) which is based on registering a callback function with the sound playback engine. In fact, you're probably just better off using jack instead of trying to do it yourself if you're really concerned about latency.
I think the reason for the "premature" device closure is that you need to call snd_pcm_drain(handle); prior to snd_pcm_close(handle); to ensure that all data is played before the device is closed.
I did some testing to determine why snd_pcm_writei() didn't seem to work for me using several examples I found in the ALSA tutorials and what I concluded was that the simple examples were doing a snd_pcm_close () before the sound device could play the complete stream sent it to it.
I set the rate to 11025, used a 128 byte random buffer, and for looped snd_pcm_writei() for 11025/128 for each second of sound. Two seconds required 86*2 calls snd_pcm_write() to get two seconds of sound.
In order to give the device sufficient time to convert the data to audio, I put used a for loop after the snd_pcm_writei() loop to delay execution of the snd_pcm_close() function.
After testing, I had to conclude that the sample code didn't supply enough samples to overcome the device latency before the snd_pcm_close function was called which implies that the close function has less latency than the snd_pcm_write() function.
If the ALSA driver's start threshold is not set properly (if in your case it is about 2s), then you will need to call snd_pcm_start() to start the data rendering immediately after snd_pcm_writei().
Or you may set appropriate threshold in the SW params of ALSA device.
ref:
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m___s_w___params.html