relation between QAudioInput bufferSize() and bytesReady() in QT

relation between QAudioInput bufferSize() and bytesReady() in QT - c++

I am trying to understand the relation between bufferSize() and bytesReady() for QAudioInput class in QT.
Assume that I have:
m_audioInput = new QAudioInput(m_Inputdevice, m_format, this);
bs = m_audioInput->bufferSize();
br = m_audioInput->bytesReady();
When I look at the values of bs and br (these are default values and I did not change the buffer size), I see that bs is 5 times larger than br. So it looks like there is a buffer that holds 5 blocks of audio input data. My question:
Is this a circular buffer? If I have these:
m_input = m_audioInput->start();
connect(m_input, SIGNAL(readyRead()), SLOT(myFunc()));
Then when I perform a read by:
MainClass::myFunc()
{
qint64 l = m_input->read(m_buffer.data(), br);
.
.
}
Does it read from the buffer in a circular manner? i.e. if I perform read 2 times consequently after a readyRead() is emitted does the buffer pointer moves from 1 block to second block (if it has 5 blocks in total)?
Is there any documentation on the buffer pointer, and if it is a circular buffer, etc.?
Are there automatic read and write pointers to the buffer? Do I need to take care of those, or it is being taken care of automatically?
Any help and pointer related to this is very much appreciated.

I don't really understand your use case. First thing, I suppose when you call
br = m_audioInput->bytesReady();
you are either in QAudio::ActiveState or QAudio::IdleState. Otherwise br is just junk.
So it looks like there is a buffer that holds 5 blocks of audio input data.
A Sample is a unit of audio data. If by that you mean 5 samples then it is not correct. There is also no such thing as block of audio when it comes to non-encoded data.
You can compute how much seconds(or milliseconds) of audio are in your buffer:
Buffer size/ Sample size gives #samples
1/sampling frequency gives you sample size in seconds
sample size x #samples the size of buffer in seconds.
That's in mono mode (one channel). You need to divide by the number of channels
In Qt:
BuffersizeSeconds = (int)((1.0/m_format->sampleRate())
*(m_audioInput->bufferSize()/m_format->sampleSize())
*(1.0/m_format->channelCount())
);

Related

Raw files aren't playing, or are playing incorrectly - Oboe (Android-ndk)

I'm attempting to Play a Raw (int16 PCM) encoded audio file in my android application. I've been following and reading through the Oboe documentation/samples to try to get one of my own audio files to play.
The audio file I need to play is roughly 6kb, or 1592 frames (stereo).
Either no sound plays, or sound/jitter plays on startup (with varying output - see bellow)
Troubleshooting
update
I have switched to floats for buffer queuing, instead of keeping everything to int16_t (and converting back to int16_t when done), although now I'm back to no sound.
The audio seems to be either not playing, or playing on startup (which is wrong). The sound should play after I press 'start'.
When the app was implemented with int16_t only, the premature sound was relative to how big the buffer size was. If the buffer size is smaller than the audio file, the sound is very fast and clipped (more drone-like at lower buffer sizes). Bigger than the Raw audio size it seems like it plays on a loop and gets quieter at higher buffer sizes. The sound would also get "softer" when the start button is pressed. I'm not even entirely sure this means the raw audio was playing, it could just be random nonsense jitters from Android.
When filling the buffers with floats, and converting to int16_t afterwards, no audio is played.
(I have tried running systrace, but I honestly don't know what I'm looking for)
The stream opens fine.
The buffer size fails to be ajusted in createPlaybackStream() (although somehow it still sets it to twice the burst size)
The stream starts fine.
The Raw resources are being loaded fine.
Implementation
What I am currently trying in the builder:
Setting the callback to this, or onAudioReady()
Setting the performance mode to LowLatency
Setting the sharing mode to Exclusive
Setting the buffer capacity to (anything bigger than my audio file frame count)
Setting the burst size (frames per call back) to (anything equal to or lower than the buffer capacity / 2)
I am using the Player class and the AAssetManager class from the Rhythm Game sample here: https://github.com/google/oboe/blob/master/samples/RhythmGame. I am using these classes to load my resources and play the sound. Player.renderAudio writes the audio data to the output buffer.
Here are the relevant methods from my audio engine:
void AudioEngine::createPlaybackStream() {
// // Load the RAW PCM data files into memory
std::shared_ptr<AAssetDataSource> soundSource(AAssetDataSource::newFromAssetManager(assetManager, "sound.raw", ChannelCount::Mono));
if (soundSource == nullptr) {
LOGE("Could not load source data for sound");
return;
}
sound = std::make_shared<Player>(soundSource);
AudioStreamBuilder builder;
builder.setCallback(this);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
builder.setChannelCount(mChannelCount);
Result result = builder.openStream(&stream);
if (result == Result::OK && stream != nullptr) {
mSampleRate = stream->getSampleRate();
mFramesPerBurst = stream->getFramesPerBurst();
int channelCount = stream->getChannelCount();
if (channelCount != mChannelCount) {
LOGW("Requested %d channels but received %d", mChannelCount, channelCount);
}
// Set the buffer size to (burst size * 2) - this will give us the minimum possible latency while minimizing underruns
stream->setBufferSizeInFrames(mFramesPerBurst * 2);
if (setBufferSizeResult != Result::OK) {
LOGW("Failed to set buffer size. Error: %s", convertToText(setBufferSizeResult.error()));
}
// Start the stream - the dataCallback function will start being called
result = stream->requestStart();
if (result != Result::OK) {
LOGE("Error starting stream. %s", convertToText(result));
}
} else {
LOGE("Failed to create stream. Error: %s", convertToText(result));
}
}
DataCallbackResult AudioEngine::onAudioReady(AudioStream *audioStream, void *audioData, int32_t numFrames) {
int16_t *outputBuffer = static_cast<int16_t *>(audioData);
sound->renderAudio(outputBuffer, numFrames);
return DataCallbackResult::Continue;
}
// When the 'start' button is pressed, it calls this method with true
// There should be no sound on app start-up until this button is pressed
// Sound stops when 'stop' is pressed
setPlaying(bool isPlaying) {
sound->setPlaying(isPlaying);
}

Setting the buffer capacity to (anything bigger than my audio file frame count)
You don't need to set the buffer capacity. This will be set automatically at a reasonable level for you. Typically ~3000 frames. Note that buffer capacity is different from buffer size which defaults to 2*framesPerBurst.
Setting the burst size (frames per call back) to (anything equal to or lower than the buffer capacity / 2)
Again, don't do this. onAudioReady will be called every time the stream requires more audio data and numFrames indicates how many frames you should supply. If you override this value with a value which isn't an exact ratio of the audio device's native burst size (typical values are 128, 192 and 240 frames depending on underlying hardware) then you may get audio glitches.
I have switched to floats for buffer queuing
The format which you need to supply data in is determined by the audio stream and it is only known after the stream has been opened. You can get it by calling stream->getFormat().
In the RhythmGame sample (at least the version you're referring to) here's how the formats work:
Source file is converted from 16-bit to float inside AAssetDataSource::newFromAssetManager (floats are the preferred format for any kind of signal processing)
If the stream format is 16-bit then convert it back inside onAudioReady
1592 frames (stereo).
You said that your source was stereo but you're specifying it as mono here:
std::shared_ptr soundSource(AAssetDataSource::newFromAssetManager(assetManager, "sound.raw", ChannelCount::Mono));
Without doubt that will cause audio problems because the AAssetDataSource will have a value for numFrames which is double the correct value. This will cause audio glitches because half the time you'll be playing random parts of system memory.

Correct use of memcpy

I have some problems with a project I'm doing. Basically I'm just using memcpy the wrong way. I know the theroy of pointer/arrays/references and should know how to do that, nevertheless I've spend two days now without any progress. I'll try to give a short code overview and maybe someone sees a fault! I would be very thankful.
The Setup: I'm using an ATSAM3x Microcontroller together with a uC for signal aquisition. I receive the data over SPI.
I have an Interrupt receiving the data whenever the uC has data available. The data is then stored in a buffer (int32_t buffer[1024 or 2048]). There is a counter that counts from 0 to the buffer size-1 and determines the place where the data point is stored. Currently I receive a test signal that is internally generated by the uC
//ch1: receive 24 bit data in 8 bit chunks -> store in an int32_t
ch1=ch1|(SPI.transfer(PIN_CS, 0x00, SPI_CONTINUE)<<24)>>8;
ch1=ch1|(SPI.transfer(PIN_CS, 0x00, SPI_CONTINUE)<<16)>>8;
ch1=ch1|(SPI.transfer(PIN_CS, 0x00, SPI_CONTINUE)<<8)>>8;
if(Not Important){
_ch1Buffer[_ch1SampleCount] = ch1;
_ch1SampleCount++;
if(_ch1SampleCount>SAMPLE_BUFFER_SIZE-1) _ch1SampleCount=0;
}
This ISR is active all the time. Since I need raw data for signal processing and the buffer is changed by the ISR whenever a new data point is available, i want to copy parts of the buffer into a temporary "storage".
To do so, I have another, global counter wich is incremented within the ISR. In the mainloop, whenever the counter reaches a certain size, i call a method get some of the buffer data (about 30 samples).
The method aquires the current position in the buffer:
'int ch1Pos = _ch1SampleCount;'
and then, depending on that position I try to use memcpy to get my samples. Depending on the position in the buffer, there has to be a "wrap-around" to get the full set of samples:
if(ch1Pos>=(RAW_BLOCK_SIZE-1)){
memcpy(&ch1[0],&_ch1Buffer[ch1Pos-(RAW_BLOCK_SIZE-1)] , RAW_BLOCK_SIZE*sizeof(int32_t));
}else{
memcpy(&ch1[RAW_BLOCK_SIZE-1 - ch1Pos],&_ch1Buffer[0],(ch1Pos)*sizeof(int32_t));
memcpy(&ch1[0],&_ch1Buffer[SAMPLE_BUFFER_SIZE-1-(RAW_BLOCK_SIZE- ch1Pos)],(RAW_BLOCK_SIZE-ch1Pos)*sizeof(int32_t));
}
_ch1Buffer is the buffer containing the raw data
SAMPLE_BUFFER_SIZE is the size of that buffer
ch1 is the array wich is supposed to hold the set of samples
RAW_BLOCK_SIZE is the size of that array
ch1Pos is the position of the last data point written to the buffer from the ISR at the time where this method is called
Technically I'm aware of the requirements, but apparently thats not enough ;-).
I know, that the data received by the SPI interface is "correct". The problem is, that this is not the case for the extracted samples. There are a lot of spikes in the data that indicate that I've been reading something I wasn't supposed to read. I've changed the memcpy commands that often, that I completly lost the overview. The code sample above is one version of many's, and while you're reading this I'm sure I've changed everything again.
I would appreciate every hint!
Thanks & Greetings!
EDIT
I've written down everything (again) on a sheet of paper and tested some constellations. This is the updated Code for the memcpy part:
if(ch1Pos>=(RAW_BLOCK_SIZE-1)){
memcpy(&ch1[0],&_ch1Buffer[ch1Pos-(RAW_BLOCK_SIZE-1)] , RAW_BLOCK_SIZE*sizeof(int32_t));
}else{
memcpy(&ch1[RAW_BLOCK_SIZE-1-ch1Pos],&_ch1Buffer[0],(ch1Pos+1)*sizeof(int32_t));
memcpy(&ch1[0],&_ch1Buffer[SAMPLE_BUFFER_SIZE-(RAW_BLOCK_SIZE-1-ch1Pos)],(RAW_BLOCK_SIZE-1-ch1Pos)*sizeof(int32_t));
}
}
This already made it a lot better. From all the changes, everything kinda got messed up. Now there is just one Error there. There is a periodical spike. I'll try to get more information, but I think it is a wrong access while wrapping around.
I've changed the if(_ch1SampleCount>SAMPLE_BUFFER_SIZE-1) _ch1SampleCount=0; to if(_ch1SampleCount>=SAMPLE_BUFFER_SIZE) _ch1SampleCount=0;.
EDIT II
To answer the Questions of #David Schwartz :
SPI.transfer returns a single byte
The buffer is initialised once at startup: memset(_ch1Buffer,0,sizeof(int32_t)*SAMPLE_BUFFER_SIZE);
EDIT III
Sorry for the frequent updates, the comment section is getting too big.
I managed to get rid of a bunch of zero values at the beginning of the stream by decreasing ch1Pos: 'int ch1Pos = _ch1SampleCount;' Now there is just one periodic "spike" (wrong value). It must be something with the splitted memcpy command. I'll continue looking. If anyone has an idea ... :-)

How to mix audio input devices in Qt

I'm new to Qt's multimedia library and in my application I want to mix audio from multiple input devices (e.g. microphone), in order to stream it via TCP.
As far as I know I have to obtain the specific QAudioDeviceInfo for all needed devices first - together with an according QAudioFormat object - and use this with QAudioInput. Then I simply call start() for every created QAudioInput object and read out pending bytes with readLine().
But how can I mix audio data of multiple devices to one buffer?

I am not sure if there is any Qt specific method / class to do this. However it's pretty simple to do it yourself.
The most basic way (assuming you are using PCM), you can simply add the two streams/buffers together word by word (if I recall they are 16-bit PCM words).
So if you have two input buffers:
int16 buff1[10];
int16 buff2[10];
int16 mixBuff[10];
// Fill them...
//... code goes here to read from the buffers ....
// Add them (effectively mix them)
for (int i = 0; i < 10; i++)
{
mixBuff[i] = buff1[i] + buff2[i];
}
Now, this is very crude and does not take any scaling into consideration. So imagine buff1 and buff2 both use 80% of the dynamic range (call this full volume, beyond which you get distortion), then when you add them together you will get number overrun (i.e. 16-bit max is 65535 so 50000 + 50000 will be a over run).
Each time you mix you effectively need half the two inputs (so 65535 / 2 + 65535 / 2 = 65535... i.e. when you add them up you can't overrun). So your mix code is like this:
for (int i = 0; i < 10; i++)
{
mixBuff[i] = (buff1[i] >> 1) + (buff2[i] >> 1);
}
There is much more you can do (noise removal etc...) but then the maths start getting a bit hairy. This is very simple. You can use the shift afterwards to increase / decrease volume as simple volume control if you want.
EDIT
One thing to note... you are using readline() (which the docs say reads the data out as ASCII). I always use read() which it does not state the "format" it is read out, but I am assuming binary. So this code may not work if you use readline() but I have never tried it. It works well for read(), you don't really want to be working in ASCII if you want to manipulate the data.

Read data from wav file before applying FFT

it's the first time when I'm working with wave files.
The problem is that I don't exactly understand how to properly read stored data. My code for reading:
uint8_t* buffer = new uint8_t[BUFFER_SIZE];
std::cout << "Buffering data... " << std::endl;
while ((bytesRead = fread(buffer, sizeof buffer[0], BUFFER_SIZE / (sizeof buffer[0]), wavFile)) > 0)
{
//do sth with buffer data
}
Sample file header gives me information that data is PCM (1 channel) with 8 bits per sample and sampling rate is 11025Hz.
Output data gives me (after updates) values from 0 to 255, so values are proper PCM values for 8bit modulation. But, any idea what BUFFER_SIZE would be prefferable to correctly read those values?
WAV file I'm using: http://www.wavsource.com/movies/2001.htm (daisy.wav)
TXT output: https://paste.ee/p/pXGvm

You've got two common situations. The first is where the WAV file represents a short audio sample and you want to read the whole thing into memory and manipulate it. So BUFFER_SIZE is a variable. Basically you seek to the end of the file to get its size, then load it.
The second common situation is that the WAV file represent fairly long audio recording, and you want to process it piecewise, often by writing to an output device in real time. So BUFFER_SIZE needs to be large enough to hold a bite-sized chunk, but not so large that you require excessive memory. Now often the size of a "frame" of audio is given by the output device itself, it expects 25 samples per second to synchronise with video or something similar. You generally need a double buffer to ensure that you can always meet the demand for more samples when the DAC (digital to analogue converter) runs out. Then on giving out a sample you load the next chunk of data from disk. Sometimes there isn't a "right" value for the chunk size, you've just got to go with something fairly sensible that balances memory footprint against the number of calls.
If you need to do FFT, it's normal to use a buffer size that is a power of two, to make the fast transform simpler. Size you need depends on the lowest frequency you are interested in.

Concatenate data in an array in C ++

I'm working on software for processing audio in real time in C++ with Qt. I need that requirements are minimized.
Defining a temporary buffer 40ms, launching our device with a sampling frequency Fs = 8000Hz, every 320 samples entered a feature called Data Processing ().
The idea is to have a global buffer that stores the 10s last recorded, 80000 samples.
This Buffer in each iteration eliminates the initial 320 samples and looped at the end, 320 new samples. Thus the buffer is updated and the user can observe the real-time graphical representation of the recorded signal.
At first I thought of using QVector (equivalent to std::vector but for Qt) for this deployment, thus we reduce the process a few lines of code
int NUM_POINTS=320;
DatosTemporales.erase(DatosTemporales.begin(),DatosTemporales.begin()+NUM_POINTS);
DatosTemporales+= (DatosNuevos); // Datos Nuevos con un tamaño de NUM_POINTS
In each iteration we create a vector of 80000 samples in addition to free some positions so requires some processing time. An alternative for opting was the use of * double, and iterations a loop:
for(int i=0;i<80000;i++){
if(i<80000-NUM_POINTS){
aux=DatosTemporales[i];
DatosTemporales[i+NUM_POINTS]=aux;
}else{
DatosTemporales[i]=DatosNuevos[i-NUN_POINTS];
}
}
Does fails. I think the best way is to use dynamic memory. Implementing this process by pointers. Could anyone give me some idea how to implement it?

It sounds like what you are looking for is a circular buffer.
https://www.google.com/search?q=qcircularbuffer
https://qt.gitorious.org/qt/qtbase/merge_requests/60
And it looks like you only need the header file and you should be good to go.
A similar tool that is already in the Qt data set is found here:
http://doc.qt.io/qt-5/qcontiguouscache.html#details
The advantage of using a system like these presented, is that they don't need to have dynamic memory, it just needs to move the head and the tail pointers.
Hope that helps.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js