SDL - How to play audio asynchronously in C++ without stopping code execution? - c++

I am developing a clone of Asteroid in pure C++ and for that purpose, I need to add sounds to different events such as when a bullet is fired and when an explosion occurs. The issue however is that I don't have any experience with audio libraries.
I am using Simple DirectMedia Layer (SDL) and wrote a function named playsound() to play a sound in case a certain event occurs. The problem however is the fact that if an event occurs, playsound() is called and the code execution stops until the sound is wholly played out or until I return from the function (I delay the return using delay func).
What I would want to do is that the sound plays in the background without creating any lag for the rest of the Game. I am developing on Ubuntu 16.04 and can't use Windows PlaySound() either to call in the ASYNC flag.
Here is the function:
void playsound(string path) {
// Initialize SDL.
if (SDL_Init(SDL_INIT_AUDIO) < 0)
return;
// local variables
Uint32 wav_length; // length of our sample
Uint8 *wav_buffer; // buffer containing our audio file
SDL_AudioSpec wav_spec;
if(SDL_LoadWAV(path.c_str(), &wav_spec, &wav_buffer, &wav_length) == NULL){
return;
}
SDL_AudioDeviceID deviceId = SDL_OpenAudioDevice(NULL, 0, &wav_spec, NULL, 0);
SDL_QueueAudio(deviceId, wav_buffer, wav_length);
SDL_PauseAudioDevice(deviceId, 0);
SDL_Delay(50);
SDL_CloseAudioDevice(deviceId);
SDL_FreeWAV(wav_buffer);
SDL_Quit();
}

Your delay is stopping your code from executing, 50ms of delay is almost 2 frames at 33ms per frame or 3 frames at 16ms per frame, having a frame drop here and there might not be a problem, but you could see how calling several sounds in succession will slow your program down.
This is how I play sounds in my engine, using SDL2_mixer, (short sounds, for music you have another method called Mix_PlayMusic), it might be helpful to you. I have no lag (and I don't use any sleep or delays in my code). Once you call play() the sound should be played in full, unless there is something else pausing your code.
#pragma once
#include <string>
#include <memory>
#include <SDL2/SDL_mixer.h>
class sample {
public:
sample(const std::string &path, int volume);
void play();
void play(int times);
void set_volume(int volume);
private:
std::unique_ptr<Mix_Chunk, void (*)(Mix_Chunk *)> chunk;
};
And the cpp file
#include <sample.h>
sample::sample(const std::string &path, int volume)
: chunk(Mix_LoadWAV(path.c_str()), Mix_FreeChunk) {
if (!chunk.get()) {
// LOG("Couldn't load audio sample: ", path);
}
Mix_VolumeChunk(chunk.get(), volume);
}
// -1 here means we let SDL_mixer pick the first channel that is free
// If no channel is free it'll return an err code.
void sample::play() {
Mix_PlayChannel(-1, chunk.get(), 0);
}
void sample::play(int times) {
Mix_PlayChannel(-1, chunk.get(), times - 1);
}
void sample::set_volume(int volume) {
Mix_VolumeChunk(chunk.get(), volume);
}
Notice that I don't need to thread my model, every time something triggers a sound play the program keeps execution. (I guess SDL_Mixer plays in the main SDL thread).
For this to work, where you init SDL you'll also have to init the mixer as
if (Mix_OpenAudio(44100, MIX_DEFAULT_FORMAT, 2, 1024) < 0) {
// Error message if can't initialize
}
// Amount of channels (Max amount of sounds playing at the same time)
Mix_AllocateChannels(32);
And an example of how to play a sound would be
// at some point loaded a sample s with sample(path to wave mp3 or whatever)
s.play();
A few remarks, you don't need to use, but can, the code as it is, it is more of a simple example of using SDL2_mixer.
This mean functionality is lacking, you might want a tighter handling of sound, for example to stop a sound mid play (for some reason), you can do this if you play your sounds in different channels with the Mix_HaltChannel function, and the play() function could receive the channel where you want it to be played.
All these functions return error values, for example if no unreserved channel is available Mix_PlayChannel will return an error code.
Another thing you want to keep in mind is if you play the same sound several times it'll start to get blurry/you would not notice if the same sound is being played again. So you could add an integer to sample to count how many times a sample can be played.
If you REALLY want to thread your mixer/audio from the main SDL thread (and still only use SDL), you can just spawn a new SDL context in a thread and send in some way signals to play audio.

You want to load all necessary assets when initializing the game. Then, when you want to play them, they're loaded into the game memory and there will be no lags. And also play the sounds in a separate thread maybe, so it won't block your main thread.

There are several tools in C++ for asynchronous operations. You can try the most simple std::async:
auto handle = std::async(std::launch::async,
playsound, std::string{"/path/to/cute/sound"});
// Some other stuff. Your game logic doesn't blocked here.
handle.get(); // This can actually block.
You should specify flag std::launch::async, which means, that new thread will be used. Then name of callable needed to be executed and its parameters. Don't forget to include <future> header.

Related

Why is there latency in this C++ ALSA (Linux audio) program?

I am exploring sound generation using C++ in Ubuntu Linux. Here is my code:
#include <iostream>
#include <cmath>
#include <stdint.h>
#include <ncurses.h>
//to compile: make [file_name] && ./[file_name]|aplay
int main()
{
initscr();
cbreak();
noecho();
nodelay(stdscr, TRUE);
scrollok(stdscr, TRUE);
timeout(0);
for ( int t=0;; t++ )
{
int ch = getch();
if (ch == 'q')
{
break;
}
uint8_t temp = t;
std::cout<<temp;
}
}
When this code is run, I want it to generate sound until I press "q" on my keyboard, after which I want the program to quit. This works fine; however, there is a noticeable delay between pressing the keyboard and the program quitting. This is not due to a delay with ncurses, as when I run the program without std::cout<<temp; (i.e. no sound generated), there is no latency
Is there a way to amend this? If not, how are real-time responsive audio programs written?
Edits and suggestions to the question are welcome. I am a novice to ALSA, so I am not sure if any additional details are required to replicate the bug.
The latency in the above loop is most likely due to delays introduced by the ncurses getch function.
Typically for realtime audio you will want to have a realtime audio thread running and a non-realtime user control thread running. The user control thread can alter the memory space of the real time audio thread which forces the real time audio loop to adjust synthesis as required.
In this gtkIOStream example, a full duplex audio class is created. The process method in the class can have your synthesis computation compiled in. This will handle the playback of your sound using ALSA.
To get user input, one possibility is to add a threaded method to the class by inheriting the FullDuplexTest class, like so :
class UIALSA : public FullDuplexTest, public ThreadedMethod {
void *threadMain(void){
while (1){
// use getchar here to block and wait for user input
// change the memory in FullDuplexTest to indicate a change in variables
}
return NULL;
}
public:
UIALSA(const char*devName, int latency) : FullDuplexTest(devName, latency), ThreadedMethod() {};
};
Then change all references to FullDuplexTest to UIALSA in the original test file (you will probably have to fix some compile time errors) :
UIALSA fullDuplex(deviceName, latency);
Also you will need to call UIALSA::run() to make sure the UI thread is running and listening for user input. You can add the call before you call "go" :
fullDuplex.run(); // start the UI thread
res=fullDuplex.go(); // start the full duplex read/write/process going.

Unity C++ DLL performance drop in Standalone Build vs. Editor mode

Abstract
I am building an unmanaged C++ Dll plugin for a Unity project, where the plugin interfaces with 2 sensor APIs, runs a repeating sensor fusion algorithm, and returns the final results via a callback function. The project runs on Windows 10 64bit. Everything was running smoothly inside the Unity Editor until I tried to build the Unity project to a standalone version. In the build mode, the sensor fusion algorithm loop had constant "hiccups" where the execution time will constantly increase 10x in one or two iterations.
I do not expect answers that could directly solve my problem as the issue probably is case specific, but I hope someone with experience could share some insights on what could be wrong. I have experimented with things I mentioned in the later section.
Relevant pseudo-code
Dll functions:
extern "C" {
void start(FilterWrapper *& pWrapper, Callback cb) {
pWrapper = new FilterWrapper();
// code registers callback
}
void stop(FilterWrapper *& pWrapper) {
pWrapper->stopFilter();
delete pWrapper;
pWrapper = NULL;
}
}
FilterWrapper Class
Class FilterWrapper
{
public:
FilterWrapper();
~FilterWrapper();
void stopFilter();
private:
void sampleSensor1();
void sampleSensor2();
void processData();
void runAlgorithm();
bool stop_condition = false;
std::thread thread1,thread2,thread3,thread4;
std::deque<float> bufferA,bufferB,bufferC;
std::mutex mtxA,mtxB,mtxC;
};
FilterWrapper::FilterWrapper() {
thread1 = std::thread(&FilterWrapper::sampleSensor1,this);
thread2 = std::thread(&FilterWrapper::sampleSensor2,this);
thread3 = std::thread(&FilterWrapper::processData,this);
thread4 = std::thread(&FilterWrapper::runAlgorithm,this);
}
void FilterWrapper::stopFilter() {
stop_condition = true;
if (thread1.joinable()) thread1.join();
// same for other threads ...
}
void FilterWrapper::sampleSensor1() {
while(!stop_condition) {
// code sample data
std::lock_guard<std::mutex> lck(mtxA);
bufferA.push_back(data);
}
}
void FilterWrapper::sampleSensor2() {
while(!stop_condition) {
// code sample data
std::lock_guard<std::mutex> lck(mtxB);
bufferB.push_back(data);
}
}
void FilterWrapper::processData() {
while(!stop_condition) {
float data;
{
std::lock_guard<std::mutex> lck(mtxA);
if (bufferA.empty()) continue;
data = bufferA.front();
bufferA.pop_front();
}
// code process data...
std::lock_guard<std::mutex> lck(mtxC);
bufferC.push_back(data);
}
}
void FilterWrapper::runAlgorithm() {
while(!stop_condition) {
float data1, data2;
{
std::lock_guard<std::mutex> lck(mtxB);
if (bufferB.empty()) continue;
data1 = bufferB.front();
bufferB.pop_front();
}
{
std::lock_guard<std::mutex> lck(mtxC);
if (bufferC.empty()) continue;
data1 = bufferC.front();
bufferC.pop_front();
}
std::chrono::stead_clock::time_point t_start = std::chrono::stead_clock::now();
// run the algorithm with data1 and data2 ...
std::chrono::stead_clock::time_point t_end = std::chrono::stead_clock::now();
std::chrono::duration<float,std::milli> dur = t_end-t_start;
std::cout << "algorithm time: " << dur.count() << "\n";
}
}
Project Structure
Inside the DLL
A FilterWrapper Class whose instance will initialize and manage:
Sensor1 sampling thread - a producer thread, store data in FIFO buffer A
Sensor2 sampling thread - a producer thread, store data in FIFO buffer B
Sensor data processing thread - process the raw data from the buffer A and queue the results in a FIFO buffer C
Algorithm thread - a consumer thread, take data out of FIFO buffer B&C and run the algorithm
All threads will run while loops with a stop condition
Functions to export
an initialize(FilterWrapper *& pWrapper, Callback cb) function - to create a FilterWrapper object via new and pass the pointer out, and pass in the callback function.
a stop(FilterWrapper *& pWrapper) function - to set stop conditions for all threads in the FilterWrapper object and free the pointer using delete.
On Unity side
Import the initialize() and stop() functions using [DllImport("MyDLL")]
Call initialize() in Awake() and pass in the callback function
Call stop() in OnApplicationQuit()
Use a private IntPtr pWrapper to hold the reference to the FilterWrapper object passed out by initialize().
Problem
I first developed and verified the algorithm and multithreading in a C++ console application project then copy over the classes and functions to a DLL project and wrote the FilterWrapper Class. In the C++ console application as well as the Unity Editor mode, the execution time of each iteration in algorithm loop is consistently around 9 ms, and that of each iteration in sensor data processing loop is 12 ms. However, when the Unity project is built, the execution times could frequently spike to 30 ms and 90 ms respectively.
Things I Have Done So Far
allocate a console window in the DLL such that I can monitor the debug information.
use std::chrono::steady_clock to time the execution; data is retrieved at the beginning of loops so that the time of waiting to acquire locks is NOT counted.
use std::lock_guard and std::mutex to ensure safe access to buffers.
start a clean Unity project with default scene, and the only addition is attaching the C# script that imports and calls the DLL; leave all Build setting as default; ensure the latest DLL build is copied into the Plugin folder.
experiment with process and thread priorities: set process priority to HIGH_PRIORITY_CLASS (one level below real-time priority to avoid affecting system stability) and set thread priority to THREAD_PRIORITY_HIGHEST (despite the name, also one level below time-critical priority).
experiment with manually setting thread affinity (I am desperate); distribute each thread to one logical processor (I do have enough logical processors).
run both editor and standalone build long enough to ensure consistent observation.
in DLL, comment out the algorithm code and replace it with code just dynamically allocating a large byte array (uint8_t *pByteArr = new uint8_t[1200000]), memcpy() some garbage to it and deallocate it (delete[] pByteArr). In Editor mode, it takes roughly 0.18 ms on my machine. In the standalone build, it frequently spikes to 5 ms.
Summary
The C++ portion of code runs fine in either console application or Unity Editor mode when imported as DLL, but very unstable and slow when the Unity project is built to the standalone application. As people often say the Editor mode runs a lot of overhead, one would expect the performance is typically better when the project is built. It seems my case the quite the opposite. I have ruled out the graphics issue since there is really nothing in the Unity scene, and I am thinking there are some environment factors that change when the project is built, but I am not sure where to look.

QCustomPlot Huge Amount of Data Plotting

I am trying to plot some serial data on my Qt Gui program using qcustomplot class. I had no trouble when I tried to plot low sampling frequency datas like 100 data/second. The graph was really cool and was plotting the data smoothly. But at high sampling rates such 1000data/second, plotter makes a bottleneck for serial read function. It slow downs serial there was a huge delay like 4-5 seconds apart from device. Straightforwardly, plotter could not reach the data stream speed. So, is there any common issue which i dont know about or any recommendation?
I thougth these scenarious,
1- to devide whole program to 2 or 3 thread. For example, serial part runs in one thread and plotting part runs in another thread and two thread communicates with a QSemaphore
2- fps of qcustom plot is limited. but there should be a solution because NI LABVIEW plots up to 2k of datas without any delay
3- to desing a new virtual serial device in usb protocol. Now, I am using ft232rl serial to usb convertor.
4- to change programming language. What is the situation and class support in C# or java for realtime plotting? (I know it is like a kid saying, but this is a pretex to be experienced in other languages)
My serial device send data funct(it is foo device for experiment there is no serious coding) is briefly that:
void progTask()
{
DelayMsec(1); //my delay function, milisecond
//read value from adc13
Adc13Read(adcValue.ui32Part);
sendData[0] = (char)'a';
sendData[1] = (char)'k';
sendData[2] = adcValue.bytes[0];
sendData[3] = (adcValue.bytes[1] & 15);
Qt Program read function is that:
//send test data
UARTSend(UART6_BASE,&sendData[0],4);
}
union{
unsigned char bytes[2];
unsigned int intPart;
unsigned char *ptr;
}serData;
void MedicalSoftware::serialReadData()
{
if(serial->bytesAvailable()<4)
{
//if the frame size is less than 4 bytes return and
//wait to full serial receive buffer
//note: serial->setReadBufferSize(4)!!!!
return;
}
QByteArray serialInData = serial->readAll();
//my algorithm
if(serialInData[0] == 'a' && serialInData[1] == 'k')
{
serData.bytes[0] = serialInData[2];
serData.bytes[1] = serialInData[3];
}else if(serialInData[2] == 'a' && serialInData[3] == 'k')
{
serData.bytes[0] = serialInData[0];
serData.bytes[1] = serialInData[1];
}
else if(serialInData[1] == 'a' && serialInData[2] == 'k')
{
serial->read(1);
return;
}else if(serialInData[0] == 'k' && serialInData[3] == 'a')
{
serData.bytes[0] = serialInData[1];
serData.bytes[1] = serialInData[2];
}
plotMainGraph(serData.intPart);
serData.intPart = 0;
}
And qcustom plot setting fuction is:
void MedicalSoftware::setGraphsProperties()
{
//MAIN PLOTTER
ui->mainPlotter->addGraph();
ui->mainPlotter->xAxis->setRange(0,2000);
ui->mainPlotter->yAxis->setRange(-0.1,3.5);
ui->mainPlotter->xAxis->setLabel("Time(s)");
ui->mainPlotter->yAxis->setLabel("Magnitude(mV)");
QSharedPointer<QCPAxisTickerTime> timeTicker(new QCPAxisTickerTime());
timeTicker->setTimeFormat("%h:%m:%s");
ui->mainPlotter->xAxis->setTicker(timeTicker);
ui->mainPlotter->axisRect()->setupFullAxesBox();
QPen pen;
pen.setColor(QColor("blue"));
ui->mainPlotter->graph(0)->setPen(pen);
dataTimer = new QTimer;
}
And the last is plot function:
void MedicalSoftware::plotMainGraph(const quint16 serData)
{
static QTime time(QTime::currentTime());
double key = time.elapsed()/1000.0;
static double lastPointKey = 0;
if(key-lastPointKey>0.005)
{
double value0 = serData*(3.3/4096);
ui->mainPlotter->graph(0)->addData(key,value0);
lastPointKey = key;
}
ui->mainPlotter->xAxis->setRange(key+0.25, 2, Qt::AlignRight);
counter++;
ui->mainPlotter->replot();
counter = 0;
}
Quick answer:
Have you tried:
ui->mainPlotter->replot(QCustomPlot::rpQueuedReplot);
according to the documentation it can improves performances when doing a lot of replots.
Longer answer:
My feeling on your code is that you are trying to replot as often as you can to get a "real time" plot. But if you are on a PC with a desktop OS there is no such thing as real time.
What you should care about is:
Ensure that the code that read/write to the serial port is not delayed too much. "Too much" is to be interpreted with respect to the connected hardware. If it gets really time critical (which seems to be your case) you have to optimize your read/write functions and eventually put them alone in a thread. This can go as far as reserving a full hardware CPU core for this thread.
Ensure that the graph plot is refreshed fast enough for the human eyes. You do not need to do a full repaint each time you receive a single data point.
In your case you receive 1000 data/s which make 1 data every ms. That is quite fast because that is beyond the default timer resolution of most desktop OS. That means you are likely to have more than a single point of data when calling your "serialReadData()" and that you could optimize it by calling it less often (e.g call it every 10ms and read 10 data points each time). Then you could call "replot()" every 30ms which would add 30 new data points each time, skip about 29 replot() calls every 30ms compared to your code and give you ~30fps.
1- to devide whole program to 2 or 3 thread. For example, serial part
runs in one thread and plotting part runs in another thread and two
thread communicates with a QSemaphore
Dividing the GUI from the serial part in 2 threads is good because you will prevent a bottleneck in GUI to block the serial communication. Also you could skip using semaphore and simply rely on Qt signal/slot connections (connected in Qt::QueuedConnection mode).
4- to change programming language. What is the situation and class
support in C# or java for realtime plotting? (I know it is like a kid
saying, but this is a pretex to be experienced in other languages)
Changing the programming language, in best case, won't change anything or could hurt your performances, especially if you go toward languages which are not compiled to native CPU instructions.
Changing the plotting library on the other hand could change the performances. You can look at Qt Charts and Qwt. I do not know how they compare to QCustomPlot though.

C++ Arduino, running two loops at once?

Okay so I have just recently dived into programming an Arduino, Currently I have the basic blink function along with a RGB LED program that changes an LED to blue, green and red in fading colors. I have 2 LEDS a simple and basic yellow LED that's supposed to function as an indicator for a "working status". And a LED that is RGB. Now I want the RGB one to transition through it's colors normally although I want to keep the Yellow LED constantly flashing.
How hould I make my code so that two processes can run at the same time?
Something like:
int timekeeper=0;
while (1)
{
do_fade(timekeeper);
if (timekeeper%100==0) {
do_blink_off();
}
if (timekeeper%100==50) {
do_blink_on();
}
delay(10);
timekeeper++;
}
This is done from memory, so your mileage may vary.
I've passed timekeeper to do_fade(), so you can figure out how far along the fade you are. do_fade() would update the fade, then immediately return. do_blink_on() and do_blink_off() would be similar - change what you need to change, then return. In this example, do_fade() would be called every 10 milliseconds, do_blink_off() once per second, with do_blink_on() 1/2 a second after (so on, 1/2 second, off, 1/2 second, on, 1/2 second...)
AMADANON's answer will work, however keep in mind the preferred way to do multiple tasks like this is with timer interrupts. For example, if you wanted your code to do something else after it fades, the timing of those other functions will interfere with your LED blinking. To solve this, you use timers that are built into the Arduino.
In the background, a timer is counting up, and when it hits a certain value, it resets it's counter and triggers the Interrupt Service Routine, which is where you would turn the LED on/off.
Here's a tutorial on blinking an LED with timer interrupts:
http://www.engblaze.com/microcontroller-tutorial-avr-and-arduino-timer-interrupts/
Try RTOS for Arduino.
You create tasks which are separate loops. I use it and it works fine.
https://create.arduino.cc/projecthub/feilipu/using-freertos-multi-tasking-in-arduino-ebc3cc
Also, I recommend using PlatformIO with the Arduino environment. Then you can also import RTOS via the library.
https://platformio.org/
Example code snippets:
In the setup:
void TaskMotion( void *pvParameters ); // Senses input from the motion sensor
and
xTaskCreate( // Create task
TaskMotion
, "Motion" // A name just for humans
, 12800 // Stack size
, NULL
, 1 // priority
, NULL );
... below the Arduino loop (having nothing but a delay(1000); in):
// ╔╦╗╔═╗╔╦╗╦╔═╗╔╗╔ ╔═╗╔═╗╔╗╔╔═╗╔═╗╦═╗
// ║║║║ ║ ║ ║║ ║║║║ ╚═╗║╣ ║║║╚═╗║ ║╠╦╝
// ╩ ╩╚═╝ ╩ ╩╚═╝╝╚╝ ╚═╝╚═╝╝╚╝╚═╝╚═╝╩╚═
void TaskMotion(void *pvParameters) // This is a task.
{
(void) pvParameters;
// initialize stuff.
for (;;) // A Task shall never return or exit.
{
Serial.println("TEST MOTION");
delay(10000);
}
}
Copy paste and change "TaskMotion" to "LED something". You can create as many tasks as you want. The RTOS manages each task. Like if one task has a delay(10), then the next 10 ms are used for another task.

Syncing audio and video playback with OpenAL & C++

I am trying to create a webcam chat program in C++, and while I have been able to get the images to be captured sent and played, I am having trouble with doing the same with the audio: the audio lags and very quickly goes out of sync with the video, even when I just played it to myself.
I found this answer and sample code to be really useful.
Are there any modifications I can make to this code to get it to be nearly lag free, or is OpenAL not right for this? I am using Windows, but I plan on making a linux version later.
From the code linked:
ALCdevice* inputDevice = alcCaptureOpenDevice(NULL,FREQ,AL_FORMAT_MONO16,FREQ/2);
Try using a larger buffer:
ALCdevice* inputDevice = alcCaptureOpenDevice(NULL,FREQ,AL_FORMAT_MONO16,FREQ*4);
The polling is very aggressive. Try sleeping in the loop:
while (!done) {
...
}
To:
int sleepSeconds = 1;
while (!done) {
...
Sleep(sleepSeconds/10) //windows, miliseconds
//sleep(sleepSeconds) //linux, seconds
}