Silence between played buffers in OpenAL? - c++

I use alSourceQueueBuffers to stream buffers into a AL sound source. I have buffers of different size that need to be played one after another. So far so good, however, between some buffer I need a variable amount of silence, how can I add it programmatic?

Perhaps the easiest way would be to generate buffers that hold silence of the length needed and queue them appropriately. You just need to make an array full of zeros based on the sample rate and the desired length of silence and pass it into the buffer.
If you want things to be more complicated, then you can't queue all of the buffers. You queue the one that needs to play right now and set a timer for when it will be done (and the amount of silent time has also passed). Then you can queue the next buffer. Or you can poll the source to see if it has stopped and when it does, start counting down the silent time. You could also use the streaming functionality...
Edit:
This worked for me. Sample rate needs to be the same as other buffers queued on your source. You could also have a 'greatest common denominator' length buffer and just queue it up multiple times.
int sampleRate=22050;
double sTime=2.5; // How long to maintain silence.
int sampleCount= int(sTime*sampleRate);
int byteCount = sampleCount*sizeof(short);
short* silence = (short*)malloc(byteCount);
memset(silence,0,byteCount);
alBufferData(silenceBuffer,AL_FORMAT_MONO16,silence,byteCount,sampleRate);
alSourceQueueBuffers(mySource,1,&silenceBuffer);
free(silence);

Related

AT command response parser

I am working on my own implementation to read AT commands from a Modem using a microcontroller and c/c++
but!! always a BUT!! after I have two "threads" on my program, the first one were I am comparing the possible reply from the Moden using strcmp which I believe is terrible slow
comparing function
if (strcmp(reply, m_buffer) == 0)
{
memset(buffer, 0, buffer_size);
buffer_size = 0;
memset(m_buffer, 0, m_buffer_size);
m_buffer_size = 0;
return 0;
}
else
return 1;
this one works fine for me with AT commands like AT or AT+CPIN? where the last response from the Modem is "OK" and nothing in the middle, but it is not working with commands like AT+CREG?, wheres it responses:
+REG: n,n
OK
and I am specting for "+REG: n,n" but I believe strncpy is very slow and my buffer data is replaced for "OK"
2nd "thread" where it enables a UART RX interruption and replaces my buffer data every time it receives new data
Interruption handle:
m_buffer_size = buffer_size;
strncpy(m_buffer, buffer, buffer_size + m_buffer_size);
Do you know any out there faster than strcmp? or something to improve the AT command responses reading?
This has the scent of an XY Problem
If you have seen the buffer contents being over written, you might want to look into a thread safe queue to deliver messages from the RX thread to the parsing thread. That way even if a second message arrives while you're processing the first, you won't run into "buffer overwrite" problems.
Move the data out of the receive buffer and place it in another buffer. Two buffers is rarely enough, so create a pool of buffers. In the past I have used linked lists of pre-allocated buffers to keep fragmentation down, but depending on the memory management and caching smarts in your microcontroller, and the language you elect to use, something along the lines of std::deque may be a better choice.
So
Make a list of free buffers.
When a the UART handling thread loop looks something like,
Get a buffer from the free list
Read into the buffer until full or timeout
Pass buffer to parser.
Parser puts buffer in its own receive list
Parsing sends a signal to wake up its thread.
Repeat until terminated. If the free list is emptied, your program is probably still too slow to keep up. Perhaps adding more buffers will allow the program to get through a busy period, but if the data flow is relatively constant and the free list empties out... Well, you have a problem.
Parser loop also repeats until terminated looks like:
If receive list not empty,
Get buffer from receive list
Process buffer
Return buffer to free list
Otherwise
Sleep
Remember to protect the lists from concurrent access by the different threads. C11 and C++11 have a number of useful tools to assist you here.

Correct use of memcpy

I have some problems with a project I'm doing. Basically I'm just using memcpy the wrong way. I know the theroy of pointer/arrays/references and should know how to do that, nevertheless I've spend two days now without any progress. I'll try to give a short code overview and maybe someone sees a fault! I would be very thankful.
The Setup: I'm using an ATSAM3x Microcontroller together with a uC for signal aquisition. I receive the data over SPI.
I have an Interrupt receiving the data whenever the uC has data available. The data is then stored in a buffer (int32_t buffer[1024 or 2048]). There is a counter that counts from 0 to the buffer size-1 and determines the place where the data point is stored. Currently I receive a test signal that is internally generated by the uC
//ch1: receive 24 bit data in 8 bit chunks -> store in an int32_t
ch1=ch1|(SPI.transfer(PIN_CS, 0x00, SPI_CONTINUE)<<24)>>8;
ch1=ch1|(SPI.transfer(PIN_CS, 0x00, SPI_CONTINUE)<<16)>>8;
ch1=ch1|(SPI.transfer(PIN_CS, 0x00, SPI_CONTINUE)<<8)>>8;
if(Not Important){
_ch1Buffer[_ch1SampleCount] = ch1;
_ch1SampleCount++;
if(_ch1SampleCount>SAMPLE_BUFFER_SIZE-1) _ch1SampleCount=0;
}
This ISR is active all the time. Since I need raw data for signal processing and the buffer is changed by the ISR whenever a new data point is available, i want to copy parts of the buffer into a temporary "storage".
To do so, I have another, global counter wich is incremented within the ISR. In the mainloop, whenever the counter reaches a certain size, i call a method get some of the buffer data (about 30 samples).
The method aquires the current position in the buffer:
'int ch1Pos = _ch1SampleCount;'
and then, depending on that position I try to use memcpy to get my samples. Depending on the position in the buffer, there has to be a "wrap-around" to get the full set of samples:
if(ch1Pos>=(RAW_BLOCK_SIZE-1)){
memcpy(&ch1[0],&_ch1Buffer[ch1Pos-(RAW_BLOCK_SIZE-1)] , RAW_BLOCK_SIZE*sizeof(int32_t));
}else{
memcpy(&ch1[RAW_BLOCK_SIZE-1 - ch1Pos],&_ch1Buffer[0],(ch1Pos)*sizeof(int32_t));
memcpy(&ch1[0],&_ch1Buffer[SAMPLE_BUFFER_SIZE-1-(RAW_BLOCK_SIZE- ch1Pos)],(RAW_BLOCK_SIZE-ch1Pos)*sizeof(int32_t));
}
_ch1Buffer is the buffer containing the raw data
SAMPLE_BUFFER_SIZE is the size of that buffer
ch1 is the array wich is supposed to hold the set of samples
RAW_BLOCK_SIZE is the size of that array
ch1Pos is the position of the last data point written to the buffer from the ISR at the time where this method is called
Technically I'm aware of the requirements, but apparently thats not enough ;-).
I know, that the data received by the SPI interface is "correct". The problem is, that this is not the case for the extracted samples. There are a lot of spikes in the data that indicate that I've been reading something I wasn't supposed to read. I've changed the memcpy commands that often, that I completly lost the overview. The code sample above is one version of many's, and while you're reading this I'm sure I've changed everything again.
I would appreciate every hint!
Thanks & Greetings!
EDIT
I've written down everything (again) on a sheet of paper and tested some constellations. This is the updated Code for the memcpy part:
if(ch1Pos>=(RAW_BLOCK_SIZE-1)){
memcpy(&ch1[0],&_ch1Buffer[ch1Pos-(RAW_BLOCK_SIZE-1)] , RAW_BLOCK_SIZE*sizeof(int32_t));
}else{
memcpy(&ch1[RAW_BLOCK_SIZE-1-ch1Pos],&_ch1Buffer[0],(ch1Pos+1)*sizeof(int32_t));
memcpy(&ch1[0],&_ch1Buffer[SAMPLE_BUFFER_SIZE-(RAW_BLOCK_SIZE-1-ch1Pos)],(RAW_BLOCK_SIZE-1-ch1Pos)*sizeof(int32_t));
}
}
This already made it a lot better. From all the changes, everything kinda got messed up. Now there is just one Error there. There is a periodical spike. I'll try to get more information, but I think it is a wrong access while wrapping around.
I've changed the if(_ch1SampleCount>SAMPLE_BUFFER_SIZE-1) _ch1SampleCount=0; to if(_ch1SampleCount>=SAMPLE_BUFFER_SIZE) _ch1SampleCount=0;.
EDIT II
To answer the Questions of #David Schwartz :
SPI.transfer returns a single byte
The buffer is initialised once at startup: memset(_ch1Buffer,0,sizeof(int32_t)*SAMPLE_BUFFER_SIZE);
EDIT III
Sorry for the frequent updates, the comment section is getting too big.
I managed to get rid of a bunch of zero values at the beginning of the stream by decreasing ch1Pos: 'int ch1Pos = _ch1SampleCount;' Now there is just one periodic "spike" (wrong value). It must be something with the splitted memcpy command. I'll continue looking. If anyone has an idea ... :-)

Concatenate data in an array in C ++

I'm working on software for processing audio in real time in C++ with Qt. I need that requirements are minimized.
Defining a temporary buffer 40ms, launching our device with a sampling frequency Fs = 8000Hz, every 320 samples entered a feature called Data Processing ().
The idea is to have a global buffer that stores the 10s last recorded, 80000 samples.
This Buffer in each iteration eliminates the initial 320 samples and looped at the end, 320 new samples. Thus the buffer is updated and the user can observe the real-time graphical representation of the recorded signal.
At first I thought of using QVector (equivalent to std::vector but for Qt) for this deployment, thus we reduce the process a few lines of code
int NUM_POINTS=320;
DatosTemporales.erase(DatosTemporales.begin(),DatosTemporales.begin()+NUM_POINTS);
DatosTemporales+= (DatosNuevos); // Datos Nuevos con un tamaño de NUM_POINTS
In each iteration we create a vector of 80000 samples in addition to free some positions so requires some processing time. An alternative for opting was the use of * double, and iterations a loop:
for(int i=0;i<80000;i++){
if(i<80000-NUM_POINTS){
aux=DatosTemporales[i];
DatosTemporales[i+NUM_POINTS]=aux;
}else{
DatosTemporales[i]=DatosNuevos[i-NUN_POINTS];
}
}
Does fails. I think the best way is to use dynamic memory. Implementing this process by pointers. Could anyone give me some idea how to implement it?
It sounds like what you are looking for is a circular buffer.
https://www.google.com/search?q=qcircularbuffer
https://qt.gitorious.org/qt/qtbase/merge_requests/60
And it looks like you only need the header file and you should be good to go.
A similar tool that is already in the Qt data set is found here:
http://doc.qt.io/qt-5/qcontiguouscache.html#details
The advantage of using a system like these presented, is that they don't need to have dynamic memory, it just needs to move the head and the tail pointers.
Hope that helps.

Socket Commuication with High frequency

I need to send data to another process every 0.02s.
The Server code:
//set socket, bind, listen
while(1){
sleep(0.02);
echo(newsockfd);
}
void echo (int sock)
{
int n;
char buffer[256]="abc";
n=send(sock,buffer,strlen(buffer),0);
if (n < 0) error("ERROR Sending");
}
The Client code:
//connect
while(1)
{
bzero(buffer,256);
n = read(sock,buffer,255);
printf("Recieved data:%s\n",buffer);
if (n < 0)
error("ERROR reading from socket");
}
The problem is that:
The client shows something like this:
Recieved data:abc
Recieved data:abcabcabc
Recieved data:abcabc
....
How does it happen? When I set sleep time:
...
sleep(2)
...
It would be ok:
Recieved data:abc
Recieved data:abc
Recieved data:abc
...
TCP sockets do not guarantee framing. When you send bytes over a TCP socket, those bytes will be received on the other end in the same order, but they will not necessarily be grouped the same way — they may be split up, or grouped together, or regrouped, in any way the operating system sees fit.
If you need framing, you will need to send some sort of packet header to indicate where each chunk of data starts and ends. This may take the form of either a delimiter (e.g, a \n or \0 to indicate where each chunk ends), or a length value (e.g, a number at the head of each chunk to denote how long it is).
Also, as other respondents have noted, sleep() takes an integer, so you're effectively not sleeping at all here.
sleep takes unsigned int as argument, so sleep(0.02) is actually sleep(0).
unsigned int sleep(unsigned int seconds);
Use usleep(20) instead. It will sleep in microseconds:
int usleep(useconds_t usec);
The OS is at liberty to buffer data (i.e. why not just send a full packet instead of multiple packets)
Besides sleep takes a unsigned integer.
The reason is that the OS is buffering data to be sent. It will buffer based on either size or time. In this case, you're not sending enough data, but you're sending it fast enough the OS is choosing to bulk it up before putting it on the wire.
When you add the sleep(2), that is long enough that the OS chooses to send a single "abc" before the next one comes in.
You need to understand that TCP is simply a byte stream. It has no concept of messages or sizes. You simply put bytes on the wire on one end and take them off on the other. If you want to do specific things, then you need to interpret the data special ways when you read it. Because of this, the correct solution is to create an actual protocol for this. That protocol could be as simple as "each 3 bytes is one message", or more complicated where you send a size prefix.
UDP may also be a good solution for you, depending on your other requirements.
sleep(0.02)
is effectively
sleep(0)
because argument is unsigned int, so implicit conversion does it for you. So you have no sleep at all here. You can use sleep(2) to sleep for 2 microseconds.Next, even if you had, there is no guarantee that your messages will be sent in a different frames. If you need this, you should apply some sort of delimiter, I have seen
'\0'
character in some implementation.
TCPIP stacks buffer up data until there's a decent amount of data, or until they decide that there's no more coming from the application and send what they've got anyway.
There are two things you will need to do. First, turn off Nagle's algorithm. Second, sort out some sort of framing mechanism.
Turning off Nagle's algorithm will cause the stack to "send data immediately", rather than waiting on the off chance that you'll be wanting to send more. It actually leads to less network efficiency because you're not filling up Ethernet frames, something to bare in mind on Gigabit where jumbo frames are required to get best throughput. But in your case timeliness is more important than throughput.
You can do your own framing by very simple means, eg by send an integer first that says how long the rest if the message will be. At the reader end you would read the integer, and then read that number of bytes. For the next message you'd send another integer saying how long that message is, etc.
That sort of thing is ok but not hugely robust. You could look at something like ASN.1 or Google Protocol buffers.
I've used Objective System's ASN.1 libraries and tools (they're not free) and they do a good job of looking after message integrity, framing, etc. They're good because they don't read data from a network connection one byte at a time so the efficiency and speed isn't too bad. Any extra data read is retained and included in the next message decode.
I've not used Google Protocol Buffers myself but it's possible that they have similar characteristics, and there maybe other similar serialisation mechanisms out there. I'd recommend avoiding XML serialisation for speed/efficiency reasons.

How to use ALSA's snd_pcm_writei()?

Can someone explain how snd_pcm_writei
snd_pcm_sframes_t snd_pcm_writei(snd_pcm_t *pcm, const void *buffer,
snd_pcm_uframes_t size)
works?
I have used it like so:
for (int i = 0; i < 1; i++) {
f = snd_pcm_writei(handle, buffer, frames);
...
}
Full source code at http://pastebin.com/m2f28b578
Does this mean, that I shouldn't give snd_pcm_writei() the number of
all the frames in buffer, but only
sample_rate * latency = frames
?
So if I e.g. have:
sample_rate = 44100
latency = 0.5 [s]
all_frames = 100000
The number of frames that I should give to snd_pcm_writei() would be
sample_rate * latency = frames
44100*0.5 = 22050
and the number of iterations the for-loop should be?:
(int) 100000/22050 = 4; with frames=22050
and one extra, but only with
100000 mod 22050 = 11800
frames?
Is that how it works?
Louise
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html#gf13067c0ebde29118ca05af76e5b17a9
frames should be the number of frames (samples) you want to write from the buffer. Your system's sound driver will start transferring those samples to the sound card right away, and they will be played at a constant rate.
The latency is introduced in several places. There's latency from the data buffered by the driver while waiting to be transferred to the card. There's at least one buffer full of data that's being transferred to the card at any given moment, and there's buffering on the application side, which is what you seem to be concerned about.
To reduce latency on the application side you need to write the smallest buffer that will work for you. If your application performs a DSP task, that's typically one window's worth of data.
There's no advantage in writing small buffers in a loop - just go ahead and write everything in one go - but there's an important point to understand: to minimize latency, your application should write to the driver no faster than the driver is writing data to the sound card, or you'll end up piling up more data and accumulating more and more latency.
For a design that makes producing data in lockstep with the sound driver relatively easy, look at jack (http://jackaudio.org/) which is based on registering a callback function with the sound playback engine. In fact, you're probably just better off using jack instead of trying to do it yourself if you're really concerned about latency.
I think the reason for the "premature" device closure is that you need to call snd_pcm_drain(handle); prior to snd_pcm_close(handle); to ensure that all data is played before the device is closed.
I did some testing to determine why snd_pcm_writei() didn't seem to work for me using several examples I found in the ALSA tutorials and what I concluded was that the simple examples were doing a snd_pcm_close () before the sound device could play the complete stream sent it to it.
I set the rate to 11025, used a 128 byte random buffer, and for looped snd_pcm_writei() for 11025/128 for each second of sound. Two seconds required 86*2 calls snd_pcm_write() to get two seconds of sound.
In order to give the device sufficient time to convert the data to audio, I put used a for loop after the snd_pcm_writei() loop to delay execution of the snd_pcm_close() function.
After testing, I had to conclude that the sample code didn't supply enough samples to overcome the device latency before the snd_pcm_close function was called which implies that the close function has less latency than the snd_pcm_write() function.
If the ALSA driver's start threshold is not set properly (if in your case it is about 2s), then you will need to call snd_pcm_start() to start the data rendering immediately after snd_pcm_writei().
Or you may set appropriate threshold in the SW params of ALSA device.
ref:
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m___s_w___params.html