waveOutWrite buffers are never returned to application - c++

I have a problem with Microsoft's WaveOut API:
edit1: Added Link to sample project:
edit2: removed link, its not representative of the issue
After playing some audio, when I want to terminate a given playback stream, I call the function:
waveOutClose(hWaveOut_);
However, even after waveOutClose() is called, sometimes the library will still access memory previously passed to it by waveOutWrite(), causing an invalid memory access.
I then tried to ensure all the buffers are marked as done before freeing the buffer:
PcmPlayback::~PcmPlayback()
{
if(hWaveOut_ == nullptr)
return;
waveOutReset(hWaveOut_); // infinite-loops, never returns
for(auto it = buffers_.begin(); it != buffers_.end(); ++it)
waveOutUnprepareHeader(hWaveOut_, &it->wavehdr_, sizeof(WAVEHDR));
while( buffers_.empty() == false ) // infinite loops
removeCompletedBuffers();
waveOutClose(hWaveOut_);
//Unhandled exception at 0x75629E80 (msvcrt.dll) in app.exe:
// 0xC0000005: Access violation reading location 0xFEEEFEEE.
}
void PcmPlayback::removeCompletedBuffers()
{
for(auto it = buffers_.begin(); it != buffers_.end();)
{
if( it->wavehdr_.dwFlags & WHDR_DONE )
{
waveOutUnprepareHeader(hWaveOut_, &it->wavehdr_, sizeof(WAVEHDR));
it = buffers_.erase(it);
}
else
++it;
}
}
However, this situation never happens - the buffer never becomes empty. There will be 4-5 blocks remaining with wavehdr_.dwFlags == 18 (I believe this means the blocks are still marked as in playback)
How can I resolve this issue?
# Martin Schlott ("Can you provide the loop where you write the buffer to waveOutWrite?")
Its not quite a loop, instead I have a function that is called whenever I receive an audio packet over the network:
void PcmPlayback::addData(const std::vector<short> &rhs)
{
removeCompletedBuffers();
if(rhs.empty())
return;
// add new data
buffers_.push_back(Buffer());
Buffer & buffer = buffers_.back();
buffer.data_ = rhs;
ZeroMemory(&buffers_.back().wavehdr_, sizeof(WAVEHDR));
buffer.wavehdr_.dwBufferLength = buffer.data_.size() * sizeof(short);
buffer.wavehdr_.lpData = (char *)(buffer.data_.data());
waveOutPrepareHeader(hWaveOut_, &buffer.wavehdr_, sizeof(WAVEHDR)); // prepare block for playback
waveOutWrite(hWaveOut_, &buffer.wavehdr_, sizeof(WAVEHDR));
}

The described behavior can happen if you do not call
waveOutUnprepareHeader
to every buffer you used before you use
waveOutClose
The flagfield _dwFlags seems to indicate that the buffers are still enqueued (WHDR_INQUEUE | WHDR_PREPARED) try:
waveOutReset
before unprepare buffers.
After analyses your code, I found two problems/bugs which are not related to waveOut (funny, you use C++11 but the oldest media interface). You use a vector as buffer. During some calling operations, the vector is copied! One bug I found is:
typedef std::function<void(std::vector<short>)> CALLBACK_FN;
instead of:
typedef std::function<void(std::vector<short>&)> CALLBACK_FN;
which forces a copy of the vector.
Try to avoid using vectors if you expect to use it mostly as rawbuffer. Better use std::unique_pointer as buffer pointer.
Your callback in the recorder is not monitored by a mutex, nor does it check if a destructor was already called. The destructing happens during the callback (mostly) which leads to an exception.
For your test program, go back and use raw pointer and static callbacks before blaming waveOut. Your code is not bad, but the first bug already shows, that a small bug will lead to unpredictical errors. As you also organize your buffers in a std::array, I would search for bugs there. I guess, you make a unintentional copy of your whole buffer array, unpreparing the wrong buffers.
I did not have the time to dig deeper, but I guess those are the problems.

I managed to find my problem in the end, it was caused by multiple bugs and a deadlock. I will document what happened here so people can learn from this in the future
I was clued in to what was happening when I fixed the bugs in the sample:
call waveInStop() before waveInClose() in ~Recorder.cpp
wait for all buffers to have the WHDR_DONE flag before calling waveOutClose() in ~PcmPlayback.
After doing this, the sample worked fine and did not display the behavior of the WHDR_DONE flag never being marked.
In my main program, that behavior was caused by a deadlock that occurs in the following situation:
I have a vector of objects representing each peer I am streaming audio with
Each Object owns a Playback class
This vector is protected by a mutex
Recorder callback:
mutex.lock()
send audio packet to each peer.
Remove Peer:
mutex.lock()
~PcmPlayback
wait for WHDR_DONE flags to be marked
A deadlock occurs when I remove a peer, locking the mutex and the recorder callback tries to acquire a lock too.
Note that this will happen often because the playback buffer is usually (~4 * 20ms) while the recorder has a cadence of 20ms.
In ~PcmPlayback, the buffers will never be marked as WHDR_DONE and any calls to the WaveOut API will never return because the WaveOut API is waiting for the Recorder callback to complete, which is in turn waiting on mutex.lock(), causing a deadlock.

Related

AT command response parser

I am working on my own implementation to read AT commands from a Modem using a microcontroller and c/c++
but!! always a BUT!! after I have two "threads" on my program, the first one were I am comparing the possible reply from the Moden using strcmp which I believe is terrible slow
comparing function
if (strcmp(reply, m_buffer) == 0)
{
memset(buffer, 0, buffer_size);
buffer_size = 0;
memset(m_buffer, 0, m_buffer_size);
m_buffer_size = 0;
return 0;
}
else
return 1;
this one works fine for me with AT commands like AT or AT+CPIN? where the last response from the Modem is "OK" and nothing in the middle, but it is not working with commands like AT+CREG?, wheres it responses:
+REG: n,n
OK
and I am specting for "+REG: n,n" but I believe strncpy is very slow and my buffer data is replaced for "OK"
2nd "thread" where it enables a UART RX interruption and replaces my buffer data every time it receives new data
Interruption handle:
m_buffer_size = buffer_size;
strncpy(m_buffer, buffer, buffer_size + m_buffer_size);
Do you know any out there faster than strcmp? or something to improve the AT command responses reading?
This has the scent of an XY Problem
If you have seen the buffer contents being over written, you might want to look into a thread safe queue to deliver messages from the RX thread to the parsing thread. That way even if a second message arrives while you're processing the first, you won't run into "buffer overwrite" problems.
Move the data out of the receive buffer and place it in another buffer. Two buffers is rarely enough, so create a pool of buffers. In the past I have used linked lists of pre-allocated buffers to keep fragmentation down, but depending on the memory management and caching smarts in your microcontroller, and the language you elect to use, something along the lines of std::deque may be a better choice.
So
Make a list of free buffers.
When a the UART handling thread loop looks something like,
Get a buffer from the free list
Read into the buffer until full or timeout
Pass buffer to parser.
Parser puts buffer in its own receive list
Parsing sends a signal to wake up its thread.
Repeat until terminated. If the free list is emptied, your program is probably still too slow to keep up. Perhaps adding more buffers will allow the program to get through a busy period, but if the data flow is relatively constant and the free list empties out... Well, you have a problem.
Parser loop also repeats until terminated looks like:
If receive list not empty,
Get buffer from receive list
Process buffer
Return buffer to free list
Otherwise
Sleep
Remember to protect the lists from concurrent access by the different threads. C11 and C++11 have a number of useful tools to assist you here.

How should QLocalSocket/QDataStream be read to avoid deadlocks?

How should QLocalSocket/QDataStream be read?
I have a program that communicates with another via named pipes using QLocalSocket and QDataStream. The recieveMessage() slot below is connected to the QLocalSocket's readyRead() signal.
void MySceneClient::receiveMessage()
{
qint32 msglength;
(*m_stream) >> msglength;
char* msgdata = new char[msglength];
int read = 0;
while (read < msglength) {
read += m_stream->readRawData(&msgdata[read], msglength - read);
}
...
}
I find that the application sometimes hangs on readRawData(). That is, it succesfully reads the 4 byte header, but then never returns from readRawData().
If I add...
if (m_socket->bytesAvailable() < 5)
return;
...to the start of this function, the application works fine (with the short test message).
I am guessing then (the documentation is very sparse) that there is some sort of deadlock occurring, and that I must use the bytesAvailable() signal to gradually build up the buffer rather than blocking.
Why is this? And what is the correct approach to reading from QLocalSocket?
Your loop blocks the event loop, so you will never get data if all did not arrive pn first read, is what causes your problem I think.
Correct approach is to use signals and slots, readyRead-signal here, and just read the available data in your slot, and if there's not enough, buffer it and return, and read more when you get the next signal.
Be careful with this alternative approach: If you are absolutely sure all the data you expect is going to arrive promptly (perhaps not unreasonable with a local socket where you control both client and server), or if the whole thing is in a thread which doesn nothing else, then it may be ok to use waitForReadyRead method. But the event loop will remain blocked until data arrives, freezing GUI for example (if in GUI thread), and generally troublesome.

pselect blocks even though data is available for read on socket

I'm experiencing an intermittent delay when reading from a POSIX socket (RHEL6 x86_64 C++ icpc). My code is designed such that a user can provide an absolute timespec deadline (vs. a relative timeout) to be used across multiple calls to recv. I call pselect to make sure that data is available for reading before attempting to call recv.
This typically works as expected (will wait for data but not exceed deadline, introducing no noticeable delay if data is available to recv). However, I have a user that can periodically (~50% of the time) get his application into a state where the select blocks for ~400-500 ms even though data is available on the socket. If I watch /proc/net/tcp, I can see that data is available in the RX queue and I can see the application slowly reading the data off the queue. If I skip the call to pselect and just call recv, the behavior is similar (but less delay overall indicating recv is also blocking unnecessarily). When the application gets into this state it stays this way (experiences consistent delay with each pselect/recv).
I spent several hours poking around here and on other sites. This is the closest similar issue I could find, but there was no resolution...
http://developerweb.net/viewtopic.php?id=7458
Has anyone run into this sort of behavior before? I'm at a loss for what to do. I've instrumented the code to validate that this is where the delay is happening. (Edit: We actually just validated that the entire method below was slow, not any particular system call.) It seems like a kernel/OS issue but I'm not sure where to look. Here's the code...
// protected
bool
Message::wait(int socket, const timespec & deadline) {
// Bail if deadline not provided
if (deadline.tv_sec == 0 && deadline.tv_nsec == 0) {
return true;
}
// Make sure we haven't already exceeded deadline
timespec currentTime;
clock_gettime(CLOCK_REALTIME, &currentTime);
if (VirtualClock::cmptime(currentTime, deadline) >= 0) {
LOG_WARNING("Timed out waiting to receive data");
m_timedOut = true;
return false;
}
// Calculate receive timeout
timespec timeout;
memset(&timeout, 0, sizeof(timeout));
timeout.tv_nsec = VirtualClock::nsecs(currentTime, deadline);
VirtualClock::fixtime(timeout);
// Wait for data
fd_set descSet;
FD_ZERO(&descSet);
FD_SET(socket, &descSet);
int result = pselect(socket + 1, &descSet, NULL, NULL, &timeout, NULL);
if (result == -1) {
m_error = errno;
LOG_ERROR("Failed to wait for data: %d, %s",
m_error, strerror(m_error));
return false;
} else if (result == 0 || !FD_ISSET(socket, &descSet)) {
LOG_WARNING("Timed out waiting to receive data");
m_timedOut = true;
return false;
}
return true;
}
VirtualClock is a time-related utility class just used here to compare/fix-up timespecs (i.e. not introducing any delays). I'd appreciate any insight on this behavior.
This was in fact not a problem with any system call. We used strace to diagnose and were seeing tons of calls to clock_gettime. Another (third) review of the calling code revealed a programming error resulting in the called code having a reference to corrupt stack data. This was facilitated by a flawed API design on my part resulting in corruption of the deadline.
I was allowing the user to pass in a reference to a ServerConfig class containing configuration (including data related to the deadline). My Server class was saving the reference instead of copying the object. The user created an instance of my Server class on the heap, passed in a reference a ServerConfig on the stack (in a method) resulting in non-deterministic garbage in the configuration when the method exited and the ServerConfig went out of scope. This is older code and I've since prevented this sort of thing from happening in other places after being burned but this one slipped through.
So lessons learned for me are: be careful with writing APIs that hang on to user-provided references, rethink premature optimization (the whole reason I was hanging onto a reference instead of just doing a copy), and look for stack corruption when you see non-deterministic behavior like this (something that I check for when I suspect builds are jacked up but didn't suspect this time). Also, strace is a great tool...I've seen others use it but now I'm comfortable using it myself.
Thanks for the comments and sorry for the false alarm.

OpenAL unqueueing error code, incomplete documentation

I am trying to implement streaming audio and I've run into a problem where OpenAL is giving me an error codes seems impossible given the information in the documentation.
int buffersProcessed = 0;
alGetSourcei(m_Source, AL_BUFFERS_PROCESSED, &buffersProcessed);
PrintALError();
int toAddBufferIndex;
// Remove the first buffer from the queue and move it to
//the end after buffering new data.
if (buffersProcessed > 0)
{
ALuint unqueued;
alSourceUnqueueBuffers(m_Source, 1, &unqueued);
/////////////////////////////////
PrintALError(); // Prints AL_INVALID_OPERATION //
/////////////////////////////////
toAddBufferIndex = firstBufferIndex;
}
According to the documentation [PDF], AL_INVALID_OPERATION means: "There is no current context." This seems like it can't be true because OpenAL has been, and continues to play other audio just fine!
Just to be sure, I called ALCcontext* temp = alcGetCurrentContext( ); here and it returned a valid context.
Is there some other error condition that's possible here that's not mentioned in the docs?
More details: The sound source is playing when this code is being called, but the impression I got from reading the spec is you can safely unqueue processed buffers while the source is playing. PrintALError is just a wrapper for alGetError that prints if there is any error.
I am on a Mac (OS 10.8.3), in case it matters.
So far what I've gathered is that it seems this OpenAL implementation incorrectly throws an error if you unqueue a buffer while the source is playing. The spec says that you should be able to unqueue a buffer that has been marked as processing while the source is playing:
Removal of a given queue entry is not possible unless either the source is stopped (in which case then entire queue is considered processed), or if the queue entry has already been processed (AL_PLAYING or AL_PAUSED source).
On that basis I'm gonna say this is probably a bug in my OpenAL implementation. I'm gonna leave the question open in case someone can give a more concrete answer though.
To handle condition for multiple buffers use a loop.
Following works on iOS and linux :
// UN queue used buffers
ALint buffers_processed = 0;
alGetSourcei(streaming_source, AL_BUFFERS_PROCESSED, & buffers_processed); // get source parameter num used buffs
while (buffers_processed > 0) { // we have a consumed buffer so we need to replenish
ALuint unqueued_buffer;
alSourceUnqueueBuffers(streaming_source, 1, & unqueued_buffer);
available_AL_buffer_array_curr_index--;
available_AL_buffer_array[available_AL_buffer_array_curr_index] = unqueued_buffer;
buffers_processed--;
}

WaitForRequest with Timeout crashes

EDIT: I have now edited my code a bit to have a rough idea of "all" the code. Maybe this
might be helpful to identify the problem ;)
I have integrated the following simple code fragement which either cancels the timer if data
is read from the TCP socket or otherwise it cancels the data read from the socket
// file tcp.cpp
void CheckTCPSocket()
{
TRequestStatus iStatus;
TSockXfrLength len;
int timeout = 1000;
RTimer timer;
TRequestStatus timerstatus;
TPtr8 buff;
iSocket.RecvOneOrMore( buff, 0, iStatus, len );
timer.CreateLocal();
timer.After(timerstatus, timeout);
// Wait for two requests – if timer completes first, we have a
// timeout.
User::WaitForRequest(iStatus, timerstatus);
if(timerstatus.Int() != KRequestPending)
{
iSocket.CancelRead();
}
else
{
timer.Cancel();
}
timer.Close();
}
// file main.cpp
void TestActiveObject::RunL()
{
TUint Data;
MQueue.ReceiveBlocking(Data);
CheckTCPSocket();
SetActive();
}
This part is executed within active Object and since integrating the code piece above I always get the kernel panic:
E32User-CBase 46: This panic is raised by an active scheduler, a CActiveScheduler. It is caused by a stray signal.
I never had any problem with my code until now this piece of code is executed; code executes fine as data is read from the socket and
then the timer is canceled and closed. I do not understand how the timer object has here any influence on the AO.
Would be great if someone could point me to the right direction.
Thanks
This could be a problem with another active object completing (not one of these two), or SetActive() not being called. See Forum Nokia. Hard to say without seeing all your code!
BTW User::WaitForRequest() is nearly always a bad idea. See why here.
Never mix active objects and User::WaitForRequest().
(Well, almost never. When you know exactly what you are doing it can be ok, but the code you posted suggests you still have some learning to do.)
You get the stray signal panic when the thread request semaphore is signalled with RThread::RequestComplete() by the asynchronous service provider and the active scheduler that was waiting on the semaphore with User::WaitForAnyRequest() tries to look for an active object that was completed so that its RunL() could be called, but cannot find any in its list of active objects.
In this case you have two ongoing requests, neither of which is controlled by the active scheduler (for example, not using CActive::iStatus as the TRequestStatus; issuing SetActive() on an object where CActive::iStatus is not involved in an async request is another error in your code but not the reason for stray signal). You wait for either one of them to complete with WaitForRequest() but don't wait for the other to complete at all. The other request's completion signal will go to the active scheduler's WaitForAnyRequest(), resulting in stray signal. If you cancel a request, you will still need to wait on the thread request semaphore.
The best solution is to make the timeout timer an active object as well. Have a look at the CTimer class.
Another solution is just to add another WaitForRequest on the request not yet completed.
You are calling TestActiveObject::SetActive() but there is no call to any method that sets TestActiveObject::iStatus to KRequestPending. This will create the stray signal panic.
The only iStatus variable in your code is local to the CheckTCPSocket() method.