WASAPI, Delays on m_AudioClient->Start() - c++

The app captures sound from a microphone using WASAPI.
This code initializes m_AudioClient that is of type IAudioClient*.
const LONG CAPTURE_CLIENT_LATENCY = 50 * 10000;
DWORD loopFlag = m_IsLoopback ? AUDCLNT_STREAMFLAGS_LOOPBACK : 0;
hr = m_AudioClient->Initialize(AUDCLNT_SHAREMODE_SHARED
, AUDCLNT_STREAMFLAGS_EVENTCALLBACK | AUDCLNT_STREAMFLAGS_NOPERSIST | loopFlag
, CAPTURE_CLIENT_LATENCY, 0, m_WaveFormat->GetRawFormat(), NULL);
Then I use m_AudioClient->Start() and m_AudioClient->Stop() to pause or resume capturing.
Usaually m_AudioClient->Start() takes 5-6 ms, but sometimes it takes about 150 mswhich is too much for the application.
If I once call m_AudioClient->Start() then subsecuent calls of m_AudioClient->Start() during next 5 seconds will be fast but after about 10-15 seconds next call of m_AudioClient->Start() will take logner (150 ms). So looks like it keeps some state several seconds and after that it needs to get to that state again which takes some time.
On another machine this delays never happen, every call of m_AudioClient->Start() takes about 30 ms
On the third machine average duration of m_AudioClient->Start() is 140 ms but peak values are about 1 s.
I run the same code on all 3 machines. The michrophone is not exatly the same on most cases it is Microphone Array Realtek High Definition Audio.
Can somebody explain why these peak values for the duration m_AudioClient->Start() happen and how I can fix it?

Related

C++ call to API function ::GetTickCount() jumps ~18 days

On a few Windows computers I have seen that two, on each other following, calls to ::GetTickCount() returns a difference of 1610619236 ms (around 18 days). This is not due to wrap around og int/unsigned int mismatch. I use Visual C++ 2015/2017.
Has anybody else seen this behaviour? Does anybody have any idea about what could cause behaviour like this?
Best regards
John
Code sample that shows the bug:
class CLTemp
{
DWORD nLastCheck;
CLTemp()
{
nLastCheck=::GetTickCount();
}
//Service is called every 200ms by a timer
void Service()
{
if( ::GetTickCount() - nLastCheck > 20000 )//check every 20 sec
{
//On some Windows machines, after an uptime of 776 days, the
//::GetTickCount() - nLastCheck gives a value of 1610619236
//(corresponding to around 18 days)
nLastCheck = ::GetTickCount();
}
}
};
Update - problem description, a way of recreating and solution:
The Windows API function GetTickCount() unexpectedly jumps 18 days forward in time when passing 776 days after Windows Restart.
We have experienced several times that some of our long running Windows pc applications coded in Microsoft Visual C++ suddenly reported a time-out error. In many of our applications we call GetTickCount() to perform some tasks with certain intervals or to watch for a time-out condition. The example code could go as this:
DWORD dwTimeNow, dwPrevTime = ::GetTickCount();
bool bExit = false;
While (!bExit)
{
dwTimeNow = ::GetTickCount();
if (dwTimeNow – dwPrevTime >= 5000)
{
dwPrevTime = dwTimeNow;
// Perform my task
}
else
{
::Sleep(10);
}
}
GetTickCount() returns a DWORD, which is an unsigned 32-bit int. GetTickCount() wraps around from its maximum value of 0xFFFFFFFF to zero after app. 49 days. The wrap around is easily handled by using unsigned arithmetic and always subtracting the previous value from the new value to calculate the distance. Do never compare two values from GetTickCount() against each other.
So, the wrap around at its maximum value each 49 days it expected and handled. But we have experienced an unexpected wrap around to zero of GetTickCount() after 776 days after latest Windows Restart. And in this case GetTickCount() wraps from 0x9FFFFFFF to zero, which is 1610612736 milliseconds too early corresponding to around 18.6 days. When GetTickCount() is used to check for a time-out condition and it suddenly reports that 18 days have elapsed since last check, then the software reports a false time-out condition. Note that it is 776 days after a Windows Restart. A Windows Restart resets the GetTickCount() value to zero. A pc reboot does not, instead the time elapsed while switched off is added to the initial GetTickCount() value.
We have made a test program that provides evidence of this issue. The test program reads the values of GetTickCount(), GetTickCount64(), InterruptTime(), and UnbiasedInterruptTime() each 5000 milliseconds scheduled by a Windows Timer. Each time the sample program calculates the distance in time for each of the four time-functions. If the distance in time is 10000 milliseconds or more, it is marked as a time jump event and logged. Each time it also keeps track of the minimum distance and the maximum distance in time for each time-function.
Before starting the test program, a Windows Restart is carried out. Ensure no automatic time synchronization is enabled. Then make a Windows shut down. Start the pc again and make it enter its Bios setup when it boots. In the Bios, advance the real time clock 776 days. Let the pc boot up and start the test program. Then after 17 hours the unexpected wraparound of GetTickCount() occurs (776 days, 17 hours, and 21 minutes). It is only GetTickCount() that shows this behavior. The other time-functions do not.
The following excerpt from the logfile of the test program shows the start values reported by the four time-functions. In this example the time has only been advanced to 775 days after Windows Restart. The format of the log entry is the time-function value converted into: days hh:mm:ss.msec. TickCount32 is the plain GetTickCount(). Because it is a 32-bit value it has wrapped around and shows a different value. At GetTickCount64() we can see the 775 days.
2024-05-14 09:13:27.262 Start times
TickCount32 : 029 08:30:11.591
TickCount64 : 775 00:12:01.031
InterruptTime : 775 00:12:01.036
UnbiasedInterruptTime: 000 00:05:48.411
The next excerpt from the logfile shows the unexpected wrap around of GetTickCount() (TickCount32). The format is: Distance between the previous value and the new value (should always be around 5000 msec). Then follows the new value converted into days and time, and finally follows the previous value converted into days and time. We can see that GetTickCount() jumps 1610617752 milliseconds (app. 18.6 days) while the other three time-functions only advances app. 5000 msec as expected. At TickCount64 one can see that it occurs at 776 days, 17 hours, and 21 minutes.
2024-05-16 02:22:30.394 Time jump *****
TickCount32 : 1610617752 - 000 00:00:00.156 - 031 01:39:09.700
TickCount64 : 5016 - 776 17:21:04.156 - 776 17:20:59.140
InterruptTime : 5015 - 776 17:21:04.165 - 776 17:20:59.150
UnbiasedInterruptTime: 5015 - 001 17:14:51.540 - 001 17:14:46.525
If you increase the time that the real time clock is advanced to two times 776 days and 17 hours – for example 1551 days – the phenomenon shows up once more. It has a cyclic nature.
2026-06-30 06:34:26.663 Start times
TickCount32 : 029 12:41:57.888
TickCount64 : 1551 21:44:51.328
InterruptTime : 1551 21:44:51.334
UnbiasedInterruptTime: 004 21:24:24.593
2026-07-01 19:31:47.641 Time jump *****
TickCount32 : 1610617736 - 000 00:00:04.296 - 031 01:39:13.856
TickCount64 : 5000 - 1553 10:42:12.296 - 1553 10:42:07.296
InterruptTime : 5007 - 1553 10:42:12.310 - 1553 10:42:07.303
UnbiasedInterruptTime: 5007 - 006 10:21:45.569 - 006 10:21:40.562
The only viable solution to this issue seems to be using GetTickCount64() and totally abandon usage of GetTickCount().

How to send a code to the parallel port in exact sync with a visual stimulus in Psychopy

I am new to python and psychopy, however I have vast experience in programming and in designing experiments (using Matlab and EPrime). I am running an RSVP (rapid visual serial presentation) experiment with displays a different visual stimuli every X ms (X is an experimental variable, can be from 100 ms to 1000 ms). As this is a physiological experiment, I need to send triggers over the parallel port exactly on stimulus onset. I test the sync between triggers and visual onset using an oscilloscope and photosensor. However, when I send my trigger before or after the win.flip(), even with the window waitBlanking=False parameter then I still get a difference between the onset of the stimuli and the onset of the code.
Attached is my code:
im=[]
for pic in picnames:
im.append(visual.ImageStim(myWin,image=pic,pos=[0,0],autoLog=True))
myWin.flip() # to get to the next vertical blank
while tm < and t &lt len(codes):
im[tm].draw()
parallel.setData(codes[t]) # before
myWin.flip()
#parallel.setData(codes[t]) # after
ttime.append(myClock.getTime())
core.wait(0.01)
parallel.setData(0)
dur=(myClock.getTime()-ttime[t])*1000
while dur < stimDur-frameDurAvg+1:
dur=(myClock.getTime()-ttime[t])*1000
t=t+1
tm=tm+1
myWin.flip()
How can I sync my stimulus onset to the trigger? I'm not sure if this is a graphics card issue (I'm using a LCD ACER screen with the onboard Intel graphics card). Many thanks,
Shani
win.flip() waits for next monitor update. This means that the next line after win.flip() is executed almost exactly when the monitor begins drawing the frame. That's where you want to send your trigger. The line just before win.flip() is potentially almost one frame earlier, e.g. 16.7 ms on a 60Hz monitor so your trigger would arrive too early.
There are two almost identical ways to do it. Let's start with the most explicit:
for i in range(10):
win.flip()
# On the first flip
if i == 0:
parallel.setData(255)
core.wait(0.01)
parallel.setData(0)
... so the signal is sent just after the image has been pushed to the monitor.
The slightly more timing-accurate way to do it will save you like 0.01 ms (plus minus an order of magnitude). Somewhere early in the script define
def sendTrigger(code):
parallel.setData(code)
core.wait(0.01)
parallel.setData(0)
Then do
win.callOnFlip(sendTrigger, code=255)
for i in range(10):
win.flip()
This will call the function just after the first flip, before psychopy does a bit of housecleaning. So the function could have been called win.callOnNextFlip since it's only executed on the first following flip.
Again, this difference in timing is so miniscule compared to other factors that this is not really a question of a performance but rather of style preferences.
There is a hidden timing variable that is usually ignored - the monitor input lag, and I think this is the reason for the delay. Put simply, the monitor needs some time to display the image even after getting the input from the graphics card. This delay has nothing to do with the refresh rate (how many times the screen switches buffer), or the response time of the monitor.
In my monitor, I find a delay of 23ms when I send a trigger with callOnFlip(). How I correct it is: floor(23/16.667) = 1, and 23%16.667 = 6.333. So I call the callOnFlip on the second frame, wait 6.3 ms and trigger the port. This works. I haven't tried with WaitBlanking=True, which waits for the blanking start from the graphics card, as that gives me some more time to prepare the next buffer already. However, I think that even with WaitBlanking=True the effect will be there. (More after testing!)
Best,
Suddha
There is at least one routine that you can use to normalized the trigger delay to your screen refreshing rate. I just tested it with a photosensor cell and I went from a mean delay of 13 milliseconds (sd = 3.5 ms) between the trigger and the stimulus display, to a mean delay of 4.8 milliseconds (sd = 3.1 ms).
The procedure is the following :
Compute the mean duration between two displays. Say your screen has a refreshing rate of 85.05 (this is my case). This means that there is mean duration of 1000/85.05 = 11.76 milliseconds between two refreshes.
Just after you called win.flip(), wait for this averaged delay before you send your trigger : core.wait(0.01176).
This will not ensure that all your delays now equal zero, since you cannot master the synchronization between the win.flip() command and the current state of your screen, but it will center the delay around zero. At least, it did for me.
So the code could be updated as following :
refr_rate = 85.05
mean_delay_ms = (1000 / refr_rate)
mean_delay_sec = mean_delay_ms / 1000 # Psychopy needs timing values in seconds
def send_trigger(port, value):
core.wait(mean_delay_sec)
parallel.setData(value)
core.wait(0.001)
parallel.setData(0)
[...]
stimulus.draw()
win.flip()
send_trigger(port, value)
[...]

Can i retrieve microseconds or very accurate milliseconds on c++ on windows?

So I made a game loop that uses SDL_Delay function to cap the frames per second, it look like this:
//While the user hasn't qui
while( stateID != STATE_EXIT )
{
//Start the frame timer
fps.start();
//Do state event handling
currentState->handle_events();
//Do state logic
currentState->logic();
//Change state if needed
change_state();
//Do state rendering
currentState->render();
//Update the screen
if( SDL_Flip( screen ) == -1 )
{
return 1;
}
//Cap the frame rate
if( fps.get_ticks() < 1000 / FRAMES_PER_SECOND )
{
SDL_Delay( ( 1000 / FRAMES_PER_SECOND ) - fps.get_ticks() );
}
}
So when I run my games on 60 frames per second (which is the "eye cap" I assume) I can still see laggy type of motion, meaning i see the frames appearing independently causing unsmooth motion.
This is because apparently SDL_Delay function is not too accurate, causing +,- 15 milliseconds or something difference between frames greater than whatever I want it to be.
(all these are just my assumptions)
so I am just searching fo a good and accurate timer that will help me with this problem.
any suggestions?
I think there is a similar question in How to make thread sleep less than a millisecond on Windows
But as a game programmer myself, I don't rely on sleep functions to manage frame-rate (the parameter they take is just a minimum). I just draw stuff on screen as fast as I can. I have a bunch of function calls in my game loop, and then I keep track of how often I'm calling them. For instance, I check input quite often (1000x/second) to make the game more responsive, but I don't check the network inbox more than 100x/second.
For example:
#define NW_CHECK_INTERVAL 10
#define INPUT_CHECK_INTERVAL 1
uint32_t last_nw_check = 0, last_input_check = 0;
while (game_running) {
uint32_t now = SDL_GetTicks();
if (now - last_nw_check > NW_CHECK_INTERVAL) {
check_network();
last_nw_check = now;
}
if (now - last_input_check > INPUT_CHECK_INTERVAL) {
check_input();
last_input_check = now;
}
check_video();
// and so on...
}
Use the QueryPerformanceCounter / Frequency for that.
LARGE_INTEGER start, end, tps; //tps = ticks per second
QueryPerformanceFrequency( &tps );
QueryPerformanceCounter( &start );
QueryPerformanceCounter( &end );
int usPassed = (end.QuadPart - start.QuadPart) * 1000000 / tps.QuadPart;
Here's a small wait function I had created for timing midi sequences using QueryPerformanceCounter:
void wait(int waitTime) {
LARGE_INTEGER time1, time2, freq;
if(waitTime == 0)
return;
QueryPerformanceCounter(&time1);
QueryPerformanceFrequency(&freq);
do {
QueryPerformanceCounter(&time2);
} while((time2.QuadPart - time1.QuadPart) * 1000000ll / freq.QuadPart < waitTime);
}
To convert ticks to microseconds, calculate the difference in ticks, multiply by 1,000,000 (microseconds/second) and divide by the frequency of ticks per second.
Note that some things may throw this off, for instance the precision of the high-resolution counter is not likely to be down to a single microsecond. For example, if you want to wait 10 microseconds and the precision/frequency is one tick every 6 microseconds, your 10 microsecond wait will actually be no less than 12 microseconds. Again, this frequency is system dependent and will vary from system to system.
Also, Windows is not a real-time operating system. A process may be preempted at any time and it is up to Windows to decide when the process is rescheduled. The application may be preempted in the middle of this function and not restarted again until long after the expected wait time has elapsed. There really isn't much you can do about it but you'll probably never notice it if it happens.
60 fame per second is just the frequency of power in US (50 in Europe, Africa and Asia are somehow mixed) and is the frequency of video refreshing for hardware comfortable reasons (It can be an integer multiple on more sophisticated monitors). It was a mandatory constrains for CRT dispaly, and it is still a comfortable reference for LCD (that's how frequently the frame buffer is uploaded to the display)
The eye-cap is no more than 20-25 fps - not to be confused with retina persistency, that's about one-half - and that's why TV interlace two squares upon every refresh.
independently on the timing accuracy, whatever hardware device cannot be updated during its buffer-scan (otherwise the image changes while it is shown, resulting in half-drawn broken frames), hence, if you go faster than one half of the device refresh you are queued behind it and forced to wait for it.
60 fps in a game loop serves only to help CPU manufacturers to sell new faster CPUs. Slow down under 25 and everything will look more fluid.
SDL_Delay:
This function waits a specified number of milliseconds before returning. It waits at least the specified time, but possible longer due to OS scheduling. The delay granularity is at least 10 ms. Some platforms have shorter clock ticks but this is the most common.
The actual delays observed with this function depend on OS settings. I'd suggest to look into the
Mutimedia Timer API, particulary into the timeBeginPeriod function, to adapt the interrupt frequency to your requirements.
Obtaining and Setting Timer Resolution shows an example how to change the interrupt period to about 1ms. This way you don't have the 15ms hickup anymore. BTW: Eye-catch period is about 40ms.
Obtaining fixed period timing can also be addressed by Waitable Timer Objects. But the use of mutimedia timers is mandatory to obtain decent resolution, no matter what.
Using other tools to improve the timing capabilities is discussed here.

C++ waiting a frame before carrying out an action

How would you wait a frame in c++.
I don't want the program to sleep or anything.
It would go soemthing like
Do this in this frame (1)
Continue with rest of program
Do this in the next frame (2)
where action 1 happens only in the first frame and action 2 happens only in the next frame. It would continue like this. 1, 2, 1 again, 2
I have the time between frames, I use c++ and i'm using Visual Studio 2008 to compile.
Edit:
I'm using Opengl my OS is Windows 7.
Frame - http://en.wikipedia.org/wiki/Frame_rate
like each image of the scene printed to the screen over a given time period
I'm making some assumptions here.
Suppose you have a model for which you wish to show the state. You might wish to maximise the CPU time spent evolving the model rather than rendering.
So you fix the target frame rate, at e.g. 25 fps.
Again, assume you have optimised rendering so that it can be done in much less than 0.04 seconds.
So you might want something like (pseudo-code):
Time lastRendertime = now();
while(forever)
{
Time current = now();
if ((current - lastRenderTime > 0.04))
{
renderEverything();
lastRenderTime = current;
}
else
{
evolveModelABit();
}
}
Of course, you probably have an input handler to break the loop. Note that this approach assumes that you do not want the model evolution affected by elapsed real time. If you do, and may games do, then pass in the current time to the evolveModelABit();.
For time functions on Windows, you can use:
LARGE_INTEGER frequency; // ticks per second
LARGE_INTEGER t1; // ticks
QueryPerformanceFrequency(&frequency);
QueryPerformanceCounter(&t1);
Note that this approach is suitable for a scientific type simulation. The model evolution will not depend on the frame rate, rendering etc, and gives the same result very time.
For a game, typically there is a push for maximising the fps. This means that the main loop is of the form:
Time lastRendertime = now();
while(forever)
{
Time current = now();
evolveModelABit(current, lastRenderTime);
renderEverything();
lastRenderTime = current;
}
If V-Sync is enabled, SwapBuffers will block the current thread until the next frame has been shown. So if you create a worker thread, and release a lock, or resume its execution right before the call of SwapBuffers your programm recieves the CPU time it would otherwise yield to the rest of the system during the wait-for-swap block. If the worker thread is manipulating GPU resources, it is a good idea using high resolution/performance counters to determine how much time is left until the swap, minus some margin and use this timing in the worker thread, so that the worker thread puts itself to sleep at about the time the swap happens, so that the GPU will not have to context switch between worker and renderer thread.

How to get an accurate 1ms Timer Tick under WinXP

I try to call a function every 1 ms. The problem is, I like to do this with windows. So I tried the multimediatimer API.
Multimediatimer API
Source
idTimer = timeSetEvent(
1,
0,
TimerProc,
0,
TIME_PERIODIC|TIME_CALLBACK_FUNCTION );
My result was that most of the time the 1 ms was ok, but sometimes I get the double period. See the little bump at around 1.95ms
multimediatimerHistogram http://www.freeimagehosting.net/uploads/8b78f2fa6d.png
My first thought was that maybe my method was running too long. But I measured this already and this was not the case.
Queued Timers API
My next try was using the queud timers API with
hTimerQueue = CreateTimerQueue();
if(hTimerQueue == NULL)
{
printf("Error creating queue: 0x%x\n", GetLastError());
}
BOOL res = CreateTimerQueueTimer(
&hTimer,
hTimerQueue,
TimerProc,
NULL,
0,
1, // 1ms
WT_EXECUTEDEFAULT);
But also the result was not as expected. Now I get most of the time 2 ms cycletime.
queuedTimer http://www.freeimagehosting.net/uploads/2a46259a15.png
Measurement
For measuring the times I used the method QueryPerformanceCounter and QueryPerformanceFrequency.
Question
So now my question is if somebody encountered similar problems under windows and maybe even found a solution?
Thanks.
Without going to a real-time OS, you cannot expect to have your function called every 1 ms.
On Windows that is NOT a real-time OS (for Linux it is similar), a program that repeatedly read a current time with microsecond precision, and store consecutive differences in an histogram have a non-empty bin for >10 ms! This means that sometimes you will have 2 ms, but you can also get more between your calls.
You can try to run timeBeginPeriod(1) at the program start and timeEndPeriod(1) before quitting. This probably can enhance timer precision.
A call to NtQueryTimerResolution() will return a value for ActualResolution. In your case the actual resolution is almost certainly 0.9765625 ms. This is exactly what you show in the first plot.
The second occurace of about 1.95 ms is more precisely Sleep(1) = 1.9531 ms = 2 x 0.9765625 ms
I guess the interrupt period runs at someting close to 1ms (0.9765625).
And now the trouble begins: The timer signals when the desired delay expires.
Say the ActualResolution is set to 0.9765625, the interrupt heartbeat of the system will run at 0.9765625 ms periods or 1024 Hz and a call to Sleep is made with a desired delay of 1 ms. Two scenarios are to be looked at:
The call was made < 1ms (ΔT) ahead of the next interrupt. The next interrupt will not confirm that the desired period of time has expired. Only the following interrupt will cause the call to return. The resulting sleep delay will be ΔT + 0.9765625 ms.
The call was made >= 1ms (ΔT) ahead of the next interrupt. The next interrupt will force the call to return. The resulting sleep delay will be ΔT.
So the result depends a lot on when the call was made and therefore you may observe 0.98ms events as well as 1.95ms events.
Edit: Using the CreateTimerQueueTimer will push the observed delay to 1.95 because the timer tick (interrupt period) is 0.9765625 ms. On the first occurence of the interrupt, the requested duration of 1 ms has not quite expired, thus the TimerProc will only be triggered after the second interrupt (2 x 0.9765625 ms = 1.953125 ms > 1 ms). Consequently, the queueTimer plot shows the peak at 1.953125 ms.
Note: This behavior strongly depends on the underlying hardware.
More details can be found at the Windows Timestamp Project