Sync playback and recording on different computers - c++

I would like to sync audio playback and recording on two different computers from a common trigger signal using a c/c++ script. Expected delay should not exceed 1ms.
Retrieving a signal and then start a script is not really an issue, delay is quite insignificant (few micro seconds).
For the moment, i'm stuck in an average delay (between the beginning of playback and beginning of the record) of about 20ms and deviation is quite important (5 to 10 ms).
Computers are running on Linux and I'm using aplay and arecord from alsa-utils (started directly from code using system() command).
Does someone has a good idea or experience to decrease or control the latency between the two audio interfaces ?
In my opinion, there should be a way to init both interface (rate, output format, ...) and, for the playback device, preload the data into the audio buffer and then start playing when signal is received.
Thanks

This is a tough one, but also technically very interesting. The best approach I can think of at the moment would be using a RTT (round trip time) approach (Given that you can control the delay of the audio devices to the extent required). You can emit a signal on the first system to the second system, to which the second system replies. The second system starts recording after a predefined amount of time (maybe 100 ms, but depends on the expected latency). When the fist system has received the response it may determine the round-trip-time. We can then start the playback after the predefined delay, minus the half round-trip-time - assuming that the way forth takes the same amount of time as the way back. The accuracy that can be achieved depends on the systems you are using for signalling.
EMIT SIGNAL ON SYSTEM 1
RECEIVE SIGNAL ON SYSTEM 2
EMIT SIGNAL ON SYSTEM 2
RECEIVE SIGNAL ON SYSTEM 1
DETERMINE ROUND-TRIP-TIME
START ON SYSTEM 2 AFTER X ms
START ON SYSTEM 1 ASTER (X-RTT/2) ms

Related

libpcap: Delay between receiving frames and call of callback-function

i am experiencing the following situation:
I open with pcap_open_live() one of my network-interfaces. Then i am compiling a filter for pcap for only capturing a specified ethernet-type (ether proto 0x1234). Now i am starting the pcap_loop(). The only thing, that the callback-function executes, is to send a frame via pcap_inject(). (The frame is hard-coded as a global char array).
When i compare now the timestamps of the received and the sent frame (e.g. on wireshark on a third non-involved computer) the delay is around 3 milliseconds (minumum 1 millisecond, but also up to 10 milliseconds). So pcap needs in average around 3 milliseconds to proceed with the received frame and calling the callback-function for sending the new frame.
I want/have to decrease that delay.
Following things i already tried:
tried all different variants of read-timout (in ms) in pcap_open_live(): even a read-timout of -1, which should be to my knowledge a polling, generates a delay of around 3 milliseconds
setting no filter
setting a higher priority to process
set InterrupThrottleRate=0 and other parameters of the e1000/e1000e-kernel module to force the hardware sending an interrupt for every single frame
But i never decreased the delay under the average of 3 milliseconds.
For my planned application it is necessary to react to incoming packets in a time unter 100 microseconds.
Is this even generally doable with libpcap?! Or are there any other suggestions for realizing such a application?
Thanks for all your replies,
i hope anyone can help me!
Notes: I am deploying under Linux/Ubuntu in C/C++.
For my planned application it is necessary to react to incoming packets in a time under 100 microseconds.
Then the buffering that many of the capture mechanisms atop which libpcap runs (BPF except on AIX, TPACKET_V3 on Linux with newer libpcap and kernel, DLPI on Solaris 10 and earlier, etc.) provide in order to reduce per-packet overhead would get in your way.
If the libpcap on your system has the pcap_set_immediate_mode() function, then:
use pcap_create() and pcap_activate(), rather than pcap_open_live(), to open the capture device;
call pcap_set_immediate_mode() between the pcap_create() and pcap_activate() calls.
In "immediate mode", packets should be delivered to the application by the capture mechanism as soon as the capture mechanism receives them.

Using timers with performance-critical software (Qt)

I am developing an application that is responsible of moving and managing robots over an UDP connection.
The application needs to:
Read joystick/user input using SDL.
Generate and send a control packet to the robot every 20 milliseconds (UDP)
Receive and decode response packets from the robot (~20 msecs). This was implemented with the signal/slot mechanism and does not require a timer.
Receive and process robot messages for debugging reasons. This is not time-regulated.
Update the UI regularly to keep the user notified about the status of the robot (e.g. battery voltage). For most cases, I have also used Qt's signal/slot mechanism.
Use a watchdog that disables the robot if no response is received after 1 second. The watchdog is reset when the application receives a robot packet (~20 msecs)
For the moment, I have implemented all of the above. However, the application fails to send the packets regularly when the watchdog is activated or when two or more QTimer objects are used. The application would generally work, but I would not consider it "production ready". I have tried to use the precision flags of the timers (Qt::Precise, Qt::Coarse and Qt::VeryCoarse), but I still experienced problems.
Notes:
The code is generally well organized, there are no "god objects" in the code base (most source files are less than 150 lines long and only create the necessary dependencies).
Most of the times, I use QTimer::singleShot() (e.g. I will only send the next packet once the current packet has been sent).
Where we use timers:
To read joystick input (~50 msecs, precise timer)
To send robot packets (~20 msecs, precise timer)
To update some aspects of the UI (~500 msecs, coarse timer)
To update the elapsed time since the robot was enabled (~100 msecs, precise timer)
To implement a watchdog (put the application and robot in safe state if 1000 msecs have passed without a robot response)
Note: the watchdog is feed when we receive a response packet from the robot (~20 msecs)
Do you have any recommendations for using QTimer objects with performance-critical code (any idea is welcome). Note that I have also tried to use different threads, but it has caused me more problems, since the application would not be in "sync", thus failing to effectively control the robots that we have tested.
Actually, I seem to have underestimated Qt's timer and event loop performance. On my system I get on average around 20k nanoseconds for an event loop cycle plus the overhead from scheduling a queued function call, and a timer with interval 1 millisecond is rarely late, most of the timeouts are a few thousand nanoseconds short of a millisecond. But it is a high end system, on embedded hardware it may be a lot worse.
You should take the time and profile your target system and Qt build to determine whether it can indeed run snappy enough, and based on those measurements, adjust your timings to compensate for the system delays to get your events scheduled more on time.
You should definitely keep the timer thread as free as possible, because if you block it by IO or extensive computation, your timer will not be accurate. Use a dedicated thread to schedule work and extra worker threads to do the actual work. You may also try playing with thread priorities a bit.
Worst case scenario, look for 3rd party high performance event loop implementations or create your own and potentially, also a faster signaling mechanism as ell. As I already mentioned in the comments, Qt's inter-thread queued signals are very slow, at least compared to something like indirect function calls.
Last but not least, if you want to do task X every N units of time, it will only be only possible if task X takes N units of time or less on your system. You need to make this consideration for each task, and for all tasks running concurrently. And in order to get accurate scheduling, you should measure how long did task X took, and if less than its frequency, schedule the next execution in the time remaining, otherwise execute immediately.

Threads are slow when audio is off

I have 2 projects. One is built by C++ Builder without MFC Style. And other one is VC++ MFC 11.
When I create a thread and create a cycle -- let's say this cycle adds one to progressbar position -- from 1 to 100 by using Sleep(10) it works of course for both C++ Builder and C++ MFC.
Now, Sleep(10) is wait 10 miliseconds. OK. But the problem is only if I have open media player, Winamp or anything else that produces "Sound". If I close all media player, winamp and other sound programs, my threads get slower than 10 miliseconds.
It takes like 50-100 ms / each. If I open any music, it works normally as I expected.
I have no any idea why this is happening. I first thought that I made a mistake inside MFC App but why does C++ Builder also slow down?
And yes, I am positively sure it is sound related because I even re-formated my windows, disabled everything. Lastly I discovered that sound issue.
Does my code need something?
Update:
Now, I follow the code and found that I used Sleep(1) in such areas to wait 1 miliseconds. The reason of this, I move an object from left to right. If I remove this sleep then the moving is not showing up because it is very fast. So, I should use Sleep(1). With Sleep(1), if audio is on than it works. If audio is off than it is very slow.
for (int i = 0; i <= 500; i++) {
theDialog->staticText->SetWindowsPosition(NULL, i, 20, 0, 0);
Sleep(1);
}
So, suggestions regarding this are really appreciated. What should I do?
I know this is the incorrect way. I should use something else that is proper and valid. But what exactly? Which function or class help me to move static texts from one position to another smoothly?
Also, changing the thread priority has not helped.
Update 2:
Update 1 is an another question :)
Sleep (10), will (as we know), wait for approximately 10 milliseconds. If there is a higher priority thread which needs to be run at that moment, the thread wakeup maybe delayed. Multimedia threads are probably running in a Real-Time or High priority, as such when you play sound, your thread wakeup gets delayed.
Refer to Jeffrey Richters comment in Programming Applications for Microsoft Windows (4th Ed), section Sleeping in Chapter 7:
The system makes the thread not schedulable for approximately the
number of milliseconds specified. That's right—if you tell the system
you want to sleep for 100 milliseconds, you will sleep approximately
that long but possibly several seconds or minutes more. Remember that
Windows is not a real-time operating system. Your thread will probably
wake up at the right time, but whether it does depends on what else is
going on in the system.
Also as per MSDN Multimedia Class Scheduler Service (Windows)
MMCSS ensures that time-sensitive processing receives prioritized access to CPU resources.
As per the above documentation, you can also control the percentage of CPU resources that will be guaranteed to low-priority tasks, through a registry key
Sleep(10) waits for at least 10 milliseconds. You have to write code to check how long you actually waited and if it's more than 10 milliseconds, handle that sanely in your code. Windows is not a real time operating system.
The minimum resolution for Sleep() timing is set system wide with timeBeginPeriod() and timeEndPeriod(). For example passing timeBeginPeriod(1) sets the minimum resolution to 1 ms. It may be that the audio programs are setting the resolution to 1 ms, and restoring it to something greater than 10 ms when they are done. I had a problem with a program that used Sleep(1) that only worked fine when the XE2 IDE was running but would otherwise sleep for 12 ms. I solved the problem by directly setting timeBeginPeriod(1) at the beginning of my program.
See: http://msdn.microsoft.com/en-us/library/windows/desktop/dd757624%28v=vs.85%29.aspx

Real time application on Microsoft Windows 7 Pro

I have opened this new thread after having tried lots of thing.
My application (C++ on VS2010) has to grab an image, elaborate the image, send through UDP the result. The problem is the frequency: 200 times/second. So I have a camera that records image in a double buffer at 200Hz, and I have to elaborate the image in less than 5 milliseconds. The application works at 99,999 % of the time but I think that Win7 Pro take out my realtime priority and so in 1 of 100000 cases something goes wrong.
Reading msdn forum and so on, I can only use:
SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS); To get a realtime priority of the process when it have been launched with administrator's priviledges
SetThreadPriority(HANDLE, THREAD_PRIORITY_ABOVE_NORMAL); or THREAD_PRIORITY_HIGHEST or THREAD_PRIORITY_TIME_CRITICAL.
Now, I have 5 threads started by me (_beginthreadex) and several thread started inside a compiled DLL of the camera. I think that if I set Time Critical priority to all my 5 thread, none of them has higher priority than others.
So I have two questions:
could I work at 200 Hz without Windows's lags?
have you any suggestions for my threads' settings?
Thanks!!
Bye bye
Paolo
Oh I would use more than two buffers for this. A pool of 200 image objects seems like a better bet.
How much latency can you afford, overall? It's always the same story with video streaming - you can have consistent, pause-free operation, or low latency, but not both.
How big is the video image buffer queue on the client side?
Edit:
'I must ever send a UDP datagram every 5 millisec' :((
OK, so you have an image output queue with a UDP send thread on a 5ms loop, yes? The queue must never empty. Sounds indeed like the elaborations are the bottleneck.
Do you have a [number of cores+] pool of threads doing the elaborations?

Sleep Function Error In C

I have a file of data Dump, in with different timestamped data available, I get the time from timestamp and sleep my c thread for that time. But the problem is that The actual time difference is 10 second and the data which I receive at the receiving end is almost 14, 15 second delay. I am using window OS. Kindly guide me.
Sorry for my week English.
The sleep function will sleep for at least as long as the time you specify, but there is no guarantee that it won't sleep for longer.If you need an accurate interval, you will need to use some other mechanism.
If I understand well:
you have a thread that send data (through network ? what is the source of data ?)
you slow down sending rythm using sleep
the received data (at the other end of network) can be delayed much more (15 s instead of 10s)
If the above describe what you are doing, your design has several flaws:
sleep is very imprecise, it will wait at least n seconds, but it may be more (especially if your system is loaded by other running apps).
networks introduce a buffering delay, you have no guarantee that your data will be send immediately on the wire (usually it is not).
the trip itself introduce some delay (latency), if your protocol wait for ACK from the receiving end you should take that into account.
you should also consider time necessary to read/build/retrieve data to send and really send it over the wire. Depending of what you are doing it can be negligible or take several seconds...
If you give some more details it will be easier to diagnostic the source of the problem. sleep as you believe (it is indeed a really poor timer) or some other part of your system.
If your dump is large, I will bet that the additional time comes from reading data and sending it over the wire. You should mesure time consumed in the sending process (reading time before and after finishing sending).
If this is indeed the source of the additional time, you just have to remove that time from the next time to wait.
Example: Sending the previous block of data took 4s, the next block is 10s later, but as you allready consumed 4s, you just wait for 6s.
sleep is still a quite imprecise timer and obviously the above mechanism won't work if sending time is larger than delay between sendings, but you get the idea.
Correction sleep is not so bad in windows environment as it is in unixes. Accuracy of windows sleep is millisecond, accuracy of unix sleep is second. If you do not need high precision timing (and if network is involved high precision timing is out of reach anyway) sleep should be ok.
Any modern multitask OS's scheduler will not guarantee any exact timings to any user apps.
You can try to assign 'realtime' priority to your app some way, from a windows task manager for instance. And see if it helps.
Another solution is to implement a 'controlled' sleep, i.e. sleep a series of 500ms, checking current timestamp between them. so, if your all will sleep a 1s instead of 500ms at some step - you will notice it and not do additional sleep(500ms).
Try out a Multimedia Timer. It is about as accurate as you can get on a Windows system. There is a good article on CodeProject about them.
Sleep function can take longer than requested, but never less. Use winapi timer functions to get one function called-back in a interval from now.
You could also use the windows task scheduler, but that's going outside programmatic standalone options.