Serial communication protocol design issues - c++

This is an embedded solution using C++, im reading the changes of brightness from a cellphone screen, from very bright (white) to dark (black).
Using JavaScript and a very simple script im changing the background of a webpage from white to black on 100 milliseconds intervals and reading the result on my brightness sensor, as expected the browser is not very precise on timing, some times it does 100ms sometimes less and sometimes more with a huge deviation at times.
var syncinterval = setInterval(function(){
bytes = "010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101";
bit = bytes[i];
output_bit(bit);
i += 1;
if(i > bytes.length) {
clearInterval(syncinterval);
i = 0;
for (i=0; i < input.length; i++) {
tbits = input[i].charCodeAt(0).toString(2);
while (tbits.length < 8) tbits = '0' + tbits;
bytes += tbits;
}
console.log(bytes);
}
}, sync_speed);
My initial idea, before knowing how the timing was on the browser was to use asynchronous serial communication, with some know "word" to sync the stream of data as RS232 does with his start bit, but on RS232 the clocks are very precise.
I could use a second sensor to read a different part of the screen as a clock, in this case even if the monitor or the browser "decides" to go faster or slower my system will only read when there is a clock signal (this is a similar application were they swipe the sensors instead of making the screen flicks as i need), but this require a more complex hardware system, i would like not to complicate things before searching for a software solution.
I don't need high speeds, the data im trying to send is just about 8 Bytes as much.

With any kind of asynchronous communications, you rely on transmitter sending a new 'bit' of data at a fixed time interval, and the receiver sampling the data at the same (fixed) interval. If the browser isn't accurate on timings, you'll just need to slow the bitrate down until its good enough.
There are a few tricks you can use to help you improve the reliability:-
a : While sending, calculate the required 'start transmit time' of each 'bit' in advance, and modify the delay after each bit has been 'sent', based on current time vs. required time. This means you'll avoid cumulative errors (i.e. if Bit 1 is sent a little 'late', the delay to bit 2 will be reduced to compensate), rather than delaying a constant N microseconds per bit.
b: While receiving, you must sample the incoming data much faster than you expect changes. (UARTS normally use a 16x oversample) This means you can resynchronize with the 'start bit' (the initial change from 1 to 0 in your diagram) and you can then sample each bit at the expected 'centre' of its time period.
In other words, if you're sending data at 1000us intervals, you sample data at ~62us intervals, and when you detect a 'start bit, you wait 500us to put you in the centre of the time period, then take 8 single-bit samples at 1000us intervals to form an 8-bit byte.

You might consider not using a fixed-rate encoding, where each bit is represented as a sequence of the same length, and instead go for a variable-rate encoding:
Time: 0 1 2 3 4
0: _/▔\_
1: _/▔▔▔▔▔\_
This means that when decoding, all you need to do is to measure the time the screen is lit. Short pulses are 0s, long pulses are 1s. It's woefully inefficient, but doesn't require accurate clocking and should be relatively resistant to inaccurate timing. By using some synchronisation pulses (say, an 010 sequence) between bytes you can automatically detect the length of the pulses and so end up not needing a fixed clock at all.

Related

Loop every x seconds based on process speed

I am implementing a basic (just for kiddies) anti-cheat for my game. I've included a timestamp to each of my movement packets and do sanity checks on server side for the time difference between those packets.
I've also included a packet that sends a timestamp every 5 seconds based on process speed. But it seems like this is a problem when the PC lags.
So what should I use to check if the process time is faster due to "speed hack"?
My current loop speed check on client:
this_time = clock();
time_counter += (double)(this_time - last_time);
last_time = this_time;
if (time_counter > (double)(5 * CLOCKS_PER_SEC))
{
time_counter -= (double)(5 * CLOCKS_PER_SEC);
milliseconds ms = duration_cast<milliseconds>(system_clock::now().time_since_epoch());
uint64_t curtime = ms.count();
if (state == WALK) {
// send the CURTIME to server
}
}
// other game loop function
The code above works fine if the clients PC doesn't lag maybe because of RAM or CPU issues. They might be running too many applications.
Server side code for reference: (GoLang)
// pktData[3:] packet containing the CURTIME from client
var speed = pickUint64(pktData, 3)
var speedDiff = speed - lastSpeed
if lastSpeed == 0 {
speedDiff = 5000
}
lastSpeed = speed
if speedDiff < 5000 /* 5000 millisec or 5 sec */ {
c.hackDetect("speed hack") // hack detect when speed is faster than the 5 second send loop in client
}
Your system has a critical flaw which makes it easy to circumvent for cheaters: It relies on the timestamp provided by the client. Any data you receive from the client can be manipulated by a cheater, so it must not be trusted.
If you want to check for speed hacking on the server:
log the current position of the players avatar at irregular intervals. Store the timestamp of each log according to the server-time.
Measure the speed between two such logs-entries by calculating the distance and divide it by the timestamp-difference.
When the speed is larger than the speed limit of the player, then you might have a cheater. But keep in mind that lags can lead to sudden spikes, so it might be better to take the average speed measurement of multiple samples to detect if the player is speed-hacking. This might make the speedhack-detection less reliable, but that might actually be a good thing, because it makes it harder for hackers to know how reliable any evasion methods they use are working.
To avoid false-positives, remember to keep track of any artificial ways of moving players around which do not obey the speed limit (like teleporting to spawn after being killed). When such an event occurs, the current speed measurement is meaningless and should be discarded.

Speed up data logging code

I have a device that outputs 64 bits of binary data at a rate of 1KHz. I am reading the device over USB via a 3rd party DLL, converting the binary data into a float, timestamping it, and writing to file.
I have the following setup at the moment:
int main(int argc, char* argv[])
{
unsigned char Message_Rx[64];
USHORT Bytes_Read=0;
std::ofstream out(argv[1]);
do
{
Result = Comms.USBRead(&Message_Rx[0],&Bytes_Read);
unsigned long now = getTickCount(start);
if(Result != 0)
{
uint16_t msb (Message_Rx[11] & 0xff) \\leftshited 8;
uint16_t lsb (Message_Rx[12] & 0xff);
uint16_t rate = msb | lsb;
char outstring[1024];
sprintf(outstring, "%d\t%.7f", now, (float)rate*0.03125);
out << outstring << "\n";
}
}while(!kbhit());
out.close();
}
(Sorry, formatting gets messed up with >> or <<).
This produces perfectly good results on my desktop. There doesn't appear to be any data missing and the timestamps are continuous and 1ms apart.
143379582 -0.5937500
143379583 -1.5312500
143379584 -1.6250000
143379585 -1.4062500
143379586 -1.1875000
143379587 -1.3437500
143379588 -1.3125000
143379589 -1.3125000
143379590 -1.1562500
But when I run this on the old laptop that I need to use I get timestamps that appear in blocks and it looks like there must be some data missing:
143379582 -0.5937500
143379582 -1.5312500
143379582 -1.6250000
143379582 -1.4062500
143379582 -1.1875000
143379593 -1.3437500
143379593 -1.3125000
143379593 -1.3125000
143379593 -1.1562500
Is there a way to achieve a speedup of my code so that I won't lose data?
To say this loud and clear: for any PC that is not a Intel 486SX, 64kb/s is a utmost laughable rate. Getting a few Mb/s over USB is very doable with small, Dollar-a-piece microcontrollers without any optimization.
Whatever goes wrong needs investigation much more than your code does.
I don't know the Comms library, but that's where I'd look for the place where time is spent.
Other than that, your printing stuff to the screen should take orders of magnitude more time than your processing, but still shouldn't be a problem. As mentioned, 1kS/s * 64 b/S is nothing for modern (read: last twenty years) PC hardware.
I recommend storing the raw data until the key is hit. After the key is pressed, output the data.
You want to remove formatting and output from high performance code areas.
Paraphrasing a song, There will be time enough for printing when the data's done.
Edit 1:
An array-based circular queue is a good data structure to hold the incoming data. This gives you the last N data samples.
Whenever you have issues with performance, your first step should be to profile the code to see what parts of it are taking up time.
However, for your code, I would say that the printing and string handling are unnecessary for the main loop. I would have a separate array of timestamps and within my main loop only acquire data.
After a key is hit, you no longer have timing restrictions and can deal with the somewhat expensive operation of file I/O and building up of the strings.
A final note is that your OS might be stealing CPU cycles from you. You may want to try to run your code with higher priorities to rule out scheduling.
With all that said, as was mentioned above, your data rate should be sustainable unless you're running on some really vintage hardware.

How to send a code to the parallel port in exact sync with a visual stimulus in Psychopy

I am new to python and psychopy, however I have vast experience in programming and in designing experiments (using Matlab and EPrime). I am running an RSVP (rapid visual serial presentation) experiment with displays a different visual stimuli every X ms (X is an experimental variable, can be from 100 ms to 1000 ms). As this is a physiological experiment, I need to send triggers over the parallel port exactly on stimulus onset. I test the sync between triggers and visual onset using an oscilloscope and photosensor. However, when I send my trigger before or after the win.flip(), even with the window waitBlanking=False parameter then I still get a difference between the onset of the stimuli and the onset of the code.
Attached is my code:
im=[]
for pic in picnames:
im.append(visual.ImageStim(myWin,image=pic,pos=[0,0],autoLog=True))
myWin.flip() # to get to the next vertical blank
while tm < and t &lt len(codes):
im[tm].draw()
parallel.setData(codes[t]) # before
myWin.flip()
#parallel.setData(codes[t]) # after
ttime.append(myClock.getTime())
core.wait(0.01)
parallel.setData(0)
dur=(myClock.getTime()-ttime[t])*1000
while dur < stimDur-frameDurAvg+1:
dur=(myClock.getTime()-ttime[t])*1000
t=t+1
tm=tm+1
myWin.flip()
How can I sync my stimulus onset to the trigger? I'm not sure if this is a graphics card issue (I'm using a LCD ACER screen with the onboard Intel graphics card). Many thanks,
Shani
win.flip() waits for next monitor update. This means that the next line after win.flip() is executed almost exactly when the monitor begins drawing the frame. That's where you want to send your trigger. The line just before win.flip() is potentially almost one frame earlier, e.g. 16.7 ms on a 60Hz monitor so your trigger would arrive too early.
There are two almost identical ways to do it. Let's start with the most explicit:
for i in range(10):
win.flip()
# On the first flip
if i == 0:
parallel.setData(255)
core.wait(0.01)
parallel.setData(0)
... so the signal is sent just after the image has been pushed to the monitor.
The slightly more timing-accurate way to do it will save you like 0.01 ms (plus minus an order of magnitude). Somewhere early in the script define
def sendTrigger(code):
parallel.setData(code)
core.wait(0.01)
parallel.setData(0)
Then do
win.callOnFlip(sendTrigger, code=255)
for i in range(10):
win.flip()
This will call the function just after the first flip, before psychopy does a bit of housecleaning. So the function could have been called win.callOnNextFlip since it's only executed on the first following flip.
Again, this difference in timing is so miniscule compared to other factors that this is not really a question of a performance but rather of style preferences.
There is a hidden timing variable that is usually ignored - the monitor input lag, and I think this is the reason for the delay. Put simply, the monitor needs some time to display the image even after getting the input from the graphics card. This delay has nothing to do with the refresh rate (how many times the screen switches buffer), or the response time of the monitor.
In my monitor, I find a delay of 23ms when I send a trigger with callOnFlip(). How I correct it is: floor(23/16.667) = 1, and 23%16.667 = 6.333. So I call the callOnFlip on the second frame, wait 6.3 ms and trigger the port. This works. I haven't tried with WaitBlanking=True, which waits for the blanking start from the graphics card, as that gives me some more time to prepare the next buffer already. However, I think that even with WaitBlanking=True the effect will be there. (More after testing!)
Best,
Suddha
There is at least one routine that you can use to normalized the trigger delay to your screen refreshing rate. I just tested it with a photosensor cell and I went from a mean delay of 13 milliseconds (sd = 3.5 ms) between the trigger and the stimulus display, to a mean delay of 4.8 milliseconds (sd = 3.1 ms).
The procedure is the following :
Compute the mean duration between two displays. Say your screen has a refreshing rate of 85.05 (this is my case). This means that there is mean duration of 1000/85.05 = 11.76 milliseconds between two refreshes.
Just after you called win.flip(), wait for this averaged delay before you send your trigger : core.wait(0.01176).
This will not ensure that all your delays now equal zero, since you cannot master the synchronization between the win.flip() command and the current state of your screen, but it will center the delay around zero. At least, it did for me.
So the code could be updated as following :
refr_rate = 85.05
mean_delay_ms = (1000 / refr_rate)
mean_delay_sec = mean_delay_ms / 1000 # Psychopy needs timing values in seconds
def send_trigger(port, value):
core.wait(mean_delay_sec)
parallel.setData(value)
core.wait(0.001)
parallel.setData(0)
[...]
stimulus.draw()
win.flip()
send_trigger(port, value)
[...]

Limit iterations per time unit

Is there a way to limit iterations per time unit? For example, I have a loop like this:
for (int i = 0; i < 100000; i++)
{
// do stuff
}
I want to limit the loop above so there will be maximum of 30 iterations per second.
I would also like the iterations to be evenly positioned in the timeline so not something like 30 iterations in first 0.4s and then wait 0.6s.
Is that possible? It does not have to be completely precise (though the more precise it will be the better).
#FredOverflow My program is running
very fast. It is sending data over
wifi to another program which is not
fast enough to handle them at the
current rate. – Richard Knop
Then you should probably have the program you're sending data to send an acknowledgment when it's finished receiving the last chunk of data you sent then send the next chunk. Anything else will just cause you frustrations down the line as circumstances change.
Suppose you have a good Now() function (GetTickCount() is bad example, it's OS specific and has bad precision):
for (int i = 0; i < 1000; i++){
DWORD have_to_sleep_until = GetTickCount() + EXPECTED_ITERATION_TIME_MS;
// do stuff
Sleep(max(0, have_to_sleep_until - GetTickCount()));
};
You can check elapsed time inside the loop, but it may be not an usual solution. Because computation time is totally up to the performance of the machine and algorithm, people optimize it during their development time(ex. many game programmer requires at least 25-30 frames per second for properly smooth animation).
easiest way (for windows) is to use QueryPerformanceCounter(). Some pseudo-code below.
QueryPerformanceFrequency(&freq)
timeWanted = 1.0/30.0 //time per iteration if 30 iterations / sec
for i
QueryPerf(count1)
do stuff
queryPerf(count2)
timeElapsed = (double)(c2 - c1) * (double)(1e3) / double(freq) //time in milliseconds
timeDiff = timeWanted - timeElapsed
if (timeDiff > 0)
QueryPerf(c3)
QueryPerf(c4)
while ((double)(c4 - c3) * (double)(1e3) / double(freq) < timeDiff)
queryPerf(c4)
end for
EDIT: You must make sure that the 'do stuff' area takes less time than your framerate or else it doesn't matter. Also instead of 1e3 for milliseconds, you can go all the way to nanoseconds if you do 1e9 (if you want that much accuracy)
WARNING... this will eat your CPU but give you good 'software' timing... Do it in a separate thread (and only if you have more than 1 processor) so that any guis wont lock. You can put a conditional in there to stop the loop if this is a multi-threaded app too.
#FredOverflow My program is running very fast. It is sending data over wifi to another program which is not fast enough to handle them at the current rate. – Richard Knop
What you might need a buffer or queue at the receiver side. The thread that receives the messages from the client (like through a socket) get the message and put it in the queue. The actual consumer of the messages reads/pops from the queue. Of course you need concurrency control for your queue.
Besides the flow control methods mentioned, if you also have the need to maintain an accurate specific data sending rate in your sender part. Usually it can be done like this.
E.x. if you want to send at 10Mbps, create a timer of interval 1ms so it will call a predefined function every 1ms. Then in the timer handler function, by keep tracking of 2 static variables 1)Time elapsed since beginning of sending data 2)How much data in bytes have been sent up to last call, you can easily calculate how much data is needed to be sent in the current call (or just sleep and wait for next call).
By this way, you can do "streaming" of data in a very stable way with very little jitterness, and this is usually adopted in streaming of videos. Of course it also depends on how accurate the timer is.

How to use ALSA's snd_pcm_writei()?

Can someone explain how snd_pcm_writei
snd_pcm_sframes_t snd_pcm_writei(snd_pcm_t *pcm, const void *buffer,
snd_pcm_uframes_t size)
works?
I have used it like so:
for (int i = 0; i < 1; i++) {
f = snd_pcm_writei(handle, buffer, frames);
...
}
Full source code at http://pastebin.com/m2f28b578
Does this mean, that I shouldn't give snd_pcm_writei() the number of
all the frames in buffer, but only
sample_rate * latency = frames
?
So if I e.g. have:
sample_rate = 44100
latency = 0.5 [s]
all_frames = 100000
The number of frames that I should give to snd_pcm_writei() would be
sample_rate * latency = frames
44100*0.5 = 22050
and the number of iterations the for-loop should be?:
(int) 100000/22050 = 4; with frames=22050
and one extra, but only with
100000 mod 22050 = 11800
frames?
Is that how it works?
Louise
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html#gf13067c0ebde29118ca05af76e5b17a9
frames should be the number of frames (samples) you want to write from the buffer. Your system's sound driver will start transferring those samples to the sound card right away, and they will be played at a constant rate.
The latency is introduced in several places. There's latency from the data buffered by the driver while waiting to be transferred to the card. There's at least one buffer full of data that's being transferred to the card at any given moment, and there's buffering on the application side, which is what you seem to be concerned about.
To reduce latency on the application side you need to write the smallest buffer that will work for you. If your application performs a DSP task, that's typically one window's worth of data.
There's no advantage in writing small buffers in a loop - just go ahead and write everything in one go - but there's an important point to understand: to minimize latency, your application should write to the driver no faster than the driver is writing data to the sound card, or you'll end up piling up more data and accumulating more and more latency.
For a design that makes producing data in lockstep with the sound driver relatively easy, look at jack (http://jackaudio.org/) which is based on registering a callback function with the sound playback engine. In fact, you're probably just better off using jack instead of trying to do it yourself if you're really concerned about latency.
I think the reason for the "premature" device closure is that you need to call snd_pcm_drain(handle); prior to snd_pcm_close(handle); to ensure that all data is played before the device is closed.
I did some testing to determine why snd_pcm_writei() didn't seem to work for me using several examples I found in the ALSA tutorials and what I concluded was that the simple examples were doing a snd_pcm_close () before the sound device could play the complete stream sent it to it.
I set the rate to 11025, used a 128 byte random buffer, and for looped snd_pcm_writei() for 11025/128 for each second of sound. Two seconds required 86*2 calls snd_pcm_write() to get two seconds of sound.
In order to give the device sufficient time to convert the data to audio, I put used a for loop after the snd_pcm_writei() loop to delay execution of the snd_pcm_close() function.
After testing, I had to conclude that the sample code didn't supply enough samples to overcome the device latency before the snd_pcm_close function was called which implies that the close function has less latency than the snd_pcm_write() function.
If the ALSA driver's start threshold is not set properly (if in your case it is about 2s), then you will need to call snd_pcm_start() to start the data rendering immediately after snd_pcm_writei().
Or you may set appropriate threshold in the SW params of ALSA device.
ref:
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html
http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m___s_w___params.html