Receiving data from USRP - c++

I have written a small c++ program that receives data from the USRP. The program can receive the I/Q data and show it on a spectrum analyzer. The receiver LED is not always green though. It sorts of blinking and dimming. I suspect there is a rate mismatch between the computer and the USRP. Could this be the case? How does one make sure that the computer consumes the samples at the same rate as the USRP is acquiring them? Below is a thread function I use for the I/Q signal acquisition.
void
USRPDriver::RxEventLoop()
{
uhd::rx_metadata_t md;
uhd::stream_cmd_t stream_cmd(uhd::stream_cmd_t::STREAM_MODE_NUM_SAMPS_AND_DONE);
stream_cmd.stream_now = true;
stream_cmd.num_samps = 1024;
//std::cout << "Maximum num samps = " << rx_stream->get_max_num_samps() << std::endl;
std::vector<std::complex<float> > fcpxIQ;
fcpxIQ.resize(1024);
usrp->issue_stream_cmd(stream_cmd);
while(true)
{
usrp->issue_stream_cmd(stream_cmd);
size_t num_rx_samps = rx_stream->recv(&fcpxIQ[0], 1024, md);
emit ReceiveIQ(fcpxIQ);
//std::cout << "Rx rate = " << usrp->get_rx_rate(0) << std::endl;
//fcpxIQ.clear();
}
}

you should not use NUM_SAMPS_AND_DONE if you want continuous streaming. That's exactly not the use case it's for: It tells the USRP to stop receiving once 1024 samples have been received.
Simply don't use that mode.

Related

Best practice for performance when multithreading with OpenCV VideoWriter in C++

I'm relatively new to C++, especially multi-threading, and have been playing with different options. I've gotten some stuff to work but ultimately I'm looking to maximize performance so maybe I think it'd be better to reach out to everyone else for what would be most effective and then go down that road.
I'm working on an application that will take a video stream and write an unmodified video file and a modified video file (there's some image processing that happens) to the local disk. There's also going to be some other threads to collect some other GPS data, etc, but nothing special.
The problem I'm running into is the framerate is limited mainly by the VideoWriter function in OpenCV. I know this can be greatly alleviated if I use a thread to write the frame to the VideoWriter object, that while the two VideoWriters can run simultaneously with each other and the image processing functions.
I've successfully created this function:
void frameWriter(Mat writeFrame, VideoWriter *orgVideo)
{
(orgVideo->write(writeFrame));
}
And it is called from within an infinite loop like so:
thread writeOrgThread(frameWriter, modFrame, &orgVideo, &orgVideoMutex);
writeOrgThread.join();
thread writeModThread(frameWriter, processMatThread(modFrame, scrnMsg1, scrnMsg2)
writeModThread.join();
Now having the .join() immediately underneath defeats the performance benefits, but without it I immediately get the error "terminate called without an active exception". I thought it would do what I needed if I put the join() functions above, so on the next loop it'd make sure the previous frame was written before writing the next, but then it behaves as if the join is not there (perhaps by the time the main task has made the full loop and gotten to the join, the thread is already terminated?). Also, using detach I think creates the issue that the threads are unsynchronized and then I run into these errors:
[mpeg4 # 0000000000581b40] Invalid pts (156) <= last (156)
[mpeg4 # 00000000038d5380] Invalid pts (158) <= last (158)
[mpeg4 # 0000000000581b40] Invalid pts (158) <= last (158)
[mpeg4 # 00000000038d5380] Invalid pts (160) <= last (160)
[mpeg4 # 0000000000581b40] [mpeg4 # 00000000038d5380] Invalid pts (160) <= last
(160)
Invalid pts (162) <= last (162)
I'm assuming this is because multiple threads are trying to access the same resource? Finally, I tried using mutex with detach to avoid above and I got a curious behavior where my sleep thread wasn't behaving properly and the frame rate was inconsistent .
void frameWriter(Mat writeFrame, VideoWriter *orgVideo, mutex *mutexVid)
{
(mutexVid->lock());
(orgVideo->write(writeFrame));
(mutexVid->unlock());
}
Obviously I'm struggling with thread synchronization and management of shared resources. I realize this is probably a rookie struggle, so if somebody tossed a tutorial link at me and told me go read a book I'd be OK with that. I guess what i'm looking for right now is some guidance as far as what specific method is going to get me the best performance in this situation and then I'll make that work.
Additionally, does anyone have a link for a very good tutorial that covers multithreading in C++ from a broader point of view (not limited to Boost or C++11 implmentation and covers mutexs, etc). It could greatly help me out with this.
Here's the 'complete' code, I stripped out some functions to make it easier to read, so don't mind the extra variable here and there:
//Standard libraries
#include <iostream>
#include <ctime>
#include <sys/time.h>
#include <fstream>
#include <iomanip>
#include <thread>
#include <chrono>
#include <mutex>
//OpenCV libraries
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
//Other libraries
//Namespaces
using namespace cv;
using namespace std;
// Code for capture thread
void captureMatThread(Mat *orgFrame, VideoCapture *stream1){
//loop infinitely
for(;;){
//capture from webcame to Mat orgFrame
(*stream1) >> (*orgFrame);
}
}
Mat processMatThread(Mat inFrame, string scrnMsg1, string scrnMsg2){
//Fancify image
putText(inFrame, scrnMsg1, cvPoint(545,450),CV_FONT_HERSHEY_COMPLEX,
0.5,CvScalar(255,255,0,255),1,LINE_8,false);
putText(inFrame, scrnMsg2, cvPoint(395,470),CV_FONT_HERSHEY_COMPLEX,
0.5,CvScalar(255,255,0,255),1,LINE_8,false);
return inFrame;
}
void frameWriter(Mat writeFrame, VideoWriter *orgVideo, mutex *mutexVid)
{
//(mutexVid->lock());
(orgVideo->write(writeFrame));
//(mutexVid->unlock());
}
long usecDiff(long usec1, long usec2){
if (usec1>usec2){
return usec1 - usec2;
}
else {
return (1000000 + usec1) - usec2;
}
}
int main()
{
//Start video capture
cout << "Opening camera stream..." << endl;
VideoCapture stream1(0);
if (!stream1.isOpened()) {
cout << "Camera failed to open!" << endl;
return 1;
}
//Message incoming image size
cout << "Camera stream opened. Incoming size: ";
cout << stream1.get(CV_CAP_PROP_FRAME_WIDTH) << "x";
cout << stream1.get(CV_CAP_PROP_FRAME_HEIGHT) << endl;
//File locations
const long fileSizeLimitBytes(10485760);
const int fileNumLimit(5);
const string outPath("C:\\users\\nag1\\Desktop\\");
string outFilename("out.avi");
string inFilename("in.avi");
//Declare variables for screen messages
timeval t1;
timeval t2;
timeval t3;
time_t now(time(0));
gettimeofday(&t1,0);
gettimeofday(&t2,0);
gettimeofday(&t3,0);
float FPS(0.0f);
const int targetFPS(60);
const long targetUsec(1000000/targetFPS);
long usec(0);
long usecProcTime(0);
long sleepUsec(0);
int i(0);
stringstream scrnMsgStream;
string scrnMsg1;
string scrnMsg2;
string scrnMsg3;
//Define images
Mat orgFrame;
Mat modFrame;
//Start Video writers
cout << "Creating initial video files..." << endl;
//Identify outgoing size, comments use incoming size
const int frame_width = 640; //stream1.get(CV_CAP_PROP_FRAME_WIDTH);
const int frame_height = 480; //stream1.get(CV_CAP_PROP_FRAME_HEIGHT);
//Message outgoing image size
cout << "Outgoing size: ";
cout << frame_width << "x" << frame_height << endl;
VideoWriter orgVideo(outPath + inFilename,CV_FOURCC('D','I','V','X'),targetFPS,
Size(frame_width,frame_height),true);
mutex orgVideoMutex;
VideoWriter modVideo(outPath + outFilename,CV_FOURCC('D','I','V','X'),targetFPS,
Size(frame_width,frame_height),true);
mutex modVideoMutex;
//unconditional loop
cout << "Starting recording..." << endl;
//Get first image to prevent exception
stream1.read(orgFrame);
resize(orgFrame,modFrame,Size(frame_width,frame_height));
// start thread to begin capture and populate Mat frame
thread captureThread(captureMatThread, &orgFrame, &stream1);
while (true) {
//Time stuff
i++;
if (i%2==0){
gettimeofday(&t1,0);
usec = usecDiff(t1.tv_usec,t2.tv_usec);
}
else{
gettimeofday(&t2,0);
usec = usecDiff(t2.tv_usec,t1.tv_usec);
}
now = time(0);
FPS = 1000000.0f/usec;
scrnMsgStream.str(std::string());
scrnMsgStream.precision(2);
scrnMsgStream << std::setprecision(2) << std::fixed << FPS;
scrnMsg1 = scrnMsgStream.str() + " FPS";
scrnMsg2 = asctime(localtime(&now));
//Get image
//Handled by captureMatThread now!!!
//stream1.read(orgFrame);
//resize image
resize(orgFrame,modFrame,Size(frame_width,frame_height));
//write original image to video
//writeOrgThread.join();
thread writeOrgThread(frameWriter, modFrame, &orgVideo, &orgVideoMutex);
//writeOrgThread.join();
writeOrgThread.detach();
//orgVideo.write(modFrame);
//write modified image to video
//writeModThread.join();
thread writeModThread(frameWriter, processMatThread(modFrame, scrnMsg1, scrnMsg2), &modVideo, &modVideoMutex);
//writeOrgThread.join();
//writeModThread.join();
writeModThread.detach();
//modVideo.write(processMatThread(modFrame, scrnMsg1, scrnMsg2));
//sleep
gettimeofday(&t3,0);
if (i%2==0){
sleepUsec = targetUsec - usecDiff(t3.tv_usec,t1.tv_usec);
}
else{
sleepUsec = targetUsec - usecDiff(t3.tv_usec,t2.tv_usec);
}
this_thread::sleep_for(chrono::microseconds(sleepUsec));
}
orgVideo.release();
modVideo.release();
return 0;
}
This is actually running on a raspberry pi (adapted to use raspberry pi camera) so my resources are limited and that's why i'm trying to minimize how many copies of the image there are and implement the parallel writing of the video files. You can see I've also experimented with placing both the 'join()'s after the "writeModThread", so at least the writing of the two files are in parallel. Perhaps that's the best we can do, but I plan to add a thread with all the image processing that I'd like to run in parallel (now you can see it called as a simple function that adds plain text).

OpenCL - Draw To OpenGL Texture crashes

I am trying to create an OpenCL raycaster. Therefore I am drawing to an OpenGL texture many times a second. However queue.enqueueNDRangeKernel eventually returns -9999. If I remove write_imagef from my kernel code, it works, so i figured this causes the problem.
OpenCL kernel (broken down)
__kernel void main(__write_only image2d_t screen)
{
unsigned int x = get_global_id(0);
unsigned int y = get_global_id(1);
int2 coords = (int2) (x, y);
write_imagef(screen, coords, (float4)(1,0,1,1));
}
This is the code that runs once in c++:
cl::Program::Sources sources;
string code = ResourceLoader::loadFile(filename);
sources.push_back({ code.c_str(),code.length() });
program = cl::Program(OpenCL::context, sources);
if (program.build({ OpenCL::default_device }) != CL_SUCCESS)
{
cout << "Could not build program \"" << filename << "\"! Error:" << endl;
cout << "OpenCL: Error building: " << program.getBuildInfo<CL_PROGRAM_BUILD_LOG>(OpenCL::default_device) << "\n";
system("PAUSE");
exit(1);
}
queue = CommandQueue(OpenCL::context, OpenCL::default_device);
kernel = Kernel(program, "main");
//OpenGL texture
ImageGL b(OpenCL::context, CL_MEM_READ_WRITE, GL_TEXTURE_2D, 0, argument, &error);
if (error != 0)
{
cout << "CL Error: " << OpenCL::get_cl_error_string(error) << endl;
system("PAUSE");
exit(error);
}
kernel.setArg(0, b);
This Code runs every frame:
glFinish();
queue.enqueueAcquireGLObjects(&this->buffersGL);
NDRange range;
if (lengthZ <= 0 && lengthY <= 0)
range = NDRange(lengthX);
else if (lengthZ <= 0)
range = NDRange(lengthX, lengthY);
else
range = NDRange(lengthX, lengthY, lengthZ);
cl::Event wait;
cl_int run_err = queue.enqueueNDRangeKernel(kernel, NDRange(), range, NullRange, NULL, &wait);
if (run_err != 0)
{
cout << OpenCL::get_cl_error_string(run_err) << " (" << run_err << ")" << endl;
system("PAUSE");
}
queue.enqueueReleaseGLObjects(&this->buffersGL);
What could be causing the -9999 error and how can I fix it? Also, there are often big chunks of "dead pixels" that have not been drawn to in the texture...
You enqueue the release of GL buffers, but do not wait for it to complete.
queue.enqueueReleaseGLObjects(&this->buffersGL);
either get the finish event out of this (watch out for leaks!), or wait on the command queue to finish all tasks before proceeding to releasing the GL objects. When one thing in a queue depends on another, you are supposed to arrange their ordering yourself.
You also queue a bunch of tasks that depend on the GL objects. Either wait for them to complete (finish the queue), or take their events and feed them to the enqueue release GL objects as perquisites.
As an aside:
Using fewer kernels might be a good idea, instead of one per pixel.
Using fewer kernels might be a good idea, instead of one per pixel.
Thanks alot Yakk! I tried that by first simply using a smaller screen size and it suddenly worked again! As it turns out though the texture I was drawing to was the problem. It was not 600x600 pixels big and that's what caused the crash. Apparently OpenCL can draw to pixels that "don't actually exist" a couple of times before crashing. It still is weird behaviour...

Brain Computer Interface P300 Machine Learning

I am currently working a P300 (basically there is detectable increase in a brain wave when a user sees something they are interested) detection system in C++ using the Emotiv EPOC. The system works but to improve accuracy I'm attempting to use Wekinator for machine learning, using an support vector machine (SVM).
So for my P300 system I have three stimuli (left, right and forward arrows). My program keeps track of the stimulus index and performs some filtering on the incoming "brain wave" and then calculates which index has the highest average area under the curve to determine which stimuli the user is looking at.
For my integration with Wekinator: I have setup Wekinator to receive a custom OSC message with 64 features (the length of the brain wave related to the P300) and set up three parameters with discrete values of 1 or 0. For training I have I have been sending the "brain wave" for each stimulus index in a trial and setting the relevant parameters to 0 or 1, then training it and running it. The issue is that when the OSC message is received by the the program from Wekinator it is returning 4 messages, rather than just the one most likely.
Here is the code for the training (and input to Wekinator during run time):
for(int s=0; s < stimCount; s++){
for(int i=0; i < stimIndexes[s].size(); i++) {
int eegIdx = stimIndexes[s][i];
ofxOscMessage wek;
wek.setAddress("/oscCustomFeatures");
if (eegIdx + winStart + winLen < sig.size()) {
int winIdx = 0;
for(int e=eegIdx + winStart; e < eegIdx + winStart + winLen; e++) {
wek.addFloatArg(sig[e]);
//stimAvgWins[s][winIdx++] += sig[e];
}
validWindowCount[s]++;
}
std::cout << "Num args: " << wek.getNumArgs() << std::endl;
wekinator.sendMessage(wek);
}
}
Here is the receipt of messages from Wekinator:
if(receiver.hasWaitingMessages()){
ofxOscMessage msg;
while(receiver.getNextMessage(&msg)) {
std::cout << "Wek Args: " << msg.getNumArgs() << std::endl;
if (msg.getAddress() == "/OSCSynth/params"){
resultReceived = true;
if(msg.getArgAsFloat(0) == 1){
result = 0;
} else if(msg.getArgAsFloat(1) == 1){
result = 1;
} else if(msg.getArgAsFloat(2) == 1){
result = 2;
}
std::cout << "Wek Result: " << result << std::endl;
}
}
}
Full code for both is at the following Gist:
https://gist.github.com/cilliand/f716c92933a28b0bcfa4
Main query is basically whether something is wrong with the code: Should I send the full "brain wave" for a trial to Wekinator? Or should I train Wekinator on different features? Does the code look right or should it be amended? Is there a way to only receive one OSC message back from Wekinator based on smaller feature sizes i.e. 64 rather than 4 x 64 per stimulus or 9 x 64 per stimulus index.

C++ win32 printing to console in fixed timesteps

I am trying to create a function that will allow me to enter the desired frames per second and the maximum frame count and then have the function "cout" to the console on the fixed time steps. I am using Sleep() to avoid busy waiting as well. I seem to make the program sleep longer than it needs to because it keeps stalling on the sleep command i think. Can you help me with this? i am having some trouble understanding time, especially on windows.
Ultimately i will probably use this timing method to time and animate a simple game , maybe like pong, or even a simple program with objects that can accelerate. I think i already understand GDI and wasapi enough to play sound and show color on the screen, so now i need to understand timing. I have looked for a long time before asking this question on the internet and i am sure that i am missing something, but i can't quite put my finger on it :(
here is the code :
#include <windows.h>
#include <iostream>
// in this program i am trying to make a simple function that prints frame: and the number frame in between fixed time intervals
// i am trying to make it so that it doesn't do busy waiting
using namespace std;
void frame(LARGE_INTEGER& T, LARGE_INTEGER& T3, LARGE_INTEGER& DELT,LARGE_INTEGER& DESI, double& framepersec,unsigned long long& count,unsigned long long& maxcount,bool& on, LARGE_INTEGER& mili)
{
QueryPerformanceCounter(&T3); // seccond measurement
DELT.QuadPart = &T3.QuadPart - &T.QuadPart; // getting the ticks between the time measurements
if(DELT.QuadPart >= DESI.QuadPart) {count++; cout << "frame: " << count << " !" << endl; T.QuadPart = T3.QuadPart; } // adding to the count by just one frame (this may cause problems if more than one passes)
if(count > maxcount) {on = false;} // turning off the loop
else {DESI.QuadPart = T.QuadPart + DESI.QuadPart;//(long long)framepersec; // setting the stop tick
unsigned long long sleep = (( DESI.QuadPart - DELT.QuadPart) / mili.QuadPart);
cout << sleep << endl;
Sleep(sleep);} // sleeping to avoid busy waiting
}
int main()
{
LARGE_INTEGER T1, T2, Freq, Delta, desired, mil;
bool loopon = true; // keeps the loop flowing until max frames has been reached
QueryPerformanceFrequency(&Freq); // getting num of updates per second
mil.QuadPart = Freq.QuadPart / 1000; // getting the number clock updates that occur in a millisecond
double framespersec; // the number of clock updates that occur per target frame
unsigned long long framecount,maxcount; //to stop the program after a certain amount of frames
framecount = 0;
cout << "Hello world! enter the amount of frames per second : " << endl;
cin >> framespersec;
cout << "you entered: " << framespersec << " ! how many max frames?" << endl;
cin >> maxcount;
cout << "you entered: " << maxcount << " ! now doing the frames !!!" << endl;
desired.QuadPart = (Freq.QuadPart / framespersec);
while(loopon == true) {
frame(T1, T2, Delta, desired, framespersec, framecount, maxcount,loopon, mil);
}
cout << "all frames are done!" << endl;
return 0;
}
The time that you sleep is limited by the frequency of the system clock. The frequency defaults to 64 Hz, so you'll end up seeing sleeps in increments of 16ms. Any sleep that's less than 16ms will be at least 16ms long - it could be longer depending on CPU load. Likewise, a sleep of 20ms will likely be rounded up to 32ms.
You can change this period by calling timeBeginPeriod(...) and timeEndPeriod(...), which can increase sleep accuracy to 1ms. If you have a look at multimedia apps like VLC Player, you'll see that they use these functions to get reliable frame timing. Note that this changes the system wide scheduling rate, so it will affect battery life on laptops.
More info:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd757624%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686298%28v=vs.85%29.aspx
Waitable timers are more accurate than Sleep, and also integrate with a GUI message loop better (replace GetMessage with MsgWaitForMultipleObjects). I've used them successfully for graphics timing before.
They won't get you high precision for e.g. controlling serial or network output at sub-millisecond timing, but UI updates are limited by VSYNC anyway.

SDL Benchmark Sound

I am doing a benchmark project between two graphical libraries (SDL, SFML) for my final cs project. I got it almost finished, however when I benchmark the speed of playing sounds, it always returns time taken 0, no matter how many loops he does. Do you know whats wrong with my code? The sound actually plays, however I should probably do some other algorithm.
void playSound()
{
Mix_PlayChannel(-1, sound, 0);
}
void soundBenchmark(int numOfCycles)
{
int time = SDL_GetTicks(), timeRequired;
for(int i = 0; i < numOfCycles; i++) playSound();
timeRequired = SDL_GetTicks() - time;
cout << "Time required for " << numOfCycles << " cycles: " << timeRequired << " seconds.\n";
}
The function Mix_PlayChannel() does not block the execution of the code. The function just send the data to the sound card( or equivalent) and returns.
You are going to have to remember the channel you used with Mix_PlayChannel() and then check periodically with Mix_Playing() whether that channel is playing or not and look at the time.