OpenCV and MATLAB speed comparison - c++

I've started to use OpenCV with Visual C++ 2010 Express, because it was supposed to be faster than MATLAB.
In order to do a fair comparison between both, I'm running a program where I convert a RGB image to its gray scale correspondent and I calculate the conversion image space operation elapsed time.
Using cvtColor command to do the task in C++ Release, it takes me around 5 ms, average. Doing the same operation in MATLAB, takes me more or less the same average time (the codes are bellow).
I already tested, and both programs are working fine.
Does anybody have any idea if I can improve the OpenCV speed?
C++ code.
#include <opencv2/highgui/highgui.hpp>
#include <iostream>
#include <opencv2/imgproc/imgproc.hpp>
#include <windows.h>
using namespace cv;
using namespace std;
double PCFreq = 0.0;
__int64 CounterStart = 0;
void StartCounter()
{
LARGE_INTEGER li;
if(!QueryPerformanceFrequency(&li))
cout << "QueryPerformanceFrequency failed!\n";
PCFreq = double(li.QuadPart)/1000.0;
QueryPerformanceCounter(&li);
CounterStart = li.QuadPart;
}
double GetCounter()
{
LARGE_INTEGER li;
QueryPerformanceCounter(&li);
return double(li.QuadPart-CounterStart)/PCFreq;
}
int main()
{
double time;
Mat im, result;
im = imread("C:/Imagens_CV/circles_rgb.jpg");
StartCounter();
cvtColor(im,result, CV_BGR2GRAY);
time = GetCounter();
cout <<"Process time: "<< time << endl;
}
MATLAB code
tic
img_gray = rgb2gray(img_rgb);
toc

Color conversion in OpenCV will make extensive use of Intel IPP if it is available at compile time. See modules\imgproc\src\color.cpp. More info from Intel. Note that this code has no OpenMP pragmas or TBB code, so that won't help here.
The exciting bit is that Intel has granted OpenCV the right to use a subset of IPP for free, incuding these functions. See the third item in the release summary for more info. But you need to use at least OpenCV 3.0 to get this free functionality; otherwise you need to compile with your own copy of IPP.
Clearly, cvtColor (far left) does not benefit much, but it gets a little boost. Other functions do much better.

If you follow the call to rgb2gray function in MATLAB (edit rgb2gray.m), you'll find that it eventually calls a private MEX-function imapplymatrixc.mexw64 implemented in C++.
In fact if you load this shared library into a tool like "Dependency Walker", you'll see it has a dependency on tbb.dll which indicates the function is multi-threaded using Intel TBB library.
While it doesn't seem to be the case for color conversion functions, the Image Processing Toolbox does use Intel IPP library for some of its image arithmetic functions (there is a setting you can control to enable/disable the use of "hardware optimization" ippl: iptsetpref('UseIPPL', true)).
In addition, there is a version of the function that runs on the GPU (CUDA) when using gpuArray input arrays (edit gpuArray>rgb2gray.m). This requires the Parallel Computing Toolbox.
So it is safe to say the function is well optimized!

Related

How to measure time cost of multithreading program on Mac OS using C++?

I'm writing a multithreading program on Mac OS using C++, and I need to measure time cost of it.
Here I found some functions maybe useful to measure time:
clock(): for a multithreading program, it's not accurate.
time(): it can measure time but only have second precision.
gettimeofday(): I wrote a sample to test it, but got wrong time measurement.
#include <stdio.h>
#include <iostream>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
using namespace std;
int main()
{
timeval start, finish;
gettimeofday(&start,NULL);
sleep(10); //sleep for 10 sec
gettimeofday(&finish,NULL);
cout<<(finish.tv_usec-start.tv_usec)/1000.0<<"ms"<<endl;
return 0;
}
The output is just few ms, I don't know why.
Does anyone know any other function to measure time on Mac OS?
Each of the values is looping - when msecs gets to 1000, secs is incremented and msecs is set back to 0. When secs hits 60, mins is incremented and secs gets set back to 0.
If you are trying to measure time this way, you need to take all the fields bigger than your required minimum resolution into account.
struct timeval has two fields, tv_sec and tv_usec. You're ignoring tv_sec, thereby throwing away the most significant part of the time.
The best solution is to use the <chrono> functionality (that of course implies that you're using a c++11-enabled compiler).
Then you code might look like:
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
using namespace std::chrono;
int main()
{
auto start = high_resolution_clock::now();
std::this_thread::sleep_for(10s); //sleep for 10 sec
cout << duration_cast<milliseconds>(high_resolution_clock::now() - start).count() << "ms\n";
return 0;
}
Note that I'm also using neat chrono literals to specify 10s, which is c++14 feature. If it's not available, just use seconds(10) instead.
Note, however, that the output is not guaranteed to be "10000ms" as the OS is allowed to suspend the thread for longer - see here (unistd one also doesn't guarantee that).
Cheers,
Rostislav.

Quick and Efficient Method to Pass Variables from C++ to Matlab

I've developed a C++ program which calculates a set of coordinates (x, y) within a loop. Every iteration I want to send the coordinate to Matlab for further processing, at a speed of about 25 times per second. I have a Matlab function that then takes this coordinate and uses it in real time; however, I haven't found an effective way of sending variables quickly from C++ to Matlab.
I've tried using the Matlab engine here: Passing Variable from C++ to Matlab (Workspace), except I want this variable to be used in the existing Matlab session and not simply run Matlab commands through C++.
I've also tried writing the C++ coordinate to a binary file and then reading this file in Matlab - this method is very fast but I'm having problems with the timing between both languages. Setting the Matlab code to an infinite loop reading the binary file, whilst running the C++ program writing the coordinate to the file, means that Matlab reads in a very strange order (ie. Matlab reads 15, 200, 70, 12 when I write the i values to file). I suspect this is due to poor timing between each program trying to open and either read or write the file.
C++:
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/opencv.hpp"
#include <iostream>
#include <math.h>
#include <fstream>
#include <stdio.h>
#include <Windows.h>
using namespace cv;
using namespace std;
int main()
{
int a = 0
for (int i = 0; i < 100000; ++i)
{
a = i;
std::ofstream ofile("foobar.bin", std::ios::binary);
ofile.write((char*) &a, sizeof(int));
ofile.close();
}
return 0;
}
Matlab:
A = fopen('foobar.bin');
fread(A)
fclose(A);
Is there a way to quickly and accurately send data between C++ and Matlab by writing to binary OR some other method which I can implement?
Thank you very much!
I cannot provide code samples because it has been a few years since I did this, but I know that you can use a create a COM object and interface it with matlab. Here is the link describing how to interface a COM object with matlab. http://www.mathworks.com/help/matlab/using-com-objects-in-matlab.html

C++ math functions can be used without including the directive "math.h" in VS 2013

I am very curious why I can use the math functions in C++ without including the "math.h". I can't find an answer with google search.
Here is the simple code I am executing. Everything is compiling and running.
#include <iostream>
using namespace std;
int main()
{
const float PI = acosf(-1);
cout << PI << endl;
return 0;
}
Any standard header is allowed to include any other standard header.
if you would compile the same with gcc-4.8 it would complain.
Keep in mind that this is not something to rely on if you want your code to be portable and compilable on different versions of the same or different compilers.

Cross platform way to prevent high cpu usage in while loop (Without boost)?

I have a server that I want to run, and it uses a cross platform library that only gives me a tick() to call:
int main()
{
Inst inst;
while(true)
{
inst.tick();
}
}
I need to try to lower the cpu usage so that it doesnt constantly take up 1 core.
Is there a simple way to do this without boost?
Thanks
#include <iostream>
#include <thread>
#include <chrono>
using namespace std;
int main()
{
//5 seconds
auto duration = chrono::duration<float>(5);
this_thread::sleep_for(duration);
return 0;
}
However, even if this code is completely fine, I can't seem to compile it with the provided MinGW compiler from Code::Blocks.

Armadillo element wise multiplication speed

Is element wise multiplication (%) speed in armadillo depends whether LAPACK/BLAS is installed? Im currently running armadillo without them installed and speed is awful.
Ok here is the simplest code, which takes eternity to calculate
#include <iostream>
#include "conio.h"
#include "armadillo"
using namespace arma;
using namespace std;
int main(int argc, char** argv)
{
int n=250;
mat X=ones(n,n);
mat quan;
for (int xi=1;xi<=256;xi++)
{
quan = exp(X)%exp(X);
}
getch();
return 0;
}
Make sure you have optimisation flags enabled in your compiler settings (eg. in GCC or Clang, use -O2 or -O3). Armadillo makes use of template metaprogramming, and like any C++ template library, this absolutely requires optimisation enabled within the compiler to be effective. For example, this also applies to C++ template libraries such as Boost.
Why are you calculating exp(X) twice? You're not benchmarking elementwise multiplication; you're apparently benchmarking exp(). Also, why are you not using expmat() or expmat_sym()?