The most optimized way of calculating distance between data in c++ - c++

I have n points in a 2D plane. I want to calculate the distance between each two points in c++. Position of m'th point in the plan is (x(m),y(m)). This points changes during passing time. The number of time steps is equal to 10^5.
I wrote below code, but as n is a big number(5000) and I want to find the distance between points 10^5 times, I'm searching for the most optimized way to do that. Could anyone tell me what is the least time-consuming way to do that?
for(i=1;n;++)
for(j=1;n;++)
if (i>j)
r(i,j)= r(j,i);
else
r(i,j)=sqrt((x(i)-x(j))^2+(y(i)-y(j))^2);
end
end
end
I know that, in Matlab, I can find this by using bsxfun function. I want also to know which one could calculate distances faster? Matlab or c++?

Regarding Matlab, you have also pdist which does exactly that (but is not so fast), and you should also read this.
About comparing Matlab and C first read this and this. Also keep in mind that Matlab, as a desktop program, requires knowing not only a general efficient way to implement your code but also the right way to do this in Matlab. One example is the difference between functions. Built-in functions are written in FORTRAN or C and run much faster then non-built-in functions. To know if a function is built-in, type:
which function_name
and check if you see "built-in" at the start of the output:
built-in (C:\Program Files\MATLAB\...)

Related

Eigen: Efficient equivalent to MATLAB's changem()?

I am needing to perform an operation on an Eigen VectorXi, which is equivalent to MATLAB's changem():
http://www.mathworks.com/help/map/ref/changem.html
At the moment, the way I am doing this is looping over the values in the array and performing the remapping with a switch/case block. I am guessing this is not particularly efficient.
Is there a fast way to do this with Eigen? Speed is critical for my application.
Switch / case will be particularly slow and inflexible.
changem takes a matrix and two vectors of values, new and old. If an entry is found in the old list, it is replaced by the corresponding entry in the new list. So it's inherently going to be rather slow, you need to pass over the entire matrix, search the old list, and if, and entry is found, replace with the new list.
How can you speed it up? First, don't hardcode as a switch / case. A modern compiler will possibly optimise to a loop rather than lots of jumps, but I wouldn't guarantee it. And the approach is inflexible.
Secondly, you can sort the "old" vector and use a binary search rather than a linear one. That will only help significantly if the old vector is long.
Thirdly, you can take advantage of what you know about the matrix. Are the old values constrained to lie in certain regions? Is there one value which is overwhelmingly likely and can be tested for first? Can you quickly exclude some values as not allowed in the old list (Too big, too small, not integral).
Are the old values integers and can you use indexing? Or generalise that to hashing. That would be even faster than a binary search, though with more overhead for hashing.
Can you solve the problem another way and keep an index of matrix xy co-ordinates by value?
There are lots of approaches. But simply implement the Matlab function naively in C as the first step. It might well be fast enough.

How to handle very large matrices (e.g. 1000000 by 1000000)

My question is very general..and its not duplicate too..
when we declare something like this int mat[1000000][1000000];
it is sure it will give an error saying matrix size too large.
i have seen many problems on many competitive programming websites where we need to declare a 2d matrix with 10^6 rows, columns ,I know there is always some trick associated with it to reduce the matrix size.
so i just want to ask what are the possible ways or tricks we can use in such cases to minimize the size ..i mean which types of algorithms are generally required to solve it like DP or anyone else??
In DP, if current row is dependent only on previous row, you can use
int mat[2][1000000];. After calculating current row, you can immediately discard previous row and switch current and previous.
Sometimes, it is possible to use std::map instead of 2D array.
I have encountered many question in programming contests and the
solutions defers from case to case basis, so if you mention a
specific case, I can possibly give you a better targeted solution.
That depends very much on the specific task. There is no universal "trick" that will always work. You'll have to look for something in the particular problem that allows you to solve it in a different way.
That said, if I could really see no other way, I'd start thinking about how many elements of that matrix will really be non-zero (perhaps I can use a sparse array or a map (dictionary) instead). Or maybe I don't need to store all the elements it memory, but can instead re-calculate them every time I need them.
At any rate, a matrix that large (or any kind of fake representation of it) will NOT be useful. Not just because you don't have enough memory, but also because filling up such an array with data will take anywhere from several hours to many months. That should be your first concern - figuring out how to solve the task with less data and computations. When you figure out that, you'll also see what data structure is appropriate.

Finding the number of local maxima/local minima (range) from a vector in c++

the task here is to find the number of local maximas(range) or minima from a vector. I know how to find local maxima/local minima(one point from a graph) however, the local maximas are now clustered together in the vector.
To give a clearer idea, plotting out the values from the vector will produce something similar to this:
In simpler terms, I want to find the number of peaks. In this case, 6. However the peaks are not a single point, and a range of values. How am I able to find the number of peaks(range of local maxima) from the vector?
It will be greatly appreciated, if there can be some pseudocode, code examples, if not, suggestions will be appreciated too. I am using Visual Studio C++, along with QWT, QT and OpenCV for this project.
I think it should not be that difficult. Just scan all the values in order and whenever you get the threshold level, start the range, when you come out of the threshold level, end the particular range.
You will need to filter out the ranges that are two small.

Efficient Longest arithmetic progression for a set of linear Points

Longest arithmetic progression of a set of numbers {ab1,ab2,ab3 .... abn} is defined as a subset {bb1,bb2,bb3 .... bbn} such that bi+1 - bi is constant.
I would like to extend this problems to a set of two dimensional points lying on a straight line.
Lets define Dist(P1,P2) is the distance between two Points P1(X1,Y1) and P2(X2,Y2) on a line as
Dist(P1,P2) = Dist((X1,Y1),(X2,Y2)) = (X2 - X1)2 + (Y2 - Y1))2
Now For a given set of points I need to find the largest Arithmetic Progression such that Dist(Pi,Pi+1) is constant, assuming they all lie on the same line (m & C are constant).
I researched a bit but could not figure out an algorithm which is better than O(n2).
In fact currently the way I am doing is I am maintaining a Dictionary say
DistDict=dict()
and say Points are defined in a List as
Points = [(X1,Y1),(X2,Y2),....]
then this is what I am doing
for i,pi in enumerate(Points):
for pj in Points[i+1:]:
DistDict.setdefault(dist(pi,pj),set([])).add((pi,pj))
so all I have ended up with a dictionary of points which are of equal distance. So the only thing I have to do is to scan through to find out the longest set.
I am just wondering that this ought to have a better solution, but somehow I can't figure out one. I have also seen couple of similar older SO posts but none I can find to give something that is more efficient than O(n2). Is this somehow an NP Hard problem that we can never have something better or if not what approach could be take.
Please note I came across a post which claims about an efficient divide and conquer algorithm but couldn't make any head or tail out of it.
Any help in this regard?
Note*** I am tagging this Python because I understand Python better than maybe Matlab or Ruby. C/C++/Java is also fine as I am somewhat proficient in these too :-)
To sum things up: As #TonyK pointed out, if you assume that the points lie on a straight line, you can reduce it to the one-dimensional case that was discussed extensively here already. The solution uses Fast Fourier Transforms as mentioned by #YochaiTimmer.
Additional note: The problem is almost certainly not NP-hard as it has an efficient O(n log n) solution, so that would imply P=NP.
You could study the Fast Fourier Transform methods for multiplication.O(N log N)
You might be able to do something similar with your problem.
Firstly, your definition of distance is wrong. You have to take the square root. Secondly, if you know that all the points lie on a straight line, you can simply ignore the y-coordinates (unless the line is vertical) or the x-coordinates (unless the line is horizontal). Then it reduces to the problem in your first paragraph.

Image Arithmetic functions in C++

I'm trying to find/write a function that would perform the same operation as imlincomb(). However, I am having trouble finding such functions in C++ without using any Matlab API functions other than Intel Performance Primitiives library, and I don't really want to purchase a license for it unless my application really has to take advantage of it. What would be any easy method of implementing it, or perhaps if there are any standard functions that make the job a lot easier?
Thanks in advance.
There's definitely nothing of the sort in any standard C++ package. You might be able to use something in LAPACK, but I think you'd be better off writing your own. It's a fairly simple function: each output pixel is independent and depends only on the input pixels at the same coordinates. In pseudocode:
for each row y in [0, height-1]
for each column x in [0, width-1]
for each color channel c in (R, G, B)
output[y][x][c] = 0
for each input i
output[y][x][c] += weight[i] * input[i][y][x][c]
Of course, the exact formulation depends on how exactly your images are stored (3D array, 2D array, or 1D array, and be careful about the order of your dimensions!).