What is Viola-Jones algorithm complexity in form like O(log(N))?
Even though it's a preety simple algorithm there is no concrete info about it.
When we toking about Viola-Jones algorithm complexity we need to remember the steps of this algorithm.
According to the original article of Paul Viola and Michael Jones, the algorithm contains 4 main steps:
Haar Feature Selection
Creating an Integral Image
Adaboost Training
Cascading Classifiers
The complexity of the first step is O(1) because the decision of which Haar Feature to choose is not related to the input.
The complexity of the second step is O(N) because in this step we go over the image matrix. As you know, an Integral Image helps us to perform computations on all the pixels inside that particular feature in O(1) complexity. However, the creation of Integral Image cost O(N) because we go over each pixel in the original matrix and write in the new matrix new value. The value of each point in the new matrix is the sum of all pixels above and to the left, including the target pixel in the old matrix
The complexity of the third step is O(N D^2), where D is the number of features look here why.
The complexity of the fourth step is less than O(N) look here why.
To sum up, as we can calculate from each stage the complexity of the Viola-Jones algorithm is O(n)
It's linear (O(N)) in the number (N) of pixels of the input image. All Haar image features are computed in constant time upon the integral image, and computing the latter requires one pass over the input image.
Related
I am looking for an efficient way to perform nearest neighbor searches within a specified radius in a two-dimensional plane. According to Wikipedia, space-partitioning data structures, such as :
k-d trees,
r-trees,
octrees,
quadtrees,
cover trees,
metric trees,
BBD trees
locality-sensitive hashing,
and bins,
are often used for organizing points in a multi-dimensional space and can provide O(log n) performance for search and insert operations. However, in my case, the points in the two-dimensional plane are moving at each iteration, so I need to update the tree accordingly. Rebuilding the tree from scratch at each iteration seems easier, but I would like to avoid it if possible because the points only move slightly between iterations.
I have read that k-d trees are not naturally balanced, which could be an issue in my case. R-trees, on the other hand, are better suited for storing rectangles. Bin algorithms, on the other hand, are easy to implement and provide near-linear search performance within local bins.
I am working on an autonomous agent simulation where 1,000,000 agents are rendered in the GPU, and the CPU is responsible for computing the next movement of each agent. Each agent is influenced by other agents within its line of sight, or in other words, other agents within a circular sector of angle θ and radius r. So here specific requirements for my use case:
Search space is a 2-d plane,
Each object is a point identified with the x,y coordinate.
All points are frequently updated by a small factor.
Cannot afford any O(n^2) algorithms.
Search within a radius (circular sector)
Search for all candidates within the search surface.
Given these considerations, what would be the best algorithms for my use case?
I think you could potentially solve this by doing a sort of scheduling approach. If you know that no object will move more than d distance in each iteration, and you want to know which objects are within X distance of each other on each iteration, then given the distances between all objects you know that on the next iteration the only potential pairs of objects that would change their neighbor status would be those with a distance between X-d and X+d. The iteration after that it would be X-2d and X+2d and so on.
So I'm thinking that you could do an initial distance calculation between all pairs of objects, and then based on each difference you can create an NxN matrix where the value in each cell is which iteration you will need to re-check their distance. Then when you re-check those during that iteration, you would update their values in this matrix for the next iteration that they need to be checked.
The only problem is whether calculating an initial NxN distance matrix is feasible.
I'm working on implementing a ModelClass for any 3D model in my DirectX 11/12 pipeline.
My specific problem lies within calculating the min and max for the BoundingBox structure I wish to use as a member of the ModelClass.
I have two approaches to calculating them.
Approach 1.
When each vertex is being read from file, store a current minx,y,z and maxx,y,z and check each vertex as it is loaded in against the current min/max x,y,z.
Approach 2.
After all the vertices have been loaded, sort them by x, then y, then z, finding the lowest and highest value at each point.
Which Approach would you recommend and why?
Approach 1
Time complexity is in O(n) and memory complexity is O(1).
It is simple to implement.
Approach 2
Time complexity is O(nLogn) memory complexity is potentially at least linear (if you make a copy of the arrays or if you use merge sort) or O(1) if you use an in place sorting algorithm like quicksort.
This has to be done 3 times one for each dimension.
All in all Approach 1 is best in all scenarios I can think of.
Sorting generally is not a cheap operation especially as your models are getting larger. Therefore it to me like Approach 1 is more efficient but if unsure I suggest measuring it see which one takes longer.
If you are using a library like Asspimp I believe the library takes care of bounding boxes but this might not be an option if you create the pipeline as a learning opportunity.
I want to implement a system where given an input image, it returns a reasonable similar one (approximation is acceptable) in a dataset of (about) 50K images. Time performances are crucial.
I'll use a parallel version of SIFT for obtaining a matrix of descriptors D. I've read about Fisher Vector (FV) (VLfeat and Yael implementations) as a learning and much more precise alternative to Bag of Features (BoF) for representing D as a single vector v.
My question are:
What distance is used for FVs? Is it the Euclidean one? In that case I would use LSH in eucledian distance for quickly find approximate near neighbor of FVs.
There is any other FV efficient (in terms of time) C++ implementation?
Another method you could take into consideration is VLAD encoding. (Basically a non-probabilistic version of FV, replacing GMMs by k-Means clustering)
Implementation differs only slightly from standard vector quantisation, but I my experiments it showed much better performance with significantly lower codebook size.
It uses euclidean distance to find the nearest codebook vector, but instead of just counting elements, it accumulates every elements residual.
An example for image search: Link
FV / VLAD paper: Paper
I have input array A
A[0], A[1], ... , A[N-1]
I want function Max(T,A) which return B represent max value on A over previous moving window of size T where
B[i+T] = Max(A[i], A[i+T])
By using max heap to keep track of max value on current moving windows A[i] to A[i+T], this algorithm yields O(N log(T)) worst case.
I would like to know is there any better algorithm? Maybe an O(N) algorithm
O(N) is possible using Deque data structure. It holds pairs (Value; Index).
at every step:
if (!Deque.Empty) and (Deque.Head.Index <= CurrentIndex - T) then
Deque.ExtractHead;
//Head is too old, it is leaving the window
while (!Deque.Empty) and (Deque.Tail.Value > CurrentValue) do
Deque.ExtractTail;
//remove elements that have no chance to become minimum in the window
Deque.AddTail(CurrentValue, CurrentIndex);
CurrentMin = Deque.Head.Value
//Head value is minimum in the current window
it's called RMQ(range minimum query). Actually i once wrote an article about that(with c++ code). See http://attiix.com/2011/08/22/4-ways-to-solve-%C2%B11-rmq/
or you may prefer the wikipedia, Range Minimum Query
after the preparation, you can get the max number of any given range in O(1)
There is a sub-field in image processing called Mathematical Morphology. The operation you are implementing is a core concept in this field, called dilation. Obviously, this operation has been studied extensively and we know how to implement it very efficiently.
The most efficient algorithm for this problem was proposed in 1992 and 1993, independently by van Herk, and Gil and Werman. This algorithm needs exactly 3 comparisons per sample, independently of the size of T.
Some years later, Gil and Kimmel further refined the algorithm to need only 2.5 comparisons per sample. Though the increased complexity of the method might offset the fewer comparisons (I find that more complex code runs more slowly). I have never implemented this variant.
The HGW algorithm, as it's called, needs two intermediate buffers of the same size as the input. For ridiculously large inputs (billions of samples), you could split up the data into chunks and process it chunk-wise.
In sort, you walk through the data forward, computing the cumulative max over chunks of size T. You do the same walking backward. Each of these require one comparison per sample. Finally, the result is the maximum over one value in each of these two temporary arrays. For data locality, you can do the two passes over the input at the same time.
I guess you could even do a running version, where the temporary arrays are of length 2*T, but that would be more complex to implement.
van Herk, "A fast algorithm for local minimum and maximum filters on rectangular and octagonal kernels", Pattern Recognition Letters 13(7):517-521, 1992 (doi)
Gil, Werman, "Computing 2-D min, median, and max filters", IEEE Transactions on Pattern Analysis and Machine Intelligence 15(5):504-507 , 1993 (doi)
Gil, Kimmel, "Efficient dilation, erosion, opening, and closing algorithms", IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12):1606-1617, 2002 (doi)
(Note: cross-posted from this related question on Code Review.)
I am implementing the Good Features To Track/Shi-Tomasi corner detection algorithm on CUDA and need to find a way to parallelize the following part of the algorithm:
I start with an array of points obtained from an image sorted according to a certain intensity value (an eigenvalue of a previous calculation).
Starting with the first point of the array, I remove any point in the array that is within a certain physical distance of the first point. (This distance is calculated on the image plane, not on the array).
On the resulting array, we repeat step two for the remaining points.
Is this somehow parallelizable, specifically on CUDA? I'm suspecting not, since there will obviously be dependencies across the image.
I think the article Accelerated Corner-Detector Algorithms describes the way to solve this problem.