Given a vector of (random) numbers, I'm looking for the magnitude difference between the smallest element and the largest element. The STL has the minmax_element function for this purpose. Depending on the result, I would like to perform some action in my code, i.e., if the difference is large enough, I would like to do stuff.
Right now, my code looks as follow:
auto res = std::minmax_element(vec.begin(), vec.end());
if (std::abs(*res.second - *res.first) > threshold)
// do stuff
This algorithm works in principle. However, in most of my cases the if condition will be fulfilled. I would even bet in most of my cases I can simply compare the first and the second element in my vector in order to check the if condition and do stuff.
Having said this, it seems a bit odd that I run through the whole vector all the time, although I mostly don't have to. Is there a STL-algorithm, which takes the early exit into, or has someone an appropriate STL algorithm solution in mind. I could easily write a hand-crafted for-loop with an early exit in order to do what I want, but maybe there are better options.
Related
In the critical path of my program, I need to sort an array (specifically, a C++ std::vector<int64_t>, using the gnu c++ standard libray). I am using the standard library provided sorting algorithm (std::sort), which in this case is introsort.
I was curious about how well this algorithm performs, and when doing some research on various sorting algorithms different standard and third party libraries use, almost all of them care about cases where 'n' tends to be the dominant factor.
In my specific case though, 'n' is going to be on the order of 2-20 elements. So the constant factors could actually be dominant. And things like cache effects might be very different when the entire array we are sorting fits into a couple of cache lines.
What are the best sorting algorithms for cases like this where the constant factors likely overwhelm the asymptotic factors? And do there exist any vetted C++ implementations of these algorithms?
Introsort takes your concern into account, and switches to an insertion sort implementation for short sequences.
Since your STL already provides it, you should probably use that.
Insertion sort or selection sort are both typically faster for small arrays (i.e., fewer than 10-20 elements).
Watch https://www.youtube.com/watch?v=FJJTYQYB1JQ
A simple linear insertion sort is really fast. Making a heap first can improve it a bit.
Sadly the talk doesn't compare that against the hardcoded solutions for <= 15 elements.
It's impossible to know the fastest way to do anything without knowing exactly what the "anything" is.
Here is one possible set of assumptions:
We don't have any knowledge of the element structure except that elements are comparable. We have no useful way to group them into bins (for radix sort), we must implement a comparison-based sort, and comparison takes place in an opaque manner.
We have no information about the initial state of the input; any input order is equally likely.
We don't have to care about whether the sort is stable.
The input sequence is a simple array. Accessing elements is constant-time, as is swapping them. Furthermore, we will benchmark the function purely according to the expected number of comparisons - not number of swaps, wall-clock time or anything else.
With that set of assumptions (and possibly some other sets), the best algorithms for small numbers of elements will be hand-crafted sorting networks, tailored to the exact length of the input array. (These always perform the same number of comparisons; it isn't feasible to "short-circuit" these algorithms conditionally because the "conditions" would depend on detecting data that is already partially sorted, which still requires comparisons.)
For a network sorting four elements (in the known-optimal five comparisons), this might look like (I did not test this):
template<class RandomIt, class Compare>
void _compare_and_swap(RandomIt first, Compare comp, int x, int y) {
if (comp(first[x], first[y])) {
auto tmp = first[x];
arr[x] = arr[y];
arr[y] = tmp;
}
}
// Assume there are exactly four elements available at the `first` iterator.
template<class RandomIt, class Compare>
void network_sort_4(RandomIt first, Compare comp) {
_compare_and_swap(2, 0);
_compare_and_swap(1, 3);
_compare_and_swap(0, 1);
_compare_and_swap(2, 3);
_compare_and_swap(1, 2);
}
In real-world environments, of course, we will have different assumptions. For small numbers of elements, with real data (but still assuming we must do comparison-based sorts) it will be difficult to beat naive implementations of insertion sort (or bubble sort, which is effectively the same thing) that have been compiled with good optimizations. It's really not feasible to reason about these things by hand, considering both the complexity of the hardware level (e.g. the steps it takes to pipeline instructions and then compensate for branch mis-predictions) and the software level (e.g. the relative cost of performing the swap vs. performing the comparison, and the effect that has on the constant-factor analysis of performance).
I have a list of points with x,y coordinates:
List_coord=[(462, 435), (491, 953), (617, 285),(657, 378)]
This list lenght (4 element here) can be very large from few hundred up to 35000 elements.
I want to remove too close points by threshold in this list.
note:Points are never at the exact same position.
My current code for that:
while iteration<5:
for pt in List_coord:
for PT in List_coord:
if (abs(pt[0]-PT[0])+abs(pt[1]-PT[1]))!=0 and abs(pt[0]-PT[0])<threshold and abs(pt[1]-PT[1])<threshold:
List_coord.remove(PT)
iteration=iteration+1
Explication of my terrible code :) :
I check if the very distance is 0 then it means that i am comparing
the same point
then i check the distance in x and in y..
Iteration:
I need few iterations to avoid missing one remove because the list change inside the loop itself...
This code is working but it is a very low process!
I am sure there is another method much easier but i wasn't able to find even if some allready answered questions are close to mine..
note:I would like to avoid using extra library for that code if it is possible
Python will be a bit slow at this ;-)
The solution you will probably want is called quad-trees, but I'll mention a simpler approach first, in case it's preferable.
The usual approach is to group the points so that you can easily reject points that are clearly far away from each other.
One approach might be to sort the list twice, once by x once by y. You can prove that if two points are too-close, they must be close in one dimension or the other. Thus your inner loop can break out early. If it sees a point that is too far away from the outer point in the sorted direction, it can know for a fact that all future points in that list are also too far away. Thus it doesn't have to look any further. Do this in X and Y and you're set!
This approach is going to tend to be dominated by the O(n log n) sort times. However, if all of your points share a single x value, you'll end up doing the same slow O(n^2) iteration that you're doing right now because you never terminate the inner loop early.
The more robust solution is to use quadtrees. Quadtrees are designed to solve the kind of problem you are looking at. The idea is to build a tree such that you can rapidly exclude large numbers of points. I'd recommend this.
If your number of points gets too large, I'd recommend getting a clustering library. Efficient clustering is a very difficult task, and often done in C++ or another fast language.
I have a bit of an issue, I was recently told that for an un-ordered value for input, a bunch of random values, lets say 1 Million of them, that using a set would be more efficient than using a vector, and then sorting said vector with the basic sort algorithm function, but when I used them, and checked them through the time function, in the terminal, and valgrind, it showed that both time complexity, and space usage were faster for the vector, even with the addition of the sort function being called. The person who gave me the advice to use the set is a lot more experienced than me in the C++ language, but I always have to test things out myself prior to taking peoples advice. The test codes follow.
For Set
std::set<int> testSet;
for(int i(0); i<= 1000000; ++i)
testSet.insert(-i);
For Vector
std::vector<int> testVector;
for(int i(0); i<= 1000000; ++i)
testVector.push_back(i * -1);
std::sort(testVector.begin(), testVector.end());
I know that these are not random variables, it wouldn't be fair since set does not allow duplicates, and vector does sothey would be different sizes for this basic function point. Can anyone clarify why the set should be used, sans the point of the no duplicates one.
I did not do any tests with the unordered set either. Not too sure of the differences between the two given points.
This is too vague and ignores/misses out several crucial factors. If your friend said precisely this, then your friend (regardless of his or her experience) was wrong. More likely you are somewhat misinterpreting their words and reading into them a simplified version of matters.
When you want a sorted final product, the sorting is "amortized" when you insert into a set, because you get little bits of sorting action each time. If you will be inserting periodically and many times, then that spreading-out of the workload may be what you want. The total, when added up, may still be more than for a vector (consider the occasional rebalancing and so forth; your vector just needs to be moved to a larger block of memory once in a while), but you've spread it out so as not to noticeably slow down some individual other part of your program.
But if you're just dumping all the elements into a vector and sorting straight away, not only is there less work for the container & algorithm to do but you probably don't mind it taking a noticeable amount of time.
You haven't really stated your use case in any detail so I won't pretend to give specifics here, but the only possible answer to your question as posed is both "it depends" and "the question is fundamentally somewhat meaningless"; you cannot just take two data structures and sorting methodologies, and ask "which is more efficient?" without a use case. You have, however, correctly measured the time and space requirements and if you've done that against your real-world use case then, well, you have your answer don't you?
Suppose we have a function foo that does something
to all the elements between *firsta and *lastb:
foo(RandomAccessIterator1 firsta,RandomAccessIterator1 lasta){
for (RandomAccessIterator1 it=firsta;it!=lasta+1;it++){
//here stuff happens...
}
}
question a): is there a way to skip an index firsta<i<lastb by only
modifying the inputs to foo --e.g. the random iterators,
in other words without changing foo itself, just its input?
--Unfortunately the index I want to skip are not in the edges
(they are often deep between firsta and lasta) and foo
is a complicated divide&conquer algorithm that's not amenable
to being called on subsets of the original array the iterators
are pointing to.
question b): if doing a) is possible, what's the cost of doing that?
constant or does it depend on (lasta-firsta)?
The best way to do this would be to use an iterator that knows how to skip that element. A more generalized idea though, is an iterator that simply iterates over two separate ranges under the hood. I don't know of anything in boost that does this, so, here's one I just whipped up: http://coliru.stacked-crooked.com/a/588afa2a353942fc
Unfortunately, the code to detect which element to skip adds a teeny tiny amount of overhead to each and every iterator increment, so the overhead is technically proportional to lasta-firsta. Realistically, using this wrapper around a vector::iterator or a char* should bring it roughly to the same performance level as std::deque::iterator, so it's not like this should be a major slowdown.
The answer might be a bit picky, but you could call foo(firsta,i-1) and foo(i+1,lastb) or something similar to have the desired effect.
First to give you some background: I have some research code which performs a Monte Carlo simulation, essential what happens is I iterate through a collection of objects, compute a number of vectors from their surface then for each vector I iterate through the collection of objects again to see if the vector hits another object (similar to ray tracing). The pseudo code would look something like this
for each object {
for a number of vectors {
do some computations
for each object {
check if vector intersects
}
}
}
As the number of objects can be quite large and the amount of rays is even larger I thought it would be wise to optimise how I iterate through the collection of objects. I created some test code which tests arrays, lists and vectors and for my first test cases found that vectors iterators were around twice as fast as arrays however when I implemented a vector in my code in was somewhat slower than the array I was using before.
So I went back to the test code and increased the complexity of the object function each loop was calling (a dummy function equivalent to 'check if vector intersects') and I found that when the complexity of the function increases the execution time gap between arrays and vectors reduces until eventually the array was quicker.
Does anyone know why this occurs? It seems strange that execution time inside the loop should effect the outer loop run time.
What you are measuring is the difference of overhead to access element from an array and a vector. (as well as their creation/modification etc... depending on the operation you are doing).
EDIT: It will vary depending on the platform/os/library you are using.
It probably depends on the implementation of vector iterators. Some implementations are better than others. (Visual C++ — at least older versions — I'm looking at you.)
I think the time difference I was witnessing was actually due to an error in the pointer handling code. After making a few modifications to make the code more readable the iterations were taking around the time (give or take 1%) regardless of the container. Which makes sense as all the containers have the same access mechanism.
However I did notice the vector runs a bit slower in an OpenMP architecture this is probably due to the overhead in each thread maintaining its own copy of the iterator.