So I have been thinking long and hard about this question and I think this might good old fashioned interval scheduling by finding the earliest deadline first.
Here's my approach,
Calculate the deadlines for each item (time it falls off the branch + the time it takes for the item to fall from the branch - the time it requires for me to travel there and catch it)
If the deadline is earlier than the current time then, I cant catch it.
However, in the ones that are reachable, I catch the earliest one. Then I again calculate the earliest deadlines for every item this time feeding in my new position.
However, this approach seems a little inefficient. Can someone point me toward a better one?
After parsing and storing your input, you'll have some array of points in 2 dimensions (x and t) you should try to visit, no more than one x per t. You need to find the optimal path through that array.
One approach would be to brute-force this. You have (2*s)^t possible paths - less, in fact, since you shouldn't move beyond 0 - l. So for very small t, you could really find the optimal path - i.e. try every possible path, see how many things you can get to for each path, and return number of things from the best path.
However, if t gets big, then a brute-force approach would take some time (sub-exponential time).
You will need to find some approximation to this if you're going to code this up. There are loads of algorithms with different trade-offs, e.g. https://en.wikipedia.org/wiki/Best-first_search .
Priority queues, as found in schedulers, could indeed be useful to this as you rightly state.
Related
I have a list of points with x,y coordinates:
List_coord=[(462, 435), (491, 953), (617, 285),(657, 378)]
This list lenght (4 element here) can be very large from few hundred up to 35000 elements.
I want to remove too close points by threshold in this list.
note:Points are never at the exact same position.
My current code for that:
while iteration<5:
for pt in List_coord:
for PT in List_coord:
if (abs(pt[0]-PT[0])+abs(pt[1]-PT[1]))!=0 and abs(pt[0]-PT[0])<threshold and abs(pt[1]-PT[1])<threshold:
List_coord.remove(PT)
iteration=iteration+1
Explication of my terrible code :) :
I check if the very distance is 0 then it means that i am comparing
the same point
then i check the distance in x and in y..
Iteration:
I need few iterations to avoid missing one remove because the list change inside the loop itself...
This code is working but it is a very low process!
I am sure there is another method much easier but i wasn't able to find even if some allready answered questions are close to mine..
note:I would like to avoid using extra library for that code if it is possible
Python will be a bit slow at this ;-)
The solution you will probably want is called quad-trees, but I'll mention a simpler approach first, in case it's preferable.
The usual approach is to group the points so that you can easily reject points that are clearly far away from each other.
One approach might be to sort the list twice, once by x once by y. You can prove that if two points are too-close, they must be close in one dimension or the other. Thus your inner loop can break out early. If it sees a point that is too far away from the outer point in the sorted direction, it can know for a fact that all future points in that list are also too far away. Thus it doesn't have to look any further. Do this in X and Y and you're set!
This approach is going to tend to be dominated by the O(n log n) sort times. However, if all of your points share a single x value, you'll end up doing the same slow O(n^2) iteration that you're doing right now because you never terminate the inner loop early.
The more robust solution is to use quadtrees. Quadtrees are designed to solve the kind of problem you are looking at. The idea is to build a tree such that you can rapidly exclude large numbers of points. I'd recommend this.
If your number of points gets too large, I'd recommend getting a clustering library. Efficient clustering is a very difficult task, and often done in C++ or another fast language.
Background:
I need to process some hundred thousand events (producing results) given a hard time limit. The clock is literally ticking, and when the timer fires, whatever is done at that point must be flushed out.
What isn't ready by that time is either discarded (depending on an importance metric) or processed during the next time quantum (with an "importance boost", i.e. adding a constant to the importance metric).
Now ideally, the CPU is much faster than needed, and the whole set is ready a long time before the end of the time slice. Unluckily, the world is rarely ever ideal, and "hundred thousands" becomes "tens of millions" before you know.
Events are added to the back of a queue (which is really a vector) as they come in, and are processed from the front during the respective next quantum (so the program always processes the last quantum's input).
However, not all events are equally important. In case the available time is not sufficient, it would be preferrable to drop unimportant events rather than important ones (this is not a strict requirement, since important events will be copied to the next time quantum's queue, but doing so further adds to the load so it isn't a perfect solution).
The obvious thing to use would be, of course, a priority queue / heap. Unluckily, heapifying 100k elements isn't precisely a free operation either (or parallel), and then I end up with objects being in some non-obvious and not necessarily cache-friendly memory locations, and pulling elements from a priority queue doesn't parallelize nicely.
What I would really like is somewhat like a vector that is sorted or at least "somewhat approximately sorted", which one can traverse sequentially afterwards. This would trivially allow me to create e.g. 12 threads (or any other number, one per CPU) that process e.g. 1/64 of the range (or another size) each, slowly advancing from the front to the end, and eventually dropping/postponing what's left over -- which will be events of little importantance that can be discarded.
Simply sorting the complete range using std::sort would be the easiest, most straightforward solution. However, the time it takes to sort items reduces the time available to actually process elements within the fixed time budget, and sorting time is for the most part single-CPU time (and parallel sort isn't that great either).
Also, doing a perfect sort (which isn't really needed) may bring forth worst case complexity whereas an approximate sort should ideally perform at its optimum and have a very predictable cost.
tl;dr
So, what I'm looking for is a way to sort an array/vector only approximately, but fast, and with a predictable (or guaranteed) runtime.
The sort key would be a small integer typically between 10 and 1000. Being postponed to the next time quantum might increase ("priority boost") that value by a small amount, e.g. 100 or 200.
In a different question where humans are supposed to do an approximate sort using "subjective compare"(?) shell sort was proposed. On various sorting demo applets, it seems like at least for the "random shuffle" input that's typical in these, shell sort can indeed do an "approximate sort" that doesn't look too bad with 3-4 passes over the data (and at least the read-tap is strictly sequential). Unluckily it seems to be somewhat of a black art to choose gap values that work well, and runtime estimates seem to involve a lot of looking into the crystal ball as well.
Comb sort with a relatively large shrink factor (such as 2 or 3?) seems tempting as well, since it visits memory strictly sequentially (on both taps) and is able to move far out elements by a great distance quickly. Again, judging from sorting demo applets, it seems like 3-4 passes already give a rather reasonable "approximate sort".
MSD radix sort comes to mind, though I am not sure how it would perform given typical 16/32bit integers in which most of the most significant bits are all zero! One would probably have to do an initial pass to find the most significant bit in the whole set, followed by 2-3 actual sort passes?
Is there a better algorithm or a well-known working approach with one of the algorithms I mentioned?
What comes to mind is to iterate over the vector and if some event is less important, don't process it but put it aside. As soon as the entire vector has been read, have a look at the events put aside. Of course you can use several buckets with different priorities. And only store references there, you don't want to move megabytes of data. (posted as an answer now as requested by Damon)
Use a separate vector for each priority. Then you don't need to sort them.
Sounds like a nice example where near-sort algorithms can be useful.
Back a decade Chazelle has developed a nice data-structure that somewhat works like a heap. The key difference is the time complexity though. It has constant time for all important operations, e.g. insert, remove, find lowest element etc.
The trick of this data-structure is, that it breaks the O(n*log n) complexity barrier by allowing for some error in the sort order.
To me that sounds pretty much what you need. The data-structure is called soft heap and explained on wikipedia:
https://en.wikipedia.org/wiki/Soft_heap
There are other algorithms that allow for some error in favor to speed as well. You'll find them if you google for Near Sort Algorithms
If you try that algorithm please give some feedback how it works in practice. I'm really eager to hear from you how the algorithm performs in practice.
Sounds like you want to use std::partition: move the part that interests you to the front, and the others to the back. Its complexity is in the order of O(n), but it is cache-friendly, so it's probably a lot faster than sorting.
If you have limited "bandwidth" in processing events (say a 128K per time quantum), you could use std::nth_element to select the 128K (minus some percentage lost due to making that computation) most promising events (assuming you have an operator< that compares priorities) in O(N) time. Then you process those in parallel, and when you are done, you reprioritize the remainder (again in O(N) time).
std::vector<Event> events;
auto const guaranteed_bandwidth = 1<<17; // 128K is always possible to process
if (events.size() <= guaranteed_bandwidth) {
// let all N workers loose on [begin(events), end(events)) range
} else {
auto nth = guaranteed_bandwidth * loss_from_nth_element;
std::nth_element(begin(events), begin(events) + nth);
// let all N workers loose on [begin(events), nth) range
// reprioritize [nth, end(events)) range and append to events for next time quantum
}
This guarantees that in the case that your bandwith threshold is reached, you process the most valuable elements first. You could even speed up the nth_element by a poor man's parallelization (e.g. let each of N workers compute M*128K/N best elements for small M in parallel, and then do a final merge and another nth_element on the M*128K elements).
The only weakness is that in case your system is really overloaded (billions of events, maybe due to some DOS attack) it could take more than the entire quantum to run nth_element (even when quasi-parallized) and you actually process nothing. But if the processing time per event is much larger (say a few 1,000 cycles) than a priority comparison (say a dozen cycles), this should not happen under regular loads.
NOTE: for performance reasons, it's of course better to sort pointers/indices into the main event vector, this is not shown for brevity.
If you have N worker threads, give each worker thread 1/Nth of the original unsorted array. The first thing the worker will do is your approximate fast sorting algorithm of preference on it's individual piece of the array. Then, they can each process their array peice in order - roughly performing higher priority items first, and also being very cache friendly. This way, you don't take a hit for trying to sort the entire array, or even trying to approximately sort the entire array; and what little sorting there is, is entirely parallelized. Sorting 10 pieces individually is much cheaper than sorting the whole thing.
This would work best if the priorities of items to process are randomly distributed. If there is some ordering to them you'll wind up with a thread being flooded by or starved of high priority items to process.
I am programming a C++ simulation application in which several mass-spring structures will move and collide and I'm currently struggling with the collision detection and response part. These structures might or might not be closed (it might be a "ball" or just a chain of masses and springs) so it is (I think) impossible to use a "classic" approach where we test for 2 overlapping shapes.
Furthermore, the collisions are a really important part of that simulation and I need them to be as accurate as possible, both in terms of detection and response (real time is not a constraint here). I want to be able to know the forces applied to each Nodes (the masses), as far as possible.
At the moment I am detecting collisions between Nodes and springs at each time step, and the detection seems to work. I can compute the time of collision between one node and a spring, and thus find the exact position of the collision. However, I am not sure if this is the right approach for this problem and after lots of researches I can't quite find a way to make things work properly, mainly on the response aspect of the collisions.
Thus I would really like to hear from any technique, algorithm or library that seems well suited for this kind of collisions problems or any idea you might have to make this work. Really, any kind of help will be greatly appreciated.
If you can meet the following conditions:
0) All collisions are locally binary - that is to say
collisions only occur for pairs of particles, not triples etc,
1) you can predict the future time for a collision between
objects i and j from knowledge of their dynamics (assuming that no other
collision occurs first)
2) you know how to process the physics/dynamicseac of the collision
then you should be able to do the following:
Let Tpq be the predicted time for a collision between particles p and q, and Vp (Vq) be a structure holding the local dynamics of each particle p (q) (i.e its velocity, position, spring-constants, whatever)
For n particles...
Initialise by calculating all Tpq (p,q in 1..n)
Store the n^2 values of Tpq in a Priority Queue (PQ)
repeat
extract first Tpq from the PQ
Advance the time to Tpq
process the collision (i.e. update Vp and Vq according to your dynamics)
remove all Tpi, and Tiq (i in 1..n) from the PQ
// these will be invalid now as the changes in Vp, Vq means the
// previously calculated collision of p and q with any other particle
// i might occur sooner, later or not at all
recalculate new Tpi and Tiq (i in 1..n) and insert in the PQ
until done
There is an o(n^2) initial setup cost, but the repeat loop should be O(nlogn) - the cost of removing and replacing the 2n-1 invalidated collisions. This is fairly efficient for moderate numbers of particles (up to hundreds). It has the benefit that you only need to process things at collision time, rather than for equally spaced time steps. This makes things particularly efficient for a sparsely populated simulation.
I guess an octree approach would do best with your problem. An octree devides the virtual space into several recursive leaves of a tree and lets you compute possible collisions between the most probable nodes.
Here a short introduction: http://www.flipcode.com/archives/Introduction_To_Octrees.shtml :)
The problem:
N nodes are related to each other by a 'closeness' factor ranging from 0 to 1, where a factor of 1 means that the two nodes have nothing in common and 0 means the two nodes are exactly alike.
If two nodes are both close to another node (i.e. they have a factor close to 0) then this doesn't mean that they will be close together, although probabilistically they do have a much higher chance of being close together.
-
The question:
If another node is placed in the set, find the node that it is closest to in the shortest possible amount of time.
This isn't a homework question, this is a real world problem that I need to solve - but I've never taken any algorithm courses etc so I don't have a clue what sort of algorithm I should be researching.
I can index all of the nodes before another one is added and gather closeness data between each node, but short of comparing all nodes to the new node I haven't been able to come up with an efficient solution. Any ideas or help would be much appreciated :)
Because your 'closeness' metric obeys the triangle inequality, you should be able to use a variant of BK-Trees to organize your elements. Adapting them to real numbers should simply be a matter of choosing an interval to quantize your number on, and otherwise using the standard Bk-Tree procedure. Some experimentation may be required - you might want to increase the resolution of the quantization as you progress down the tree, for instance.
but short of comparing all nodes to
the new node I haven't been able to
come up with an efficient solution
Without any other information about the relationships between nodes, this is the only way you can do it since you have to figure out the closeness factor between the new node and each existing node. A O(n) algorithm can be a perfectly decent solution.
One addition you might consider - keep in mind we have no idea what data structure you are using for your objects - is to organize all present nodes into a graph, where nodes with factors below a certain threshold can be considered connected, so you can first check nodes that are more likely to be similar/related.
If you want the optimal algorithm in terms of speed, but O(n^2) space, then for each node create a sorted list of other nodes (ordered by closeness).
When you get a new node, you have to add it to the indexed list of all the other nodes, and all the other nodes need to be added to its list.
To find the closest node, just find the first node on any node's list.
Since you already need O(n^2) space (in order to store all the closeness information you need basically an NxN matrix where A[i,j] represents the closeness between i and j) you might as well sort it and get O(1) retrieval.
If this closeness forms a linear spectrum (such that closeness to something implies closeness to other things that are close to it, and not being close implies not being close to those close), then you can simply do a binary or interpolation sort on insertion for closeness, handling one extra complexity: at each point you have to see if closeness increases or decreases below or above.
For example, if we consider letters - A is close to B but far from Z - then the pre-existing elements can be kept sorted, say: A, B, E, G, K, M, Q, Z. To insert say 'F', you start by comparing with the middle element, [3] G, and the one following that: [4] K. You establish that F is closer to G than K, so the best match is either at G or to the left, and we move halfway into the unexplored region to the left... 3/2=[1] B, followed by E, and we find E's closer to F, so the match is either at E or to its right. Halving the space between our earlier checks at [3] and [1], we test at [2] and find it equally-distant, so insert it in between.
EDIT: it may work better in probabilistic situations, and require less comparisons, to start at the ends of the spectrum and work your way in (e.g. compare F to A and Z, decide it's closer to A, see if A's closer or the halfway point [3] G). Also, it might be good to finish with a comparison to the closest few points either side of where the binary/interpolation led you.
ACM Surveys September 2001 carried two papers that might be relevant, at least for background. "Searching in Metric Spaces", lead author Chavez, and "Searching in High Dimensional Spaces - Index Structures for Improving the Performance of Multimedia Databases", lead author Bohm. From memory, if all you have is the triangle inequality, you can use it to some effect, but if you can trim your data down to a sensible number of dimensions, you can do better by using a search structure that knows about this dimensional structure.
Facebook has this thing where it puts you and all of your friends in a graph, then slowly moves everyone around until people are grouped together based on mutual friends and so on.
It looked to me like they just made anything <0.5 an attractive force, anything >0.5 a repulsive force, and moved people with every iteration based on the net force. After a couple hundred iterations, it was looking pretty darn good.
Note: this is not an algorithm it is a heuristic. In the facebook implementation I saw, two people were not able to reach equilibrium and kept dancing around each other. It turns out they were actually the same person with two different accounts.
Also, it took about 15 minutes on a decent computer and ~100 nodes. YMMV.
It looks suspiciously like a Nearest Neighbor Search problem (also called a similarity search)
Is there a way (or is it even theoretically possible) to implement a binary search algorithm concurrently? I'm guessing the answer may well be no for two reasons:
Despite lots of Googling I haven't found a concurrent implementation anywhere
Each iterative cycle of the binary chop depends on the values from the previous one, so even if each iteration was a separate thread it would have to block until the previous one completed, making it sequential.
However, I'd like some clarification on this front (and if it is possible, any links or examples?)
At first, it looks like binary search is completely nonparallel. But notice that there are only three possible outcomes:
You hit the element
The element searched for is before the element you hit
The element is after
So we start three parallel processes:
Hit the element
Assume the element is before, search here
Assume the element is after, search there
As soon as we know the result from the first of these, we can kill the one which is not going to find the element. But at the same time, the process that searched in the right spot, has doubled the search rate, that is current speedup is 2 out of a possible 3.
Naturally, this approach can be generalized if you have more than 3 cores at your disposal. An important aside is that this way of thinking is what is done inside hardware. Look up carry-lookahead adders for instance.
I think you can figure the answer! To parallelize, there must be some work that can be divided. In case of the bin-search, there is nothing that could possibly be divided and parallelized. bin-search gets into the middle of an array of values. This work cannot be divided. Etc.. until it find the solution.
What in your opinion could be parallelized?
If you have n worker threads, you can split the array in n segments and run n binary searches concurrently, combining the results when they are ready. Apart from this cheap trick, I can see no obvious way to introduce parallelism.
You could always try a not-quite-binary search, essentially if you have n cores then you can split the array into n+1 pieces. From there you search each of the "cut-points" and see whether the value is larger or smaller than the cut point, this results in you having a fifth of the original search space as opposed to half, as you will be able to select a smaller section.