I am looking for C++ Kruskal implementations to benchmark against my own...
If you know a few good ones, please share!
There's boost::kruskal_minimum_spanning_tree. Prim's algorithm is there too if you want to compare against that.
Related
I have been checking the algorithm of Mahout 0.9 k-means using MapReduce and I would like to know where can I check the code of what is happening inside the map function and in the reducer?
I was using debugging using NetBeans and I was not able to find what is exactly implemented in the Map and Reduce functions...
The reason what I am doing this is because I would like to know what is exactly implemented in the version of Mahout 0.9 in order to see which parts where optimized on the K-Means mapReduce algorithm.
If somebody knows which research paper the Mahout K-means were based on, that would also helped me a lot.
Thank you so much!
Best regards!
Download source code for mahout-core. Search for java file org.apache.mahout.clustering.kmeans.KMeansDriver.
In this java file search for line ClusterIterator.iterateMR(conf, input, priorClustersPath, output, maxIterations);
iterateMR function in class org.apache.mahout.clustering.iterator.ClusterIterator is the class which defines all configuration required for Map Reduce.
org.apache.mahout.clustering.iterator.CIMapper and org.apache.mahout.clustering.iterator.CIReducer are the Map reduce classes you are looking for.
Hope this helps!! :)
However, I do not know which research paper is implemented.
K-means (more precisely, Lloyds algorithm) is naively parallel. I doubt there is a paper discussing the implementation used by Mahout, because it's the obvious way to do so. There is absolutely no trick involved:
Lloyds algorithm consists mostly of a sum, and sums are trivially to parallelize.
Unfortunately (like much of Hadoop), Mahout is 10 layers thick abstraction. Which doesn't yield the best performance, but in particular makes it also really hard to dig through all the code and meta-code to the actual implementation. See the other answere here for pointers to the source code fragments scattered in a dozen classes.
When playing around with Mahout, make sure to also include non-Hadoop implementations of k-means in your experiments. You will be surprised how often they A) outperform Mahout, and B) provide better results.
I need to compare the Apriori and the A-close algorithm on a dataset so I need the implementations of both algorithms. I can find implementions of the Apriori algorithm but I can't find implementations of the A-close algorithm. It's saves me lots of time when I find a implementation of the A-close algorithm. Does someone has the implementation of this algorithm and wants to share it or some tips for finding this implementation?
How to implement 3d kDTree build and search algorithm in c or c++? I am wondering if we have a working code to follow
I'd like to recommend you a two good presentations to start with:
"Introduction to k-d trees"
"Lecture 6: Kd-trees and range trees".
Both give all (basic ideas behind kd-trees, short visual examples and code snippets) you need to start writing your own implementation.
Update-19-10-2021: Resources are now longer public. Thanks to #Hari for posting new links down below in the comments.
I found the publications of Vlastimil Havran very useful. His Ph.d Thesis gives a nice introduction to kd-trees and traversal algorithms. Further articles are about several improvements, e.g. how to construct kd-tree in O(nlogn). There are also a lot of implementations in different graphic libs. You should just google for it.
For an example of a 3D kd-tree implementation in C, take a look at kd3. It is not general purpose library and requires the input data to be in a specific form, but the ideas and approach should be transferable.
Disclosure: I am the author of kd3.
Disclaimer: It was written as proof-of-concept code for an existing application and is therefore not as generic nor as well-tested as it should be. Bug reports/fixes are welcome.
Does someone know if there is any production-ready K-shortest-paths algorithm for C++?
The only available implementation (k-shortest-paths), unfortunately, leaks memory, has counter-intuitive interfaces and another "reinvented wheel" - the Graph class.
I'm looking for something better, probably, boost::graph-based.
There are two possible algorithms available - simple Yen's algorithm and optimized Yen's algorithm, both would suit me.
Thanks in advance.
There is another one, but you'll have to check if this also leaks memory.
http://sourceforge.net/projects/ksp/files/ksp/ksp-1.0/
Could anyone point me to an example implementation of a HOT Queue or give some pointers on how to approach implementing one?
Here is a page I found that provides at least a clue toward what data structures you might use to implement this. Scroll down to the section called "Making A* Scalable." It's unfortunate that the academic papers on the subject mention having written C++ code but don't provide any.
Here is the link to the paper describing HOT queues. It's very abstract, that's why i wanted
to see a coded example (I'm still trying to fin my way around it).
http://www.star-lab.com/goldberg/pub/neci-tr-97-104.ps
The "cheapest", sort to speak variant of this is a two-level heap queue (maybe this sounds more familiar). What i wanted to do is to improve the running time of the Dijkstra's shortest path algorithm.
What i wanted to do is to improve the
running time of the Dijkstra's
shortest path algorithm.
Have you considered using the Boost Graph Library?
If you are using your own implementation of the algorithm you might already get better results using the one the BGL provides.
However It might be nontrivial to modify your code so it works with the BGL.
Of course speed-up could also be gained by not using Dijkstra at all but another algorithm.