A* and N-Puzzle optimization - c++

I am writing a solver for the N-Puzzle (see http://en.wikipedia.org/wiki/Fifteen_puzzle)
Right now I am using a unordered_map to store hash values of the puzzle board,
and manhattan distance as the heuristic for the algorithm, which is a plain DFS.
so I have
auto pred = [](Node * lhs, Node * rhs){ return lhs->manhattanCost_ < rhs->manhattanCost_; };
std::multiset<Node *, decltype(pred)> frontier(pred);
std::vector<Node *> explored; // holds nodes we have already explored
std::tr1::unordered_set<unsigned> frontierHashTable;
std::tr1::unordered_set<unsigned> exploredHashTable;
This works great for n = 2 and 3.
However, its really hit and miss for n=4 and above. (stl unable to allocate memory for a new node)
I also suspect that I am getting hash collisions in the unordered_set
unsigned makeHash(const Node & pNode)
{
unsigned int b = 378551;
unsigned int a = 63689;
unsigned int hash = 0;
for(std::size_t i = 0; i < pNode.data_.size(); i++)
{
hash = hash * a + pNode.data_[i];
a = a * b;
}
return hash;
}
16! = 2 × 10^13 (possible arrangements)
2^32 = 4 x 10^9 (possible hash values in a 32 bit hash)
My question is how can I optimize my code to solve for n=4 and n=5?
I know from here
http://kociemba.org/fifteen/fifteensolver.html
http://www.ic-net.or.jp/home/takaken/e/15pz/index.html
that n=4 is possible in less than a second on average.
edit:
The algorithm itself is here:
bool NPuzzle::aStarSearch()
{
auto pred = [](Node * lhs, Node * rhs){ return lhs->manhattanCost_ < rhs->manhattanCost_; };
std::multiset<Node *, decltype(pred)> frontier(pred);
std::vector<Node *> explored; // holds nodes we have already explored
std::tr1::unordered_set<unsigned> frontierHashTable;
std::tr1::unordered_set<unsigned> exploredHashTable;
// if we are in the solved position in the first place, return true
if(initial_ == target_)
{
current_ = initial_;
return true;
}
frontier.insert(new Node(initial_)); // we are going to delete everything from the frontier later..
for(;;)
{
if(frontier.empty())
{
std::cout << "depth first search " << "cant solve!" << std::endl;
return false;
}
// remove a node from the frontier, and place it into the explored set
Node * pLeaf = *frontier.begin();
frontier.erase(frontier.begin());
explored.push_back(pLeaf);
// do the same for the hash table
unsigned hashValue = makeHash(*pLeaf);
frontierHashTable.erase(hashValue);
exploredHashTable.insert(hashValue);
std::vector<Node *> children = pLeaf->genChildren();
for( auto it = children.begin(); it != children.end(); ++it)
{
unsigned childHash = makeHash(**it);
if(inFrontierOrExplored(frontierHashTable, exploredHashTable, childHash))
{
delete *it;
}
else
{
if(**it == target_)
{
explored.push_back(*it);
current_ = **it;
// delete everything else in children
for( auto it2 = ++it; it2 != children.end(); ++it2)
delete * it2;
// delete everything in the frontier
for( auto it = frontier.begin(); it != frontier.end(); ++it)
delete *it;
// delete everything in explored
explored_.swap(explored);
for( auto it = explored.begin(); it != explored.end(); ++it)
delete *it;
return true;
}
else
{
frontier.insert(*it);
frontierHashTable.insert(childHash);
}
}
}
}
}

Since this is homework I will suggest some strategies you might try.
First, try using valgrind or a similar tool to check for memory leaks. You may have some memory leaks if you don't delete everything you new.
Second, calculate a bound on the number of nodes that should be explored. Keep track of the number of nodes you do explore. If you pass the bound, you might not be detecting cycles properly.
Third, try the algorithm with depth first search instead of A*. Its memory requirements should be linear in the depth of the tree and it should just be a matter of changing the sort ordering (pred). If DFS works, your A* search may be exploring too many nodes or your memory structures might be too inefficient. If DFS doesn't work, again it might be a problem with cycles.
Fourth, try more compact memory structures. For example, std::multiset does what you want but std::priority_queue with a std::deque may take up less memory. There are other changes you could try and see if they improve things.

First i would recommend cantor expansion, which you can use as the hashing method. It's 1-to-1, i.e. the 16! possible arrangements would be hashed into 0 ~ 16! - 1.
And then i would implement map by my self, as you may know, std is not efficient enough for computation. map is actually a Binary Search Tree, i would recommend Size Balanced Tree, or you can use AVL tree.
And just for record, directly use bool hash[] & big prime may also receive good result.
Then the most important thing - the A* function, like what's in the first of your link, you may try variety of A* function and find the best one.

You are only using the heuristic function to order the multiset. You should use the min(g(n) + f(n)) i.e. the min(path length + heuristic) to order your frontier.
Here the problem is, you are picking the one with the least heuristic, which may not be the correct "next child" to pick.
I believe this is what is causing your calculation to explode.

Related

C++ LRU cache - need suggestions on how to improve speed

The task is to implement an O(1) Least Recently Used Cache
Here is the question on leetcode
https://leetcode.com/problems/lru-cache/
Here is my solution, while it is O(1) it is not the fastest implementationcould you give some feedback and maybe ideas on how can I optimize this ? Thank you !
#include<unordered_map>
#include<list>
class LRUCache {
// umap<key,<value,listiterator>>
// store the key,value, position in list(iterator) where push_back occurred
private:
unordered_map<int,pair<int,list<int>::iterator>> umap;
list<int> klist;
int cap = -1;
public:
LRUCache(int capacity):cap(capacity){
}
int get(int key) {
// if the key exists in the unordered map
if(umap.count(key)){
// remove it from the old position
klist.erase(umap[key].second);
klist.push_back(key);
list<int>::iterator key_loc = klist.end();
umap[key].second = --key_loc;
return umap[key].first;
}
return -1;
}
void put(int key, int value) {
// if key already exists delete it from the the umap and klist
if(umap.count(key)){
klist.erase(umap[key].second);
umap.erase(key);
}
// if the unordered map is at max capacity
if(umap.size() == cap){
umap.erase(klist.front());
klist.pop_front();
}
// finally update klist and umap
klist.push_back(key);
list<int>::iterator key_loc = klist.end();
umap[key].first = value;
umap[key].second = --key_loc;
return;
}
};
/**
* Your LRUCache object will be instantiated and called as such:
* LRUCache* obj = new LRUCache(capacity);
* int param_1 = obj->get(key);
* obj->put(key,value);
*/
Here's some optimizations that might help:
Take this segment of code from the get function:
if(umap.count(key)){
// remove it from the old position
klist.erase(umap[key].second);
The above will lookup key in the map twice. Once for the count method to see if it exists. Another to invoke the [] operator to fetch its value. Save a few cycles by doing this:
auto itor = umap.find(key);
if (itor != umap.end()) {
// remove it from the old position
klist.erase(itor->second);
In the put function, you do this:
if(umap.count(key)){
klist.erase(umap[key].second);
umap.erase(key);
}
Same thing as get, you can avoid the redundant search through umap. Additionally, there's no reason to invoke umap.erase only to add that same key back into the map a few lines later.
Further, this is also inefficient
umap[key].first = value;
umap[key].second = --key_loc;
Similar to above, redundantly looking up key twice in the map. In the first assignment statement, the key is not in the map, so it default constructs a new value pair thing. The second assignment is doing another lookup in the map.
Let's restructure your put function as follows:
void put(int key, int value) {
auto itor = umap.find(key);
bool reinsert = (itor != umap.end());
// if key already exists delete it from the klist only
if (reinsert) {
klist.erase(umap[key].second);
}
else {
// if the unordered map is at max capacity
if (umap.size() == cap) {
umap.erase(klist.front());
klist.pop_front();
}
}
// finally update klist and umap
klist.push_back(key);
list<int>::iterator key_loc = klist.end();
auto endOfList = --key_loc;
if (reinsert) {
itor->second.first = value;
itor->second.second = endOfList;
}
else {
const pair<int, list<int>::iterator> itempair = { value, endOfList };
umap.emplace(key, itempair);
}
}
That's as far as you can probably go by using std::list. The downside of the list type is that there's no way to move an existing node from the middle to the front (or back) without first removing it and then adding it back. That's a couple of unneeded memory allocations to update the list. Possible alternative is that you just use your own double-linked list type and manually fixup the prev/next pointer yourself.
Here is my solution, while it is O(1) it is not the fastest implementation
could you give some feedback and maybe ideas on how can I optimize this ? Thank you !
Gonna take on selbie's point here:
Every instance of if(umap.count(key)) will search for the key and using umap[key] is the equivalent for the search. You can avoid the double search by assigning an iterator which points to the key by a single std::unordered_map::find() operation.
selbie already gave the code for int get()'s search, here's the one for void put()'s one:
auto it = umap.find(key);
if (it != umap.end())
{
klist.erase(it ->second);
umap.erase(key);
}
Sidecase:
Not applicable for your code as of now due to lack of input and output work, but in case you use std::cin and std::cout, you can disable the synchronization between C and C++ streams, and untie cin from cout as an optimization: (they are tied together by default)
// If your using cin/cout or I/O
ios::sync_with_stdio(false);
cin.tie(nullptr);
cout.tie(nullptr);

Union-Find leetcode question exceeding time limit

I am solving this problem on leetcode https://leetcode.com/problems/sentence-similarity-ii/description/ that involves implementing the union-find algorithm to find out if two sentences are similar or not given a list of pairs representing similar words. I implemented ranked union-find where I keep track of the size of each subset and join the smaller subtree to the bigger one but for some reason the code is still exceeding the time limit. Can someone point me to what I am doing wrong? How can it be optimized further. I saw other accepted solutions were using the same ranked union find algorithm.
Here is the code:
string root(map<string, string> dict, string element) {
if(dict[element] == element)
return element;
return root(dict, dict[element]);
}
bool areSentencesSimilarTwo(vector<string>& words1, vector<string>& words2, vector<pair<string, string>> pairs) {
if(words1.size() != words2.size()) return false;
std::map<string, string> dict;
std::map<string, int> sizes;
for(auto pair: pairs) {
if(dict.find(pair.first) == dict.end()) {
dict[pair.first] = pair.first;
sizes[pair.first] = 1;
}
if(dict.find(pair.second) == dict.end()) {
dict[pair.second] = pair.second;
sizes[pair.second] = 1;
}
auto firstRoot = root(dict, pair.first);
auto secondRoot = root(dict, pair.second);
if(sizes[firstRoot] < sizes[secondRoot]) {
dict[firstRoot] = secondRoot;
sizes[firstRoot] += sizes[secondRoot];
}
else {
dict[secondRoot] = firstRoot;
sizes[secondRoot] += sizes[firstRoot];
}
}
for(int i = 0; i < words1.size(); i++) {
if(words1[i] == words2[i]) {
continue;
}
else if(root(dict, words1[i]) != root(dict, words2[i])) {
return false;
}
}
return true;
}
Thanks!
Your union-find is broken with respect to complexity. Please read Wikipedia: Disjoint-set data structure.
For union-find to have its near O(1) complexity, it has to employ path-compaction. For that, your root method has to:
Get dict by reference, so that it can modify it.
Make path compaction to all elements on the path, so that they point to the root.
Without compaction you will have O(log N) complexity for root(), which could be OK. But for that, you'd have to fix it so that root() gets dict by reference and not by value. Passing dict by value costs O(N).
The fact that dict is an std::map makes any query cost O(log N), instead of O(1). std::unordered_map costs O(1), but in practice for N < 1000, std::map is faster. Also, even if std::unordered_map is used, hashing a string costs O(len(str)).
If the data is big, and performance is still slow, you may gain from working with indexes into pairs instead of strings, and run union-find with indexes into a vector<int>. This is error prone, since you have to correctly deal with duplicate strings.

How to merge sorted vectors into a single vector in C++

I have 10,000 vector<pair<unsigned,unsigned>> and I want to merge them into a single vector such that it is lexicographically sorted and does not contain duplicates. In order to do so I wrote the following code. However, to my surprise the below code is taking a lot of time. Can someone please suggest as to how can I reduce the running time of my code?
using obj = pair<unsigned, unsigned>
vector< vector<obj> > vecOfVec; // 10,000 vector<obj>, each sorted with size()=10M
vector<obj> result;
for(auto it=vecOfVec.begin(), l=vecOfVec.end(); it!=l; ++it)
{
// append vectors
result.insert(result.end(),it->begin(),it->end());
// sort result
std::sort(result.begin(), result.end());
// remove duplicates from result
result.erase(std::unique(result.begin(), result.end()), result.end());
}
I think you should use the fact that the vector in vectOfVect are sorted.
So detecting the min value in the front on the single vectors, push_back() it in the result and remove all the values detected from the front of the vectors matching the min values (avoiding duplicates in result).
If you can delete the vecOfVec variable, something like (caution: code not tested: just to give an idea)
while ( vecOfVec.size() )
{
// detect the minimal front value
auto itc = vecOfVec.cbegin();
auto lc = vecOfVec.cend();
auto valMin = itc->front();
while ( ++itc != lc )
valMin = std::min(valMin, itc->front());
// push_back() the minimal front value in result
result.push_back(valMin);
for ( auto it = vecOfVec.begin() ; it != vecOfVec.end() ; )
{
// remove all the front values equals to valMin (this remove the
// duplicates from result)
while ( (false == it->empty()) && (valMin == it->front()) )
it->erase(it->begin());
// when a vector is empty is removed
it = ( it->empty() ? vecOfVec.erase(it) : ++it );
}
}
If you can, I suggest you to switch vecOfVec from a vector< vector<obj> > to something that permit an efficient removal from the front of single containers (stacks?) and an efficient removal of single containers (a list?).
If there are lot of duplicates, you should use set rather than vector for your result, as set is the most natural thing to store something without duplicates:
set< pair<unsigned,unsigned> > resultSet;
for (auto it=vecOfVec.begin(); it!=vecOfVec.end(); ++it)
resultSet.insert(it->begin(), it->end());
If you need to turn it into a vector, you can write
vector< pair<unsigned,unsigned> > resultVec(resultSet.begin(), resultSet.end());
Note that since your code runs over 800 billion elements, it would still take a lot of time, no matter what. At least hours, if not days.
Other ideas are:
recursively merge vectors (10000 -> 5000 -> 2500 -> ... -> 1)
to merge 10000 vectors, store 10000 iterators in a heap structure
One problem with your code is the excessive use of std::sort. Unfortunately, the quicksort algorithm (which usually is the working horse used by std::sort) is not particularly faster when encountering an already sorted array.
Moreover, you're not exploiting the fact that your initial vectors are already sorted. This can be exploited by using a heap of their next values, when you will not need to call sort again. This may be coded as follows (code tested using obj=int), but perhaps it can be made more concise.
// represents the next unused entry in one vector<obj>
template<typename obj>
struct feed
{
typename std::vector<obj>::const_iterator current, end;
feed(std::vector<obj> const&v)
: current(v.begin()), end(v.end()) {}
friend bool operator> (feed const&l, feed const&r)
{ return *(l.current) > *(r.current); }
};
// - returns the smallest element
// - set corresponding feeder to next and re-establish the heap
template<typename obj>
obj get_next(std::vector<feed<obj>>&heap)
{
auto&f = heap[0];
auto x = *(f.current++);
if(f.current == f.end) {
std::pop_heap(heap.begin(),heap.end(),std::greater<feed<obj>>{});
heap.pop_back();
} else
std::make_heap(heap.begin(),heap.end(),std::greater<feed<obj>>{});
return x;
}
template<typename obj>
std::vector<obj> merge(std::vector<std::vector<obj>>const&vecOfvec)
{
// create min heap of feed<obj> and count total number of objects
std::vector<feed<obj>> input;
input.reserve(vecOfvec.size());
size_t num_total = 0;
for(auto const&v:vecOfvec)
if(v.size()) {
num_total += v.size();
input.emplace_back(v);
}
std::make_heap(input.begin(),input.end(),std::greater<feed<obj>>{});
// append values in ascending order, avoiding duplicates
std::vector<obj> result;
result.reserve(num_total);
while(!input.empty()) {
auto x = get_next(input);
result.push_back(x);
while(!input.empty() &&
!(*(input[0].current) > x)) // remove duplicates
get_next(input);
}
return result;
}

Combining arrays/lists in an specific fashion

I'm trying to find a sensible algorithm to combine multiple lists/vectors/arrays as defined below.
Each element contains a float declaring the start of its range of validity and a constant that is used over this range. Where ranges from different lists overlap their constants need to be added to produce one global list.
I've done an attempt at an illustration below to try and give a good idea of what I mean:
First List:
0.5---------------2------------3.2--------4
a1 a2 a3
Second List:
1----------2----------3---------------4.5
b1 b2 b3
Desired Output:
0.5----1----------2----------3-3.2--------4--4.5
a1 a1+b1 a2+b2 ^ a3+b3 b3
b3+a2
I can't think of a sensible way of going about this in the case of n lists; Just 2 is quite easy to brute force.
Any hints or ideas would be welcome. Each list is represented as a C++ std::vector (so feel free to use standard algorithms) and are sorted by start of range value.
Cheers!
Edit: Thanks for the advice, I've come up with a naive implementation, not sure why I couldn't get here on my own first. To my mind the obvious improvement would be to store an iterator for each vector since they're already sorted and not have to re-traverse each vector for each point. Given that most vectors will contain less than 100 elements, but there may be many vectors this may or may not be worthwhile. I'd have to profile to see.
Any thoughts on this?
#include <vector>
#include <iostream>
struct DataType
{
double intervalStart;
int data;
// More data here, the data is not just a single int, but that
// works for our demonstration
};
int main(void)
{
// The final "data" of each vector is meaningless as it refers to
// the coming range which won't be used as this is only for
// bounded ranges
std::vector<std::vector<DataType> > input = {{{0.5, 1}, {2.0, 3}, {3.2, 3}, {4.0, 4}},
{{1.0, 5}, {2.0, 6}, {3.0, 7}, {4.5, 8}},
{{-34.7895, 15}, {-6.0, -2}, {1.867, 5}, {340, 7}}};
// Setup output vector
std::vector<DataType> output;
std::size_t inputSize = 0;
for (const auto& internalVec : input)
inputSize += internalVec.size();
output.reserve(inputSize);
// Fill output vector
for (const auto& internalVec : input)
std::copy(internalVec.begin(), internalVec.end(), std::back_inserter(output));
// Sort output vector by intervalStartPoints
std::sort(output.begin(), output.end(),
[](const DataType& data1, const DataType& data2)
{
return data1.intervalStart < data2.intervalStart;
});
// Remove DataTypes with same intervalStart - each interval can only start once
output.erase(std::unique(output.begin(), output.end(),
[](const DataType& dt1, const DataType& dt2)
{
return dt1.intervalStart == dt2.intervalStart;
}), output.end());
// Output now contains all the right intersections, just not with the right data
// Lambda to find the associated data value associated with an
// intervsalStart value in a vector
auto FindDataValue = [&](const std::vector<DataType> v, double startValue)
{
auto iter = std::find_if(v.begin(), v.end(), [startValue](const DataType& data)
{
return data.intervalStart > startValue;
});
if (iter == v.begin() || iter == v.end())
{
return 0;
}
return (iter-1)->data;
};
// For each interval in the output traverse the input and sum the
// data constants
for (auto& val : output)
{
int sectionData = 0;
for (const auto& iv : input)
sectionData += FindDataValue(iv, val.intervalStart);
val.data = sectionData;
}
for (const auto& i : output)
std::cout << "loc: " << i.intervalStart << " data: " << i.data << std::endl;
return 0;
}
Edit2: #Stas's code is a very good way to approach this problem. I've just tested it on all the edge cases I could think of.
Here's my merge_intervals implementation in case anyone is interested. The only slight change I've had to make to the snippets Stas provided is:
for (auto& v : input)
v.back().data = 0;
Before combining the vectors as suggested. Thanks!
template<class It1, class It2, class OutputIt>
OutputIt merge_intervals(It1 first1, It1 last1,
It2 first2, It2 last2,
OutputIt destBegin)
{
const auto begin1 = first1;
const auto begin2 = first2;
auto CombineData = [](const DataType& d1, const DataType& d2)
{
return DataType{d1.intervalStart, (d1.data+d2.data)};
};
for (; first1 != last1; ++destBegin)
{
if (first2 == last2)
{
return std::copy(first1, last1, destBegin);
}
if (first1->intervalStart == first2->intervalStart)
{
*destBegin = CombineData(*first1, *first2);
++first1; ++first2;
}
else if (first1->intervalStart < first2->intervalStart)
{
if (first2 > begin2)
*destBegin = CombineData(*first1, *(first2-1));
else
*destBegin = *first1;
++first1;
}
else
{
if (first1 > begin1)
*destBegin = CombineData(*first2, *(first1-1));
else
*destBegin = *first2;
++first2;
}
}
return std::copy(first2, last2, destBegin);
}
Unfortunately, your algorithm is inherently slow. It doesn't make sense to profile or apply some C++ specific tweaks, it won't help. It will never stop calculation on pretty small sets like merging 1000 lists of 10000 elements each.
Let's try to evaluate time complexity of your algo. For the sake of simplicity, let's merge only lists of the same length.
L - length of a list
N - number of lists to be merged
T = L * N - length of a whole concatenated list
Complexity of your algorithm steps:
create output vector - O(T)
sort output vector - O(T*log(T))
filter output vector - O(T)
fix data in output vector - O(T*T)
See, the last step defines the whole algorithm complexity: O(T*T) = O(L^2*N^2). It is not acceptable for practical application. See, to merge 1000 lists of 10000 elements each, the algorithm should run 10^14 cycles.
Actually, the task is pretty complex, so do not try to solve it in one step. Divide and conquer!
Write an algorithm that merges two lists into one
Use it to merge a list of lists
Merging two lists into one
This is relatively easy to implement (but be careful with corner cases). The algorithm should have linear time complexity: O(2*L). Take a look at how std::merge is implemented. You just need to write your custom variant of std::merge, let's call it merge_intervals.
Applying a merge algorithm to a list of lists
This is a little bit tricky, but again, divide and conquer! The idea is to do recursive merge: split a list of lists on two halves and merge them.
template<class It, class Combine>
auto merge_n(It first, It last, Combine comb)
-> typename std::remove_reference<decltype(*first)>::type
{
if (first == last)
throw std::invalid_argument("Empty range");
auto count = std::distance(first, last);
if (count == 1)
return *first;
auto it = first;
std::advance(it, count / 2);
auto left = merge_n(first, it, comb);
auto right = merge_n(it, last, comb);
return comb(left, right);
}
Usage:
auto combine = [](const std::vector<DataType>& a, const std::vector<DataType>& b)
{
std::vector<DataType> result;
merge_intervals(a.begin(), a.end(), b.begin(), b.end(),
std::back_inserter(result));
return result;
};
auto output = merge_n(input.begin(), input.end(), combine);
The nice property of such recursive approach is a time complexity: it is O(L*N*log(N)) for the whole algorithm. So, to merge 1000 lists of 10000 elements each, the algorithm should run 10000 * 1000 * 9.966 = 99,660,000 cycles. It is 1,000,000 times faster than original algorithm.
Moreover, such algorithm is inherently parallelizable. It is not a big deal to write parallel version of merge_n and run it on thread pool.
I know I'm a bit late to the party, but when I started writing this you hadn't a suitable answer yet, and my solution should have a relatively good time complexity, so here you go:
I think the most straightforward way to approach this is to see each of your sorted lists as a stream of events: At a given time, the value (of that stream) changes to a new value:
template<typename T>
struct Point {
using value_type = T;
float time;
T value;
};
You want to superimpose those streams into a single stream (i.e. having their values summed up at any given point). For that you take the earliest event from all streams, and apply its effect on the result stream. Therefore, you need to first "undo" the effect that the previous value from that stream made on the result stream, and then add the new value to the current value of the result stream.
To be able to do that, you need to remember for each stream the last value, the next value (and when the stream is empty):
std::vector<std::tuple<Value, StreamIterator, StreamIterator>> streams;
The first element of the tuple is the last effect of that stream onto the result stream, the second is an iterator pointing to the streams next event, and the last is the end iterator of that stream:
transform(from, to, inserter(streams, begin(streams)),
[] (auto & stream) {
return make_tuple(static_cast<Value>(0), begin(stream), end(stream));
});
To be able to always get the earliest event of all the streams, it helps to keep the (information about the) streams in a (min) heap, where the top element is the stream with the next (earliest) event. That's the purpose of the following comparator:
auto heap_compare = [] (auto const & lhs, auto const & rhs) {
bool less = (*get<1>(lhs)).time < (*get<1>(rhs)).time;
return (not less);
};
Then, as long as there are still some events (i.e. some stream that is not empty), first (re)build the heap, take the top element and apply its next event to the result stream, and then remove that element from the stream. Finally, if the stream is now empty, remove it.
// The current value of the result stream.
Value current = 0;
while (streams.size() > 0) {
// Reorder the stream information to get the one with the earliest next
// value into top ...
make_heap(begin(streams), end(streams), heap_compare);
// .. and select it.
auto & earliest = streams[0];
// New value is the current one, minus the previous effect of the selected
// stream plus the new value from the selected stream
current = current - get<0>(earliest) + (*get<1>(earliest)).value;
// Store the new time point with the new value and the time of the used
// time point from the selected stream
*out++ = Point<Value>{(*get<1>(earliest)).time, current};
// Update the effect of the selected stream
get<0>(earliest) = (*get<1>(earliest)).value;
// Advance selected stream to its next time point
++(get<1>(earliest));
// Remove stream if empty
if (get<1>(earliest) == get<2>(earliest)) {
swap(streams[0], streams[streams.size() - 1u]);
streams.pop_back();
}
}
This will return a stream where there might be multiple points with the same time, but a different value. This occurs when there are multiple "events" at the same time. If you only want the last value, i.e. the value after all these events happened, then one needs to combine them:
merge_point_lists(begin(input), end(input), inserter(merged, begin(merged)));
// returns points with the same time, but with different values. remove these
// duplicates, by first making them REALLY equal, i.e. setting their values
// to the last value ...
for (auto write = begin(merged), read = begin(merged), stop = end(merged);
write != stop;) {
for (++read; (read != stop) and (read->time == write->time); ++read) {
write->value = read->value;
}
for (auto const cached = (write++)->value; write != read; ++write) {
write->value = cached;
}
}
// ... and then removing them.
merged.erase(
unique(begin(merged), end(merged),
[](auto const & lhs, auto const & rhs) {
return (lhs.time == rhs.time);}),
end(merged));
(Live example here)
Concerning the time complexity: This is iterating over all "events", so it depends on the number of events e. The very first make_heap call has to built a complete new heap, this has worst case complexity of 3 * s where s is the number of streams the function has to merge. On subsequent calls, make_heap only has to correct the very first element, this has worst case complexity of log(s'). I write s' because the number of streams (that need to be considered) will decrease to zero. This
gives
3s + (e-1) * log(s')
as complexity. Assuming the worst case, where s' decreases slowly (this happens when the events are evenly distributed across the streams, i.e. all streams have the same number of events:
3s + (e - 1 - s) * log(s) + (sum (log(i)) i = i to s)
Do you really need a data structure as result? I don't think so. Actually you're defining several functions that can be added. The examples you give are encoded using a 'start, value(, implicit end)' tuple. The basic building block is a function that looks up it's value at a certain point:
double valueAt(const vector<edge> &starts, float point) {
auto it = std::adjacent_find(begin(starts), end(starts),
[&](edge e1, edge e2) {
return e1.x <= point && point < e2.x;
});
return it->second;
};
The function value for a point is the sum of the function values for all code-series.
If you really need a list in the end, you can join and sort all edge.x values for all series, and create the list from that.
Unless performance is an issue :)
If you can combine two of these structures, you can combine many.
First, encapsulate your std::vector into a class. Implement what you know as operator+= (and define operator+ in terms of this if you want). With that in place, you can combine as many as you like, just by repeated addition. You could even use std::accumulate to combine a collection of them.

Get top 5 algorithm from a container?

I have a class(object), User. This user has 2 private attributes, "name" and "popularity". I store the objects into a vector (container).
From the container, I need to find the top 5 most popular user, how do I do that? (I have an ugly code, I will post here, if you have a better approach, please let me know. Feel free to use other container, if you think vector is not a good choice, but please use only: map or multimap, list, vector or array, because I only know how to use these.) My current code is:
int top5 = 0, top4 = 0, top3 = 0, top2 = 0, top1 = 0;
vector<User>::iterator it;
for (it = user.begin(); it != user.end(); ++it)
{
if( it->getPopularity() > top5){
if(it->getPopularity() > top4){
if(it->getPopularity() > top3){
if(it->getPopularity() > top2){
if(it->getPopularity() > top1){
top1 = it->getPopularity();
continue;
} else {
top2 = it->getPopularity();
continue;
}
} else {
top3 = it->getPopularity();
continue;
}
}
} else {
top4 = it->getPopularity();
continue;
}
} else {
top5 = it->getPopularity();
continue;
}
}
I know the codes is ugly and might be prone to error, thus if you have better codes, please do share with us (us == cpp newbie). Thanks
You can use the std::partial_sort algorithm to sort your vector so that the first five elements are sorted and the rest remains unsorted. Something like this (untested code):
bool compareByPopularity( User a, User b ) {
return a.GetPopularity() > b.GetPopularity();
}
vector<Users> getMostPopularUsers( const vector<User> &users, int num ) {
if ( users.size() <= num ) {
sort( users.begin(), users.end(), compareByPopularity );
} else {
partial_sort( users.begin(), users.begin() + num, users.end(),
compareByPopularity );
}
return vector<Users>( users.begin(), users.begin() + num );
}
Why don't you sort (std::sort or your own implementation of Quick Sort) the vector based on popularity and take the first 5 values ?
Example:
bool UserCompare(User a, User b) { return a.getPopularity() > b.getPopularity(); }
...
std::sort(user.begin(), user.end(), UserCompare);
// Print first 5 users
If you just want top 5 popular uses, then use std::partial_sort().
class User
{
private:
string name_m;
int popularity_m;
public:
User(const string& name, int popularity) : name_m(name), popularity_m(popularity) { }
friend ostream& operator<<(ostream& os, const User& user)
{
return os << "name:" << user.name_m << "|popularity:" << user.popularity_m << "\n";
return os;
}
int Popularity() const
{
return popularity_m;
}
};
bool Compare(const User& lhs, const User& rhs)
{
return lhs.Popularity() > rhs.Popularity();
}
int main()
{
// c++0x. ignore if you don't want it.
auto compare = [](const User& lhs, const User& rhs) -> bool
{ return lhs.Popularity() > rhs.Popularity(); };
partial_sort(users.begin(), users.begin() + 5, users.end(), Compare);
copy(users.begin(), users.begin() + 5, ostream_iterator<User>(std::cout, "\n"));
}
First off, cache that it->getPopularity() so you don't have to keep repeating it.
Secondly (and this is much more important): Your algorithm is flawed. When you find a new top1 you have to push the old top1 down to the #2 slot before you save the new top1, but before you do that you have to push the old top2 down to the #3 slot, etc. And that is just for a new top1. You are going to have to do something similar for a new top2, a new top3, etc. The only one you can paste in without worrying about pushing things down the list is when you get a new top5. The correct algorithm is hairy. That said, the correct algorithm is much easier to implement when your topN is an array rather than a bunch of separate values.
Thirdly (and this is even more important than the second point): You shouldn't care about performance, at least not initially. The easy way to do this is to sort the entire list and pluck off the first five off the top. If this suboptimal but simple algorithm doesn't affect your performance, done. Don't bother with the ugly but fast first N algorithm unless performance mandates that you toss the simple solution out the window.
Finally (and this is the most important point of all): That fast first N algorithm is only fast when the number of elements in the list is much, much larger than five. The default sort algorithm is pretty dang fast. It has to be wasting a lot of time sorting the dozens / hundreds of items you don't care about before a pushdown first N algorithm becomes advantageous. In other words, that pushdown insertion sort algorithm may well be a case of premature disoptimization.
Sort your objects, maybe with the library if this is allowed, and then simply selecte the first 5 element. If your container gets too big you could probably use a std::list for the job.
Edit : #itsik you beat me to the sec :)
Do this pseudo code.
Declare top5 as an array of int[5] // or use a min-heap
Initialize top5 as 5 -INF
For each element A
if A < top5[4] // or A < root-of-top5
Remove top5[4] from top5 // or pop min element from heap
Insert A to top // or insert A to the heap
Well, I advise you improve your code by using an array or list or vector to store the top five, like this
struct TopRecord
{
int index;
int pop;
} Top5[5];
for(int i = 0; i<5; i++)
{
Top5[i].index = -1;
// Set pop to a value low enough
Top5[i].pop = -1;
}
for(int i = 0; i< users.size(); i++)
{
int currentpop = i->getPopularity()
int currentindex = i;
int j = 0;
int temp;
while(j < 5 && Top5[j].pop < currentpop)
{
temp = Top5[j].pop;
Top[j].pop = currentpop;
currentpop = temp;
temp = Top5[j].index;
Top[j].index = currentindex;
currentindex = temp;
j++;
}
}
You also may consider using Randomized Select if Your aim is performance, since originally Randomized Select is good enough for ordered statistics and runs in linear time, You just need to run it 5 times. Or to use partial_sort solution provided above, either way counts, depends on Your aim.