Is there any way of optimising this function? - c++

This piece of code seems to be the worst offender in terms of time in my program. What my program is trying to do find the minimum number of individual "nodes" required to satisfy a network with two constraints:
Each node must connect to x number of other nodes
Each node must have y degrees of separation between it and each of the nodes it's connected to.
However for values of x greater than 600 this task takes a very long time, the task is on the order of exponential anyway so I expect it to take forever at some point but that also means that if any small changes could be made here it'd speed up the entire program by alot.
uniint = unsigned long long int (64-bit)
network is a vector of the form vector<vector<uniint>>
The piece of code:
/* Checks if id2 is in id1's list of connections */
inline bool CheckIfInList (uniint id1, uniint id2)
{
uniint id1size = network[id1].size();
for (uniint itr = 0; itr < id1size; ++itr)
{
if (network[id1][itr] == id2)
{
return true;
}
}
return false;
}

The only way is to sort the network[id1] array when you build it.
If you arrive here with a sorted array you can easiliy find, if exists, what you are looking for using a dichotomic search.

Use std::map or std::unordered_map for fast search. I guess it's impossible to MICRO optimize this code, std::vector is cool. But not for 600 elements search.

I'm guessing CheckIfInList() is called in a loop? Perhaps a vector is not the best choice, you could try vector<set<uniint>>. This will give you O(log n) for a look up of the inner collection instead of O(n)

For quick microoptimization, check whether your compiler optimizes the multiple calls to network[id1] away. If not, that is where you loose a lot of time, so remember the address:
vector<uniint>& connectedNodes = network[id1];
uniint id1size = connectedNodes.size();
for (uniint itr = 0; itr < id1size; ++itr)
{
if (connectedNodes[itr] == id2)
{
return true;
}
}
return false;
If your compiler already took care of that, I'm afraid that there's not much you can micro optimize about this method. The only real optimization can be achieved on the algorithmic level, starting with sorting the neighbour lists, moving on to using unordered_map<> instead of vector<>, and ending with asking yourself whether you can't somehow reduce the number of calls to CheckIfInList().

This is not as effective as HAL9000's suggestion, and is good for cases when you have an unsorted list/array. What you can do is to ask less question in each iteration if you put the value you looking for at the end of the vector.
uniint id1size = network[id1].size();
network[id1][id1size] = id2;
for (uniint itr = 0; network[id1][itr] == id2; ++itr);
//if itr != id1size return true else flase....
need to add checks if the last member in the vector was your id2.
This way you don't need to ask each time whether you get to the end of the list.

Related

Most efficient way to search for a value and return its index in a vector?

I am trying to iterate through a vector (k), and check if it contains a value (key), if it does, I want to add the value found at the same index of a different vector (val) and then add whatever value is found there to a third vector (temp).
for(int i = 0; i < k.size(); ++i)
{
if(k.at(i) == key)
{
temp.push_back(val.at(i));
}
}
I've learned a lot lately but I'm still not super advanced in C++, this code does work for my purposes but it is extremely slow. It can handle small vectors of sizes like 10 or 100, but takes much too long for sizes bigger like 1000, 10000 or even 1000000.
My question is, is there a faster and more efficient way to do this?
I've tried this:
std::vector<int>::iterator it = k.begin();
while ((iter = std::find(it, k.end(), key)) != k.end())
{
int index = std::distance(k.begin(), it);
temp.push_back(val.at(index));
}
I thought maybe using a vector iterator would speed things up, but I can't get the code to work due to bad_alloc errors that I'm not sure how to fix.
Does anyone know what I can do to make this little bit of code much faster?
Here are a few things you could do:
Pre-allocate the data for temp, so that push_back doesn't cause repeated allocations:
temp.reserve(k.size());
If k is sorted, you can use that fact to speed things up a bit:
auto lowerIt = std::lower_bound(k.begin(), k.end(), key);
auto upperIt = std::upper_bound(k.begin(), k.end(), key);
for (auto it = lowerIt; it != upperIt; ++it)
temp.push_back(val[it - k.begin()]);
at does bounds checking, so it is a tad bit slower than []. You obviously have to guarantee that you are never accessing an out of bounds index.
Besides Rakete's suggestions:
If your keys vector is sorted - use std::binary_search instead of std::find and then just iterate until the next value/end of vector.
If you're free to change your data structures, keep your data in std::unordered_multimap and use equal_range to access elements with your desired keys.

QHashIterator in c++

I developed a game in C++, and want to make sure everything is properly done.
Is it a good solution to use a QHashIterator to check which item in the list has the lowest value (F-cost for pathfinding).
Snippet from my code:
while(!pathFound){ //do while path is found
QHashIterator<int, PathFinding*> iterator(openList);
PathFinding* parent;
iterator.next();
parent = iterator.value();
while(iterator.hasNext()){ //we take the next tile, and we take the one with the lowest value
iterator.next();
//checking lowest f value
if((iterator.value()->getGcost() + iterator.value()->getHcost()) < (parent->getGcost() + parent->getHcost())){
parent = iterator.value();
}
}
if(!atDestionation(parent,endPoint)){ //here we check if we are at the destionation. if we are we return our pathcost.
clearLists(parent);
filllists(parent,endPoint);
}else{
pathFound = true;
while(parent->hasParent()){
mylist.append(parent);
parent = parent->getParent();
}
pathcost = calculatePathCost(mylist); //we calculate what the pathcost is and return it
}
}
If no? Are there better improvements?
I also found someting about the std::priority_queue. It this mutch better then a QHashIterator?
It's maybe not a problem with gameworld where there which are not big. But i'm looking for a suitable solution when the game worlds are big (like + 10000 calculations).Any marks?
Here you basically scan the whole map to find the element that is the minimum one according to some values:
while(iterator.hasNext()){ //we take the next tile, and we take the one with the lowest value
iterator.next();
//checking lowest f value
if((iterator.value()->getGcost() + iterator.value()->getHcost()) < (parent->getGcost() + parent->getHcost())){
parent = iterator.value();
}
}
All this code, if you had an stl container, for instance a map, could be reduced to:
auto parent = std::min_element(iterator.begin(), iterator.end(), [](auto& lhs, auto& rhs)
{ lhs.value()->getGcost() + lhs.value()->getHcost()) < (rhs.value()->getGcost() + rhs.value()->getHcost() }
Once you have something easier to understand you can play around with different containers, for instance it might be faster to hold a sorted vector in this case.
)
Your code does not present any obvious problems per se, often performance gains are not conquered by optimizing little loops, it's more on how you code is organized. For instance I see that you have a lot of indirections, those cost a lot in cache misses. Or if you have to always find the minimum element, you could cache it in another structure and you would have it at a constant time, all the time.

A* pathfinding slow

I am currently working on a A* search algorithm. The algorithm would just be solving text file mazes. I know that the A* algorithm is supposed to be very quick in finding the finish. Mine seems to take 6 seconds to find the path in a 20x20 maze with no walls. It does find the finish with the correct path it just takes forever to do so.
If I knew which part of code was the problem I would just post that but I really have no idea what is going wrong. So here is the algorithm that I use...
while(!openList.empty()) {
visitedList.push_back(openList[index]);
openList.erase(openList.begin() + index);
if(currentCell->x_coor == goalCell->x_coor && currentCell->y_coor == goalCell->y_coor)
}
FindBestPath(currentCell);
break;
}
if(map[currentCell->x_coor+1][currentCell->y_coor] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor+1,currentCell->y_coor,currentCell));
}
if(map[currentCell->x_coor-1][currentCell->y_coor] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor-1,currentCell->y_coor,currentCell));
}
if(map[currentCell->x_coor][currentCell->y_coor+1] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor,currentCell->y_coor+1,currentCell));
}
if(map[currentCell->x_coor][currentCell->y_coor-1] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor,currentCell->y_coor-1,currentCell));
}
for(int i=0;i<openList.size();i++) {
openList[i]->G = openList[i]->parent->G + 1;
openList[i]->H = openList[i]->ManHattenDistance(goalCell);
}
float bestF = 999999;
index = -1;
for(int i=0;i<openList.size();i++) {
if(openList[i]->GetF() < bestF) {
for(int n=0;n<visitedList.size();n++) {
if(CheckVisited(openList[i])) {
bestF = openList[i]->GetF();
index = i;
}
}
}
}
if(index >= 0) {
currentCell = openList[index];
}
}
I know this code is messy and not the most efficient way to do things but I think it should still be faster then what it is. Any help would be greatly appreciated.
Thanks.
Your 20x20 maze has no walls, and therefore many, many routes which are all the same length. I'd estimate trillions of equivalent routes, in fact. It doesn't seem so bad when you take that into account.
Of course, since your heuristic looks perfect, you should get a big benefit from excluding routes that are heuristically predicted to be precisely as long as the best route known so far. (This is safe if your heuristic is correct, i.e. never overestimates the remaining distance).
Here is a big hint.
If ever you find two paths to the same cell, you can always throw away the longer one. If there is a tie, you can throw away the second one to get there.
If you implement that, with no other optimizations, the search would become more than acceptably fast.
Secondly the A* algorithm should only bother backtracking if the length to the current cell plus the heuristic exceeds the length to the current cell plus the heuristic for any other node. If you implement that, then it should directly find a path and stop. To facilitate that you need to store paths in a priority queue (typically implemented with a heap), not a vector.
openList.erase is O(n), and the for-loop beginning with for(int i=0;i<openList.size();i++) is O(n^2) due to the call to CheckVisited - these are called every iteration, making your overall algorithm O(n^3). A* should be O(n log n).
Try changing openList to a priority-queue like it's supposed to be, and visitedList to a hash table. The entire for loop can then be replaced by a dequeue - make sure you check if visitedList.Contains(node) before enqueuing!
Also, there is no need to recalculate the ManHattenDistance for every node every iteration, since it never changes.
Aren't you constantly backtracking?
The A* algorithm backtracks when the current best solution becomes worse than another previously visited route. In your case, since there are no walls, all routes are good and never die (and as MSalters correctly pointed, there are several of them). When you take a step, your route becomes worse than all the others that are one step shorter.
If that is true, this may account for the time taken by your algorithm.

no duplicate function for a lottery program

right now im trying to make a function that checks to see if the user’s selection is already in the array , and if it does itll tell you to choose a diff number. how can i do this?
Do you mean something like this?
bool CheckNumberIsValid()
{
for(int i = 0 ; i < array_length; ++i)
{
if(array[i] == user_selection)
return false;
}
return true;
}
That should give you a clue, at least.
What's wrong with std::find? If you get the end iterator back, the
value isn't in the array; otherwise, it is. Or if this is homework, and
you're not allowed to use the standard library, a simple while loop
should do the trick: this is a standard linear search, algorithms for
which can be found anywhere. (On the other hand, some of the articles
which pop up when searching with Google are pretty bad. You really
should use the standard implementation:
Iterator
find( Iterator begin, Iterator end, ValueType target )
{
while ( begin != end && *begin != target )
++ begin;
return begin;
}
Simple, effective, and proven to work.)
[added post factum]Oh, homework tag. Ah well, it won't really benefit you that much then, still - I'll leave my answer since it can be of some use to others browsing through SO.
If you'd need to have lots of unique random numbers in a range - say 45000 random numbers from 0..45100 - then you should see how this is going to get problematic using the approach of:
while (size_of_range > v.size()) {
int n = // get random
if ( /* n is not already in v */ ) {
v.push_back(n);
}
}
If the size of the pool and the range you want to get are close, and the pool size is not a very small integer - it'll get harder and harder to get a random number that wasn't already put in the vector/array.
In that case, you'll be much better of using std::vector (in <vector>) and std::random_shuffle (in <algorithm>):
unsigned short start = 10; // the minimum value of a pool
unsigned short step = 1; // for 10,11,12,13,14... values in the vector
// initialize the pool of 45100 numbers
std::vector<unsigned long> pool(45100);
for (unsigned long i = 0, j = start; i < pool.size(); ++i, j += step) {
pool[i] = j;
}
// get 45000 numbers from the pool without repetitions
std::random_shuffle(pool.begin(), pool.end());
return std::vector<unsigned long>(pool.begin(), pool.begin() + 45000);
You can obviously use any type, but you'll need to initialize the vector accordingly, so it'd contain all possible values you want.
Note that the memory overhead probably won't really matter if you really need almost all of the numbers in the pool, and you'll get good performance. Using rand() and checking will take a lot of time, and if your RAND_MAX is equal 32767 then it'd be an infinite loop.
The memory overhead is however noticeable if you only need few of those values. The first approach would usually be faster then.
If it really needs to be the array you need to iterate or use find function from algorithm header. Well, I would suggest you go for putting the numbers in a set as the look up is fast in sets and handy using set::find function
ref: stl set
These are some of the steps (in pseudo-code since this is a homework question) on how you may get around to doing this:
Get user to enter a new number.
If the number entered is the first, push it to the vector anyway.
Sort the contents of the vector in case size is > 1.
Ask user to enter the number.
Perform a binary search on the contents to see if the number was entered.
If number is unique, push it into vector. If not unique, ask again.
Go to step 3.
HTH,
Sriram.

How to keep only the last duplicate when iterating through rows

Following code iterates through many data-rows, calcs some score per row and then sorts the rows according to that score:
unsigned count = 0;
score_pair* scores = new score_pair[num_rows];
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
scores[count].score = score;
scores[count].doc_id = row->docid;
count++;
}
assert(count <= num_rows);
qsort(scores, count, sizeof(score_pair), score_cmp);
Unfortunately, there are many duplicate rows with the same docid but different score. Now i like to keep the last score for any docid only. The docids are unsigned int, but usually big (=> no lookup-array) - using a HashMap to lookup the last count for a docid would probably be too slow (many millions of rows, should only take seconds not minutes...).
Ok, i modified my code to use a std:map:
map<int, int> docid_lookup;
unsigned count = 0;
score_pair* scores = new score_pair[num_rows];
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
map<int, int>::iterator iter;
iter = docid_lookup.find(row->docid);
if (iter != docid_lookup.end()) {
scores[iter->second].score = score;
scores[iter->second].doc_id = row->docid;
} else {
scores[count].score = score;
scores[count].doc_id = row->docid;
docid_lookup[row->docid] = count;
count++;
}
}
It works and the performance hit is not as bad as i expected - now it runs a minute instead of 16 seconds, so it's about a factor of 3. Memory usage has also gone up from about 1Gb to 4Gb.
The first thing I'd try would be a map or unordered_map: I'd be surprised if performance is a factor of 60 slower than what you did without any unique-ification. If the performance there isn't acceptable, another option is something like this:
// get the computed data into a vector
std::vector<score_pair>::size_type count = 0;
std::vector<score_pair> scores;
scores.reserve(num_rows);
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
scores.push_back(score_pair(score, row->docid));
}
assert(scores.size() <= num_rows);
// remove duplicate doc_ids
std::reverse(scores.begin(), scores.end());
std::stable_sort(scores.begin(), scores.end(), docid_cmp);
scores.erase(
std::unique(scores.begin(), scores.end(), docid_eq),
scores.end()
);
// order by score
std::sort(scores.begin(), scores.end(), score_cmp);
Note that the use of reverse and stable_sort is because you want the last score for each doc_id, but std::unique keeps the first. If you wanted the first score you could just use stable_sort, and if you didn't care what score, you could just use sort.
The best way of handling this is probably to pass reverse iterators into std::unique, rather than a separate reverse operation. But I'm not confident I can write that correctly without testing, and errors might be really confusing, so you get the unoptimised code...
Edit: just for comparison with your code, here's how I'd use the map:
std::map<int, float> scoremap;
while ((row = data.next_row())) {
scoremap[row->docid] = calc_score(data.next_feature());
}
std::vector<score_pair> scores(scoremap.begin(), scoremap.end());
std::sort(scores.begin(), scores.end(), score_cmp);
Note that score_pair would need a constructor taking a std::pair<int,float>, which makes it non-POD. If that's not acceptable, use std::transform, with a function to do the conversion.
Finally, if there is much duplication (say, on average 2 or more entries per doc_id), and if calc_score is non-trivial, then I would be looking to see whether it's possible to iterate the rows of data in reverse order. If it is, then it will speed up the map/unordered_map approach, because when you get a hit for the doc_id you don't need to calculate the score for that row, just drop it and move on.
I'd go for a std::map of docids. If you could create an appropriate hashing function, a hash-map would be preferable. But I guess it's too difficult. And no - the std::map ist not too slow. Access is O(log n), which is nearly as good as O(1). O(1) is array access time (and Hashmap btw).
Btw, if std::map is too slow, qsort O(n log n) is too slow as well. And, using a std::map and iterating over it's contents, you can perhaps save your qsort.
Some additions for the comment (by onebyone):
I did not go for the implementation
details, since there wasn't enough
information on that.
qsort may behave bad with sorted data
(depending on the implementation).
Std::map may not. This is a real
advantage, especially if you read the
values from a database that might
output them ordered by key.
There was no word on the memory allocation strategy. Changing to a memory allocator with fast allocation of small objects may improve the performance.
Still - the fastest would be a hash map with an appropriate hash function. Since there's not enough information about the distribution of the keys, presenting one in this answer is not possible.
Short - if you ask general questions, you get general answers. This means - at least for me, looking at the time complexity in the O-Notation. Still you were right, depending on different factors, the std::map may be too slow while qsort is still fast enough - it may also be the other way round in the worst case of qsort, where it has n^2 complexity.
Unless I've misunderstood the question, the solution can be simplified considerably. At least as I understand it, you have a few million docid's (which are of type unsigned int) and for each unique docid, you want to store one 'score' (which is a float). If the same docid occurs more than once in the input, you want to keep the score from the last one. If that's correct, the code can be reduced to this:
std::map<unsigned, float> scores;
while ((row = data.next_row()))
scores[row->docid] = calc_score(data.next_feature());
This will probably be somewhat slower than your original version since it allocates a lot of individual blocks rather than one big block of memory. Given your statement that there are a lot of duplicates in the docid's, I'd expect this to save quite a bit of memory, since it only stores data for each unique docid rather than for every row in the original data.
If you wanted to optimize this, you could almost certainly do so -- since it uses a lot of small blocks, a custom allocator designed for that purpose would probably help quite a bit. One possibility would be to take a look at the small-block allocator in Andrei Alexandrescu's Loki library. He's done more work on the problem since, but the one in Loki is probably sufficient for the task at hand -- it'll almost certainly save a fair amount of memory and run faster as well.
If your C++ implementation has it, and most do, try hash_map instead of std::map (it's sometimes available under std::hash_map).
If the lookups themselves are your computational bottleneck, this could be a significant speedup over std::map's binary tree.
Why not sort by doc id first, calculate scores, then for any subset of duplicates use the max score?
On re-reading the question; I'd suggest a modification to how scores are read in. Keep in mind C++ isn't my native language, so this won't quite be compilable.
unsigned count = 0;
pair<int, score_pair>* scores = new pair<int, score_pair>[num_rows];
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
scores[count].second.score = score;
scores[count].second.doc_id = row->docid;
scores[count].first = count;
count++;
}
assert(count <= num_rows);
qsort(scores, count, sizeof(score_pair), pair_docid_cmp);
//getting number of unique scores
int scoreCount = 0;
for(int i=1; i<num_rows; i++)
if(scores[i-1].second.docId != scores[i].second.docId) scoreCount++;
score_pair* actualScores=new score_pair[scoreCount];
int at=-1;
int lastId = -1;
for(int i=0; i<num_rows; i++)
{
//if in first entry of new doc id; has the last read time by pair_docid_cmp
if(lastId!=scores[i].second.docId)
actualScores[++at]=scores[i].second;
}
qsort(actualScores, count, sizeof(score_pair), score_cmp);
Where pair_docid_cmp would compare first on docid; grouping same docs together, then second by reverse order read; such that the last item read is the first in the sublist of items with the same docid. Should only be ~5/2x memory usage, and ~double the execution speed.