QHashIterator in c++ - c++

I developed a game in C++, and want to make sure everything is properly done.
Is it a good solution to use a QHashIterator to check which item in the list has the lowest value (F-cost for pathfinding).
Snippet from my code:
while(!pathFound){ //do while path is found
QHashIterator<int, PathFinding*> iterator(openList);
PathFinding* parent;
iterator.next();
parent = iterator.value();
while(iterator.hasNext()){ //we take the next tile, and we take the one with the lowest value
iterator.next();
//checking lowest f value
if((iterator.value()->getGcost() + iterator.value()->getHcost()) < (parent->getGcost() + parent->getHcost())){
parent = iterator.value();
}
}
if(!atDestionation(parent,endPoint)){ //here we check if we are at the destionation. if we are we return our pathcost.
clearLists(parent);
filllists(parent,endPoint);
}else{
pathFound = true;
while(parent->hasParent()){
mylist.append(parent);
parent = parent->getParent();
}
pathcost = calculatePathCost(mylist); //we calculate what the pathcost is and return it
}
}
If no? Are there better improvements?
I also found someting about the std::priority_queue. It this mutch better then a QHashIterator?
It's maybe not a problem with gameworld where there which are not big. But i'm looking for a suitable solution when the game worlds are big (like + 10000 calculations).Any marks?

Here you basically scan the whole map to find the element that is the minimum one according to some values:
while(iterator.hasNext()){ //we take the next tile, and we take the one with the lowest value
iterator.next();
//checking lowest f value
if((iterator.value()->getGcost() + iterator.value()->getHcost()) < (parent->getGcost() + parent->getHcost())){
parent = iterator.value();
}
}
All this code, if you had an stl container, for instance a map, could be reduced to:
auto parent = std::min_element(iterator.begin(), iterator.end(), [](auto& lhs, auto& rhs)
{ lhs.value()->getGcost() + lhs.value()->getHcost()) < (rhs.value()->getGcost() + rhs.value()->getHcost() }
Once you have something easier to understand you can play around with different containers, for instance it might be faster to hold a sorted vector in this case.
)
Your code does not present any obvious problems per se, often performance gains are not conquered by optimizing little loops, it's more on how you code is organized. For instance I see that you have a lot of indirections, those cost a lot in cache misses. Or if you have to always find the minimum element, you could cache it in another structure and you would have it at a constant time, all the time.

Related

Why does sorting call the comparison function less often than a linear minimum search algorithm?

I'll start by giving some context. I'm learning to write a raytracer, a very simple one. I don't have any acceleration structures yet, so the code in question is intended to find the closest object that the ray hits. Since I'm learning yet, I'd greatly appreciate if the answers concentrated on the seemingly strange problem that I'm observing - I know the RT logic is very wrong as it is right now. It produces correct results, anyway.
1. The first approach: for every hit, add a hit-result structure object to the list, then apply std::sort with a predicate that compares the distance form the hit point to the ray origin. Should be O(N log N) according to the textbook, and I think it is suboptimal, since I only need the first result, not the whole sorted list.
2. The second approach: whenever there is a hit, take the distance and compare it to the minimum, which is first initialized to std::numeric_limits<float>::max(). Well, your standard "find min in the array" algorithm. Should be O(N) and thus faster.
These pieces of code reside in a recursive function. Tested on the very same scene of 10 spheres, 1 is faster by an order of magnitude. The amount of calls to the distance function is a few times less than in 2. What am I missing?
I'm not sure if the context is required, in case there are "branches to be cut" off this question, tell me.
Code piece 1:
result rt_function(...) {
static int count{};
std::vector<result> hitList;
for(const auto& obj : objList) {
const result res = obj->testOuter(ray);
if ( res.hit ) {
hitList.push_back(res);
}
}
if (!hitList.empty()) {
sort(hitList.begin(), hitList.end(), [=](result& hit1, result& hit2) -> bool {
std::cerr << ++count << '\n';
return cv::norm(hit1.point - ray.origin) <
cv::norm(hit2.point - ray.origin);
});
const result res = hitList.front();
const SceneObject* near = res.obj;
// the raytracing continues...
count == 180771
Code piece 2:
result rt_function(...) {
static int count{};
float min_distance = std::numeric_limits<float>::max(), distance{};
result closest_res{}; bool have_hit{};
for(const auto& obj : objList) {
const result res = obj->testOuter(ray);
if ( res.hit ) {
have_hit = true;
std::cerr << ++count << '\n';
distance = cv::norm(res.point - ray.origin);
if (distance < min_distance) {
min_distance = distance; closest_res = res;
}
}
}
if (have_hit) {
const result res = closest_res;
const SceneObject* near = res.obj;
// the raytracing continues...
count == 349633
I want to (a) understand why there are less comparisons and (b) where the bottleneck is, since the run time is significantly higher, as I've noted above.
Statements like O(N²) are like a dimension; double the number of points and time taken quadruples. An O(log N) algorithm can be slow for small N , the point being if N doubles or is increased by a factor of 10 running time doesn't.
Compare with finding a specific word in a 1000 page dictionary and one in a 20 word sentence. Sorting a 20 word sentence before finding a specific word takes longer than reading it straight through once.

Is there any way of optimising this function?

This piece of code seems to be the worst offender in terms of time in my program. What my program is trying to do find the minimum number of individual "nodes" required to satisfy a network with two constraints:
Each node must connect to x number of other nodes
Each node must have y degrees of separation between it and each of the nodes it's connected to.
However for values of x greater than 600 this task takes a very long time, the task is on the order of exponential anyway so I expect it to take forever at some point but that also means that if any small changes could be made here it'd speed up the entire program by alot.
uniint = unsigned long long int (64-bit)
network is a vector of the form vector<vector<uniint>>
The piece of code:
/* Checks if id2 is in id1's list of connections */
inline bool CheckIfInList (uniint id1, uniint id2)
{
uniint id1size = network[id1].size();
for (uniint itr = 0; itr < id1size; ++itr)
{
if (network[id1][itr] == id2)
{
return true;
}
}
return false;
}
The only way is to sort the network[id1] array when you build it.
If you arrive here with a sorted array you can easiliy find, if exists, what you are looking for using a dichotomic search.
Use std::map or std::unordered_map for fast search. I guess it's impossible to MICRO optimize this code, std::vector is cool. But not for 600 elements search.
I'm guessing CheckIfInList() is called in a loop? Perhaps a vector is not the best choice, you could try vector<set<uniint>>. This will give you O(log n) for a look up of the inner collection instead of O(n)
For quick microoptimization, check whether your compiler optimizes the multiple calls to network[id1] away. If not, that is where you loose a lot of time, so remember the address:
vector<uniint>& connectedNodes = network[id1];
uniint id1size = connectedNodes.size();
for (uniint itr = 0; itr < id1size; ++itr)
{
if (connectedNodes[itr] == id2)
{
return true;
}
}
return false;
If your compiler already took care of that, I'm afraid that there's not much you can micro optimize about this method. The only real optimization can be achieved on the algorithmic level, starting with sorting the neighbour lists, moving on to using unordered_map<> instead of vector<>, and ending with asking yourself whether you can't somehow reduce the number of calls to CheckIfInList().
This is not as effective as HAL9000's suggestion, and is good for cases when you have an unsorted list/array. What you can do is to ask less question in each iteration if you put the value you looking for at the end of the vector.
uniint id1size = network[id1].size();
network[id1][id1size] = id2;
for (uniint itr = 0; network[id1][itr] == id2; ++itr);
//if itr != id1size return true else flase....
need to add checks if the last member in the vector was your id2.
This way you don't need to ask each time whether you get to the end of the list.

Logic Help: comparing values and taking the smallest distance, while removing it from the list of "available to compare"

Okay, I have been set with the task of comparing this list of Photons using one method (IU) and comparing it with another (TSP). I need to take the first IU photon and compare distances with all of the TSP photons, find the smallest distance, and "pair" them (i.e. set them both in arrays with the same index). Then, I need to take the next photon in the IU list, and compare it to all of the TSP photons, minus the one that was chosen already.
I know I need to use a Boolean array of sorts, with keeping a counter. I can't seem to logic it out entirely.
The code below is NOT standard C++ syntax, as it is written to interact with ROOT (CERN data analysis software).
If you have any questions with the syntax to better understand the code, please ask. I'll happily answer.
I have the arrays and variables declared already. The types that you see are called EEmcParticleCandidate and that's a type that reads from a tree of information, and I have a whole set of classes and headers that tell that how to behave.
Thanks.
Bool_t used[2];
if (num[0]==2 && num[1]==2) {
TIter photonIterIU(mPhotonArray[0]);
while(IU_photon=(EEmcParticleCandidate_t*)photonIterIU.Next()){
if (IU_photon->E > thresh2) {
distMin=1000.0;
index = 0;
IU_PhotonArray[index] = IU_photon;
TIter photonIterTSP(mPhotonArray[1]);
while(TSP_photon=(EEmcParticleCandidate_t*)photonIterTSP.Next()) {
if (TSP_photon->E > thresh2) {
Float_t Xpos_IU = IU_photon->position.fX;
Float_t Ypos_IU = IU_photon->position.fY;
Float_t Xpos_TSP = TSP_photon->position.fX;
Float_t Ypos_TSP = TSP_photon->position.fY;
distance_1 = find distance //formula didnt fit here //
if (distance_1 < distMin){
distMin = distance_1;;
for (Int_t i=0;i<2;i++){
used[i] = false;
} //for
used[index] = true;
TSP_PhotonArray[index] = TSP_photon;
index++;
} //if
} //if thresh
} // while TSP
} //if thresh
} // while IU
Thats all I have at the moment... work in progress, I realize all of the braces aren't closed. This is just a simple logic question.
This may take a few iterations.
As a particle physicist, you should understand the importance of breaking things down into their component parts. Let's start with iterating over all TSP photons. It looks as if the relevant code is here:
TIter photonIterTSP(mPhotonArray[1]);
while(TSP_photon=(EEmcParticleCandidate_t*)photonIterTSP.Next()) {
...
if(a certain condition is met)
TSP_PhotonArray[index] = TSP_photon;
}
So TSP_photon is a pointer, you will be copying it into the array TSP_PhotonArray (if the energy of the photon exceeds a fixed threshold), and you go to a lot of trouble keeping track of which pointers have already been so copied. There is a better way, but for now let's just consider the problem of finding the best match:
distMin=1000.0;
while(TSP_photon= ... ) {
distance_1 = compute_distance_somehow();
if (distance_1 < distMin) {
distMin = distance_1;
TSP_PhotonArray[index] = TSP_photon; // <-- BAD
index++; // <-- VERY BAD
}
}
This is wrong. Suppose you find a TSP_photon with the smallest distance yet seen. You haven't yet checked all TSP photons, so this might not be the best, but you store the pointer anyway, and increment the index. Then if you find another match that's even better, you'll store that one too. Conceptually, it should be something like this:
distMin=1000.0;
best_photon_yet = NULL;
while(TSP_photon= ... ) {
distance_1 = compute_distance_somehow();
if (distance_1 < distMin) {
distMin = distance_1;
best_pointer_yet = TSP_photon;
}
}
// We've now finished searching the whole list of TSP photons.
TSP_PhotonArray[index] = best_photon_yet;
index++;
Post a comment to this answer, telling me if this makes sense; if so, we can proceed, if not, I'll try to clarify.

A* pathfinding slow

I am currently working on a A* search algorithm. The algorithm would just be solving text file mazes. I know that the A* algorithm is supposed to be very quick in finding the finish. Mine seems to take 6 seconds to find the path in a 20x20 maze with no walls. It does find the finish with the correct path it just takes forever to do so.
If I knew which part of code was the problem I would just post that but I really have no idea what is going wrong. So here is the algorithm that I use...
while(!openList.empty()) {
visitedList.push_back(openList[index]);
openList.erase(openList.begin() + index);
if(currentCell->x_coor == goalCell->x_coor && currentCell->y_coor == goalCell->y_coor)
}
FindBestPath(currentCell);
break;
}
if(map[currentCell->x_coor+1][currentCell->y_coor] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor+1,currentCell->y_coor,currentCell));
}
if(map[currentCell->x_coor-1][currentCell->y_coor] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor-1,currentCell->y_coor,currentCell));
}
if(map[currentCell->x_coor][currentCell->y_coor+1] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor,currentCell->y_coor+1,currentCell));
}
if(map[currentCell->x_coor][currentCell->y_coor-1] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor,currentCell->y_coor-1,currentCell));
}
for(int i=0;i<openList.size();i++) {
openList[i]->G = openList[i]->parent->G + 1;
openList[i]->H = openList[i]->ManHattenDistance(goalCell);
}
float bestF = 999999;
index = -1;
for(int i=0;i<openList.size();i++) {
if(openList[i]->GetF() < bestF) {
for(int n=0;n<visitedList.size();n++) {
if(CheckVisited(openList[i])) {
bestF = openList[i]->GetF();
index = i;
}
}
}
}
if(index >= 0) {
currentCell = openList[index];
}
}
I know this code is messy and not the most efficient way to do things but I think it should still be faster then what it is. Any help would be greatly appreciated.
Thanks.
Your 20x20 maze has no walls, and therefore many, many routes which are all the same length. I'd estimate trillions of equivalent routes, in fact. It doesn't seem so bad when you take that into account.
Of course, since your heuristic looks perfect, you should get a big benefit from excluding routes that are heuristically predicted to be precisely as long as the best route known so far. (This is safe if your heuristic is correct, i.e. never overestimates the remaining distance).
Here is a big hint.
If ever you find two paths to the same cell, you can always throw away the longer one. If there is a tie, you can throw away the second one to get there.
If you implement that, with no other optimizations, the search would become more than acceptably fast.
Secondly the A* algorithm should only bother backtracking if the length to the current cell plus the heuristic exceeds the length to the current cell plus the heuristic for any other node. If you implement that, then it should directly find a path and stop. To facilitate that you need to store paths in a priority queue (typically implemented with a heap), not a vector.
openList.erase is O(n), and the for-loop beginning with for(int i=0;i<openList.size();i++) is O(n^2) due to the call to CheckVisited - these are called every iteration, making your overall algorithm O(n^3). A* should be O(n log n).
Try changing openList to a priority-queue like it's supposed to be, and visitedList to a hash table. The entire for loop can then be replaced by a dequeue - make sure you check if visitedList.Contains(node) before enqueuing!
Also, there is no need to recalculate the ManHattenDistance for every node every iteration, since it never changes.
Aren't you constantly backtracking?
The A* algorithm backtracks when the current best solution becomes worse than another previously visited route. In your case, since there are no walls, all routes are good and never die (and as MSalters correctly pointed, there are several of them). When you take a step, your route becomes worse than all the others that are one step shorter.
If that is true, this may account for the time taken by your algorithm.

How to keep only the last duplicate when iterating through rows

Following code iterates through many data-rows, calcs some score per row and then sorts the rows according to that score:
unsigned count = 0;
score_pair* scores = new score_pair[num_rows];
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
scores[count].score = score;
scores[count].doc_id = row->docid;
count++;
}
assert(count <= num_rows);
qsort(scores, count, sizeof(score_pair), score_cmp);
Unfortunately, there are many duplicate rows with the same docid but different score. Now i like to keep the last score for any docid only. The docids are unsigned int, but usually big (=> no lookup-array) - using a HashMap to lookup the last count for a docid would probably be too slow (many millions of rows, should only take seconds not minutes...).
Ok, i modified my code to use a std:map:
map<int, int> docid_lookup;
unsigned count = 0;
score_pair* scores = new score_pair[num_rows];
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
map<int, int>::iterator iter;
iter = docid_lookup.find(row->docid);
if (iter != docid_lookup.end()) {
scores[iter->second].score = score;
scores[iter->second].doc_id = row->docid;
} else {
scores[count].score = score;
scores[count].doc_id = row->docid;
docid_lookup[row->docid] = count;
count++;
}
}
It works and the performance hit is not as bad as i expected - now it runs a minute instead of 16 seconds, so it's about a factor of 3. Memory usage has also gone up from about 1Gb to 4Gb.
The first thing I'd try would be a map or unordered_map: I'd be surprised if performance is a factor of 60 slower than what you did without any unique-ification. If the performance there isn't acceptable, another option is something like this:
// get the computed data into a vector
std::vector<score_pair>::size_type count = 0;
std::vector<score_pair> scores;
scores.reserve(num_rows);
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
scores.push_back(score_pair(score, row->docid));
}
assert(scores.size() <= num_rows);
// remove duplicate doc_ids
std::reverse(scores.begin(), scores.end());
std::stable_sort(scores.begin(), scores.end(), docid_cmp);
scores.erase(
std::unique(scores.begin(), scores.end(), docid_eq),
scores.end()
);
// order by score
std::sort(scores.begin(), scores.end(), score_cmp);
Note that the use of reverse and stable_sort is because you want the last score for each doc_id, but std::unique keeps the first. If you wanted the first score you could just use stable_sort, and if you didn't care what score, you could just use sort.
The best way of handling this is probably to pass reverse iterators into std::unique, rather than a separate reverse operation. But I'm not confident I can write that correctly without testing, and errors might be really confusing, so you get the unoptimised code...
Edit: just for comparison with your code, here's how I'd use the map:
std::map<int, float> scoremap;
while ((row = data.next_row())) {
scoremap[row->docid] = calc_score(data.next_feature());
}
std::vector<score_pair> scores(scoremap.begin(), scoremap.end());
std::sort(scores.begin(), scores.end(), score_cmp);
Note that score_pair would need a constructor taking a std::pair<int,float>, which makes it non-POD. If that's not acceptable, use std::transform, with a function to do the conversion.
Finally, if there is much duplication (say, on average 2 or more entries per doc_id), and if calc_score is non-trivial, then I would be looking to see whether it's possible to iterate the rows of data in reverse order. If it is, then it will speed up the map/unordered_map approach, because when you get a hit for the doc_id you don't need to calculate the score for that row, just drop it and move on.
I'd go for a std::map of docids. If you could create an appropriate hashing function, a hash-map would be preferable. But I guess it's too difficult. And no - the std::map ist not too slow. Access is O(log n), which is nearly as good as O(1). O(1) is array access time (and Hashmap btw).
Btw, if std::map is too slow, qsort O(n log n) is too slow as well. And, using a std::map and iterating over it's contents, you can perhaps save your qsort.
Some additions for the comment (by onebyone):
I did not go for the implementation
details, since there wasn't enough
information on that.
qsort may behave bad with sorted data
(depending on the implementation).
Std::map may not. This is a real
advantage, especially if you read the
values from a database that might
output them ordered by key.
There was no word on the memory allocation strategy. Changing to a memory allocator with fast allocation of small objects may improve the performance.
Still - the fastest would be a hash map with an appropriate hash function. Since there's not enough information about the distribution of the keys, presenting one in this answer is not possible.
Short - if you ask general questions, you get general answers. This means - at least for me, looking at the time complexity in the O-Notation. Still you were right, depending on different factors, the std::map may be too slow while qsort is still fast enough - it may also be the other way round in the worst case of qsort, where it has n^2 complexity.
Unless I've misunderstood the question, the solution can be simplified considerably. At least as I understand it, you have a few million docid's (which are of type unsigned int) and for each unique docid, you want to store one 'score' (which is a float). If the same docid occurs more than once in the input, you want to keep the score from the last one. If that's correct, the code can be reduced to this:
std::map<unsigned, float> scores;
while ((row = data.next_row()))
scores[row->docid] = calc_score(data.next_feature());
This will probably be somewhat slower than your original version since it allocates a lot of individual blocks rather than one big block of memory. Given your statement that there are a lot of duplicates in the docid's, I'd expect this to save quite a bit of memory, since it only stores data for each unique docid rather than for every row in the original data.
If you wanted to optimize this, you could almost certainly do so -- since it uses a lot of small blocks, a custom allocator designed for that purpose would probably help quite a bit. One possibility would be to take a look at the small-block allocator in Andrei Alexandrescu's Loki library. He's done more work on the problem since, but the one in Loki is probably sufficient for the task at hand -- it'll almost certainly save a fair amount of memory and run faster as well.
If your C++ implementation has it, and most do, try hash_map instead of std::map (it's sometimes available under std::hash_map).
If the lookups themselves are your computational bottleneck, this could be a significant speedup over std::map's binary tree.
Why not sort by doc id first, calculate scores, then for any subset of duplicates use the max score?
On re-reading the question; I'd suggest a modification to how scores are read in. Keep in mind C++ isn't my native language, so this won't quite be compilable.
unsigned count = 0;
pair<int, score_pair>* scores = new pair<int, score_pair>[num_rows];
while ((row = data.next_row())) {
float score = calc_score(data.next_feature())
scores[count].second.score = score;
scores[count].second.doc_id = row->docid;
scores[count].first = count;
count++;
}
assert(count <= num_rows);
qsort(scores, count, sizeof(score_pair), pair_docid_cmp);
//getting number of unique scores
int scoreCount = 0;
for(int i=1; i<num_rows; i++)
if(scores[i-1].second.docId != scores[i].second.docId) scoreCount++;
score_pair* actualScores=new score_pair[scoreCount];
int at=-1;
int lastId = -1;
for(int i=0; i<num_rows; i++)
{
//if in first entry of new doc id; has the last read time by pair_docid_cmp
if(lastId!=scores[i].second.docId)
actualScores[++at]=scores[i].second;
}
qsort(actualScores, count, sizeof(score_pair), score_cmp);
Where pair_docid_cmp would compare first on docid; grouping same docs together, then second by reverse order read; such that the last item read is the first in the sublist of items with the same docid. Should only be ~5/2x memory usage, and ~double the execution speed.