Kd-Tree flawed K nearest neighbor - c++

Disclaimer: There are some bad practices in this following code
Hello, I just had a few questions on how to correctly format my KD tree K nearest neighbor search. Here is an example of my function.
void nearest_neighbor(Node *T, int K) {
if (T == NULL) return;
nearest_neighbor(T->left, K);
//do stuff find dist etc
if(?)nearest_neighbor(T->right, K);
}
This code is confusing so I will try to explain it. My function only takes the k value and a Node T. What I am trying to do is find the distance between the current node and every other value in the structure. These all work, the issue I'm having is understanding when and how to call the recursive calls nearest_neighbor(T->left/T->right,K) I know I am meant to prune the calls to the right side but I'm not sure how to do this. This is an multidimensional KD Tree by the way. Any guidance to better examples would be very appreciated.

I would advise you to implement like Wikipedia says, where for your specific question, this:
Starting with the root node, the algorithm moves down the tree
recursively, in the same way that it would if the search point were
being inserted (i.e. it goes left or right depending on whether the
point is lesser than or greater than the current node in the split
dimension).
answers the question. Of course you can have this image in mind:
where if you have more two dimensions like in the example, you simply split in the first dimension, then in the second, then in the third, then in the forth and so on, and then you follow a cyclic policy, so that when you reach the final dimension, you start from the first dimension again.

The general idea is to keep a global point closest to the target, updating with newly discovered points and never descending into an n-gon that can't possibly contain a point closer than the nearest to the target already found. I'll show it in C rather than C++. You can easily translate to object-oriented form.
#define N_DIM <k for the k-D tree>
typedef float COORD;
typedef struct point_s {
COORD x[N_DIM];
} POINT;
typedef struct node_s {
struct node_s *lft, *rgt;
POINT p[1];
} NODE;
POINT target[1]; // target for nearest search
POINT nearest[1]; // nearest found so far
POINT b0[1], b1[1]; // search bounding box
bool prune_search() {
// Return true if no point in the bounding box [b0..b1] is closer
// to the target than than the current value of nearest.
}
void search(NODE *node, int dim);
void search_lft(NODE *node, int dim) {
if (!node->lft) return;
COORD save = b1->p->x[dim];
b1->p->x[dim] = node->p->x[dim];
if (!prune_search()) search(node->lft, (dim + 1) % N_DIM);
b1->p->x[dim] = save;
}
void search_rgt(NODE *node, int dim) {
if (!node->rgt) return;
COORD save = b0->p->x[dim];
b0->p->x[dim] = node->p->x[dim];
if (!prune_search()) search(node->rgt, (dim + 1) % N_DIM);
b0->p->x[dim] = save;
}
void search(NODE *node, int dim) {
if (dist(node->p, target) < dist(nearest, target)) *nearest = *node->p;
if (target->p->x[dim] < node->p->x[dim]) {
search_lft(node, dim);
search_rgt(node, dim);
} else {
search_rgt(node, dim);
search_lft(node, dim);
}
}
/** Set *nst to the point in the given kd-tree nearest to tgt. */
void get_nearest(POINT *nst, POINT *tgt, NODE *root) {
*b0 = POINT_AT_NEGATIVE_INFINITY;
*b1 = POINT_AT_POSITIVE_INFINITY;
*target = *tgt;
*nearest = *root->p;
search(root, 0);
*nst = *nearest;
}
Note this is not the most economical implementation. It does some unnecessary nearest updates and pruning comparisons for simplicity. But its asymptotic performance is as expected for kd-tree NN. After you get this one working, you can use it as a base implementation to squeeze out the extra comparisons.

Related

Shortest path in Graph with time limit

Let's say I have a graph G with N vertices and M edges. Each edge has its length and time (let's say in minutes), which it takes to traverse that edge. I need to find the shortest path in the graph between the vertices 1 and N, which is performed in under T minutes time.
Since time is the more valuable resource and we care about traversing the graph in time, and only then with minimal length, I decided to use Dijkstra's algorithm, for which I considered the time of each edge as its weight. I added a vector to store the durations. Thus, the algorithm returns the least time, not the least length. A friend suggested this addition to my code:
int answer(int T) {
int l = 1;
int r = M; // a very big number
int answer = M;
while (l <= r) {
int mid = (l + r) / 2;
int time = dijkstra(mid); // the parameter mid serves as an upper bound for dijkstra and I relax the edge only if its length(not time) is less than mid
if (time <= T) {
answer = mid;
r = mid - 1;
} else {
l = mid + 1;
}
}
if (best == M) {
return -1; // what we return in case there is no path in the graph, which takes less than T minutes
}
return answer;
}
Here is the dijkstra method (part of class Graph with std::unordered_map<int, std::vector<Node>> adjacencyList member):
int dijkstra(int maxLength) {
std::priority_queue<Node, std::vector<Node>, NodeComparator> heap;//NodeComparator sorts by time of edge
std::vector<int> durations(this->numberOfVertices + 1, M);
std::set<int> visited;
// duration 1->1 is 0
durations[1] = 0;
heap.emplace(1, 0, 0);
while (!heap.empty()) {
int vertex = heap.top().endVertex;
heap.pop();
// to avoid repetition
if (visited.find(vertex) != visited.end()) {
continue;
}
for (Node node: adjacencyList[vertex]) {
// relaxation
if (node.length <= maxLength && durations[node.endVertex] > durations[vertex] + node.time) {
durations[node.endVertex] = durations[vertex] + node.time;
heap.emplace(node.endVertex, durations[node.endVertex], 0);
}
}
// mark as visited to avoid going through the same vertex again
visited.insert(vertex);
}
// return path time between 1 and N bounded by maxKilograms
return durations.back();
}
This seems to work but seems inefficient to me. To be frank, I don't understand his idea completely. It appears to me like randomly trying to find the best answer(because nobody said that the time of an edge is tied proportionally to its length). I tried searching for shortest path in graph with time limit but I found algorithms that find the fastest paths, not the shortest with a limit. Does an algorithm for this even exist? How can improve my solution?
What is this?
int time = dijkstra(mid);
It certainly isnt an implementation of the Dijkstra algorithm!
The Dijkstra algorithm requires the starting node and returns THE shortest path from the starting node to every other.
You are going to need a function that returns all the distinct paths between start and end nodes that take less than T. Then you can search them for the one that is cheapest.
Search graph for all distinct paths from start to end
Discard paths that take more then T
Select cheapest path.
Finding a resource constrained shortest path is NPHard. Most approaches to this problem employ a labelling scheme, which is a specialization of dynamic programming. You could use available libraries to accomplish this. See here for a boost implementation.

Depth First Search: Formatting output?

If I have the following graph:
Marisa Mariah
\ /
Mary---Maria---Marian---Maryanne
|
Marley--Marla
How should be Depth First Search function be implemented such that I get the output if "Mary" is my start point ?
Mary
Maria
Marisa
Mariah
Marian
Maryanne
Marla
Merley
I do realize that the number of spaces equal to depth of the vertex( name ) but I don't how to code that. Following is my function:
void DFS(Graph g, Vertex origin)
{
stack<Vertex> vertexStack;
vertexStack.push(origin);
Vertex currentVertex;
int currentDepth = 0;
while( ! vertexStack.empty() )
{
currentVertex = vertexStack.top();
vertexStack.pop();
if(currentVertex.visited == false)
{
cout << currentVertex.name << endl;
currentVertex.visited = true;
for(int i = 0; i < currentVertex.adjacencyList.size(); i++)
vertexStack.push(currentVertex.adjacencyList[i]);
}
}
}
Thanks for any help !
Just store the node and its depth your stack:
std::stack<std::pair<Vertex, int>> vertexStack;
vertexStack.push(std::make_pair(origin, 0));
// ...
std::pair<Vertex, int> current = vertexStack.top();
Vertex currentVertex = current.first;
int depth = current.second;
If you want to get fancy, you can extra the two values using std::tie():
Vertex currentVertex;
int depth;
std::tie(currentVertex, depth) = vertexStack.top();
With knowing the depth you'd just indent the output appropriately.
The current size of your stack is, BTW, unnecessarily deep! I think for a complete graph it may contain O(N * N) elements (more precisely, (N-1) * (N-2)). The problem is that you push many nodes which may get visited.
Assuming using an implicit stack (i.e., recursion) is out of question (it won't work for large graphs as you may get a stack overflow), the proper way to implement a depth first search would be:
push the current node and edge on the stack
mark the top node visited and print it, using the stack depth as indentation
if there is no node
if the top nodes contains an unvisited node (increment the edge iterator until such a node is found) go to 1.
otherwise (the edge iterator reached the end) remove the top node and go to 3.
In code this would look something like this:
std::stack<std::pair<Node, int> > stack;
stack.push(std::make_pair(origin, 0));
while (!stack.empty()) {
std::pair<Node, int>& top = stack.top();
for (; top.second < top.first.adjacencyList.size(); ++top.second) {
Node& adjacent = top.first.adjacencyList[top.second];
if (!adjacent.visited) {
adjacent.visted = true;
stack.push(std::make_pair(adjacent, 0));
print(adjacent, stack.size());
break;
}
}
if (stack.top().first.adjacencyList.size() == stack.top().second) {
stack.pop();
}
}
Let Rep(Tree) be the representation of the tree Tree. Then, Rep(Tree) looks like this:
Root
<Rep(Subtree rooted at node 1)>
<Rep(Subtree rooted at node 2)>
.
.
.
So, have your dfs function simply return the representation of the subtree rooted at that node and modify this value accordingly. Alternately, just tell every dfs call to print the representation of the tree rooted at that node but pass it the current depth. Here's an example implementation of the latter approach.
void PrintRep(const Graph& g, Vertex current, int depth)
{
cout << std::string(' ', 2*depth) << current.name << endl;
current.visited = true;
for(int i = 0; i < current.adjacencyList.size(); i++)
if(current.adjacencyList[i].visited == false)
PrintRep(g, current.adjacencyList[i], depth+1);
}
You would call this function with with your origin and depth 0 like this:
PrintRep(g, origin, 0);

Finding nearest neighbor(s) in a KD Tree

Warning: Fairly long question, perhaps too long. If so, I apologize.
I'm working on a program involving a nearest neighbor(s) search of a kd tree (in this example, it is an 11 dimensional tree with 3961 individual points). We've only just learned about them, and while I have a good grasp of what the tree does, I get very confused when it comes to the nearest neighbor search.
I've set up a 2D array of points, each containing a quality and a location, which looks like this.
struct point{
double quality;
double location;
}
// in main
point **parray;
// later points to an array of [3961][11] points
I then translated the data so it has zero mean, and rescaled it for unit variance. I won't post the code as it's not important to my questions. Afterwards, I built the points into the tree in random order like this:
struct Node {
point *key;
Node *left;
Node *right;
Node (point *k) { key = k; left = right = NULL; }
};
Node *kd = NULL;
// Build the data into a kd-tree
random_shuffle(parray, &parray[n]);
for(int val=0; val<n; val++) {
for(int dim=1; dim<D+1; dim++) {
kd = insert(kd, &parray[val][dim], dim);
}
}
Pretty standard stuff. If I've used random_shuffle() incorrectly, or if anything is inherently wrong about the structure of my tree, please let me know. It should shuffle the first dimension of the parray, while leaving the 11 dimensions of each in order and untouched.
Now I'm on to the neighbor() function, and here's where I've gotten confused.
The neighbor() function (last half is pseudocode, where I frankly have no idea where to start):
Node *neighbor (Node *root, point *pn, int d,
Node *best, double bestdist) {
double dist = 0;
// Recursively move down tree, ignore the node we are comparing to
if(!root || root->key == pn) return NULL;
// Dist = SQRT of the SUMS of SQUARED DIFFERENCES of qualities
for(int dim=1; dim<D+1; dim++)
dist += pow(pn[d].quality - root->key->quality, 2);
dist = sqrt(dist);
// If T is better than current best, current best = T
if(!best || dist<bestdist) {
bestdist = dist;
best = root;
}
// If the dist doesn't reach a plane, prune search, walk back up tree
// Else traverse down that tree
// Process root node, return
}
Here's the call to neighbor in main(), mostly uncompleted. I'm not sure what should be in main() and what should be in the neighbor() function:
// Nearest neighbor(s) search
double avgdist = 0.0;
// For each neighbor
for(int i=0; i<n; i++) {
// Should this be an array/tree of x best neighbors to keep track of them?
Node *best;
double bestdist = 1000000000;
// Find nearest neighbor(s)?
for(int i=0; i<nbrs; i++) {
neighbor(kd, parray[n], 1, best, &bestdist);
}
// Determine "distance" between the two?
// Add to total dist?
avgdist += bestdist;
}
// Average the total dist
// avgdist /= n;
As you can see, I'm stuck on these last two sections of code. I've been wracking my brain over this for a few days now, and I'm still stuck. It's due very soon, so of course any and all help is appreciated. Thanks in advance.
The kd-tree does not involve shuffling.
In fact, you will want to use sorting (or better, quickselect) to build the tree.
First solve it for the nearest neighbor (1NN). It should be fairly clear how to find the kNN once you have this part working, by keeping a heap of the top candidates, and using the kth point for pruning.

Finding adjacent nodes in a tree

I'm developing a structure that is like a binary tree but generalized across dimensions so you can set whether it is a binary tree, quadtree, octree, etc by setting the dimension parameter during initialization.
Here is the definition of it:
template <uint Dimension, typename StateType>
class NDTree {
public:
std::array<NDTree*, cexp::pow(2, Dimension)> * nodes;
NDTree * parent;
StateType state;
char position; //position in parents node list
bool leaf;
NDTree const &operator[](const int i) const
{
return (*(*nodes)[i]);
}
NDTree &operator[](const int i)
{
return (*(*nodes)[i]);
}
}
So, to initialize it- I set a dimension and then subdivide. I am going for a quadtree of depth 2 for illustration here:
const uint Dimension = 2;
NDTree<Dimension, char> tree;
tree.subdivide();
for(int i=0; i<tree.size(); i++)
tree[i].subdivide();
for(int y=0; y<cexp::pow(2, Dimension); y++) {
for(int x=0; x<cexp::pow(2, Dimension); x++) {
tree[y][x].state = ((y)*10)+(x);
}
}
std::cout << tree << std::endl;
This will result in a quadtree, the state of each of the values are initialized to [0-4][0-4].
([{0}{1}{2}{3}][{10}{11}{12}{13}][{20}{21}{22}{23}][{30}{31}{32}{33}])
I am having trouble finding adjacent nodes from any piece. What it needs to do is take a direction and then (if necessary) traverse up the tree if the direction goes off of the edge of the nodes parent (e.g. if we were on the bottom right of the quadtree square and we needed to get the piece to the right of it). My algorithm returns bogus values.
Here is how the arrays are laid out:
And here are the structures necessary to know for it:
This just holds the direction for items.
enum orientation : signed int {LEFT = -1, CENTER = 0, RIGHT = 1};
This holds a direction and whether or not to go deeper.
template <uint Dimension>
struct TraversalHelper {
std::array<orientation, Dimension> way;
bool deeper;
};
node_orientation_table holds the orientations in the structure. So in 2d, 0 0 refers to the top left square (or left left square).
[[LEFT, LEFT], [RIGHT, LEFT], [LEFT, RIGHT], [RIGHT, RIGHT]]
And the function getPositionFromOrientation would take LEFT, LEFT and return 0. It is just basically the opposite of the node_orientation_table above.
TraversalHelper<Dimension> traverse(const std::array<orientation, Dimension> dir, const std::array<orientation, Dimension> cmp) const
{
TraversalHelper<Dimension> depth;
for(uint d=0; d < Dimension; ++d) {
switch(dir[d]) {
case CENTER:
depth.way[d] = CENTER;
goto cont;
case LEFT:
if(cmp[d] == RIGHT) {
depth.way[d] = LEFT;
} else {
depth.way[d] = RIGHT;
depth.deeper = true;
}
break;
case RIGHT:
if(cmp[d] == LEFT) {
depth.way[d] = RIGHT;
} else {
depth.way[d] = LEFT;
depth.deeper = true;
}
break;
}
cont:
continue;
}
return depth;
}
std::array<orientation, Dimension> uncenter(const std::array<orientation, Dimension> dir, const std::array<orientation, Dimension> cmp) const
{
std::array<orientation, Dimension> way;
for(uint d=0; d < Dimension; ++d)
way[d] = (dir[d] == CENTER) ? cmp[d] : dir[d];
return way;
}
NDTree * getAdjacentNode(const std::array<orientation, Dimension> direction) const
{
//our first traversal pass
TraversalHelper<Dimension> pass = traverse(direction, node_orientation_table[position]);
//if we are lucky the direction results in one of our siblings
if(!pass.deeper)
return (*(*parent).nodes)[getPositionFromOrientation<Dimension>(pass.way)];
std::vector<std::array<orientation, Dimension>> up; //holds our directions for going up the tree
std::vector<std::array<orientation, Dimension>> down; //holds our directions for going down
NDTree<Dimension, StateType> * tp = parent; //tp is our tree pointer
up.push_back(pass.way); //initialize with our first pass we did above
while(true) {
//continue going up as long as it takes, baby
pass = traverse(up.back(), node_orientation_table[tp->position]);
std::cout << pass.way << " :: " << uncenter(pass.way, node_orientation_table[tp->position]) << std::endl;
if(!pass.deeper) //we've reached necessary top
break;
up.push_back(pass.way);
//if we don't have any parent we must explode upwards
if(tp->parent == nullptr)
tp->reverseBirth(tp->position);
tp = tp->parent;
}
//line break ups and downs
std::cout << std::endl;
//traverse upwards combining the matrices to get our actual position in cube
tp = const_cast<NDTree *>(this);
for(int i=1; i<up.size(); i++) {
std::cout << up[i] << " :: " << uncenter(up[i], node_orientation_table[tp->position]) << std::endl;
down.push_back(uncenter(up[i], node_orientation_table[tp->parent->position]));
tp = tp->parent;
}
//make our way back down (tp is still set to upmost parent from above)
for(const auto & i : down) {
int pos = 0; //we need to get the position from an orientation list
for(int d=0; d<i.size(); d++)
if(i[d] == RIGHT)
pos += cexp::pow(2, d); //consider left as 0 and right as 1 << dimension
//grab the child of treepointer via position we just calculated
tp = (*(*tp).nodes)[pos];
}
return tp;
}
For an example of this:
std::array<orientation, Dimension> direction;
direction[0] = LEFT; //x
direction[1] = CENTER; //y
NDTree<Dimension> * result = tree[3][0]->getAdjacentNode(direction);
This should grab the top right square within bottom left square, e.g. tree[2][1] which would have a value of 21 if we read its state. Which works since my last edit (algorithm is modified). Still, however, many queries do not return correct results.
//Should return tree[3][1], instead it gives back tree[2][3]
NDTree<Dimension, char> * result = tree[1][2].getAdjacentNode({ RIGHT, RIGHT });
//Should return tree[1][3], instead it gives back tree[0][3]
NDTree<Dimension, char> * result = tree[3][0].getAdjacentNode({ RIGHT, LEFT });
There are more examples of incorrect behavior such as tree[0][0](LEFT, LEFT), but many others work correctly.
Here is the folder of the git repo I am working from with this. Just run g++ -std=c++11 main.cpp from that directory if it is necessary.
here is one property you can try to exploit:
consider just the 4 nodes:
00 01
10 11
Any node can have up to 4 neighbor nodes; two will exist in the same structure (larger square) and you have to look for the other two in neighboring structures.
Let's focus on identifying the neighbors which are in the same structure: the neighbors for 00 are 01 and 10; the neighbors for 11 are 01 and 10. Notice that only one bit differs between neighbor nodes and that neighbors can be classified in horizontal and vertical. SO
00 - 01 00 - 01 //horizontal neighbors
| |
10 11 //vertical neighbors
Notice how flipping the MSB gets the vertical neighbor and flipping the LSB gets the horizontal node? Let's have a close look:
MSB: 0 -> 1 gets the node directly below
1 -> 0 sets the node directly above
LSB: 0 -> 1 gets the node to the right
1 -> 0 gets the node to the left
So now we can determine the node's in each direction assuming they exist in the same substructure. What about the node to the left of 00 or above 10?? According to the logic so far if you want a horizontal neighbor you should flip the LSB; but flipping it would retrieve 10 ( the node to the right). So let's add a new rule, for impossible operations:
you can't go left for x0 ,
you can't go right for x1,
you can't go up for 0x,
you can't go down for 1x
*impossible operations refers to operations within the same structure.
Let's look at the bigger picture which are the up and left neighbors for 00? if we go left for 00 of strucutre 0 (S0), we should end up with 01 of(S1), and if we go up we end up with node 10 of S(2). Notice that they are basically the same horizontal/ veritical neighbor values form S(0) only they are in different structures. So basically if we figure out how to jump from one structure to another we have an algorithm.
Let's go back to our example: going up from node 00 (S0). We should end up in S2; so again 00->10 flipping the MSB. So if we apply the same algorithm we use within the structure we should be fine.
Bottom line:
valid transition within a strucutres
MSB 0, go down
1, go up
LSB 0, go right
1, go left
for invalid transitions (like MSB 0, go up)
determine the neighbor structure by flipping the MSB for vertical and LSB for vertical
and get the neighbor you are looking for by transforming a illegal move in structure A
into a legal one in strucutre B-> S0: MSB 0 up, becomes S2:MSB 0 down.
I hope this idea is explicit enough
Check out this answer for neighbor search in octrees: https://stackoverflow.com/a/21513573/3146587. Basically, you need to record in the nodes the traversal from the root to the node and manipulate this information to generate the required traversal to reach the adjacent nodes.
The simplest answer I can think of is to get back your node from the root of your tree.
Each cell can be assigned a coordinate mapping to the deepest nodes of your tree. In your example, the (x,y) coordinates would range from 0 to 2dimension-1 i.e. 0 to 3.
First, compute the coordinate of the neighbour with whatever algorithm you like (for instance, decide if a right move off the edge should wrap to the 1st cell of the same row, go down to the next row or stay in place).
Then, feed the new coordinates to your regular search function. It will return the neighbour cell in dimension steps.
You can optimize that by looking at the binary value of the coordinates. Basically, the rank of the most significant bit of difference tells you how many levels up you should go.
For instance, let's take a quadtree of depth 4. Coordinates range from 0 to 15.
Assume we go left from cell number 5 (0101b). The new coordinate is 4 (0100b). The most significant bit changed is bit 0, which means you can find the neighbour in the current block.
Now if you go right, the new coordinate is 6 (0110b), so the change is affecting bit 1, which means you have to go up one level to access your cell.
All this being said, the computation time and volume of code needed to use such tricks seems hardly worth the effort to me.

A star algorithm without diagonal movement

Situation:
I'm trying to translate the A* algoritm into c++ code where there is no diagonal movement allowed but I'm having some strange behaviour.
My question: Is it needed to also take into account the diagonal cost even when there is no diagonal movement possible.
When I do the calculations without diagonal cost (with a 'heuristic' factor 10), I always have the same fscore of 80 (you can see this in the second picture where I calculate fscore, gscore and hscore) and this seems really strange to me. Because of this (almost all the nodes having a fscore of 80) the algorimth doesn't seem to work because to many nodes have the same smallest fscore of 80.
Or is this normal behaviour and will a A* star implementation without diagonal have to perform a lot more work (because more nodes have the same minimum fscore)?.
I don't think this question has anything to do with my code than with my reasoning but I'm posting it anyway:
pathFind.cpp
#include "pathFind.h"
#include <queue>
#include "node.h"
#include <QString>
#include <QDebug>
#include <iostream>
/** Pathfinding (A* algo) using Manhatten heuristics and assuming a monotonic, consistent
* heuristic (the enemies do not change position)
*
* TODO: add variable terrain cost
**/
//dimensions
const int horizontalSize = 20;
const int verticalSize = 20;
//nodes sets
static int closedNodes[horizontalSize][verticalSize]; //the set of nodes already evaluated
static int openNodes[horizontalSize][verticalSize]; // the set of nodes to be evaluated; initialy only containing start node
static int dir_map[horizontalSize][verticalSize]; //map of directions (contains parents-children connection)
//directions
const int dir=4;
static int dirX[dir]={1,0,-1,0};
static int dirY[dir]={0,1,0,-1};
//test class
static int map[horizontalSize][verticalSize]; //map of navigated nodes
pathFind::pathFind(){
//test
srand(time(NULL));
//create empty map
for (int y=0;y<verticalSize;y++){
for (int x=0;x<horizontalSize;x++){
map[x][y]=0;
}
}
//fillout matrix
for (int x=horizontalSize/8;x<horizontalSize*7/8;x++){
map[x][verticalSize/2]=1;
}
for (int y=verticalSize/8;y<verticalSize*7/8;y++){
map[horizontalSize/2][y]=1;
}
//randomly select start and finish locations
int xA,yA,xB,yB;
int n=horizontalSize;
int m=verticalSize;
xA=6;
yA=5;
xB = 14;
yB = 12;
qDebug() <<"Map Size (X,Y): "<<n<<","<<m<<endl;
qDebug()<<"Start: "<<xA<<","<<yA<<endl;
qDebug()<<"Finish: "<<xB<<","<<yB<<endl;
// get the route
clock_t start = clock();
QString route=calculatePath(xA, yA, xB, yB);
if(route=="") qDebug() <<"An empty route generated!"<<endl;
clock_t end = clock();
double time_elapsed = double(end - start);
qDebug()<<"Time to calculate the route (ms): "<<time_elapsed<<endl;
qDebug()<<"Route:"<<endl;
qDebug()<<route<<endl<<endl;
}
QString pathFind::calculatePath(const int & xStart, const int & yStart,const int & xFinish, const int & yFinish){
/** why do we maintain a priority queue?
* it's for maintaining the open list: everytime we acces the open list we need to find the node with the lowest
* fscore. A priority queue is a sorted list so we simply have to grab (pop) the first item of the list everytime
* we need a lower fscore.
*
* A priority queue is a data structure in which only the largest element can be retrieved (popped).
* It's problem is that finding an node in the queue is a slow operation.
**/
std::priority_queue<node> pq[2]; //we define 2 priority list which is needed for our priority change of a node 'algo'
static int index; //static and global variables are initialized to 0
static node *currentNode;
static node *neighborNode;
//first reset maps
resetMaps();
//create start node
static node * startNode;
startNode= new node(xStart,yStart,0,0);
startNode->updatePriority(xFinish, yFinish);
// push it into the priority queue
pq[index].push(*startNode);
//add it to the list of open nodes
openNodes[0][0] = startNode->getPriority();
//A* search
//while priority queue is not empty; continue
while(!pq[index].empty()){
//get current node with the higest priority from the priority list
//in first instance this is the start node
currentNode = new node(pq[index].top().getxPos(),
pq[index].top().getyPos(),
pq[index].top().getDistance(),
pq[index].top().getPriority());
//remove node from priority queue
pq[index].pop();
openNodes[currentNode->getxPos()][currentNode->getyPos()]=0; //remove node from open list
closedNodes[currentNode->getxPos()][currentNode->getyPos()]=1; //add current to closed list
//if current node = goal => finished => retrace route back using parents nodes
if (currentNode->getxPos()==xFinish && currentNode->getyPos()==yFinish){
//quit searching if goal is reached
//return generated path from finish to start
QString path="";
int x,y,direction; //in cpp you don't need to declare variables at the top compared to c
//currentNode is now the goalNode
x=currentNode->getxPos();
y=currentNode->getyPos();
while (!(x==xStart && y==yStart)){
/** We start at goal and work backwards moving from node to parent
* which will take us to the starting node
**/
direction=dir_map[x][y];
path =(direction+dir/2)%dir+path; //we work our way back
//iterate trough our dir_map using our defined directions
x+=dirX[direction];
y+=dirY[direction];
}
//garbage collection; the memory allocated with 'new' should be freed to avoid memory leaks
delete currentNode;
while (!pq[index].empty()){
pq[index].pop();
}
return path;
//return path;
} else {
//add all possible neighbors to the openlist + define parents for later tracing back
//(four directions (int dir): up, down, left & right); but we want to make it
//as extendable as possible
for (int i=0; i<dir; i++){
//define new x & y coord for neighbor
//we do this because we want to preform some checks before making this neighbore node
int neighborX = currentNode->getxPos()+dirX[i];
int neighborY = currentNode->getyPos()+dirY[i];
//check boundaries
//ignore if on closed list (was already evaluted) or if unwalkable (currently not implemented)
if (!(neighborX <0 || neighborY <0 || neighborX > horizontalSize || neighborY > verticalSize ||
closedNodes[neighborX][neighborY]==1)){
//ok -> generate neighbor node
neighborNode = new node (neighborX,neighborY,currentNode->getDistance(),currentNode->getPriority());
//calculate the fscore of the node
neighborNode->updatePriority(xFinish,yFinish);
neighborNode->updateDistance();
//if neighbor not in openset => add it
if(openNodes[neighborX][neighborY]==0){
//add it to open set
openNodes[neighborX][neighborY]=neighborNode->getPriority();
//add it to the priority queue (by dereferencing our neighborNode object
//pq is of type node; push inserts a new element;
//the content is initialized as a copy
pq[index].push(*neighborNode);
//set the parent to the node we are currently looking at
dir_map[neighborX][neighborY]=(i+dir/2)%dir;
//if neighbor is already on open set
//check if path to that node is a better one (=lower gscore) if we use the current node to get there
} else if(openNodes[neighborX][neighborY]>neighborNode->getPriority()) {
/** lower gscore: change parent of the neighbore node to the select square
* recalculate fscore
* the value of the fscore should also be changed inside the node which resides inside our priority queue
* however as mentioned before this is a very slow operation (is going to be the bottleneck of this implemention probably)
* we will have to manuall scan for the right node and than change it.
*
* this check is very much needed, but the number of times this if condition is true is limited
**/
//update fscore inside the open list
openNodes[neighborX][neighborY]=neighborNode->getPriority();
//update the parent node
dir_map[neighborX][neighborY]=(i+dir/2)%dir;
//we copy the nodes from one queue to the other
//except the -old-current node will be ignored
while (!((pq[index].top().getxPos()==neighborX) && (pq[index].top().getyPos()==neighborY))){
pq[1-index].push(pq[index].top());
pq[index].pop();
}
pq[index].pop(); //remove the -old-current node
/** ?? **/
if(pq[index].size()>pq[1-index].size()){ //??? is this extra check necessary?
index=1-index; //index switch 1->0 or 0->1
}
while(!pq[index].empty()){
pq[1-index].push(pq[index].top());
pq[index].pop();
}
/** ?? **/
index=1-index; //index switch 1->0 or 0->1
pq[index].push(*neighborNode); //and the -new-current node will be pushed in instead
} else delete neighborNode;
}
}
delete currentNode;
}
} return ""; //no path found
}
void pathFind::resetMaps(){
for (int y=0;y<horizontalSize;y++){
for (int x=0;x<verticalSize;x++){
closedNodes[x][y]=0;
openNodes[x][y]=0;
}
}
}
node.cpp
#include "node.h"
#include <stdio.h>
#include <stdlib.h>
/** constructor **/
node::node(int x,int y, int d,int p)
{
xPos=x;
yPos=y;
distance=d;
priority=p;
}
/** getters for variables **/
//current node xPosition
int node::getxPos() const {
return xPos;
}
//current node yPosition
int node::getyPos() const {
return yPos;
}
//gscore
int node::getDistance() const {
return distance;
}
//fscore
int node::getPriority() const {
return priority;
}
/** Updates the priority; the lower the fscore the higer the priority
* the fscore is the sum of:
* -path-cost (gscore) (which is the distance from the starting node to the current node)
* -heuristic estimate (hscore) (which is an estimate of the distance from the current node to the destination node)
*
**/
void node::updatePriority(const int & xDest, const int & yDest){
priority = distance + estimateDistance(xDest,yDest)*10;
}
void node::updateDistance(){//const int & direction
distance +=10;
}
/** Estimate function for the remaining distance to the goal
* here it's based on the Manhattan distance;
* which is the distance between two points in a grid based on a strictly horizontal & veritcal path;
* => sum of vertical & horizontal components
**/
const int & node::estimateDistance(const int & xDest, const int & yDest) const{
static int xDistance,yDistance,totalDistance;
xDistance=xDest-xPos;
yDistance=yDest-yPos;
totalDistance=abs(xDistance)+abs(yDistance);
return (totalDistance);
}
/** class functor (I think) to compare elements using:
* operator overloading: "<" gets overloaded which we are going to use in our priority queue
* to determine priority of a node in our queue;
* returns true if node a has a lower fscore compared to node b
*
* there is an ambiguity here: < -- >; better to make both >
* also prototype is now friend which could probably be replaced with this for the first
* argument; it advantage is that because of the friend function the operand order can be reversed
* this doesn't really looks to favor our application; so should I use it or not?
**/
bool operator<(const node & a, const node & b){
return a.getPriority() > b.getPriority();
}
I think your algorithm have no problem, and if there is no diagonal movements available, it won't be necessary to take it into account.
Manhattan distance is a simple heuristic and as you become closer to destination node the H function gives smaller numbers though the G function(distance from first node till here) gets bigger. As the result, many nodes have the same f score.
A* is full generic- the graph you give it can be any shape at all. It only cares about whether you can go from X to Y, not why. If diagonal movement is not allowed, it does not care. And it is quite fine and legitimate for many nodes to have the same f score- it is, in fact, expected.
i think it doesn't matter how many squares have the same f score Bcz you choose one among them. As said in the main algorithm of A*, "For the purposes of speed, it can be faster to choose the last one you added to the open list".
I hope it wouldn't be a probelm if you don't consider diagonal costs. Seems interesting how it give results.
I hope you already read this article. If not have a look at it clearly.