I get a segmentation fault in the call to
auto n1=std::make_shared<Node>(n,n->x+i,n->y+j);
after a few recursive calls. Strange thing is that it's always at the same point in time. Can anyone spot the problem?
This is an implementation for a dynamic programming problem and here I'm accumulating the costs of a path. I have simplified the cost function but in this example the problem still occurs.
void HorizonLineDetector::dp(std::shared_ptr<Node> n)
{
n->cost= 1 + n->prev->cost;
//Check if we reached the last column(done!)
if (n->x==current_edges.cols-1)
{
//Save the info in the last node if it's the cheapest path
if (last_node->cost > n->cost)
{
last_node->cost=n->cost;
last_node->prev=n;
}
}
else
{
//Check for neighboring pixels to see if they are edges, launch dp with all the ones that are
for (int i=0;i<2;i++)
{
for (int j=-1;j<2;j++)
{
if (i==0 && j==0) continue;
if (n->x+i >= current_edges.cols || n->x+i < 0 ||
n->y+j >= current_edges.rows || n->y+j < 0) continue;
if (current_edges.at<char>(n->y+j,n->x+i)!=0)
{
auto n1=std::make_shared<Node>(n,n->x+i,n->y+j);
//n->next.push_back(n1);
nlist.push_back(n1);
dp(n1);
}
}
}
}
}
class Node
{
public:
Node(){}
Node(std::shared_ptr<Node> p,int x_,int y_){prev=p;x=x_;y=y_;lost=0;}
Node(Node &n1){x=n1.x;y=n1.y;cost=n1.cost;lost=n1.lost;prev=n1.prev;}//next=n1.next;}
std::shared_ptr<Node> prev; //Previous and next nodes
int cost; //Total cost until now
int lost; //Number of steps taken without a clear path
int x,y;
Node& operator=(const Node &n1){x=n1.x;y=n1.y;cost=n1.cost;lost=n1.lost;prev=n1.prev;}//next=n1.next;}
Node& operator=(Node &&n1){x=n1.x;y=n1.y;cost=n1.cost;lost=n1.lost;prev=n1.prev;n1.prev=nullptr;}//next=n1.next;n1.next.clear();}
};
Your code looks like a pathological path search, in that it checks almost every path and doesn't keep track of paths it has already checked you can get to more than one way.
This will build recursive depth equal to the length of the longest path, and then the next longest path, and ... down to the shortest one. Ie, something like O(# of pixels) depth.
This is bad. And, as call stack depth is limited, will crash you.
The easy solution is to modify dp into dp_internal, and have dp_internal return a vector of nodes to process next. Then write dp, which calls dp_internal and repeats on its return value.
std::vector<std::shared_ptr<Node>>
HorizonLineDetector::dp_internal(std::shared_ptr<Node> n)
{
std::vector<std::shared_ptr<Node>> retval;
...
if (current_edges.at<char>(n->y+j,n->x+i)!=0)
{
auto n1=std::make_shared<Node>(n,n->x+i,n->y+j);
//n->next.push_back(n1);
nlist.push_back(n1);
retval.push_back(n1);
}
...
return retval;
}
then dp becomes:
void HorizonLineDetector::dp(std::shared_ptr<Node> n)
{
std::vector<std::shared_ptr<Node>> nodes={n};
while (!nodes.empty()) {
auto node = nodes.back();
nodes.pop_back();
auto new_nodes = dp_internal(node);
nodes.insert(nodes.end(), new_nodes.begin(), new_nodes.end());
}
}
but (A) this will probably just crash when the number of queued-up nodes gets ridiculously large, and (B) this just patches over the recursion-causes-crash, doesn't make your algorithm suck less.
Use A*.
This involves keeping track of which nodes you have visited and what nodes to process next with their current path cost.
You then use heuristics to figure out which of the ones to process next you should check first. If you are on a grid of some sort, the heuristic is to use the shortest possible distance if nothing was in the way.
Add the cost to get to the node-to-process, plus the heuristic distance from that node to the destination. Find the node-to-process that has the least total. Process that one: you mark it as visited, and add all of its adjacent nodes to the list of nodes to process.
Never add a node to the list of nodes to process that you have already visited (as that is redundant work).
Once you have a solution, prune the list of nodes to process against any node whose current path value is greater than or equal to your solution. If you know your heuristic is a strong one (that it is impossible to get to the destination faster), you can even prune based off of the total of heuristic and current cost. Similarly, don't add to the list of nodes to process if it would be pruned by this paragraph.
The result is that your algorithm searches in a relatively strait line towards the target, and then expands outwards trying to find a way around any barriers. If there is a relatively direct route, it is used and the rest of the universe isn't even touched.
There are many optimizations on A* you can do, and even alternative solutions that don't rely on heuristics. But start with A*.
Related
I have managed to find shortest path for unweighted graph using recursive dfs. Here is such an attempt.
void dfsHelper(graph*& g, int start,int end, bool*& visited, int& min, int i) {
visited[start] = true;
i = i + 1;
if (start == end) {
if (i<=min) { min = i; }
}
node* current = g->adj[start];
while (current != NULL) {
if (!visited[current->dest]) {
dfsHelper(g, current->dest,end, visited,min,i);
}
current = current->next;
}
visited[start] = false;
}
However for an iterative algorithm of dfs such as this one, how should I approach.
void dfsItr(graph*& g, int start, int end) {
bool* isVisited = new bool[g->numVertex];
for (int i = 0; i < g->numVertex; ++i) {
isVisited[i] = false;
}
stack<int> st;
isVisited[start] = true;
st.push(start);
while (!st.empty()) {
start = st.top();
cout << start << " ";
st.pop();
node* current = g->adj[start];
while (current != NULL) {
if (!isVisited[current->dest]) {
isVisited[current->dest] = true;
st.push(current->dest);
if (current->dest == end) {
cout << current->dest << endl;
}
}
current = current->next;
}
}
}
Is there any algorithm detailing about the procedure to follow. I am well aware of finding shortest path using BFS algorithm as given here or as suggested here. My initial intuition as to why such idea would work for BFS is that the traversal happens layer by layer,multiple children share same parent in each layer, so it is easy to backtrack just by following the parent node. In iterative dfs, it is not the case. Can someone shed some light as to how to proceed. Is there any proven algorithm to tackle this scenario.
Thanks.
It is not entirely clear for me what you are asking about...
If you ask about how to optimise iterative implementation of DFS, the one thing I'd do here is not use stack but write own collection that has LIFO interface but does pre-allocation of memory.
Other room for optimisation is not to use stream operators since they are significantly slower than printf. Check out this answer section about performance. Also, does it really make sense to print to STDOUT all the time? If performance is the key this might be done once every couple of iterations, since IO operations are really slow.
If you're asking about what algorithm is better than DFS approach it is hard to answer since it always depends on given problem. If you want to find best path between nodes, go for BFS-based (e.g. Dijkstra algorithm) since it would perform best in unweighted graphs in comparison to DFS (btw. A* won't do the trick here, since with no weights and no fancy heuristic it'll just collapse to DFS). If you are more interested in this topic, you can find more info on what tricks you could do to optimise path-finding in this books series.
Last but not least, give some heuristics a try too. Maybe there's no need to do exhaustive search to find solution to your problem.
Here is an example that illustrates why depth-first search, even with some optimizations, can be a bad idea. (It's almost always a bad idea, but the illustration does not go that far).
Suppose your graph is the complete graph on nodes 0, ..., n, that is, the graph containing every possible edge. Suppose further that the edges always appear in order in the data structure, and you want to find the shortest path from 0 to n.
Naive depth-first search will explore (n-1)! paths before it finds the optimal path. Breadth-first search explores n paths. Both cases are essentially worst cases (edit: worst-case orderings for this graph) for their respective algorithms.
You could optimize depth-first search in a couple of ways:
1) Prune the search if the current path is one hop shorter than the best successful path so far, and is not a successful path.
2) More aggressively, each time you visit a node for the first time, store the length of the current path in the node. Each later time you visit, compare the length of the current path with the previously stored length. If the new path is shorter, store the new length. Otherwise, prune the search.
Of these two, (2) is the more aggressive optimization. It's entirely worse than breadth-first search. In breadth-first search, every time you pass through a node it is because you reached it by a shortest path, and at that point the node becomes a dead end for all further path traversals. Neither of these things are the case for depth-first search. Additionally, the (asymptotic) memory cost of storing the lengths is no better than that of using a breadth-first queue.
I have a global unique path table which can be thought of as a directed un-weighted graph. Each node represents either a piece of physical hardware which is being controlled, or a unique location in the system. The table contains the following for each node:
A unique path ID (int)
Type of component (char - 'A' or 'L')
String which contains a comma separated list of path ID's which that node is connected to (char[])
I need to create a function which given a starting and ending node, finds the shortest path between the two nodes. Normally this is a pretty simple problem, but here is the issue I am having. I have a very limited amount of memory/resources, so I cannot use any dynamic memory allocation (ie a queue/linked list). It would also be nice if it wasn't recursive (but it wouldn't be too big of an issue if it was as the table/graph itself if really small. Currently it has 26 nodes, 8 of which will never be hit. At worst case there would be about 40 nodes total).
I started putting something together, but it doesn't always find the shortest path. The pseudo code is below:
bool shortestPath(int start, int end)
if start == end
if pathTable[start].nodeType == 'A'
Turn on part
end if
return true
else
mark the current node
bool val
for each node in connectedNodes
if node is not marked
val = shortestPath(node.PathID, end)
end if
end for
if val == true
if pathTable[start].nodeType == 'A'
turn on part
end if
return true
end if
end if
return false
end function
Anyone have any ideas how to either fix this code, or know something else that I could use to make it work?
----------------- EDIT -----------------
Taking Aasmund's advice, I looked into implementing a Breadth First Search. Below I have some c# code which I quickly threw together using some pseudo code I found online.
pseudo code found online:
Input: A graph G and a root v of G
procedure BFS(G,v):
create a queue Q
enqueue v onto Q
mark v
while Q is not empty:
t ← Q.dequeue()
if t is what we are looking for:
return t
for all edges e in G.adjacentEdges(t) do
u ← G.adjacentVertex(t,e)
if u is not marked:
mark u
enqueue u onto Q
return none
C# code which I wrote using this code:
public static bool newCheckPath(int source, int dest)
{
Queue<PathRecord> Q = new Queue<PathRecord>();
Q.Enqueue(pathTable[source]);
pathTable[source].markVisited();
while (Q.Count != 0)
{
PathRecord t = Q.Dequeue();
if (t.pathID == pathTable[dest].pathID)
{
return true;
}
else
{
string connectedPaths = pathTable[t.pathID].connectedPathID;
for (int x = 0; x < connectedPaths.Length && connectedPaths != "00"; x = x + 3)
{
int nextNode = Convert.ToInt32(connectedPaths.Substring(x, 2));
PathRecord u = pathTable[nextNode];
if (!u.wasVisited())
{
u.markVisited();
Q.Enqueue(u);
}
}
}
}
return false;
}
This code runs just fine, however, it only tells me if a path exists. That doesn't really work for me. Ideally what I would like to do is in the block "if (t.pathID == pathTable[dest].pathID)" I would like to have either a list or a way to see what nodes I had to pass through to get from the source and destination, such that I can process those nodes there, rather than returning a list to process elsewhere. Any ideas on how i could make that change?
The most effective solution, if you're willing to use static memory allocation (or automatic, as I seem to recall that the C++ term is), is to declare a fixed-size int array (of size 41, if you're absolutely certain that the number of nodes will never exceed 40). By using two indices to indicate the start and end of the queue, you can use this array as a ring buffer, which can act as the queue in a breadth-first search.
Alternatively: Since the number of nodes is so small, Bellman-Ford may be fast enough. The algorithm is simple to implement, does not use recursion, and the required extra memory is only a distance (int, or even byte in your case) and a predecessor id (int) per node. The running time is O(VE), alternatively O(V^3), where V is the number of nodes and E is the number of edges.
I have a binary tree and a method for the size of the longest path (the diameter):
int diameter(struct node * tree)
{
if (tree == 0)
return 0;
int lheight = height(tree->left);
int rheight = height(tree->right);
int ldiameter = diameter(tree->left);
int rdiameter = diameter(tree->right);
return max(lheight + rheight + 1, max(ldiameter, rdiameter));
}
I want the function to return also the exact path (list of all the nodes of the diameter).
How can I do it?
Thanks
You have two options:
A) Think.
B) Search. Among the first few google hits you can find this: http://login2win.blogspot.hu/2012/07/print-longest-path-in-binary-tree.html
Choose A) if you want to learn, choose B) if you do not care, only want a quick, albeit not necessarily perfect solution.
There are many possible solutions, some of them:
In a divide and conquer approach you will probably end up with maintaining the so far longest paths on both sides, and keep only the longer.
The quoted solution does two traversals, one for determining the diameter, and the second for printing. This is a nice trick to overcome the problem of not knowing whether we are at the deepest point in approach 1.
Instead of a depth first search, do a breadth first one. Use a queue. Proceed level by level, for each node storing the parent. When you reach the last level (no children added to queue), you can print the whole path easily, because the last printed node is on (one) longest path, and you have the parent links.
Add a property struct node * next to the node struct. Before the return statement, add a line like this tree->next = (ldiameter > rdiameter ? tree->left : tree->right) to get the longer path node as the next node. After calling diameter(root), you should be able to iterate through all of the next nodes from the root to print the largest path.
I think the following may work... compute the diameter as follows in O(N) time.
// this is a c++ code
int findDiameter(node *root, int &max_length, node* &max_dia_node, int parent[], node* parent_of_root){
if(!root) return 0;
parent[root->val] = parent_of_root->val;
int left = findDiameter(root->left, max_length);
int right = findDiameter(root->right, max_length);
if(left+right+1 > max_length){
max_dia_node = root;
max_length = left+right+1;
}
return 1 + max(left,right);
}
So in this function number of things is happening. First max_length is calculating the max diameter of the tree. And along with that I am assigning the max_dia_node to this node.
This is the node through which I will have my max diameter pass through.
Now using this information we can find the max depth left child and right child of this node (max_dia_node). From that we can have the actual nodes via "parent" array.
This is two traversal of the tree.
I am currently working on a A* search algorithm. The algorithm would just be solving text file mazes. I know that the A* algorithm is supposed to be very quick in finding the finish. Mine seems to take 6 seconds to find the path in a 20x20 maze with no walls. It does find the finish with the correct path it just takes forever to do so.
If I knew which part of code was the problem I would just post that but I really have no idea what is going wrong. So here is the algorithm that I use...
while(!openList.empty()) {
visitedList.push_back(openList[index]);
openList.erase(openList.begin() + index);
if(currentCell->x_coor == goalCell->x_coor && currentCell->y_coor == goalCell->y_coor)
}
FindBestPath(currentCell);
break;
}
if(map[currentCell->x_coor+1][currentCell->y_coor] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor+1,currentCell->y_coor,currentCell));
}
if(map[currentCell->x_coor-1][currentCell->y_coor] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor-1,currentCell->y_coor,currentCell));
}
if(map[currentCell->x_coor][currentCell->y_coor+1] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor,currentCell->y_coor+1,currentCell));
}
if(map[currentCell->x_coor][currentCell->y_coor-1] != wall)
{
openList.push_back(new SearchCell(currentCell->x_coor,currentCell->y_coor-1,currentCell));
}
for(int i=0;i<openList.size();i++) {
openList[i]->G = openList[i]->parent->G + 1;
openList[i]->H = openList[i]->ManHattenDistance(goalCell);
}
float bestF = 999999;
index = -1;
for(int i=0;i<openList.size();i++) {
if(openList[i]->GetF() < bestF) {
for(int n=0;n<visitedList.size();n++) {
if(CheckVisited(openList[i])) {
bestF = openList[i]->GetF();
index = i;
}
}
}
}
if(index >= 0) {
currentCell = openList[index];
}
}
I know this code is messy and not the most efficient way to do things but I think it should still be faster then what it is. Any help would be greatly appreciated.
Thanks.
Your 20x20 maze has no walls, and therefore many, many routes which are all the same length. I'd estimate trillions of equivalent routes, in fact. It doesn't seem so bad when you take that into account.
Of course, since your heuristic looks perfect, you should get a big benefit from excluding routes that are heuristically predicted to be precisely as long as the best route known so far. (This is safe if your heuristic is correct, i.e. never overestimates the remaining distance).
Here is a big hint.
If ever you find two paths to the same cell, you can always throw away the longer one. If there is a tie, you can throw away the second one to get there.
If you implement that, with no other optimizations, the search would become more than acceptably fast.
Secondly the A* algorithm should only bother backtracking if the length to the current cell plus the heuristic exceeds the length to the current cell plus the heuristic for any other node. If you implement that, then it should directly find a path and stop. To facilitate that you need to store paths in a priority queue (typically implemented with a heap), not a vector.
openList.erase is O(n), and the for-loop beginning with for(int i=0;i<openList.size();i++) is O(n^2) due to the call to CheckVisited - these are called every iteration, making your overall algorithm O(n^3). A* should be O(n log n).
Try changing openList to a priority-queue like it's supposed to be, and visitedList to a hash table. The entire for loop can then be replaced by a dequeue - make sure you check if visitedList.Contains(node) before enqueuing!
Also, there is no need to recalculate the ManHattenDistance for every node every iteration, since it never changes.
Aren't you constantly backtracking?
The A* algorithm backtracks when the current best solution becomes worse than another previously visited route. In your case, since there are no walls, all routes are good and never die (and as MSalters correctly pointed, there are several of them). When you take a step, your route becomes worse than all the others that are one step shorter.
If that is true, this may account for the time taken by your algorithm.
I'm trying to create a function that finds the average of some data within the nodes of a tree. The problem is, every node contains two pieces of data and unlike other BSTs, the primary data from which it is built is a string. Finding the average of number-based elements in a tree isn't an issue for me, but since each node contains a string (a person's name) and a seemingly random number (the weight of said person), the tree is actually in complete disarray, and I have no idea how to deal with it.
Here is my node so you see what I mean:
struct Node {
string name;
double weight;
Node* leftChild;
Node* rightChild;
};
Node* root;
Here's the function during one of its many stages:
// This isn't what I'm actually using so don't jump to conclusions
double nameTree::averageWeight(double total, double total, int count) const
{
if (parent != NULL)
{ //nonsense, nonsense
averageWeight(parent->leftChild, total, count);
averageWeight(parent->rightChild, total, count);
count++;
total = total + parent->weight;
return total;
}
return (total / count);
}
In an effort to traverse the tree, I tried some recursion but every time I manage to count and total everything, something gets screwey and it ends up doing return(total/count) each time. I've also tried an array implementation by traversing the tree and adding the weights to the array, but that didn't work because the returns and recursion interfered, or something.
And just because I know someone is going to ask, yes, this is for a school assignment. However, this is one out of like, 18 functions in a class so it's not like I'm asking anyone to do this for me. I've been on this one function for hours now and I've been up all night and my brain hurts so any help would be vastly appreciated!
You could try something like:
//total number of tree nodes
static int count=0;
// Calculate the total sum of the weights in the tree
double nameTree::calculateWeight(Node *parent)
{
double total=0;
if (parent != NULL)
{
//nonsense, nonsense
//Calculate total weight for left sub-tree
total+=calculateWeight(parent->leftChild);
//Calculate weight for right sub-tree
total+=calculateWeight(parent->rightChild);
//add current node weight
total+=parent->weight;
}
count++;
//if it is a leaf it will return 0
return total;
}
double averageWeight()
{
double weightSum;
weightSum=calculateWeight();
if(count!=0)
return (weightSum/count);
else
{
cout<<"The tree is empty";
return 0;
}
}
I don't have a compiler here but I believe it works.
To calculate the average you need two numbers: the total value and the number of elements in the set. You need to provide a function (recursive is probably the simplest) that will walk the tree and either return a pair<double,int> with those values or else modify some argument passed as reference to store the two values.
As of your code, averageWeight returns a double, but when you call it recursively you are ignoring (discarding) the result. The count argument is passed by copy, which means that the modifications applied in the recursive calls will not be visible by the caller (which then does not know how much parent->weight should weight towards the result.
This should be enough to get you started.