I am really stuck at the moment and I am going insane.
In the simplest of terms, when do you stop on Open and Closed lists with a depth first search?
Do you open and close every node until there are no nodes left?
Please help because I am going doolally here
Thank you
Open list helps you in both depth first and breadth first searches to traverse your tree properly. Think about algorithms step by step. You are in a node with many children and your are going to expand one of them. After expansion there should be a mechanism to get back and continue your traversal. Open list performs that for you and tells you what is actually the next node to be expand. And the algorithm only clarify the order of child insertion into the list.
And Closed list generally improves the speed of algorithm. It prevents the algorithm from expanding pre-visited nodes. Maybe you reach node A that was expanded previously through another branch. This will let you cut this branch and try another path.
Heuristics are useful to get away from dead-end. In AI algorithms usually you are facing problems that they have many waste branches. By passing through each step you can add the path cost to a variable and when you want to add expanded nodes to your open list, considering it will help you never go through them. Otherwise, you will get into a trap and algorithm hangs.
Let me explain more with an example:
Consider the game 15-puzzles. You are going to solve it through an algorithm and you have to check all possible ways. (actually you are going to make a tree). When you move a tile in a direction that would be possible in your tree to move that in reverse direction in the next level, right? So you will never get out of such dead-ends and your algorithm hangs.
This was explanation of Open and Closed lists. You asked about when the algorithm finishes. Actually you will repeat the expand and add to Open list until you find your goal or Open list goes empty.
Related
I have the following simplified situation:
I need to create a tree like the following one:
I start at one node with a score of 3. From this node I calculate all possible next nodes which have the score 4, 6, 5 and 7. In the next step I only want to consider the two nodes with the highest score, in our case 6 and 7. From these two nodes I again calculate all possible next nodes. The highest score of all the next nodes are 12 and 13 so in the next step, I only want to consider these two nodes as my next start point. This means all nodes from the previous score 6 nodes are ignored from now on.
And so on...
I have no idea about graph theory right now (but I guess I have to do some research)
I looked for some libraries which might help me implement this in C++. I came across boost::graph which looks promising at a first glance. Downside is, it looks also also quite complex.
My question is:
Do you think boost::graph can be easily used to implement a tree like this? Is it worth spending some days trying to lean boost::graph, or is it not the right library for me.
Is there a better library? I quite like boost and have used it a lot, but not the graph library of it.
I think my requirements are
calculation of my nodes is quite expensive, therefore I need a tree/graph which saves the "score" of each node and can be easily extended later.
it should be possible to simply say _"from now on let's ignore those nodes" and focus on those two best ones.
I need to somehow easily be able to get access to the "end of my tree" at each step, so that I know which nodes are the current nodes I have to calculate the next possible nodes.
to calculate the score of my next nodes I need to be able to easily follow the tree back to its root (I do not only need the previous score but more information. My implementation would somehow require that each node is an object and a score and I need to have access to the members of those objects as well as to the score). This means I need to be able to reconstruct my way through the tree.
Note:
I guess it should be a kind of standard problem for graph theory. But right now I do not really know where to start my research. If you have any good literature, papers, descriptions, key words for google, for this problem please tell me.
Edit:
Many people have pointed out that a array might be sufficient. First of all: You might be right and thank you for your comment. My example was highly simplified. At each step I would need to calculate up to ten thousand of new possible nodes and I need to make roughly 1000 iterations. The Idea of only following a couple of nodes (in my example 2 but in my application most likely ~100 or something like "the best x percent" was simply to reduce the number of possibilities. I think it is obvious that the problem would otherwise explode quite quickly.
Edit2:
seems like there is some confusion about the numbers:
Right now the code runs sequential (no graph, just an array). For every possible new node I calculated a score based on some metrics. At the end I select the node which has the highest score and go on with the new iteration.
The idea how to implement it with a graph is the following:
Again for every possible node I calculate the score (this is the number over the lines that connect two nodes) but the thing we are interested in is the sum of all these scores through the graph which is the number I wrote in the nodes. This means I am interested in the path through the graph which leads to the highest sum of scores.
I try to understand several weeks the principles of operation with clang
AST, but meanwhile I did not answer the main issue: how to walk on this
tree?
I read all guides which I found, studied doxygen documentation and even
watched couple of lectures on YouTube, however the understanding did not
come.
That I understood:
1) AST of a tree has no general type of nodes
2) For movement on a tree it is offered to use either RecursiveASTVisitor,
or matcher. The first recursively realizes bypass in depth, and the second
allows to look for those nodes which are interesting.
The problem is in what part of a tree I am in what in one of two options I
cannot learn. I do not know how to define at what moment my visitor passed
to other branch and at what moment continues to move to depth.
Ideally it would be desirable to know depth of the node visited by me. It is
possible?
I very much like a dump() function output because in it communications
between nodes are accurately shown. However how to receive it in pure form
(but not as the text) I do not know.
Generally, the question is as follows: whether I can construct the tree on
the basis of AST, but with uniform type of nodes and how to make it?
Recently I asked a question on Stack Overflow asking for help to solve a problem. It is a travelling salesman problem where I have up to 40,000 cities but I only need to visit 15 of them.
I was pointed to use Dijkstra with a priority queue to make a connectivity matrix for the 15 cities I need to visit and then do TSP on that matrix with DP. I had previously only used Dijkstra with O(n^2). After trying to figure out how to implement Dijkstra, I finally did it (enough to optimize from 240 seconds to 0.6 for 40,000 cities). But now I am stuck at the TSP part.
Here are the materials I used for learning TSP :
Quora
GeeksForGeeks
I sort of understand the algorithm (but not completely), but I am having troubles implementing it. Before this I have done dynamic programming with arrays that would be dp[int] or dp[int][int]. But now when my dp matrix has to be dp[subset][int] I don't have any idea how should I do this.
My questions are :
How do I handle the subsets with dynamic programming? (an example in C++ would be appreciated)
Do the algorithms I linked to allow visiting cities more than once, and if they don't what should I change?
Should I perhaps use another TSP algorithm instead? (I noticed there are several ways to do it). Keep in mind that I must get the exact value, not approximate.
Edit:
After some more research I stumbled across some competitive programming contest lectures from Stanford and managed to find TSP here (slides 26-30). The key is to represent the subset as a bitmask. This still leaves my other questions unanswered though.
Can any changes be made to that algorithm to allow visiting a city more than once. If it can be done, what are those changes? Otherwise, what should I try?
I think you can use the dynamic solution and add to each pair of node a second edge with the shortest path. See also this question:Variation of TSP which visits multiple cities.
Here is a TSP implementation, you will find the link of the implemented problem in the post.
The algorithms you linked don't allow visiting cities more than once.
For your third question, I think Phpdna answer was good.
Can cities be visited more than once? Yes and no. In your first step, you reduce the problem to the 15 relevant cities. This results in a complete graph, i.e. one where every node is connected to every other node. The connection between two such nodes might involve multiple cities on the original map, including some of the relevant ones, but that shouldn't be relevant to your algorithm in the second step.
Whether to use a different algorithm, I would perhaps do a depth-first search through the graph. Using a minimum spanning tree, you can give an upper and lower bound to the remaining cities, and use that to pick promising solutions and to discard hopeless ones (aka pruning). There was also a bunch of research done on this topic, just search the web. For example, in cases where the map is actually carthesian (i.e. the travelling costs are the distance between two points on a plane), you can exploit this info to improve the algorithms a bit.
Lastly, if you really intend to increase the number of visited cities, you will find that the time for computing it increases vastly, so you will have to abandon your requirement for an exact solution.
Before you start throwing links to wikipedia and blogs in my face, please hear me out.
I'm trying to find the optimal algorithm/function to do a dependency sort on... stuff. Each item has a list of its dependencies.
I would like to have something iterator-based, but that's not very important.
What is important is that the algorithm points out exactly which items are part of the dependency cycle. I'd like to give detailed error information in this case.
Practically, I'm thinking of subclassing my items from a "dependency node" class, which has the necessary booleans/functions to get the job done. Cool (but descriptive) names are welcome :)
It's normally called a topological sort. Most books/papers/whatever that cover topological sorting will also cover cycle detection as a matter of course.
I don't exactly get why is it so hard to find the dependecy cycle if there is any! you just have to check if there is any node you already passed over while appling bfs algorithm to find out all the dependecies. if there is one you just roll back the way you came down to revisit a node alll the way up and mark all the nodes until you reach the first visit at the specified node. all the ones in your pass will be marked as a cycle. (just leave a comment and i'll give a code to do that if you need)
Is there any fast way to determine the size of the largest strongly connected component in a graph?
I mean, like, the obvious approach would mean determining every SCC (could be done using two DFS calls, I suppose) and then looping through them and taking the maximum.
I'm pretty sure there has to be some better approach if I only need to have the size of that component and only the largest one, but I can't think of a good solution. Any ideas?
Thanks.
Let me answer your question with another question -
How can you determine which value in a set is the largest without examining all of the values?
Firstly you could use Tarjan's algorithm which needs only one DFS instead of two. If you understand the algorithm clearly, the SCCs form a DAG and this algo finds them in the reverse topological sort order. So if you have a sense of the graph (like a visual representation) and if you know that relative big SCCs occur at end of the DAG then you could stop the algorithm once first few SCCs are found.