An idea as to how to approach a directed graph

An idea as to how to approach a directed graph - c++

I am trying to write a program using a directed graph (which I know about but have never implemented) to simulate a network of transportation.
The user will input a planet name followed by an integer representing the number of total nodes in the graph. The user will then go through each node one by one. They will give it a name give the number of neighbors that node has and then the specific names. The input will look like this.
some_planet 4
node1 2 node2 node3
node2 1 node4
node3 1 node4
node4 1 node1
I then just output which nodes node1 cannot reach. However, I have some questions about implementing this.
1) The big one is storing this stuff. What is an easy way? I thinking about a LinkedList and thought I would link the locations. Then I could pop pointers onto them corresponding to whatever the input is. However, I have no idea how to do the last part.
2) The next big one is searching the graph. I was planning on a recursive depth-first-search . Is there anything wrong with this algorithm that you see? I need to search for every node individually this way though so I will have to increment this. Can I just throw everything in a for loop?
recursive-d-first-search(graph G, node start, node find)
if(start == find)
return true;
else
for every neighbor n of start
success = recursive-d-first-search(graph G, node n, node find);
if(success)
return true;
return false;

I think you just need to use adjacency matrix to store whole graph's connection relation.
in your case, it should be like:
1 2 3 4
1 0 1 1 0
2 0 0 0 1
3 0 0 0 1
4 1 0 0 0
If you use adjacency matrix, I think the breadth-first search may be a good choice, because it's easy to understand and implement. Meanwhile, you need one queue to store next nodes to be checked, and one list to store which nodes have already been checked
For example, you want to check which nodes node1 cannot reach, you just check row 1 and see that it has 2 and 3, and put 2 and 3 to queue. Then check row 2 to see that it has 4, put 2 to list, and put 4 to queue. Then just use a for loop to do the same operation.
In the end, you just need to check which nodes are not in list.

You can also use Boost::Graph if you don't feel like reinventing the wheel...

Related

minimum number of nodes that traverse entire graph

Note: The question is entirely changed.
In the following graph, we can traverse entire graph if we select the nodes 0 and 2. I am looking for an efficient algorithm which returns this two nodes. Note that this is neither vertex-cover problem nor dominating-set problem since we don't need to select node 3. We say that, if we select node 0, we can go to node 1 from there and if we select node 2, we can go to node 3 and then node 4 from there.
If I run a SCC algorithm on it, it finds all vertices as a different SCC and I can't go from there to anywhere:
C:\>project2 ../../input.txt o.txt
Following are strongly connected components in given graph (Each line is a different SCC)
2
4
3
0
1

If there is no cycle in the graph i.e. the graph is a Directed Acyclic Graph (DAG), then we just need to count the indegrees for each node. The set of nodes with indegree 0 is the required set.
In case you are not familiar with indegree, if there is an edge a->b then indegree of b increases by 1.
This works because, if there is an edge a->b i.e. b has an indegree it means there is a node a from which b is reachable. So it is always better to include node a to the set instead of b. A node with indegree 0 has no other way to get visited unless we start with the node itself. So it will be included in the set.
In case there is a cycle in the graph, we search for Strongly Connected Components(SCC). Then we have build a new graph considering a SCC as one node and add edges from initial graph which connect two different SCC's. The new graph will be a DAG. Then we can apply the above procedure to find the required set of nodes.

Traveling Salesman Heuristic

My current idea is:
Start with point 0 and connect it with it's nearest point. For all the remaining nodes, insert it into all possible locations and keep the configuration that has the least cost.
So I start with point 0. The node closest to point 0 is point 1.
So I now have 0->1 -> 0
For point 2 (and all the remaining nodes) I will check all possibilities on where the new node could be:
2 -> 0 -> 1 -> 2
0 -> 2 -> 1 -> 0
0 -> 1-> 2 -> 0
From here I find that
0 -> 1 - > 2 -> 0 has the least total euclidean distance, so that is the configuration I will keep.
I will keep using this logic for the rest of my nodes.
Is there an easy way to implement this in c++? My current thought is maybe linked list would be a good idea but I'd like to be able to use vectors if possible. Does anyone have any tips on how to approach this?

Have you thought about using a directed graph and then implementing dijkstra's algorithm. In a directed graph dijkstra's algorithm will give you the shortest path from a starting node to all other nodes in the graph and not just the few nodes that you want.

Faster alternatives to Dijkstra's algorithm for GPS system

I'm a real speed freak if it gets to algorithms, and in the plugins I made for a game.
The speed is.. a bit.. not satisfying. Especially while driving around with a car and you do not follow your path, the path has to be recalculated.. and it takes some time, So the in-game GPS is stacking up many "wrong way" signals (and stacking up the signals means more calculations afterward, for each wrong way move) because I want a fast, live-gps system which updates constantly.
I changed the old algorithm (some simple dijkstra implementation) to boost::dijkstra's to calculate a path from node A to node B
(total node list is around ~15k nodes with ~40k connections, for curious people here is the map: http://gz.pxf24.pl/downloads/prv2.jpg (12 MB), edges in the red lines are the nodes),
but it didn't really increase in speed. (At least not noticeably, maybe 50 ms).
The information that is stored in the Node array is:
The ID of the Node,
The position of the node,
All the connections to the node (and which way it is connected to the other nodes, TO, FROM, or BOTH)
Distance to the connected nodes.
I'm curious if anybody knows some faster alternatives in C/C++?
Any suggestions (+ code examples?) are appreciated!
If anyone is interested in the project, here it is (source+binaries):
https://gpb.googlecode.com/files/RouteConnector_177.zip
In this video you can see what the gps-system is like:
http://www.youtu.be/xsIhArstyU8
as you can see the red route is updating slowly (well, for us - gamers - it is slow).
( ByTheWay: the gaps between the red lines have been fixed a long time ago :p )

Since this is a GPS, it must have a fixed destination. Instead of computing the path from your current node to the destination each time you change the current node, you can instead find the shortest paths from your destination to all the nodes: just run Dijkstra once starting from the destination. This will take about as long as an update takes right now.
Then, in each node, keep prev = the node previous to this on the shortest path to this node (from your destination). You update this as you compute the shortest paths. Or you can use a prev[] array outside of the nodes - basically whatever method you are using to reconstruct the path now should still work.
When moving your car, your path is given by currentNode.prev -> currentNode.prev.prev -> ....
This will solve the update lag and keep your path optimal, but you'll still have a slight lag when entering your destination.
You should consider this approach even if you plan on using A* or other heuristics that do not always give the optimal answer, at least if you still get lag with those approaches.
For example, if you have this graph:
1 - 2 cost 3
1 - 3 cost 4
2 - 4 cost 1
3 - 4 cost 2
3 - 5 cost 5
The prev array would look like this (computed when you compute the distances d[]):
1 2 3 4 5
prev = 1 1 1 2 3
Meaning:
shortest path FROM TO
1 2 = prev[2], 2 = 1, 3
1 3 = prev[3], 3 = 1, 3
1 4 = prev[ prev[4] ], prev[4], 4 = 1, 2, 4 (fill in right to left)
1 5 = prev[ prev[5] ], prev[5], 5 = 1, 3, 5
etc.

To make the start instant, you can cheat in the following way.
Have a fairly small set of "major thoroughfare nodes". For each node define its "neighborhood" (all nodes within a certain distance). Store the shortest routes from every major thoroughfare node to every other one. Store the shortest routes to/from each node to its major thoroughfares.
If the two nodes are in the same neighborhood you can calculate the best answer on the fly. Else consider only routes of the form, "here to major thoroughfare node near me to major thoroughfare near it to it". Since you've already precalculated those, and have a limited number of combinations, you should very quickly be able to calculate a route on the fly.
Then the challenge becomes picking a set of major thoroughfare nodes. It should be a fairly small set of nodes that most good routes should go through - so pick a node every so often along major streets. A list of a couple of hundred should be more than good enough for your map.

O(log n) index update and search

I need to keep track of indexes in a large text file. I have been keeping a std::map of indexes and accompanying data as a quick hack. If the user is on character 230,400 in the text, I can display any meta-data for the text.
Now that my maps are getting larger, I'm hitting some speed issues (as expected).
For example, if the text is modified at the beginning, I need to increment the indexes after that position in the map by one, an O(N) operation.
What's a good way to change this to O(log N) complexity? I've been looking at AVL Arrays, which is close.
I'm hoping for O(log n) time for updates and searches. For example, if the user is on character 500,000 in the text array, I want to very quickly find if there is any meta data for that character.
(Forgot to add: The user can add meta data whenever they like)

Easy. Make a binary tree of offsets.
The value of any offset is computed by traversing the tree from the leaf to the root adding offsets any time a node is a right child.
Then if you add text early in the file you only need to update the offsets for nodes which are parents of the offsets that change. That is say you added text before the very first offset, you add the number of characters added to the root node. now one half of your offsets have been corrected. Now traverse to the left child and add the offset again. Now 3/4s of offsets have been updated. Continue traversing left children adding the offset until all the offsets are updated.
#OP:
Say you have a text buffer with 8 characters, and 4 offsets into the odd bytes:
the tree: 5
/ \
3 2
/ \ / \
1 0 0 0
sum of right
children (indices) 1 3 5 7
Now say you inserted 2 bytes at offset 4. Buffer was:
01234567
Now its
0123xx4567
So you modify just nodes that dominate parts of the array that changed. In this case just
the root node needs to be modified.
the tree: 7
/ \
3 2
/ \ / \
1 0 0 0
sum of right
children (indices) 1 3 7 9
The summation rule is walking from leaf to root I sum to myself, the value of my parent if I am that parent's right child.
To find if there is an index at my current location I start at the root and ask is this offset greater smaller than my location. If yes I traverse left and add nothing. If no I traverse right and add the value to my index. If at the end of traversal my value is equal to my index then yes there is an annotation. You can do a similar traverals with a minimum and maximum index to find the node that dominates all the indices in the range, finding all the indices to the text I'm displaying.
Oh.. and this is just a toy example. In reality you need to periodically rebalance the tree otherwise there is a chance that if you keep adding new indices just in one part of the file you will get a tree which is way out of balance, and worst case performance would no longer be O(log2 n) but would be O(n). To keep the tree balanced you would need to implement a balanced binary tree like a "red/black tree". That would guarantee O(log2 n) performance where N is the number of metadatas.

Don't store indices! There's no way to possibly do that and simultaneously have performance better than O(n) - add a character at the beginning of the array and you'll have to increment n - 1 indices, no way around it.
But if you store substring lengths instead, you'd only have to change one per level of tree structure, bringing you up to O(log n). My (untested) solution would be to use a Rope with metadata attached to the nodes - you might need to play around with that a bit, but I think it's a solid foundation.
Hope that helps!

Maintaining a Linked List in File

As a part of the programming Assignment, I have to maintain a linked list in text file. I am pretty convenient with Linked List datastructure, but not so much with files in C++. Can some one give me an idea or overview of how to approach it. I should be able to add or delete the linked list, also able to add or delete the nodes in the linked list or else and should reuse the space that was deleted on one linked list. Each list has a number(integer), all nodes are of same size, contain integer.
My idea would be,
1) maintain a file with numbers(that contain linked List numbers)
0 - NULL
1 - head_offset for_linked_list_num 1
0 - NULL
1 - head_offset_for_linked_list_num 3
1 - head_offset_for_linked_list_num 3
1 - head_offset_for_linked_list_num 3
etc
where -1 is the termiator indication, 1 in a position indicates the ith location has an location associated with it
2) open another file and maintain the linked list like this
data next_offset
data next_offset
data NULL
By doing this, I can keep track of the linked list and efficiently add/delete/display an array.
For doing in C++ what are the functions I need to know and learn. I have very less time and I can code thinking it as an basic level of functions. Please advise. Thanks in advance

1 solution could be to have each list it's own file. The number would be the file name on disk. Inside of the file the first line is always the head node. Each line would be two peices of data. The node data followed by the line in the file for the next node. This would allow you to reuse empty spaces that have been deleted. (a -1 for the next node line would be the end of the list. For example:
1.txt:
(data) (next node lines number)
18 2
32 4
4 -1
5 3
So this link list would be 18 -> 32 -> 5 -> 4

For the what functions to use question, you might consult:
http://www.gnu.org/software/libc/manual/html_node/I_002fO-on-Streams.html#I_002fO-on-Streams
A read through this section may take 15 minutes and covers the basics.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js