I am coding for large graph sampling and meet some memory problems.
possible_edges = set(itertools.combinations(list(sampled_nodes), 2))
sampled_graph = list(possible_edges.intersection(ori_edges))
The code is supposed to find all combinations of nodes in sampled_nodes, which provided all possible edges formed by these nodes. Then take the intersection with original_edges to find which edge exactly exists.
The problem is when the graph is enormous, the itertools.combinations function would cause memory error.
I've thought to write for loop to iteratively calculate the intersection but takes too much time.
Any help from you guys would be appreciated. Thank you!
Found one solution, instead of take intersection of 2 lists, I choose not to create all combinations for possible_edges.
I pick edges in ori_edges to see if both of the node exists in sampled_nodes.
sampled_graph = list(set([(e[0], e[1]) for e in ori_edges if int(e[0]) in result_nodes and int(e[1]) in result_nodes]))
This code won't create the huge list and the speed is acceptable.
So, I've implemented a directed graph using an unordered multimap. Each pair within the map is made up of two strings: the vertex and its adjacent vertex.
Now, I am trying to determine if my graph has a cycle, and if so, how big is the cycle. This is the code I have so far:
int findCycle(const unordered_multimap<string,string> & connectedURLVertices, string y, string key)
{
string position;
position=y.find(key);
if(position!=string::npos)
{
return 1;
}
auto nodesToCheck=connectedURLVertices.equal_range(key);
for(auto & node : nodesToCheck)
{
int z=findCycle(connectedURLVertices,y+key,node);
}
}
I've walked through the code on paper and it seems to be logically correct, but I would appreciate it if anyone could take a look and see if I am on the right track or missing anything. Thanks!
To search for cycles in a graph you have to descend recursively through the arcs from some initial node until you reach one already visited node (you can construct a std::set of already visited nodes or mark the nodes as you visit them) or exhaust all the nodes without getting one already visited (absence of cycles) The criterion to select the arc can be adjusted to find it more quickly or the kind of search (first in depth, search by level, etc.)
For school I am supposed to use recursive backtracking to solve a Boat puzzle. The user inputs a maximum weight for the boat, the amount of item types, and a weight and value for each item type. More than one of each item type can be placed on the boat.
Our assignment states "The program should find a solution that fills the boat with selected valuable items such that the total value of the items in the boat is maximized while the total weight of the items stays within the weight capacity of the boat."
It also has pretty specific template for the recursive backtracking algorithm.
Currently I am using contiguous lists of items to store the possible items and the items on the boat. The item struct includes int members for weight, value, count (of how many times it is used) and a unique code for printing purposes. I then have a Boat class which contains data members max_weight, current_weight, value_sum, and members for each of the contiguous lists, and then member functions needed to solve the puzzle. All of my class functions seem to be working perfectly and my recursion is indeed displaying the correct answer given the example input.
The thing I can't figure out is the condition for extra credit, which is, "Modify your program so that it displays the best solution, which has the lowest total weight. If there are two solutions with the same total weight, break the tie by selecting the solution with the least items in it." I've looked at it for awhile, but I'm just not sure how I can change it make sure the weight is minimized while also maximizing the value. Here is the code for my solution:
bool solve(Boat &boat) {
if (boat.no_more()) {
boat.print();
return true;
}
else {
int pos;
for (int i = 0; i < boat.size(); i++){
if (boat.can_place(i)) {
pos = boat.add_item(i);
bool solved = solve(boat);
boat.remove_item(pos);
if (solved) return true;
}
}
return false;
}
}
All functions do pretty much exactly what their name says. No more returns true if none of the possible items will fit on the boat. Size returns the size of the list of possible items. Adding and removing items change the item count data and also the Boat current_weight and value_sum members accordingly. Also the add_item, remove_item and can_place parameter is the index of the possible item that is being used. In order to make sure maximized value is found, the list of possible items is sorted in descending order by value in the Boat's constructor, which takes a list of possible items as a parameter.
Also here is an example of what input and output look like:
Any insight is greatly appreciated!
It turned out that the above solution was correct. The only reason I was getting an incorrect answer was because of my implementation of the nomore() function. In the function I was checking if any item in the possible items list was less than the weight left on the boat. I should have been checking if they were less than or equal to the weight on the boat. A simple mistake.
The wikipedia entry was indeed of use and I enjoyed the comic :)
Let me start off with saying that I have very basic knowledge of nodes and graphs.
My goal is to make a solver for a maze which is stored as an array. I know exactly how to implement the algorithm for solving (I'm actually implementing a couple of them) but what my problem is, is that I am very confused on how to implement the nodes that the solver will use in each empty cell.
Here is an example array:
char maze[5][9] =
"#########",
"# # #",
"# ## ## #",
"# # #",
"#########"
My solver starts at the top left and the solution (exit) is at the bottom right.
I've read up on how nodes work and how graphs are implemented, so here is how I think I need to make this:
Starting point will become a node
Each node will have as property the column and the row number
Each node will also have as property the visited state
Visited state can be visited, visited and leads to dead end, not visited
Every time a node gets visited, every directly adjacent, empty and not visited cell becomes the visited node's child
Every visited node gets put on top of the solutionPath stack (and marked on the map as '*')
Every node that led to a dead end is removed from the stack (and marked on the map as '~')
Example of finished maze:
"#########",
"#*~#****#",
"#*##*##*#",
"#****~#*#",
"#########"
Basically my question is, am I doing something really stupid here with my way of thinking (since I am really inexperienced with nodes) and if it is could you please explain to me why? Also if possible provide me other websites to check which implement examples of graphs on real world applications so I can get a better grasp of it.
The answer really depends on what you find most important in the problem. If you're searching for efficiency and speed - you're adding way too many nodes. There's no need for so many.
The efficient method
Your solver only needs nodes at the start and end of the path, and at every possible corner on the map. Like this:
"#########",
"#oo#o o#",
"# ## ## #",
"#o oo#o#",
"#########"
There's no real need to test the other places on the map - you'll either HAVE TO walk thru them, or won't have need to even bother testing.
If it helps you - I got a template digraph class that I designed for simple graph representation. It's not very well written, but it's perfect for showing the possible solution.
#include <set>
#include <map>
template <class _nodeType, class _edgeType>
class digraph
{
public:
set<_nodeType> _nodes;
map<pair<unsigned int,unsigned int>,_edgeType> _edges;
};
I use this class to find a path in a tower defence game using the Dijkstra's algorithm. The representation should be sufficient for any other algorithm tho.
Nodes can be of any given type - you'll probably end up using pair<unsigned int, unsigned int>. The _edges connect two _nodes by their position in the set.
The easy to code method
On the other hand - if you're looking for an easy to implement method - you just need to treat every free space in the array as a possible node. And if that's what you're looking for - there's no need for designing a graph, because the array represents the problem in a perfect way.
You don't need dedicated classes to solve it this way.
bool myMap[9][5]; //the array containing the map info. 0 = impassable, 1 = passable
vector<pair<int,int>> route; //the way you need to go
pair<int,int> start = pair<int,int>(1,1); //The route starts at (1,1)
pair<int,int> end = pair<int,int>(7,3); //The road ends at (7,3)
route = findWay(myMap,start,end); //Finding the way with the algorithm you code
Where findWay has a prototype of vector<pair<int,int>> findWay(int[][] map, pair<int,int> begin, pair<int,int> end), and implements the algorithm you desire. Inside the function you'll probably need another two dimensional array of type bool, that indicates which places were tested.
When the algorithm finds a route, you usually have to read it in reverse, but I guess it depends on the algorithm.
In your particular example, myMap would contain:
bool myMap[9][5] = {0,0,0,0,0,0,0,0,0,
0,1,1,0,1,1,1,1,0,
0,1,0,0,1,0,0,1,0,
0,1,1,1,1,1,0,1,0,
0,0,0,0,0,0,0,0,0};
And findWay would return a vector containing (1,1),(1,2),(1,3),(2,3),(3,3),(4,3),(4,2),(4,1),(5,1),(6,1),(7,1),(7,2),(7,3)
I have a simple, non-dirictional tree T. I should find a path named A and another named B that A and B have no common vertex. The perpuse is to maxmize the Len(A)*Len(B).
I figured this problem is similer to Partition Problem, except in Partition Problem you have a set but here you have a Equivalence set. The solution is to find two uncrossed path that Len(A) ~ Len(B) ~ [n-1/2]. Is this correnct? how should I impliment such algorithm?
First of all. I Think you are looking at this problem the wrong way. As I understand it you have a graph related problem. What you do is
Build a maximum spanning tree and find the length L.
Now, you say that the two paths can't have any vertex in common, so we have to remove an edge to archieve this. I assume that every edge wheight in your graph is 1. So sum of the two paths A and B are L-1 after you removed an edge. The problem is now that you have to remove an edge such that the product of len(a) and len(b) is maximized. You do that by removeing the edge in et most 'middel' of L. Why, the problem is of the same as optimizing the area of a rectangle with a fixed perimeter. A short youtube video on the subject can be found here.
Note if your edge wheights are not equal to 1, then you have a harder problem, because there may exist more than one maximum spanning tree. But you may be able to split them in different ways, if this is the case, write me back, then I will think about a solution, but i do not have one at hand.
Best of luck.
I think there is a dynamic programming solution that is just about tractable if path length is just the number of links in the paths (so links don't have weights).
Work from the leaves up. At each node you need to keep track of the best pair of solutions confined to the subtree with root at that node, and, for each k, the best solution with a path of length k terminating in that node and a second path of maximum length somewhere below that node and not touching the path.
Given this info for all descendants of a node, you can produce similar info for that node, and so work your way up to the route.
You can see that this amount of information is required if you consider a tree that is in fact just a line of nodes. The best solution for a line of nodes is to split it in two, so if you have only worked out the best solution when the line is of length 2n + 1 you won't have the building blocks you need for a line of length 2n + 3.