I'm wondering if it's possible to make dynamic edges weights in BGL? I'm writing public transport navigator so except time as weight it would be nice if I can promote actualy using line instead of change at every stop event if it would be 3 minutes faster - this is just inconvenient.
Thanks for your help
edit:
Or maybe there is better library than can do that which I should use?
I'm not entirely clear on what you mean by dynamic... the weights are presumably stored in edge properties; there's nothing to stop you updating the properties with new values as required.
If you mean that you want the edge weights to be a function-object (or "functor", if you must) rather than "just a value", then see this thread on the BGL users list; haven't tried it myself. Makes me wonder how well various graph algorithms using edge weights deal with the weights changing while they're in progress (if the functor is called more than once and returns a different value each time)...
Related
I tried to play around with the partition strategy as what was mentioned here https://tinkerpop.apache.org/docs/current/reference/ .Initially, I expect that when I define a specific partition key for a zone and write some vertices on it, it would index that specific zones and improve the vertex lookup. Eventually, I realize that the partition key is just like another property value define within a vertex. In other words, these codes is nothing more but just a property value lookup which leads to full graph traversal scan:
g.withStrategies(new PartitionStrategy(partitionKey: "_partition", writePartition: "a",
readPartitions: ["a"]));
I'm not sure what are the underlying logic of this partitionstrategy, but it does not seems to be improve the lookup if it really does full graph scan. Correct me if i;m wrong
From TinkerPop's perspective, PartitionStrategy is just automatically modifying your Gremlin to take advantage of particular property in the graph. TinkerPop doesn't know anything about your graph databases's underlying indexing features nor does it implement any. It is up to your graph to optimize such things. Some graphs might do that on their own, some might offer you the opportunity to create indices that would help improve the speed of PartitionStrategy and others might do nothing at all, leaving PartitionStrategy to not work well for all use cases.
Going back to TinkerPop's perspective, the goal of PartitionStrategy (and SubgraphStrategy for that matter) is more to ease the manner with which Gremlin is written for use cases where parts of the graph need to be hidden. Without it, you would have lots and lots of repetitive filters mixed into your traversal which would muddy its readability.
Consider this bit of code:
graph = TinkerGraph.open()
strategy = new PartitionStrategy(partitionKey: "_partition", writePartition: "a", readPartitions: ["a"])
g = traversal().withEmbedded(graph).withStrategies(strategy)
g.addV().addE('link')
g.V().out().out().out()
The traversal is quite readable and straightforward. It is easy to understand the intent - a three step hop. But that's not really the traversal that executed. What executed was:
g.V().out().has('_partition',within("a")).
out().has('_partition',within("a")).
out().has('_partition',within("a"))
If you are using PartitionStrategy then you need to be sure it suits your graph database as well as your use case.
so I was trying to represent a certain transport system and apply some search algorithms. The system consists of a number of stations, so I think they can act as vertices. while the lines between them are good for the edges. I do have a high level idea of what I wanna do and how the search might work but I am not able to translate that into code.
Do I use a class to represent the stations? and each station an object? a tuple to store the co-ordinates? I would love to get any guidance to how to actually translate implementing the algorithms and writing the program itself.
I am thinking about using C++ for this
you need at least a class Node/Station, then you need either 1. a class Graph that contains a list of weighted connections between 2 Nodes, or 2. a list of nodes and each nodes has a list of weighted connections with node pointers.
Usually you would want your graph API to be able to return the neighbors of a node* sorted by weights, so probably sort that by calling graph.build() after adding all the weights.
There is no real gain in adding "coordinates" for the stations, unless its a real application, as its much easier to just set costs between stations than trying to come up with good positions for stations, otherwise just draw yourself a station map and label the edges yourself.
I'm guessing you want something like graph.path(a, b), then with Dijkstra's algorithm you will easily be able to do that, I would recommend setting up the code to call the algo before coding too much, that way if you're about to represent the data in a bad way, you will know earlier.
I am working on a graph implementation for a C++ class I am taking. This is what I came up with so far:
struct Edge {
int weight;
Vertex *endpoints[2]; // always will have 2 endpoints, since i'm making this undirected
};
struct Vertex {
int data; // or id
list<Edge*> edges;
};
class Graph {
public:
// constructor, destructor, methods, etc.
private:
list<Vertex> vertices;
};
It's a bit rough at the moment, but I guess I'm wondering... Am I missing something basic? It seems a bit too easy at the moment, and usually that means I'm designing it wrong.
My thought is that a graph is just a list of vertices, which has a list edges, which will have a list of edges, which has two vertex end points.
Other than some functions I'll put into the graph (like: shortest distance, size, add a vertex, etc), am I missing something in the basic implementation of these structs/classes?
Sometimes you need to design stuff like this and it is not immediately apparent what the most useful implementation and data representation is (for example, is it better storing a collection of points, or a collection of edges, or both?), you'll run into this all the time.
You might find, for example, that your first constructor isn't something you'd actually want. It might be easier to have the Graph class create the Vertices rather than passing them in.
Rather than working within the class itself and playing a guessing game, take a step back and work on the client code first. For example, you'll want to create a Graph object, add some points, connect the points with edges somehow, etc.
The ordering of the calls you make from the client will come naturally, as will the parameters of the functions themselves. With this understanding of what the client will look like, you can start to implement the functions themselves, and it will be more apparent what the actual implementation should be
Comments about your implementation:
A graph is a collection of objects in which some pairs of objects are related. Therefore, your current implementation is one potential way of doing it; you model the objects and the relationship between them.
The advantages of your current implementation are primarily constant lookup time along an edge and generalizability. Lookup time: if you want to access the nth neighbor of node k, that can be done in constant time. Generalizability: this represents almost any graph someone could think of, especially if you replace the data type of weight and data with an object (or a Template).
The disadvantages of your current implementation are that it will probably be slower than ideal. Looking across an edge will be cheap, but still take two hops instead of one (node->edge->node). Furthermore, using a list of edges is going to take you O(d) time to look up a specific edge, where d is the degree of the graph. (Your reliance on pointers also require that the graph fits in the memory of one computer; you'd have trouble with Facebook's graphs or the US road network. I doubt that parallel computing is a concern of yours at this point.)
Concerns when implementing a graph:
However, your question asks whether this is the best way. That's a difficult question, as several specific qualities of a graph come in to play.
Edge Information: If the way in which vertices are related doesn't matter (i.e., there is no weight or value to an edge), there is little point in using edge objects; this will only slow you down. Instead, each vertex can just keep a list of pointers to its neighbors, or a list of the IDs of its neighbors.
Construction: As you noticed in the comments, your current implementation requires that you have a vertex available before adding an edge. That is true in general. But you may want to create vertices on the fly as you add edges; this can make the construction look cleaner, but will take more time if the vertices have non-constant lookup time. If you know all vertices before construction the graph, it may be beneficial to explicitly create them first, then the edges.
Density: If the graph is sparse (i.e., the number of edges per vertex is approximately constant), then an adjacency list is again a good method. However, if it is dense, you can often get increased performance if you use an adjacency matrix. Every vertex holds a list of all other vertices in order, and so accessing any edge is a constant time operation.
Algorithm: What problems do you plan on solving on the graph? Several famous graph algorithms have different running times based on how the graph is represented.
Addendum:
Have a look at this question for many more comments that may help you out:
Graph implementation C++
I need to represent directed graphs in Clojure. I'd like to represent each node in the graph as an object (probably a record) that includes a field called :edges that is a collection of the nodes that are directly reachable from the current node. Hopefully it goes without saying, but I would like these graphs to be immutable.
I can construct directed acyclic graphs with this approach as long as I do a topological sort and build each graph "from the leaves up".
This approach doesn't work for cyclic graphs, however. The one workaround I can think of is to have a separate collection (probably a map or vector) of all of the edges for an entire graph. The :edges field in each node would then have the key (or index) into the graph's collection of edges. Adding this extra level of indirection works because I can create keys (or indexes) before the things they (will) refer to exist, but it feels like a kludge. Not only do I need to do an extra lookup whenever I want to visit a neighboring node, but I also have to pass around the global edges collection, which feels very clumsy.
I've heard that some Lisps have a way of creating cyclic lists without resorting to mutation functions. Is there a way to create immutable cyclic data structures in Clojure?
You can wrap each node in a ref to give it a stable handle to point at (and allow you to modify the reference which can start as nil). It is then possible to possible to build cyclic graphs that way. This does have "extra" indirection of course.
I don't think this is a very good idea though. Your second idea is a more common implementation. We built something like this to hold an RDF graph and it is possible to build it out of the core data structures and layer indices over the top of it without too much effort.
I've been playing with this the last few days.
I first tried making each node hold a set of refs to edges, and each edge hold a set of refs to the nodes. I set them equal to each other in a (dosync... (ref-set...)) type of operation. I didn't like this because changing one node requires a large amount of updates, and printing out the graph was a bit tricky. I had to override the print-method multimethod so the repl wouldn't stack overflow. Also any time I wanted to add an edge to an existing node, I had to extract the actual node from the graph first, then do all sorts of edge updates and that sort of thing to make sure everyone was holding on to the most recent version of the other thing. Also, because things were in a ref, determining whether something was connected to something else was a linear-time operation, which seemed inelegant. I didn't get very far before determining that actually performing any useful algorithms with this method would be difficult.
Then I tried another approach which is a variation of the matrix referred to elsewhere. The graph is a clojure map, where the keys are the nodes (not refs to nodes), and the values are another map in which the keys are the neighboring nodes and single value of each key is the edge to that node, represented either as a numerical value indicating the strength of the edge, or an edge structure which I defined elsewhere.
It looks like this, sort of, for 1->2, 1->3, 2->5, 5->2
(def graph {node-1 {node-2 edge12, node-3 edge13},
node-2 {node-5 edge25},
node-3 nil ;;no edge leaves from node 3
node-5 {node-2 edge52}) ;; nodes 2 and 5 have an undirected edge
To access the neighbors of node-1 you go (keys (graph node-1)) or call the function defined elsewhere (neighbors graph node-1), or you can say ((graph node-1) node-2) to get the edge from 1->2.
Several advantages:
Constant time lookup of a node in the graph and of a neighboring node, or return nil if it doesn't exist.
Simple and flexible edge definition. A directed edge exists implicitly when you add a neighbor to a node entry in the map, and its value (or a structure for more information) is provided explicitly, or nil.
You don't have to look up the existing node to do anything to it. It's immutable, so you can define it once before adding it to the graph and then you don't have to chase it around getting the latest version when things change. If a connection in the graph changes, you change the graph structure, not the nodes/edges themselves.
This combines the best features of a matrix representation (the graph topology is in the graph map itself not encoded in the nodes and edges, constant time lookup, and non-mutating nodes and edges), and the adjacency-list (each node "has" a list of its neighboring nodes, space efficient since you don't have any "blanks" like a canonical sparse matrix).
You can have multiples edges between nodes, and if you accidentally define an edge which already exists exactly, the map structure takes care of making sure you are not duplicating it.
Node and edge identity is kept by clojure. I don't have to come up with any sort of indexing scheme or common reference point. The keys and values of the maps are the things they represent, not a lookup elsewhere or ref. Your node structure can be all nils, and as long as it's unique, it can be represented in the graph.
The only big-ish disadvantage I see is that for any given operation (add, remove, any algorithm), you can't just pass it a starting node. You have to pass the whole graph map and a starting node, which is probably a fair price to pay for the simplicity of the whole thing. Another minor disadvantage (or maybe not) is that for an undirected edge you have to define the edge in each direction. This is actually okay because sometimes an edge has a different value for each direction and this scheme allows you to do that.
The only other thing I see here is that because an edge is implicit in the existence of a key-value pair in the map, you cannot define a hyperedge (ie one which connects more than 2 nodes). I don't think this is a big deal necessarily since most graph algorithms I've come across (all?) only deal with an edge that connects 2 nodes.
I ran into this challenge before and concluded that it isn't possible using truly immutable data structures in Clojure at present.
However you may find one or more of the following options acceptable:
Use deftype with ":unsynchronized-mutable" to create a mutable :edges field in each node that you change only once during construction. You can treat it as read-only from then on, with no extra indirection overhead. This approach will probably have the best performance but is a bit of a hack.
Use an atom to implement :edges. There is a bit of extra indirection, but I've personally found reading atoms to be extremely efficient.
So, I was thinking about making a simple random world generator. This generator would create a starting "cell" that would have between one and four random exits (in the cardinal directions, something like a maze). After deciding those exits, I would generate a new random "cell" at each of those exits, and repeat whenever a player would get near a part of the world that had not yet been generated. This concept would allow a "infinite" world of sorts, all randomly generated; however, I am unsure of how to best represent this internally.
I am using C++ (which doesn't really matter, I could implement any sort of data structure necessary). At first I thought of using a sort of directed graph in which each node would have directed edges to each cell surrounding it, but this probably won't work well if a user finds a spot in the world, backtracks, and comes back to that spot from another direction. The world might do some weird things, such as generate two cells at one location.
Any ideas on what kind of data structure might be the most effective for such a situation? Or am I doing something really dumb with my random world generation?
Any help would be greatly appreciated.
Thanks,
Chris
I recommend you read about graphs. This is exactly an application of random graph generation. Instead of 'cell' and 'exit' you are describing 'node' and 'edge'.
Plus, then you can do things like shortest path analysis, cycle detection and all sorts of other useful graph theory application.
This will help you understand about the nodes and edges:
and here is a finished application of these concepts. I implemented this in a OOP way - each node knew about it's edges to other nodes. A popular alternative is to implement this using an adjacency list. I think the adjacency list concept is basically what user470379 described with his answer. However, his map solution allows for infinite graphs, while a traditional adjacency list does not. I love graph theory, and this is a perfect application of it.
Good luck!
-Brian J. Stianr-
A map< pair<int,int>, cell> would probably work well; the pair would represent the x,y coordinates. If there's not a cell in the map at those coordinates, create a new cell. If you wanted to make it truly infinite, you could replace the ints with an arbitrary length integer class that you would have to provide (such as a bigint)
If the world's cells are arranged in a grid, you can easily give them cartesian coordinates. If you keep a big list of existing cells, then before determining exits from a given cell, you can check that list to see if any of its neighbors already exist. If they do, and you don't want to have 1-way doors (directed graph?) then you'll have to take their exits into account. If you don't mind having chutes in your game, you can still choose exits randomly, just make sure that you link to existing cells if they're there.
Optimization note: checking a hash table to see if it contains a particular key is O(1).
Couldn't you have a hash (or STL set) that stored a collection of all grid coordinates that contain occupied cells?
Then when you are looking at creating a new cell, you can quickly check to see if the candidate cell location is already occupied.
(if you had finite space, you could use a 2d array - I think I saw this in a Byte magazine article back in ~1980-ish, but if I understand correctly, you want a world that could extend indefinitely)