so I've been working on this program and its objective was to use recursion and an adjacency matrix to find how many possible routes a person could take to get through a subway system without going over a track more than once. That was self explanatory for me but now I'm lost on program 2 which is to do the same problem from program 1 in C++ and using three classes and recursion. The classes are suppose to be SubwaySystem, Station, and Track. I don't really know how to go about the transition from a simple adjacency matrix into three classes? It seems counterproductive since it seems more complicated. I have been working on it for a while know and I can't seem to utilize all three classes.
What I have tried: My approach was I created 1 Subway System with 12 Stations, and each station with an array of Tracks. For example, Station A has one station it can go to which is B. In Station A there is an array of 12 tracks but only 1 track is activated. However I keep running to errors since I tried to initialize the arrays in the Track class and then use them in the SubwaySystem class. Then trying to use recursion to get all possible routes makes it that much more difficult. I really don't know how to figure this out.
The adjacency matrix in the my code pretty much maps out the entire connection from station to station. The station are A - L corresponding to each row/column. I don't know how to represent this in c++ without using an adjacency matrix.
My code in C (program 1):
#include <stdio.h>
void routesFinder(int row, int col);
char station[13] = "ABCDEFGHIJKL";
char order[25] = "A";
int subway[12][12] = {{0,1,0,0,0,0,0,0,0,0,0,0},
{1,0,1,1,1,1,0,0,0,0,0,0},
{0,1,0,0,1,0,0,0,0,0,0,0},
{0,1,0,0,1,0,0,0,0,0,0,0},
{0,1,1,1,0,0,1,1,0,0,0,0},
{0,1,0,0,0,0,0,1,0,0,0,0},
{0,0,0,0,1,0,0,0,0,0,1,0},
{0,0,0,0,1,1,0,0,1,1,1,0},
{0,0,0,0,0,0,0,1,0,0,1,0},
{0,0,0,0,0,0,0,1,0,0,1,0},
{0,0,0,0,0,0,1,1,1,1,0,1},
{0,0,0,0,0,0,0,0,0,0,1,0}};
int paths = 0, i = 1;
int main(){
routesFinder(0, 0); //start with first station row, first column
printf("\n%d days before repeating a route.\n", paths);
return 0;
}
void routesFinder(int row, int col) {
while (col < 12) { //go through columns of a row
if (subway[row][col] == 0) { // if no station is found in row
if (row == 11) { // station found
paths++;
printf("Route %d: %s.\n", paths, order);
return;
}
col++;
if (row != 11 && col == 12) { //backtracking from deadend
return;
}
}
if (subway[row][col] == 1) {
order[i] = station[col]; //add station to route
i++; //increment, prepare for next route
subway[row][col] = 0; //no track forward
subway[col][row] = 0; // or backward
routesFinder(col, 0); //recursion, look for path in new row
order[i] = '\0'; //remove route
i--; //decrement, prepare for next route
subway[row][col] = 1; //restore path
subway[col][row] = 1; // restore path
col++; //returning from deadend, check for next open path
if (row != 11 && col == 12) { //return from deadend
return;
}
}
}
}
In general I can tell you that in c++ in particular and in object oriented in general,
each object should have its unique role in the system. Each is encapsulating a behavior and a knowledge that are its own and sole responsibility.
As for you specific problem - Without getting too deeply into the problem, I think the idea would be:
#include <iostream>
#include <string>
#include <vector>
class Track;
typedef std::vector<Track*> TrackList;
class Station
{
public:
Station( std::string name ) : _name( name ){};
~Station(){}
public:
const std::string& GetName() const
{ return _name; }
TrackList& GetTrackList()
{ return _trackList; }
void AddTrack( Track& track )
{ _trackList.push_back( &track ); }
private:
std::string _name;
TrackList _trackList;
};
class Track
{
public:
Track( Station& edgeA, Station& edgeB )
:
_edgeA( edgeA ),
_edgeB( edgeB ),
_wasVisited( false )
{
edgeA.AddTrack( *this );
edgeB.AddTrack( *this );
}
~Track(){}
public:
bool WasVisited() const
{ return _wasVisited; }
void SetVisited()
{ _wasVisited = true; }
public:
Station& GetEdgeA()
{ return _edgeA; }
Station& GetEdgeB()
{ return _edgeB; }
private:
Station& _edgeA;
Station& _edgeB;
bool _wasVisited;
};
class SubwaySystem
{
public:
SubwaySystem() {}
~SubwaySystem() {}
public:
void Traverse( Station& start )
{
TrackList& tracks = start.GetTrackList();
TrackList::iterator it = tracks.begin();
while ( it != tracks.end() )
{
if ( ! (*it)->WasVisited() )
{
std::cout << (*it)->GetEdgeA().GetName() << "-->" << (*it)->GetEdgeB().GetName() << ",";
(*it)->SetVisited();
Traverse( (*it)->GetEdgeB() );
}
++ it;
}
std::cout << std::endl;
}
};
int main()
{
Station A( "A" );
Station B( "B" );
Station C( "C" );
Station D( "D" );
Station E( "E" );
Track AB( A, B );
Track BC( B, C );
Track CA( C, A );
Track CD( C, D );
Track CE( C, E );
Track AE( A, E );
SubwaySystem subway;
subway.Traverse( A );
}
The output to this is
A-->B,B-->C,C-->A,A-->E,C-->E,
C-->D,
Surly you can 'play' with the Traverse function and put the printings in other places,
select another end-recursion condition, etc.
Notice how clean main() is.
You just declare the Stations and the Tracks and the voodoo happens.
Adding more tracks is simple, just describe the link and that's all, the track wad 'added' to the subway.
Other parts of the applications are also very clean, as each class knows exactly what it should and nothing more.
One possible way would to have the subway system hold control over all the stations. The stations would then have tracks that knew the origin (which station they came from) and the destination (which station they could go to).
The adjacency matrix would be broken up, the whole thing is represented inside the subway system, each row/column is represented in the stations, and each 1/0 is represented by the tracks. There would be no track for a zero.
Which paths to take would be decided at the station level, with which tracks being used/destinations already have been gone to. The tracks could have a property that keep track if they have been ridden on.
If you were doing this in C, you might have structures like this
typedef struct node node;
typedef struct edge edge;
typedef struct graph graph;
struct graph { // subway system
node *nodes; // stations
};
struct node { // station
char *name;
edge *edges; // tracks connected to this station
node *next; // next node in graph
bool visited;
}
struct edge { // track
node *src; // from station
node *dst; // to station
edge *next; // next track, this station
bool visited;
}
Transforming that into classes should be easy. Except that they might want you to use stl data structures instead of simply inlining the lists like I did.
The simple recursive graph algorithms map nicely to these data structures.
The idea of recursion for counting is a little hard to get, but let me try to explain at least that part.
So you know how strlen works, in C, right? You walk the array and keep a count. But here's a recursive version
unsigned int strlen(const char * string) {
if (*string == '\0') { return 0; }
else return 1 + strlen(string + 1);
}
Do you see how that works? Not that useful when walking an array where you can use a simple counter, but when you are dealing with issues where there are multiple possible combinations of doing things, or multiple ways of going, it works nicely. For example, if you wanted to count the number of nodes in a binary tree, you might do something like.
unsigned int treecount(NODE * node) {
if (node == NULL) { return 0;}
else return 1 + treecount(node->left) + treecount(node->right);
}
Hopefully that helps. Charlie Burns is probably right that doing it with a graph is a good idea.
Related
Context
I'm currently implementing some form of A* algorithm. I decided to use boost's fibonacci heap as underlying priority queue.
My Graph is being built while the algorithm runs. As Vertex object I'm using:
class Vertex {
public:
Vertex(double, double);
double distance = std::numeric_limits<double>::max();
double heuristic = 0;
HeapData* fib;
Vertex* predecessor = nullptr;
std::vector<Edge*> adj;
double euclideanDistanceTo(Vertex* v);
}
My Edge looks like:
class Edge {
public:
Edge(Vertex*, double);
Vertex* vertex = nullptr;
double weight = 1;
}
In order to use boosts fibonacci heap, I've read that one should create a heap data object, which I did like that:
struct HeapData {
Vertex* v;
boost::heap::fibonacci_heap<HeapData>::handle_type handle;
HeapData(Vertex* u) {
v = u;
}
bool operator<(HeapData const& rhs) const {
return rhs.v->distance + rhs.v->heuristic < v->distance + v->heuristic;
}
};
Note, that I included the heuristic and the actual distance in the comparator to get the A* behaviour, I want.
My actual A* implementation looks like that:
boost::heap::fibonacci_heap<HeapData> heap;
HeapData fibs(startPoint);
startPoint->distance = 0;
startPoint->heuristic = getHeuristic(startPoint);
auto handles = heap.push(fibs);
(*handles).handle = handles;
while (!heap.empty()) {
HeapData u = heap.top();
heap.pop();
if (u.v->equals(endPoint)) {
return;
}
doSomeGraphCreationStuff(u.v); // this only creates vertices and edges
for (Edge* e : u.v->adj) {
double newDistance = e->weight + u.v->distance;
if (e->vertex->distance > newDistance) {
e->vertex->distance = newDistance;
e->vertex->predecessor = u.v;
if (!e->vertex->fib) {
if (!u.v->equals(endPoint)) {
e->vertex->heuristic = getHeuristic(e->vertex);
}
e->vertex->fib = new HeapData(e->vertex);
e->vertex->fib->handle = heap.push(*(e->vertex->fib));
}
else {
heap.increase(e->vertex->fib->handle);
}
}
}
}
Problem
The algorithm runs just fine, if I use a very small heuristic (which degenerates A* to Dijkstra). If I introduce some stronger heuristic, however, the program throws an exepction stating:
0xC0000005: Access violation writing location 0x0000000000000000.
in the unlink method of boosts circular_list_algorithm.hpp. For some reason, next and prev are null. This is a direct consequence of calling heap.pop().
Note that heap.pop() works fine for several times and does not crash immediately.
Question
What causes this problem and how can I fix it?
What I have tried
My first thought was that I accidentally called increase() even though distance + heuristic got bigger instead of smaller (according to the documentation, this can break stuff). This is not possible in my implementation, however, because I can only change a node if the distance got smaller. The heurisitic stays constant. I tried to use update() instead of increase() anyway, without success
I tried to set several break points to get a more detailed view, but my data set is huge and I fail to reproduce it with smaller sets.
Additional Information
Boost Version: 1.76.0
C++14
the increase function is indeed right (instead of a decrease function) since all boost heaps are implemented as max-heaps. We get a min-heap by reversing the comparator and using increase/decrease reversed
Okay, prepare for a ride.
First I found a bug
Next, I fully reviewed, refactored and simplified the code
When the dust settled, I noticed a behaviour change that looked like a potential logic error in the code
1. The Bug
Like I commented at the question, the code complexity is high due to over-reliance on raw pointers without clear semantics.
While I was reviewing and refactoring the code, I found that this has, indeed, lead to a bug:
e->vertex->fib = new HeapData(e->vertex);
e->vertex->fib->handle = heap.push(*(e->vertex->fib));
In the first line you create a HeapData object. You make the fib member point to that object.
The second line inserts a copy of that object (meaning, it's a new object, with a different object identity, or practically speaking: a different address).
So, now
e->vertex->fib points to a (leaked) HeapData object that does not exist in the queue, and
the actual queued HeapData copy has a default-constructed handle member, which means that the handle wraps a null pointer. (Check boost::heap::detail::node_handle<> in detail/stable_heap.hpp to verify this).
This would handsomely explain the symptom you are seeing.
2. Refactor, Simplify
So, after understanding the code I have come to the conclusion that
HeapData and Vertex should to be merged: HeapData only served to link a handle to a Vertex, but you can already make the Vertex contain a Handle directly.
As a consequence of this merge
your vertex queue now actually contains vertices, expressing intent of the code
you reduce all of the vertex access by one level of indirection (reducing Law Of Demeter violations)
you can write the push operation in one natural line, removing the room for your bug to crop up. Before:
target->fib = new HeapData(target);
target->fib->handle = heap.push(*(target->fib));
After:
target->fibhandle = heap.push(target);
Your Edge class doesn't actually model an edge, but rather an "adjacency" - the target
part of the edge, with the weight attribute.
I renamed it OutEdge for clarity and also changed the vector to contain values instead of
dynamically allocated OutEdge instances.
I can't tell from the code shown, but I can almost guarantee these were
being leaked.
Also, OutEdge is only 16 bytes on most platforms, so copying them will be fine, and adjacencies are by definition owned by their source vertex (because including/moving it to another source vertex would change the meaning of the adjacency).
In fact, if you're serious about performance, you may want to use a boost::container::small_vector with a suitably chosen capacity if you know that e.g. the median number of edges is "small"
Your comparison can be "outsourced" to a function object
using Node = Vertex*;
struct PrioCompare {
bool operator()(Node a, Node b) const;
};
After which the heap can be typed as:
namespace bh = boost::heap;
using Heap = bh::fibonacci_heap<Node, bh::compare<PrioCompare>>;
using Handle = Heap::handle_type;
Your cost function violated more Law-Of-Demeter, which was easily fixed by adding a Literate-Code accessor:
Cost cost() const { return distance + heuristic; }
From quick inspection I think it would be more accurate to use infinite() over max() as initial distance. Also, use a constant for readability:
static constexpr auto INF = std::numeric_limits<Cost>::infinity();
Cost distance = INF;
You had a repeated check for xyz->equals(endPoint) to avoid updating the heuristic for a vertex. I suggest moving the update till after vertex dequeue, so the repetition can be gone (of both the check and the getHeuristic(...) call).
Like you said, we need to tread carefully around the increase/update fixup methods. As I read your code, the priority of a node is inversely related to the "cost" (cumulative edge-weight and heuristic values).
Because Boost Heap heaps are max heaps this implies that increasing the priority should match decreasing cost. We can just assert this to detect any programmer error in debug builds:
assert(target->cost() < previous_cost);
heap.increase(target->fibhandle);
With these changes in place, the code can read a lot quieter:
Cost AStarSearch(Node start, Node destination) {
Heap heap;
start->distance = 0;
start->fibhandle = heap.push(start);
while (!heap.empty()) {
Node u = heap.top();
heap.pop();
if (u->equals(destination)) {
return u->cost();
}
u->heuristic = getHeuristic(start);
doSomeGraphCreationStuff(u);
for (auto& [target, weight] : u->adj) {
auto curDistance = weight + u->distance;
// if cheaper route, queue or update queued
if (curDistance < target->distance) {
auto cost_prior = target->cost();
target->distance = curDistance;
target->predecessor = u;
if (target->fibhandle == NOHANDLE) {
target->fibhandle = heap.push(target);
} else {
assert(target->cost() < cost_prior);
heap.update(target->fibhandle);
}
}
}
}
return INF;
}
2(b) Live Demo
Adding some test data:
Live On Coliru
#include <boost/heap/fibonacci_heap.hpp>
#include <iostream>
using Cost = double;
struct Vertex;
Cost getHeuristic(Vertex const*) { return 0; }
void doSomeGraphCreationStuff(Vertex const*) {
// this only creates vertices and edges
}
struct OutEdge { // adjacency from implied source vertex
Vertex* target = nullptr;
Cost weight = 1;
};
namespace bh = boost::heap;
using Node = Vertex*;
struct PrioCompare {
bool operator()(Node a, Node b) const;
};
using Heap = bh::fibonacci_heap<Node, bh::compare<PrioCompare>>;
using Handle = Heap::handle_type;
static const Handle NOHANDLE{}; // for expressive comparisons
static constexpr auto INF = std::numeric_limits<Cost>::infinity();
struct Vertex {
Vertex(Cost d = INF, Cost h = 0) : distance(d), heuristic(h) {}
Cost distance = INF;
Cost heuristic = 0;
Handle fibhandle{};
Vertex* predecessor = nullptr;
std::vector<OutEdge> adj;
Cost cost() const { return distance + heuristic; }
Cost euclideanDistanceTo(Vertex* v);
bool equals(Vertex const* u) const { return this == u; }
};
// Now Vertex is a complete type, implement comparison
bool PrioCompare::operator()(Node a, Node b) const {
return a->cost() > b->cost();
}
Cost AStarSearch(Node start, Node destination) {
Heap heap;
start->distance = 0;
start->fibhandle = heap.push(start);
while (!heap.empty()) {
Node u = heap.top();
heap.pop();
if (u->equals(destination)) {
return u->cost();
}
u->heuristic = getHeuristic(start);
doSomeGraphCreationStuff(u);
for (auto& [target, weight] : u->adj) {
auto curDistance = weight + u->distance;
// if cheaper route, queue or update queued
if (curDistance < target->distance) {
auto cost_prior = target->cost();
target->distance = curDistance;
target->predecessor = u;
if (target->fibhandle == NOHANDLE) {
target->fibhandle = heap.push(target);
} else {
assert(target->cost() < cost_prior);
heap.update(target->fibhandle);
}
}
}
}
return INF;
}
int main() {
// a very very simple graph data structure with minimal helpers:
std::vector<Vertex> graph(10);
auto node = [&graph](int id) { return &graph.at(id); };
auto id = [&graph](Vertex const* node) { return node - graph.data(); };
// defining 6 edges
graph[0].adj = {{node(2), 1.5}, {node(3), 15}};
graph[2].adj = {{node(4), 2.5}, {node(1), 5}};
graph[1].adj = {{node(7), 0.5}};
graph[7].adj = {{node(3), 0.5}};
// do a search
Node startPoint = node(0);
Node endPoint = node(7);
Cost cost = AStarSearch(startPoint, endPoint);
std::cout << "Overall cost: " << cost << ", reverse path: \n";
for (Node node = endPoint; node != nullptr; node = node->predecessor) {
std::cout << " - " << id(node) << " distance " << node->distance
<< "\n";
}
}
Prints
Overall cost: 7, reverse path:
- 7 distance 7
- 1 distance 6.5
- 2 distance 1.5
- 0 distance 0
3. The Plot Twist: Lurking Logic Errors?
I felt uneasy about moving the getHeuristic() update around. I wondered
whether I might have changed the meaning of the code, even though the control
flow seemed to check out.
And then I realized that indeed the behaviour changed. It is subtle. At first I thought the
the old behaviour was just problematic. So, let's analyze:
The source of the risk is an inconsistency in node visitation vs. queue prioritization.
When visiting nodes, the condition to see whether the target became "less
distant" is expressed in terms of distance only.
However, the queue priority will be based on cost, which is different
from distance in that it includes any heuristics.
The problem lurking there is that it is possible to write code that where the
fact that distance decreases, NEED NOT guarantee that cost decreases.
Going back to the code, we can see that this narrowly avoided, because the
getHeuristic update is only executed in the non-update path of the code.
In Conclusion
Understanding this made me realize that
the Vertex::heuristic field is intended merely as a "cached" version of the getHeuristic() function call
implying that that function is treated as if it is idempotent
that my version did change behaviour in that getHeuristic was now
potentially executed more than once for the same vertex (if visited again
via a cheaper path)
I would suggest to fix this by
renaming the heuristic field to cachedHeuristic
making an enqueue function to encapsulate the three steps for enqueuing a vertex:
simply omitting the endpoint check because it can at MOST eliminate a single invocation of getHeuristic for that node, probably not worth the added complexity
add a comment pointing out the subtlety of that code path
UPDATE as discovered in the comments, we also need the inverse
operatione (dequeue) to symmtrically update handle so it reflects that
the node is no longer in the queue...
It also drives home the usefulness of having the precondition assert added before invoking Heap::increase.
Final Listing
With the above changes
encapsulated into a Graph object, that
also reads the graph from input like:
0 2 1.5
0 3 15
2 4 2.5
2 1 5
1 7 0.5
7 3 0.5
Where each line contains (source, target, weight).
A separate file can contain heuristic values for vertices index [0, ...),
optionally newline-separated, e.g. "7 11 99 33 44 55"
and now returning the arrived-at node instead of its cost only
Live On Coliru
#include <boost/heap/fibonacci_heap.hpp>
#include <iostream>
#include <deque>
#include <fstream>
using Cost = double;
struct Vertex;
struct OutEdge { // adjacency from implied source vertex
Vertex* target = nullptr;
Cost weight = 1;
};
namespace bh = boost::heap;
using Node = Vertex*;
struct PrioCompare {
bool operator()(Node a, Node b) const;
};
using MutableQueue = bh::fibonacci_heap<Node, bh::compare<PrioCompare>>;
using Handle = MutableQueue::handle_type;
static const Handle NOHANDLE{}; // for expressive comparisons
static constexpr auto INF = std::numeric_limits<Cost>::infinity();
struct Vertex {
Vertex(Cost d = INF, Cost h = 0) : distance(d), cachedHeuristic(h) {}
Cost distance = INF;
Cost cachedHeuristic = 0;
Handle handle{};
Vertex* predecessor = nullptr;
std::vector<OutEdge> adj;
Cost cost() const { return distance + cachedHeuristic; }
Cost euclideanDistanceTo(Vertex* v);
};
// Now Vertex is a complete type, implement comparison
bool PrioCompare::operator()(Node a, Node b) const {
return a->cost() > b->cost();
}
class Graph {
std::vector<Cost> _heuristics;
Cost getHeuristic(Vertex* v) {
size_t n = id(v);
return n < _heuristics.size() ? _heuristics[n] : 0;
}
void doSomeGraphCreationStuff(Vertex const*) {
// this only creates vertices and edges
}
public:
Graph(std::string edgeFile, std::string heurFile) {
{
std::ifstream stream(heurFile);
_heuristics.assign(std::istream_iterator<Cost>(stream), {});
if (!stream.eof())
throw std::runtime_error("Unexpected heuristics");
}
std::ifstream stream(edgeFile);
size_t src, tgt;
double weight;
while (stream >> src >> tgt >> weight) {
_nodes.resize(std::max({_nodes.size(), src + 1, tgt + 1}));
_nodes[src].adj.push_back({node(tgt), weight});
}
if (!stream.eof())
throw std::runtime_error("Unexpected input");
}
Node search(size_t from, size_t to) {
assert(from < _nodes.size());
assert(to < _nodes.size());
return AStar(node(from), node(to));
}
size_t id(Node node) const {
// ugh, this is just for "pretty output"...
for (size_t i = 0; i < _nodes.size(); ++i) {
if (node == &_nodes[i])
return i;
}
throw std::out_of_range("id");
};
Node node(int id) { return &_nodes.at(id); };
private:
// simple graph data structure with minimal helpers:
std::deque<Vertex> _nodes; // reference stable when growing at the back
// search state
MutableQueue _queue;
void enqueue(Node n) {
assert(n && (n->handle == NOHANDLE));
// get heuristic before insertion!
n->cachedHeuristic = getHeuristic(n);
n->handle = _queue.push(n);
}
Node dequeue() {
Node node = _queue.top();
node->handle = NOHANDLE;
_queue.pop();
return node;
}
Node AStar(Node start, Node destination) {
_queue.clear();
start->distance = 0;
enqueue(start);
while (!_queue.empty()) {
Node u = dequeue();
if (u == destination) {
return u;
}
doSomeGraphCreationStuff(u);
for (auto& [target, weight] : u->adj) {
auto curDistance = u->distance + weight;
// if cheaper route, queue or update queued
if (curDistance < target->distance) {
auto cost_prior = target->cost();
target->distance = curDistance;
target->predecessor = u;
if (target->handle == NOHANDLE) {
// also caches heuristic
enqueue(target);
} else {
// NOTE: avoid updating heuristic here, because it
// breaks the queue invariant if heuristic increased
// more than decrease in distance
assert(target->cost() < cost_prior);
_queue.increase(target->handle);
}
}
}
}
return nullptr;
}
};
int main() {
Graph graph("input.txt", "heur.txt");
Node arrival = graph.search(0, 7);
std::cout << "reverse path: \n";
for (Node n = arrival; n != nullptr; n = n->predecessor) {
std::cout << " - " << graph.id(n) << " cost " << n->cost() << "\n";
}
}
Again, printing the expected
reverse path:
- 7 cost 7
- 1 cost 17.5
- 2 cost 100.5
- 0 cost 7
Note how the heuristics changed the cost, but not optimal path in this case.
Over the last week, I have implemented a Digraph by parsing an input file. The graph is guaranteed to have no cycles. I have successfully created the graph, used methods to return the number of vertices and edges, and performed a topological sort of the graph. The graph is composed of different major courses and their prereqs. Here is my graph setup:
class vertex{
public:
typedef std::pair<int, vertex*> ve;
std::vector<ve> adjacency;
std::string course;
vertex(std::string c){
course = c;
}
};
class Digraph{
public:
typedef std::map<std::string, vertex *> vmap;
vmap work;
typedef std::unordered_set<vertex*> marksSet;
marksSet marks;
typedef std::deque<vertex*> stack;
stack topo;
void dfs(vertex* vcur);
void addVertex(std::string&);
void addEdge(std::string& from, std::string& to, int cost);
int getNumVertices();
int getNumEdges();
void getTopoSort();
};
The implementation
//function to add vertex's to the graph
void Digraph::addVertex(std::string& course){
vmap::iterator iter = work.begin();
iter = work.find(course);
if(iter == work.end()){
vertex *v;
v = new vertex(course);
work[course] = v;
return;
}
}
//method to add edges to the graph
void Digraph::addEdge(std::string& from, std::string& to, int cost){
vertex *f = (work.find(from)->second);
vertex *t = (work.find(to)->second);
std::pair<int, vertex *> edge = std::make_pair(cost, t);
f->adjacency.push_back(edge);
}
//method to return the number of vertices in the graph
int Digraph::getNumVertices(){
return work.size();
}
//method to return the number of edges in the graph
int Digraph::getNumEdges(){
int count = 0;
for (const auto & v : work) {
count += v.second->adjacency.size();
}
return count;
}
//recursive function used by the topological sort method
void Digraph::dfs(vertex* vcur) {
marks.insert(vcur);
for (const auto & adj : vcur->adjacency) {
vertex* suc = adj.second;
if (marks.find(suc) == marks.end()) {
this->dfs(suc);
}
}
topo.push_front(vcur);
}
//method to calculate and print out a topological sort of the graph
void Digraph::getTopoSort(){
marks.clear();
topo.clear();
for (const auto & v : work) {
if (marks.find(v.second) == marks.end()) {
this->dfs(v.second);
}
}
// Display it
for (const auto v : topo) {
std::cout << v->course << "\n";
}
}
For the last part of my implementation, I have been trying to do 2 things. Find the shortest path from the first vertex to every other vertices, and also find the shortest path that visits every vertex and returns to the first one. I am completely lost on this implementation. I assumed from reading I need to use Dijkstra's algorithm to implement this. I have been trying for the last 3 days to no avail. Did i set up my digraph in a bad way to implement these steps? Any guidance is appreciated.
The fact that there are no cycles makes the problem much simpler. Finding the shortest paths and a minimal "grand tour" are O(n).
Implement Dijkstra and run it, without a "destination" node; just keep going until all nodes have been visited. Once every node has been marked (with its distance to the root), you can start at any node and follow the shortest (and only) path back to the root by always stepping to the only neighbor whose distance is less than this one. If you want, you can construct these paths quite easily as you go, and mark each node with the full path back to the root, but copying those paths can push the cost to O(n2) if you're not careful.
And once all the nodes are marked, you can construct a minimal grand tour. Start at the root; when you visit a node, iterate over its unvisited neighbors (i.e. all but the one you just came from), visiting each, then go back the one you came from. (I can put this with more mathematical rigor, or give an example, if you like.)
I use Justin Heyes-Jones implementation of the astar algorithm. My heuristic function is just Euclidean distance. In the drawing attached (sorry for bad quality) a specific situation is described: lets say we are going from the node 1 to the node 2. The shortest way would go through the nodes a - b - c - d - e. But the step-by-step Astar with the Euclidean heuristic will give us the way through the following nodes: a - b' - c' - d' - e and I understand why this happens. But what do I have to do to make it return the shortest path?!
false shortest path finding by the astar
The code for the real road map import:
#include "search.h"
class ArcList;
class MapNode
{
public:
int x, y; // ���������� ����
MapNode();
MapNode(int X, int Y);
float Get_h( const MapNode & Goal_node );
bool GetNeighbours( AStarSearch<MapNode> *astarsearch, MapNode *parent_node );
bool IsSamePosition( const MapNode &rhs );
void PrintNodeInfo() const;
bool operator == (const MapNode & other) const;
void setArcList( ArcList * list );
private:
ArcList * list;
};
class Arc
{
public:
MapNode A1;
MapNode B1;
Arc(const MapNode & a, const MapNode & b);
};
class ArcList
{
public:
void setArcs( const std::vector<Arc> & arcs );
void addArc( const Arc & arc );
size_t size() const;
bool addNeighbours( AStarSearch<MapNode> * astarsearch, const MapNode & neighbour );
private :
std::vector<Arc> arcs;
};
std::vector <MapNode> FindPath(const MapNode & StartNode, const MapNode & GoalNode)
{
AStarSearch<MapNode> astarsearch;
astarsearch.SetStartAndGoalStates( StartNode, GoalNode );
unsigned int SearchState;
unsigned int SearchSteps = 0;
do
{
if ( SearchSteps % 100 == 0)
std::cout << "making step " << SearchSteps << endl;
SearchState = astarsearch.SearchStep();
SearchSteps++;
}
while ( SearchState == AStarSearch<MapNode>::SEARCH_STATE_SEARCHING );
std::vector<MapNode> S;
if ( SearchState == AStarSearch<MapNode>::SEARCH_STATE_SUCCEEDED )
{
int steps = 0;
for ( MapNode * node = astarsearch.GetSolutionStart(); node != 0; node = astarsearch.GetSolutionNext() )
{
S.push_back(*node);
// node->PrintNodeInfo();
}
astarsearch.FreeSolutionNodes();
}
else if ( SearchState == AStarSearch<MapNode>::SEARCH_STATE_FAILED )
{
throw " SEARCH_FAILED ";
}
return S;
}
Function FindPath gives me the vector of the result nodes.
Here is addNeighbours method:
bool ArcList::addNeighbours( AStarSearch<MapNode> * astarsearch, const MapNode & target )
{
assert(astarsearch != 0);
bool found = false;
for (size_t i = 0; i < arcs.size(); i++ )
{
Arc arc = arcs.at(i);
if (arc.A1 == target)
{
found = true;
astarsearch->AddSuccessor( arc.B1 );
}
else if (arc.B1 == target )
{
found = true;
astarsearch->AddSuccessor( arc.A1 );
}
}
return found;
}
and get_h method:
float MapNode::Get_h( const MapNode & Goal_node )
{
float dx = x - Goal_node.x;
float dy = y - Goal_node.y;
return ( dx * dx + dy * dy );
}
I know that its not exact distance (no taking of square root here) - this is done to save some machine resources when evaluating.
When you are using A* graph search, i.e. you only consider the first visit to a node and disregard the future visits, this can in fact happen when your heuristic is not consistent. If the heuristic is not consistent and you use a graph search, (you keep a list of visited states and if you have already encountered a state, you do not expand it again), your search doesn't give the correct answer.
However, when you use A* tree search with an admissible heuristic, you should get the correct answer. The difference in tree and graph search is that in the tree search you expand the state every time you encounter it. Hence even if at first your algorithm decides to take the longer b', c', d' path, later it returns to a, expands it again and finds out that the b, c, d path is in fact shorter.
Hence my advice is, either to use the tree search instead of the graph search, or choose a consistent heuristic.
For definition of consistent, see for example: Wikipedia: A* Search Algorithm
EDIT: While the above is still true, this heuristic is indeed consistent, I apologise for the confusion.
EDIT2: While the heuristic itself is admissible and consistent, the implementation wasn't. For the sake of performance you decided not to do the square root and that made your heuristic inadmissible and it was the reason why you got wrong results.
For the future, it is always better to first implement your algorithms as naively as possible. It usually helps to keep them more readable and they are less prone to bugs. If there are bugs, they are easier to spot. Hence, my final advice would be don't optimise unless you need it, or unless everything else is working well. Otherwise you may get into troubles.
It seems that taking the square root in get_h method has solved it. Turns out that my heuristic wasnt admissable (at least I think this explains it). Special thanks to Laky and justinhj for helping out with this!!!
I have a class(object), User. This user has 2 private attributes, "name" and "popularity". I store the objects into a vector (container).
From the container, I need to find the top 5 most popular user, how do I do that? (I have an ugly code, I will post here, if you have a better approach, please let me know. Feel free to use other container, if you think vector is not a good choice, but please use only: map or multimap, list, vector or array, because I only know how to use these.) My current code is:
int top5 = 0, top4 = 0, top3 = 0, top2 = 0, top1 = 0;
vector<User>::iterator it;
for (it = user.begin(); it != user.end(); ++it)
{
if( it->getPopularity() > top5){
if(it->getPopularity() > top4){
if(it->getPopularity() > top3){
if(it->getPopularity() > top2){
if(it->getPopularity() > top1){
top1 = it->getPopularity();
continue;
} else {
top2 = it->getPopularity();
continue;
}
} else {
top3 = it->getPopularity();
continue;
}
}
} else {
top4 = it->getPopularity();
continue;
}
} else {
top5 = it->getPopularity();
continue;
}
}
I know the codes is ugly and might be prone to error, thus if you have better codes, please do share with us (us == cpp newbie). Thanks
You can use the std::partial_sort algorithm to sort your vector so that the first five elements are sorted and the rest remains unsorted. Something like this (untested code):
bool compareByPopularity( User a, User b ) {
return a.GetPopularity() > b.GetPopularity();
}
vector<Users> getMostPopularUsers( const vector<User> &users, int num ) {
if ( users.size() <= num ) {
sort( users.begin(), users.end(), compareByPopularity );
} else {
partial_sort( users.begin(), users.begin() + num, users.end(),
compareByPopularity );
}
return vector<Users>( users.begin(), users.begin() + num );
}
Why don't you sort (std::sort or your own implementation of Quick Sort) the vector based on popularity and take the first 5 values ?
Example:
bool UserCompare(User a, User b) { return a.getPopularity() > b.getPopularity(); }
...
std::sort(user.begin(), user.end(), UserCompare);
// Print first 5 users
If you just want top 5 popular uses, then use std::partial_sort().
class User
{
private:
string name_m;
int popularity_m;
public:
User(const string& name, int popularity) : name_m(name), popularity_m(popularity) { }
friend ostream& operator<<(ostream& os, const User& user)
{
return os << "name:" << user.name_m << "|popularity:" << user.popularity_m << "\n";
return os;
}
int Popularity() const
{
return popularity_m;
}
};
bool Compare(const User& lhs, const User& rhs)
{
return lhs.Popularity() > rhs.Popularity();
}
int main()
{
// c++0x. ignore if you don't want it.
auto compare = [](const User& lhs, const User& rhs) -> bool
{ return lhs.Popularity() > rhs.Popularity(); };
partial_sort(users.begin(), users.begin() + 5, users.end(), Compare);
copy(users.begin(), users.begin() + 5, ostream_iterator<User>(std::cout, "\n"));
}
First off, cache that it->getPopularity() so you don't have to keep repeating it.
Secondly (and this is much more important): Your algorithm is flawed. When you find a new top1 you have to push the old top1 down to the #2 slot before you save the new top1, but before you do that you have to push the old top2 down to the #3 slot, etc. And that is just for a new top1. You are going to have to do something similar for a new top2, a new top3, etc. The only one you can paste in without worrying about pushing things down the list is when you get a new top5. The correct algorithm is hairy. That said, the correct algorithm is much easier to implement when your topN is an array rather than a bunch of separate values.
Thirdly (and this is even more important than the second point): You shouldn't care about performance, at least not initially. The easy way to do this is to sort the entire list and pluck off the first five off the top. If this suboptimal but simple algorithm doesn't affect your performance, done. Don't bother with the ugly but fast first N algorithm unless performance mandates that you toss the simple solution out the window.
Finally (and this is the most important point of all): That fast first N algorithm is only fast when the number of elements in the list is much, much larger than five. The default sort algorithm is pretty dang fast. It has to be wasting a lot of time sorting the dozens / hundreds of items you don't care about before a pushdown first N algorithm becomes advantageous. In other words, that pushdown insertion sort algorithm may well be a case of premature disoptimization.
Sort your objects, maybe with the library if this is allowed, and then simply selecte the first 5 element. If your container gets too big you could probably use a std::list for the job.
Edit : #itsik you beat me to the sec :)
Do this pseudo code.
Declare top5 as an array of int[5] // or use a min-heap
Initialize top5 as 5 -INF
For each element A
if A < top5[4] // or A < root-of-top5
Remove top5[4] from top5 // or pop min element from heap
Insert A to top // or insert A to the heap
Well, I advise you improve your code by using an array or list or vector to store the top five, like this
struct TopRecord
{
int index;
int pop;
} Top5[5];
for(int i = 0; i<5; i++)
{
Top5[i].index = -1;
// Set pop to a value low enough
Top5[i].pop = -1;
}
for(int i = 0; i< users.size(); i++)
{
int currentpop = i->getPopularity()
int currentindex = i;
int j = 0;
int temp;
while(j < 5 && Top5[j].pop < currentpop)
{
temp = Top5[j].pop;
Top[j].pop = currentpop;
currentpop = temp;
temp = Top5[j].index;
Top[j].index = currentindex;
currentindex = temp;
j++;
}
}
You also may consider using Randomized Select if Your aim is performance, since originally Randomized Select is good enough for ordered statistics and runs in linear time, You just need to run it 5 times. Or to use partial_sort solution provided above, either way counts, depends on Your aim.
I need to use (not implement) an array based version of Dijkstras algo .The task is that given a set of line segments(obstacles) and start/end points I have to find and draw the shortest path from start/end point.I have done the calculating part etc..but dont know how to use dijkstras with my code.My code is as follows
class Point
{
public:
int x;
int y;
Point()
{
}
void CopyPoint(Point p)
{
this->x=p.x;
this->y=p.y;
}
};
class NeighbourInfo
{
public:
Point x;
Point y;
double distance;
NeighbourInfo()
{
distance=0.0;
}
};
class LineSegment
{
public:
Point Point1;
Point Point2;
NeighbourInfo neighbours[100];
LineSegment()
{
}
void main()//in this I use my classes and some code to fill out the datastructure
{
int NoOfSegments=i;
for(int j=0;j<NoOfSegments;j++)
{
for(int k=0;k<NoOfSegments;k++)
{
if( SimpleIntersect(segments[j],segments[k]) )
{
segments[j].neighbours[k].distance=INFINITY;
segments[j].neighbours[k].x.CopyPoint(segments[k].Point1);
segments[j].neighbours[k].y.CopyPoint(segments[k].Point2);
cout<<"Intersect"<<endl;
cout<<segments[j].neighbours[k].distance<<endl;
}
else
{
segments[j].neighbours[k].distance=
EuclidianDistance(segments[j].Point1.x,segments[j].Point1.y,segments[k].Point2.x,segments[k ].Point2.y);
segments[j].neighbours[k].x.CopyPoint(segments[k].Point1);
segments[j].neighbours[k].y.CopyPoint(segments[k].Point2);
}
}
}
}
Now I have the distances from each segmnets to all other segments, amd using this data(in neighbourinfo) I want to use array based Dijkstras(restriction ) to trace out the shortest path from start/end points.There is more code , but have shortened the problem for the ease of the reader
Please Help!!Thanks and plz no .net lib/code as I am using core C++ only..Thanks in advance
But I need the array based version(strictly..) I am not suppose to use any other implemntation.
Dijkstras
This is how Dijkstra's works:
Its not a simple algorithm. So you will have to map this algorithm to your own code.
But good luck.
List<Nodes> found; // All processed nodes;
List<Nodes> front; // All nodes that have been reached (but not processed)
// This list is sorted by the cost of getting to this node.
List<Nodes> remaining; // All nodes that have not been explored.
remaining.remove(startNode);
front.add(startNode);
startNode.setCost(0); // Cost nothing to get to start.
while(!front.empty())
{
Node current = front.getFirstNode();
front.remove(current);
found.add(current);
if (current == endNode)
{ return current.cost(); // we found the end
}
List<Edge> edges = current.getEdges();
for(loop = edges.begin(); loop != edges.end(); ++loop)
{
Node dst = edge.getDst();
if (found.find(dst) != found.end())
{ continue; // If we have already processed this node ignore it.
}
// The cost to get here. Is the cost to get to the last node.
// Plus the cost to traverse the edge.
int cost = current.cost() + loop.cost();
Node f = front.find(dst);
if (f != front.end())
{
f.setCost(std::min(f.cost(), cost));
continue; // If the node is on the front line just update the cost
// Then continue with the next node.
}
// Its a new node.
// remove it from the remaining and add it to the front (setting the cost).
remaining.remove(dst);
front.add(dst);
dst.setCost(cost);
}
}