I have a wireless mesh network of nodes, each of which is capable of reporting its 'distance' to its neighbors, measured in (simplified) signal strength to them. The nodes are geographically in 3d space but because of radio interference, the distance between nodes need not be trigonometrically (trigonomically?) consistent. I.e., given nodes A, B and C, the distance between A and B might be 10, between A and C also 10, yet between B and C 100.
What I want to do is visualize the logical network layout in terms of connectness of nodes, i.e. include the logical distance between nodes in the visual.
So far my research has shown the multidimensional scaling (MDS) is designed for exactly this sort of thing. Given that my data can be directly expressed as a 2d distance matrix, it's even a simpler form of the more general MDS.
Now, there seem to be many MDS algorithms, see e.g. http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html and http://tapkee.lisitsyn.me/ . I need to do this in C++ and I'm hoping I can use a ready-made component, i.e. not have to re-implement an algo from a paper. So, I thought this: https://sites.google.com/site/simpmatrix/ would be the ticket. And it works, but:
The layout is not stable, i.e. every time the algorithm is re-run, the position of the nodes changes (see differences between image 1 and 2 below - this is from having been run twice, without any further changes). This is due to the initialization matrix (which contains the initial location of each node, which the algorithm then iteratively corrects) that is passed to this algorithm - I pass an empty one and then the implementation derives a random one. In general, the layout does approach the layout I expected from the given input data. Furthermore, between different runs, the direction of nodes (clockwise or counterclockwise) can change. See image 3 below.
The 'solution' I thought was obvious, was to pass a stable default initialization matrix. But when I put all nodes initially in the same place, they're not moved at all; when I put them on one axis (node 0 at 0,0 ; node 1 at 1,0 ; node 2 at 2,0 etc.), they are moved along that axis only. (see image 4 below). The relative distances between them are OK, though.
So it seems like this algorithm only changes distance between nodes, but doesn't change their location.
Thanks for reading this far - my questions are (I'd be happy to get just one or a few of them answered as each of them might give me a clue as to what direction to continue in):
Where can I find more information on the properties of each of the many MDS algorithms?
Is there an algorithm that derives the complete location of each node in a network, without having to pass an initial position for each node?
Is there a solid way to estimate the location of each point so that the algorithm can then correctly scale the distance between them? I have no geographic location of each of these nodes, that is the whole point of this exercise.
Are there any algorithms to keep the 'angle' at which the network is derived constant between runs?
If all else fails, my next option is going to be to use the algorithm I mentioned above, increase the number of iterations to keep the variability between runs at around a few pixels (I'd have to experiment with how many iterations that would take), then 'rotate' each node around node 0 to, for example, align nodes 0 and 1 on a horizontal line from left to right; that way, I would 'correct' the location of the points after their relative distances have been determined by the MDS algorithm. I would have to correct for the order of connected nodes (clockwise or counterclockwise) around each node as well. This might become hairy quite quickly.
Obviously I'd prefer a stable algorithmic solution - increasing iterations to smooth out the randomness is not very reliable.
Thanks.
EDIT: I was referred to cs.stackexchange.com and some comments have been made there; for algorithmic suggestions, please see https://cs.stackexchange.com/questions/18439/stable-multi-dimensional-scaling-algorithm .
Image 1 - with random initialization matrix:
Image 2 - after running with same input data, rotated when compared to 1:
Image 3 - same as previous 2, but nodes 1-3 are in another direction:
Image 4 - with the initial layout of the nodes on one line, their position on the y axis isn't changed:
Most scaling algorithms effectively set "springs" between nodes, where the resting length of the spring is the desired length of the edge. They then attempt to minimize the energy of the system of springs. When you initialize all the nodes on top of each other though, the amount of energy released when any one node is moved is the same in every direction. So the gradient of energy with respect to each node's position is zero, so the algorithm leaves the node where it is. Similarly if you start them all in a straight line, the gradient is always along that line, so the nodes are only ever moved along it.
(That's a flawed explanation in many respects, but it works for an intuition)
Try initializing the nodes to lie on the unit circle, on a grid or in any other fashion such that they aren't all co-linear. Assuming the library algorithm's update scheme is deterministic, that should give you reproducible visualizations and avoid degeneracy conditions.
If the library is non-deterministic, either find another library which is deterministic, or open up the source code and replace the randomness generator with a PRNG initialized with a fixed seed. I'd recommend the former option though, as other, more advanced libraries should allow you to set edges you want to "ignore" too.
I have read the codes of the "SimpleMatrix" MDS library and found that it use a random permutation matrix to decide the order of points. After fix the permutation order (just use srand(12345) instead of srand(time(0))), the result of the same data is unchanged.
Obviously there's no exact solution in general to this problem; with just 4 nodes ABCD and distances AB=BC=AC=AD=BD=1 CD=10 you cannot clearly draw a suitable 2D diagram (and not even a 3D one).
What those algorithms do is just placing springs between the nodes and then simulate a repulsion/attraction (depending on if the spring is shorter or longer than prescribed distance) probably also adding spatial friction to avoid resonance and explosion.
To keep a "stable" diagram just build a solution and then only update the distances, re-using the current position from previous solution as starting point. Picking two fixed nodes and aligning them seems a good idea to prevent a slow drift but I'd say that spring forces never end up creating a rotational momentum and thus I'd expect that just scaling and centering the solution should be enough anyway.
Related
I'm programming my first game and I have one last problem to solve. I need an algorithm to check if I can move a chosen ball to a chosen place.
Look at this picture:
The rule is, if I picked up the blue ball on the white background (in the very middle) I can move it to all the green spaces and I can't move it to the purple ones, cause they are sort of fenced by other balls. I naturally can't move it to the places taken by other balls. The ball can only move up, down, left and right.
Now I am aware that there is two already existing algorithms: A* and Dijkstra's algorithm that might be helpful, but they seem too complex for what I need (both using vectors or stuff that I weren't taught yet, I'm quite new to programming and this is my semester project). I don't need to find the shortest way, I just need to know whether the chosen destination place is fenced by other balls or not.
My board in the game is 9x9 array simply filled with '/' if it's an empty place or one of the 7 letters if it's taken.
Is there a way I can code the algorithm in a simple way?
[I went for the flood fill and it works just fine, thank you for all your help and if someone has a similar problem - I recommend using flood fill, it's really simple and quick]
I suggest using Flood fill algorithm:
Flood fill, also called seed fill, is an algorithm that determines the
area connected to a given node in a multi-dimensional array. It is
used in the "bucket" fill tool of paint programs to fill connected,
similarly-colored areas with a different color, and in games such as
Go and Minesweeper for determining which pieces are cleared. When
applied on an image to fill a particular bounded area with color, it
is also known as boundary fill.
In terms of complexity time, this algorithm will be equals the recursive one: O(N×M), where N and M are the dimensions of the input matrix. The key idea is that in both algorithms each node is processed at most once.
In this link you can find a guide to the implementation of the algorithm.
More specifically, as Martin Bonner suggested, there are some key concepts for the implementation:
Mark all empty cells as unknown (all full cells are unreachable)
Add the source cell to a set of routable cells
While the set is not empty:
pop an element from the set;
mark all adjacent unknown cells as "reachable" and add them to the set
All remaining unknown cells are unreachable.
PS: You may want to read Flood fill vs DFS.
You can do this very simply using the BFS(Breadth First Search) algorithm.
For this you need to study the Graph data structure. Its pretty simple to implement, once you understand it.
Key Idea
Your cells will act as vertices, whereas the edges will tell whether or not you will be able to move from one cell to another.
Once you have implemented you graph using an adjacency list or adjacency matrix representation, you are good to go and use the BFS algorithm to do what you are trying to do.
I performed edge detection on images (with Python 2-7 and OpenCV 3.2) and have results like the following picture, i.e. one-pixel-wide edges not necessarily closed (can have "loose ends"), and with possible holes :
Now I would like to get the "derivative" of these edges, meaning the "slope" at each point, as in the following image :
For the moment, the only way I managed to do it is very locally. For each point of the edge (in red in next "zoomed" picture), I create a circle around it (in pink), mask the circle with the edge to get the red point's neighbors, then compute the slope of these two neighbors.
However, it can be quite messy if edges have holes (which they often do) or are close to other edges (which they often are) and masking all the points is pretty computationally intensive, so I wonder if there is a better way.
My first idea was spline interpolation, but you need to give as input an ordered list of points, which you can't have for a given edge unless you use a pixel neighbor tracking algorithm which can also get quite messy in case of not-that-good edges.
I also thought of findContours but it needs closed edges or else it yields the contour of a one-pixel-wide edge, i.e. two lines on both side of the edges, started at an arbitrary location on the edge, in short it's a mess.
Is there a cleaner and more efficient way than my actual method to achieve what I want ? Does OpenCV have any resources or is its job done after edge detection (I think the latter is more probable !) ?
P.S. : "I don't think there is a better way" is an answer I'm ready to accept !
So, if I understood everything correctly, what you need is an ordered list of your points free from holes, because after that it seems you know how to proceed to obtain your result. So, you should concentrate in getting an ordered gapless list.
FindContours does output an ordered list, but probably not in the order you need. It groups connected pixels with a TOP-DOWN / LEFT-RIGHT priority. So, it swipes each row sequentially, when it hits a white pixel, it finds the first contour. So, in your image, the first contour it finds is actually the one on the right, since it has the closest to 0 Y value.
In the case of this particular image, if you rotate it 90 degrees you'll realize that it will actually order your contours and points in the way you need. But will this always be the case? Only you can tell. If there is a pre-process method to apply to your images that will guarantee that findContours will order your pixels in the correct way, the rest will be easy. If not, I suggest you create your own pixel-connectivity algorithm that will work as you need it to, since all your problems depends on getting an ordered list.
Once you have the ordered list, just interpolate the missing pixels.
If you have an ordered set of pixels, "closing the gaps" is easy, since you just need to find the gaps and interpolate between them as an approximation that probably wont hurt your algorithm.
I have a few ordered points (less than 10) in 2D coordinates system.
I have an agent moving in the system and I want to find the shortest path between those points following their order.
For background the agent can be given a position to go to with a thrust, and my objective is to plot the fastest course given the fact that the agent has a maximum thrust and maximum angular velocity.
After some research I realized that I may be looking for a curve fitting algorithm, but I don't know the underlying function since the points are randomly distributed in the coordinates system.
Please, help me find a solution to this problem.
I am open to any suggestion, my prefered programming language being C++.
I'm sure there is a pure mathematical solution such as spacecraft trajectory optimization for example, but here is how I would consider approaching it from a programming/annealing perspective.
Even though you need a continuous path, you might start the path search with discreet steps and a tolerance for getting close enough to each point as a start.
Assign a certain amount of time to each step and vary applied angular force and thrust at each step.
Calculate resulting angular momentum and position for the start of the next step.
Select the parameters for the next step either with a random search, or iterate through each to find the closest to the next target point (quantize the selection of angles and thrust to begin with). Repeat steps until you are close enough to the next target point. Then repeat to get to the next point.
Once you have a rough path you can start refining it (perhaps use the rough point from the previous run as target points in the new one) by reducing the time/size of the steps and tolerance to target points.
When evaluating the parameters' fitness at each step you might want to consider that once you reach a target point you also want to perhaps have momentum in the direction of the next point. This should be an emergent property if multiple paths are considered and the fitness function considers shortest total time.
c++ could help here if you use the std::priority_queue and std::map for example to keep track of the potential paths.
I do have 4 lists of the x and y coordinates of calibration points. Those are in no particular order and not alligned on any axis (they come from a real calibration picture with slight rotation and distortion) but the lists have the same indexing and cannot be sorted in such a way that each list is ascending/descending. They also hold no integer values but floating point. I am now trying to find the four neighbouring points for a given point.
E.g. searching for the neighbours of the point [150,150] would return [140,140], [140,160], [160,140], [160,160] (except for them actually beeing more like [139.581239,138.28812]).
At the moment I have to look through all calibration points for each point to check. There are about 500 calibration points.
Later during the process, I need to know the 4 neighbours for a random point within the 1600x1400 grid for multiple million times. So it is crucial to find those points as fast as possible to avoid calculation time of days or even weeks.
My first approach was checking each of the ~500 calibration points for each point to check and look at their relative position to the checking point (x_calib > x and y_calib > y would be somewhere in the top, right region of the point) and calculate their distance to it. The closest point in each region (top left, top right, lower left, lower right) would then be the respective neighbour point. That seems not the be efficient at all and takes a lot of time.
The second approach was creating a rainbow table for each of the 1600x1400 points and save the respective neighbours (to be exact, to save the index in the list of coordinates). Later on, the process would check this rainbow table at position [x,y,0], [x,y,1], [x,y,2] and [x,y,3] to get the 4 indices of the 4 neighbour points. Though calculating the rainbow table takes some time (~20 minutes for those ~2 million points), this approach speeds up the later processing. Unfortunatelly, this approach makes it difficult to debug the later steps of the process because it takes this much time before the rest even starts..
I still think there should be room for optimization and I would appreciate any suggestion or help to speed up the whole thing. I allready read about the kd-tree thing but did not quite see the possibility to use it here. I'm hoping that there's an approach for this kind of unsorted (and unsortable) list of points which is more efficient than the rainbow table - or which is at least faster at creating the table.
Thanks in advance!
I have a rectangular grid shaped DAG where the horizontal edges always point right and the vertical edges always point down. The edges have positive costs associated with them. Because of the rectangular format, the nodes are being referred to using zero-based row/column. Here's an example graph:
Now, I want to perform a search. The starting vertex will always be in the left column (column with index 0) and in the upper half of the graph. This means I'll pick the start to be either (0,0), (1,0), (2,0), (3,0) or (4,0). The goal vertex is always in the right column (column with index 6) and "corresponds" to the start vertex:
start vertex (0,0) corresponds to goal vertex (5,6)
start vertex (1,0) corresponds to goal vertex (6,6)
start vertex (2,0) corresponds to goal vertex (7,6)
start vertex (3,0) corresponds to goal vertex (8,6)
start vertex (4,0) corresponds to goal vertex (9,6)
I only mention this to demonstrate that the goal vertex will always be reachable. It's possibly not very important to my actual question.
What I want to know is what search algorithm should I use to find the path from start to goal? I am using C++ and have access to the Boost Graph Library.
For those interested, I'm trying to implement Fuchs' suggestions from his Optimal Surface Reconstruction from Planar Contours paper.
I looked at A* but to be honest didn't understand it and wasn't how the heuristic works or even whether I could come up with one!
Because of the rectangular shape and regular edge directions, I figured there might be a well-suited algorithm. I considered Dijkstra
but the paper I mention said there were quicker algorithms (but annoyingly for me doesn't provide an implementation), plus that's
single-source and I think I want single-pair.
So, this is your problem,
DAG no cycles
weights > 0
Left weight < Right weight
You can use a simple exhaustive search defining every possible route. So you have a O(NxN) algoririthm. And then you will choose the shortest path. It is not a very smart solution, but it is effective.
But I suppose you want to be smarter than that, let's consider that if a particular node can be reached from two nodes, you can find the minimum of the weights at the two nodes plus the cost for arriving to the current node. You can consider this as an extension of the previous exhaustive search.
Remember that a DAG can be drawn in a line. For DAG linearization here's an interesting resource.
Now you have just defined a recursive algorithm.
MinimumPath(A,B) = MinimumPath(MinimumPath(A,C)+MinimumPath(A,D)+,MinimumPath(...)+MinimumPath(...))
Of course the starting point of recursion is
MinimumPath(Source,Source)
which is 0 of course.
As far as I know, there isn't an out of the box algorithm from boost to do this. But this is quite straightforward to implement it.
A good implementation is here.
If, for some reason, you do not have a DAG, Dijkstra's or Bellman-Ford should be used.
if I'm not mistaken, from the explanation this is really an optimal path problem not a search problem since the goal is known. In optimization I don't think you can avoid doing an exhaustive search if you want the optimal path and not an approximate path.
From the paper it seems like you are really subdividing a space many times then running the algorithm. This would reduce your N to closer to a constant in the context of the entire surface making O(N^2) not so bad.
That being said perhaps dynamic programming would be a good strategy where the N would be bounded by the difference between your start and goal node. Here is an example form genomic alignment. Just an illustration to give you an idea of how it works.
Construct a N by N array of cost values all set to 0 or some default.
#
For i in size N:
For j in size N:
#cost of getting here from i-1 or j-1
cost[i,j] = min(cost[i-1,j] + i edge cost , cost[i,j-1] + j edge cost)
Once you have your table filled in, start at bottom right corner. Starting from your goal, Choose to go to the node with the lowest cost of getting there. Work your way backwards choosing the lowest value until you reach the start node (or rather the array entry corresponding to the start node). This should give you the optimal path very simply.
Dynamic programming works by solving the optimization on smaller sub-problems. In this case the sub-problems are optimal paths to preceding nodes. I think the rectangular nature of your graph makes this a good fit.