I have to find the shortest path from point D to R. These are fixed points.
This is an example of a situation:
The box also contains walls, through which you cannot pass across them, unless you break them. Each wall break costs you, let's say "a" points, where "a" is a positive integer.
Each move which doesn't involve a wall, costs you 1 point.
The mission is to find out of all the paths of minimum cost, the one with the least number of broken walls.
Since, the width of the box can go up to 100 cells, it's irrelevant to use backtracking. It's too slow. The only solution I came up is this one:
Go east or south if there are no walls
If south has a wall, check if west has wall. If west has wall, break south wall. If west doesn't have wall, go west, until you find a south cell without wall. Repeat this process with south and east until you exceed the cost of a broken wall in this order. If path from west goes into the same place as if I had broken the south wall and costs the same or less than "a" points, then use this path, else brake south wall.
If nothing above encounters, brake a south or east wall, depending on the box boundary.
Repeat steps 1, 2, 3 till the "passenger" arrives in point R. Between these 3 steps, there are "else-if" relations.
Can you come up with a better problem algorithm? I program in C++.
Use Dijkstra, but for costs give it 1 for a move that doesn't break a wall, and (a+0.00001) for breaking a wall. Then Dijkstra will give you what you want, the path that breaks the fewest walls among all minimal-cost paths.
Conceptually, imagine a traveler who can jump over walls -- while keeping track of the cost -- and can also split into two identical travelers when faced with a choice of two paths, so as to take them both (take that, Robert Frost!). Only one traveler moves at a time, the one who has incurred the lowest cost so far. That one moves, and writes on the floor "I reached here at a cost of only x". If I find such a note already there, if I got there more cheaply I erase the old note and write my own; if that other traveler got there more cheaply I commit suicide.
The two-part "cost first, then broken walls", can be represented as a pair (c, w) that is compared lexicographically. c is the cost, w is the number of broken walls. That makes it a "single thing" again (in some sense), so it's a thing that you can put into algorithms and so on that expect simply "a cost" (as an abstract thing that it may add an other cost to or compare to an other cost).
So we can just use A*, with a Manhattan Distance heuristic (perhaps there's something smarter that doesn't ignore walls completely, but this will work - underestimating the distance is admissible). The movement cost will, of course, not ignore walls. Neighbours will be all adjacent squares. All costs will be the pair I described above.
This could easily be modeled as a weighted graph and then apply Dijkstra's shortest path algorithm to it. Each square is a node. It is connected to the nodes of the squares it is adjacent to. The weight of the connections is either 1 or "a", based on whether there is a wall or not. This will get you the minimal cost. It's possible that the minimum cost and the minimum number of wall breaks could be different.
Here is a general algorithm (you'll have to do the implementation yourself):
Convert the matrix into a weighted graph:
For each entry in the matrix, create a Vertex.
For each Vertex, create an array of Edges, one for each neighboring Vertex.
For each Edge, define a weight according to the cost of breaking the wall between the two Vertices that the Edge is connecting.
Then, run the Dijkstra's algorithm (http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm) on the graph, starting from Vertex D. As an output, you will have the shortest (cheapest) path from Vertex D to any other Vertex on the graph, including Vertex R.
Related
I am attempting to use Q-learning to learn minesweeping behavior on a discreet version of Mat Buckland's smart sweepers, the original available here http://www.ai-junkie.com/ann/evolved/nnt1.html, for an assignment. The assignment limits us to 50 iterations of 2000 moves on a grid that is effectively 40x40, with the mines resetting and the agent being spawned in a random location each iteration.
I've attempted performing q learning with penalties for moving, rewards for sweeping mines and penalties for not hitting a mine. The sweeper agent seems unable to learn how to sweep mines effectively within the 50 iterations because it learns that going to specific cell is good, but after a the mine is gone it is no longer rewarded, but penalized for going to that cell with the movement cost
I wanted to attempt providing rewards only when all the mines were cleared in an attempt to make the environment static as there would only be a state of not all mines collected, or all mines collected, but am struggling to implement this due to the agent having only 2000 moves per iteration and being able to backtrack, it never manages to sweep all the mines in an iteration within the limit with or without rewards for collecting mines.
Another idea I had was to have an effectively new Q matrix for each mine, so once a mine is collected, the sweeper transitions to that matrix and operates off that where the current mine is excluded from consideration.
Are there any better approaches that I can take with this, or perhaps more practical tweaks to my own approach that I can try?
A more explicit explanation of the rules:
The map edges wrap around, so moving off the right edge of the map will cause the bot to appear on the left edge etc.
The sweeper bot can move up down, left or right from any map tile.
When the bot collides with a mine, the mine is considered swept and then removed.
The aim is for the bot to learn to sweep all mines on the map from any starting position.
Given that the sweeper can always see the nearest mine, this should be pretty easy. From your question I assume your only problem is finding a good reward function and representation for your agent state.
Defining a state
Absolute positions are rarely useful in a random environment, especially if the environment is infinite like in your example (since the bot can drive over the borders and respawn at the other side). This means that the size of the environment isn't needed for the agent to operate (we will actually need it to simulate the infinite space, tho).
A reward function calculates its return value based on the current state of the agent compared to its previous state. But how do we define a state? Lets see what we actually need in order to operate the agent like we want it to.
The position of the agent.
The position of the nearest mine.
That is all we need. Now I said erlier that absolute positions are bad. This is because it makes the Q table (you call it Q matrix) static and very fragile to randomness. So let's try to completely eliminate abosulte positions from the reward function and replace them with relative positions. Luckily, this is very simple in your case: instead of using the absolute positions, we use the relative position between the nearest mine and the agent.
Now we don't deal with coordinates anymore, but vectors. Lets calculate the vector between our points: v = pos_mine - pos_agent. This vector gives us two very important pieces of information:
the direction in which the nearst mine is, and
the distance to the nearest mine.
And these are all we need to make our agent operational. Therefore, an agent state can be defined as
State: Direction x Distance
of which distance is a floating point value and direction either a float that describes the angle or a normalized vector.
Defining a reward function
Given our newly defined state, the only thing we care about in our reward function is the distance. Since all we want is to move the agent towards mines, the distance is all that matters. Here are a few guesses how the reward function could work:
If the agent sweeps a mine (distance == 0), return a huge reward (ex. 100).
If the agent moves towards a mine (distance is shrinking), return a neutral (or small) reward (ex. 0).
If the agent moves away from a mine (distance is increasing), retuan a negative reward (ex. -1).
Theoretically, since we penaltize moving away from a mine, we don't even need rule 1 here.
Conclusion
The only thing left is determining a good learning rate and discount so that your agent performs well after 50 iterations. But, given the simplicity of the environment, this shouldn't even matter that much. Experiment.
This is a question from the Australian Informatics Olympiad
The question is:
Have you ever heard of Melodramia, my friend? It is a land of forbidden forests and boundless swamps, of sprinting heroes and dashing heroines. And it is home to two dragons, Rose and Scarlet, who, despite their competitive streak, are the best of friends.
Rose and Scarlet love playing Binary Snap, a game for two players. The game is played with a deck of cards, each with a numeric label from 1 to N. There are two cards with each possible label, making 2N cards in total. The game goes as follows:
Rose shuffles the cards and places them face down in front of Scarlet.
Scarlet then chooses either the top card, or the second-from-top card from the deck and reveals it.
Scarlet continues to do this until the deck is empty. If at any point the card she reveals has the same label as the previous card she revealed, the cards are a Dragon Pair, and whichever dragon shouts `Snap!' first gains a point.
After many millenia of playing, the dragons noticed that having more possible Dragon Pairs would often lead to a more exciting game. It is for this reason they have summoned you, the village computermancer, to write a program that reads in the order of cards in the shuffled deck and outputs the maximum number of Dragon Pairs that the dragons can find.
I'm not sure how to solve this. I thought of something which is wrong(choosing the maximum over all cards, when compared with its previous occurence for each card)
Here's my code as of now:
#include <iostream>
#include <fstream>
using namespace std;
int main() {
ifstream fin("snapin.txt");
ofstream fout("snapout.txt");
int n;
fin>>n;
int arr[(2*n)+1];
for(int i=0;i<2*n;i++){
fin>>arr[i];
}
int dp[(2*n) +1];
int maxi = 0;
int pos[n+1];
for(int i=0;i<n+1;i++){
pos[i] = -1;
}
int count = 0;
for(int i=2;i<(2*n)-2;i++){
if(pos[arr[i]] == -1){
pos[arr[i]] = i;
}else{
dp[i] = pos[arr[i]]+1;
maxi = max(dp[i],maxi);
}
dp[i] = max(dp[i],maxi);
}
fout<<dp[2*n -1];
}
Ok, let's get some basic measurements of the problem out of the way first:
There are 2N cards. 1 card is drawn at a time, without replacement. Therefore there are 2N draws, taking the deck from size 2N (before the first draw) to size 0 (after the last draw).
The final draw takes place from a deck of size 1, and must take the last remaining card.
The 2N-1 preceding draws have deck size 2N, ... 3, 2. For each of these you have a choice between the top two cards. 2N-1 decisions, each with 2 possibilities.
The brute force search space is therefore 22N-1.
That is exponential growth, every optimization scientist's favorite sort of challenge.
If N is small, say 20, the brute force method needs to search "only" a trillion possibilities, which you can get done in a few thousand seconds on a readily available PC that does a few billion operations per second (each solution takes more than one CPU instruction to check).
In N is not quite as small, perhaps 100, the brute force method is akin to breaking the encryption on government secrets.
Not happy with the brute force approach then? I'm not either.
Before we get to the optimal solution, let’s take a break to explore what the Markov assumption is and what it means for us. It shows up in different fields using different verbiage, but I’ll just paraphrase it in a way that is particularly useful for this problem involving gameplay choices:
Markov Assumption
A process is Markov if and only if The choices available to you in the future depend only on what you have now, and not how you got it.
A bad but often used real-world example is the stock market. Not only do taxation differences between short-term and long-term capital gains make history important in a small way, but investors do trend analysis and remember what stocks have done before, which affects future behavior in a big way.
A better example, especially for StackOverflow, is that of Turing machines and computer processors. What your program does next depends on the current instruction pointer and the contents of memory, but not the history of memory that’s since been overwritten. But there are many more. As we’ll see shortly, the Binary Snap problem can be formulated as Markov.
Now let’s talk about what makes the Markov assumption so important. For that, we’ll use the Travelling Salesman Problem. No, the Travelling International Salesman Problem. Still too messy. Let’s try the “Travelling International Salesman with a Single-Entry Visa Problem”. But we’ll go through all three of them briefly:
Travelling Salesman Problem
A salesman has to visit potential buyers in N cities. Plan an itinerary for the salesman which minimizes the total cost of visiting all N cities (variations: at least once / exactly once), given a matrix aj,k which is the cost of travel from city j to city k.
Another variation is whether the starting city is predetermined or not.
Travelling International Salesman Problem
The cities the salesman needs to visit are split between two (or more) nations. A subset of the cities have border crossings and have travel options to all cities. The other cities can only reach cities which are either in the same country or are border-equipped.
Alternatively, instead of cities along the border, use cities with international airports. Won’t make a difference in the end.
The cost matrix for this problem looks rather like the flag of the Dominican Republic. Travel between interior cities of country A is permitted, as is travel between interior cities of country B (blue fields). Border cities connect with interior and border cities in both countries (white cross). And direct travel between an interior city of country A and one of country B is impossible (red areas).
Travelling International Salesman with a Single-Entry Visa
Now not only does the salesman need to visit cities in both countries, but he can only cross the border once.
(For travel fanatics, assume he starts in a third country and has single-entry visas for both countries, so he can’t visit some of A, all of B, then return to A for the rest).
Let’s look at an extremely simple case first: Only one border city. We’ll use one additional trick, the one from proof by induction: We assume that all problems smaller than the current one can be solved.
It should be fairly obvious that the Markov assumption holds when the salesman reaches the border city. No matter what path he took through country A, he has exactly the same choice of paths through country B.
But there’s a really important point here: Any path through country A ending at the border and any path through country B starting at the border, can be combined into a feasible full itinerary. If we have two full itineraries x and y, and x spent more money in country A than y did, then even if x has a lower total cost than the total cost of y, we can plan a path better than both, using the portion of y in country A and the portion of x in country B. I’m going to call that “splicing”. The Markov assumption lets us do it, by making all roads leading to the border interchangeable!
In fact, we can look just at the cities of country A, pick the best of all routes to the border, and forget about all the other options as soon as (in our plan) the salesman steps across into B.
This means instead of having factorial(NA) * factorial(NB) routes to look at, there’s only factorial(NA) + factorial(NB). Which is pretty much factorial(NA) times better. Wow, is this Markov thing helpful or what?
Ok, that was too easy. Let’s mess it all up by having NAB border cities instead of just one. Now if I have a path x which costs less in country B and a path y which costs less in country A, but they cross the border in different cities, I can’t just splice them together. So I have to keep track of all the paths through all the cities again, right?
Not exactly. What if, instead of throwing away all the paths through country A except the best y path, I actually keep one path ending in each border city (the lowest cost of all paths ending in the same border city). Now, for any path x I look at in country B, I have a path yendpt(x) that uses the same border city, to splice it with. So I have to solve the country A and country B partitions each NAB times to find the best splice of a complete itinerary, for total work of NAB factorial(NA) + NAB factorial(NB) which is still way better than factorial(NA) * factorial(NB).
Enough development of tools. Let’s get back to the dragons, since they are they are subtle and quick to anger and I don’t want to be eaten or burnt to a crisp.
I claim that at any step T of the Binary Snap game, if we consider our “location” a pair of (card just drawn, card on top of deck), the Markov assumption will hold. These are the only things that determine our future options. All the cards below the top one in the deck must be in the same order no matter what we did before. And for knowing whether to count a Snap! with the next card, we need to know the last one taken. And that’s it!
Furthermore, there are N possible labels on the card last drawn, and N possible for the top card on the deck, for a total of N2 “border cities”. As we start playing the game, there are two choices on the first turn, two on the second, two on the third, so we start out with 2T possible game states (and a count of Snap!s for each). But by the pigeonhole principle, when 2T > N2, some of these plays must end in exactly the same game state (“border city”) as each other, and when that happens, we only need to keep the "dominating" one that got the best score on the way there.
Final complexity bound: 2*N timesteps, from no more than N2 game states, with 2 draw choices at each, equals an upper limit of 4*N3 simulated draws.
And that means the same trillion calculations that allowed us to do N=20 with the brute force method, now permit right around N=8000.
That makes the dragons happy, which makes us alive and well.
Implementation note: Since the challenge didn’t ask for the order of draws, but just the highest attainable number of snaps, all you data to keep track of in addition to the initial ordering of the cards is the time, T, and a 2-dimensional array (N rows, N columns) of the best score you can have and reach that state at time T.
Real world applications: If you take this approach and apply it to a digital radio (fixed uniform bit timing, discrete signal levels) receiving a signal using a convolutional error-correcting code, you have the Viterbi decoder. If you apply it to acquired medical data, with variable timing intervals and continuous signal levels, and add some other gnarly math, you get my doctoral project.
I've hit a snag in continuing my work in a C++ program, I'm not sure what the best way to approach my problem is. Here is the situation in non-programming terms: I have a list of children and each child has a specific weight, age, and happiness. I have a way that people can visually view the bones of the child that is specific to these characteristics. (Think of an MMO character customization where there are sliders for each characteristic and when you slide the weight slider to heavy, the walk cycle looks like the character is heavier).
Before, my system had a set walk cycle for each end of the spectrum for each characteristic. For example, there is one specific walk cycle for the heaviest walk, one for the lightest walk, one for youngest walk, etc. There was not middle input, the output was the position of the slider on the scale and the heaviest walk cycle and the lightest walk cycle were averaged by a specific percentage, the position of the slider.
Now to the problem, I have a large library of preset walk cycles and each walk cycle has a specific weight, age, and happiness. So, Joe has a weight of 4, an age of 7, and happiness level of 8 and Sally 2, 3, 5. When the sliders move to a the specific value (weight 5, age 8, happiness 7). However, only one slider can be moved at one time and the slider that was moved last is the most important characteristic to find the closest match to. I want to find in my library the child that has the closest to all three of these values and Joe will be the closest.
I was told to check out using a 3 dimensional array but I would rather use an array of child objects and do multiple searches on that array which, I am a rookie and I know the search will take a bit of computing time but I keep leaning towards using the single array. I could also use a two dimensional array but I'm not sure. What data structure would be the best to search for three values in?
Thank you for any help!
How many different values can each slider take? If there are say ten values for each slider this would mean there are 10*10*10=1000 different possible character classes. If your library has less than 1000 walk cycles just reading through them all looking for the nearest match is probably gonna be fast enough.
Of course if there are 100 values for each slider then you may want something more clever. My point is there are some thing that don't have to be optimized.
Also is your library of walk cycles fixed once and for all? If so perhaps you could pre compute the walk cycle for each setting of the sliders and write that to a static array.
I agree with Wilf that the number of walk cycles is critical, as even if there are say 100,000 cycles you could easily use a brute-force find-the-maximum over...
weight_factor * diff(candidate.weight, target.weight) +
age_factor * diff(candidate.age, target.age) +
happiness_factor * diff(candidate.happiness, target.happiness)
...where the factor for the last-moved slider was higher than the others.
For more cycles than that you'd want to limit the search space somewhat, and some indices would be useful, e.g.:
map<int, map< int, map<int, vector<Cycle*>> cycles_by_weight_age_happiness;
You'd populate that adding a pointer to each walk cycle - characterised by { weight, age, happiness } - to cycles[rw(weight)][ra(age)][rh(happiness)], where each of rw, ra and rh rounded the parameters by whatever granularity you liked (e.g. round weight down to nearest 5kgs, group ages by integer part of log base 1.5 of age, leave happiness alone). Then to search you evaluate the entries "around" your target { rw(weight), ra(age), rh(happiness) } indices... the further from there you deviate (especially on the last-slider-moved parameter, the less likely you are to find a better fit than you've already seen, so tune to taste.
The above indexing is a refinement of what I think Wilf intended - just using functions to decouple the mapping from value space into vectors in the index.
I have a rectangular grid shaped DAG where the horizontal edges always point right and the vertical edges always point down. The edges have positive costs associated with them. Because of the rectangular format, the nodes are being referred to using zero-based row/column. Here's an example graph:
Now, I want to perform a search. The starting vertex will always be in the left column (column with index 0) and in the upper half of the graph. This means I'll pick the start to be either (0,0), (1,0), (2,0), (3,0) or (4,0). The goal vertex is always in the right column (column with index 6) and "corresponds" to the start vertex:
start vertex (0,0) corresponds to goal vertex (5,6)
start vertex (1,0) corresponds to goal vertex (6,6)
start vertex (2,0) corresponds to goal vertex (7,6)
start vertex (3,0) corresponds to goal vertex (8,6)
start vertex (4,0) corresponds to goal vertex (9,6)
I only mention this to demonstrate that the goal vertex will always be reachable. It's possibly not very important to my actual question.
What I want to know is what search algorithm should I use to find the path from start to goal? I am using C++ and have access to the Boost Graph Library.
For those interested, I'm trying to implement Fuchs' suggestions from his Optimal Surface Reconstruction from Planar Contours paper.
I looked at A* but to be honest didn't understand it and wasn't how the heuristic works or even whether I could come up with one!
Because of the rectangular shape and regular edge directions, I figured there might be a well-suited algorithm. I considered Dijkstra
but the paper I mention said there were quicker algorithms (but annoyingly for me doesn't provide an implementation), plus that's
single-source and I think I want single-pair.
So, this is your problem,
DAG no cycles
weights > 0
Left weight < Right weight
You can use a simple exhaustive search defining every possible route. So you have a O(NxN) algoririthm. And then you will choose the shortest path. It is not a very smart solution, but it is effective.
But I suppose you want to be smarter than that, let's consider that if a particular node can be reached from two nodes, you can find the minimum of the weights at the two nodes plus the cost for arriving to the current node. You can consider this as an extension of the previous exhaustive search.
Remember that a DAG can be drawn in a line. For DAG linearization here's an interesting resource.
Now you have just defined a recursive algorithm.
MinimumPath(A,B) = MinimumPath(MinimumPath(A,C)+MinimumPath(A,D)+,MinimumPath(...)+MinimumPath(...))
Of course the starting point of recursion is
MinimumPath(Source,Source)
which is 0 of course.
As far as I know, there isn't an out of the box algorithm from boost to do this. But this is quite straightforward to implement it.
A good implementation is here.
If, for some reason, you do not have a DAG, Dijkstra's or Bellman-Ford should be used.
if I'm not mistaken, from the explanation this is really an optimal path problem not a search problem since the goal is known. In optimization I don't think you can avoid doing an exhaustive search if you want the optimal path and not an approximate path.
From the paper it seems like you are really subdividing a space many times then running the algorithm. This would reduce your N to closer to a constant in the context of the entire surface making O(N^2) not so bad.
That being said perhaps dynamic programming would be a good strategy where the N would be bounded by the difference between your start and goal node. Here is an example form genomic alignment. Just an illustration to give you an idea of how it works.
Construct a N by N array of cost values all set to 0 or some default.
#
For i in size N:
For j in size N:
#cost of getting here from i-1 or j-1
cost[i,j] = min(cost[i-1,j] + i edge cost , cost[i,j-1] + j edge cost)
Once you have your table filled in, start at bottom right corner. Starting from your goal, Choose to go to the node with the lowest cost of getting there. Work your way backwards choosing the lowest value until you reach the start node (or rather the array entry corresponding to the start node). This should give you the optimal path very simply.
Dynamic programming works by solving the optimization on smaller sub-problems. In this case the sub-problems are optimal paths to preceding nodes. I think the rectangular nature of your graph makes this a good fit.
I have a wireless mesh network of nodes, each of which is capable of reporting its 'distance' to its neighbors, measured in (simplified) signal strength to them. The nodes are geographically in 3d space but because of radio interference, the distance between nodes need not be trigonometrically (trigonomically?) consistent. I.e., given nodes A, B and C, the distance between A and B might be 10, between A and C also 10, yet between B and C 100.
What I want to do is visualize the logical network layout in terms of connectness of nodes, i.e. include the logical distance between nodes in the visual.
So far my research has shown the multidimensional scaling (MDS) is designed for exactly this sort of thing. Given that my data can be directly expressed as a 2d distance matrix, it's even a simpler form of the more general MDS.
Now, there seem to be many MDS algorithms, see e.g. http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html and http://tapkee.lisitsyn.me/ . I need to do this in C++ and I'm hoping I can use a ready-made component, i.e. not have to re-implement an algo from a paper. So, I thought this: https://sites.google.com/site/simpmatrix/ would be the ticket. And it works, but:
The layout is not stable, i.e. every time the algorithm is re-run, the position of the nodes changes (see differences between image 1 and 2 below - this is from having been run twice, without any further changes). This is due to the initialization matrix (which contains the initial location of each node, which the algorithm then iteratively corrects) that is passed to this algorithm - I pass an empty one and then the implementation derives a random one. In general, the layout does approach the layout I expected from the given input data. Furthermore, between different runs, the direction of nodes (clockwise or counterclockwise) can change. See image 3 below.
The 'solution' I thought was obvious, was to pass a stable default initialization matrix. But when I put all nodes initially in the same place, they're not moved at all; when I put them on one axis (node 0 at 0,0 ; node 1 at 1,0 ; node 2 at 2,0 etc.), they are moved along that axis only. (see image 4 below). The relative distances between them are OK, though.
So it seems like this algorithm only changes distance between nodes, but doesn't change their location.
Thanks for reading this far - my questions are (I'd be happy to get just one or a few of them answered as each of them might give me a clue as to what direction to continue in):
Where can I find more information on the properties of each of the many MDS algorithms?
Is there an algorithm that derives the complete location of each node in a network, without having to pass an initial position for each node?
Is there a solid way to estimate the location of each point so that the algorithm can then correctly scale the distance between them? I have no geographic location of each of these nodes, that is the whole point of this exercise.
Are there any algorithms to keep the 'angle' at which the network is derived constant between runs?
If all else fails, my next option is going to be to use the algorithm I mentioned above, increase the number of iterations to keep the variability between runs at around a few pixels (I'd have to experiment with how many iterations that would take), then 'rotate' each node around node 0 to, for example, align nodes 0 and 1 on a horizontal line from left to right; that way, I would 'correct' the location of the points after their relative distances have been determined by the MDS algorithm. I would have to correct for the order of connected nodes (clockwise or counterclockwise) around each node as well. This might become hairy quite quickly.
Obviously I'd prefer a stable algorithmic solution - increasing iterations to smooth out the randomness is not very reliable.
Thanks.
EDIT: I was referred to cs.stackexchange.com and some comments have been made there; for algorithmic suggestions, please see https://cs.stackexchange.com/questions/18439/stable-multi-dimensional-scaling-algorithm .
Image 1 - with random initialization matrix:
Image 2 - after running with same input data, rotated when compared to 1:
Image 3 - same as previous 2, but nodes 1-3 are in another direction:
Image 4 - with the initial layout of the nodes on one line, their position on the y axis isn't changed:
Most scaling algorithms effectively set "springs" between nodes, where the resting length of the spring is the desired length of the edge. They then attempt to minimize the energy of the system of springs. When you initialize all the nodes on top of each other though, the amount of energy released when any one node is moved is the same in every direction. So the gradient of energy with respect to each node's position is zero, so the algorithm leaves the node where it is. Similarly if you start them all in a straight line, the gradient is always along that line, so the nodes are only ever moved along it.
(That's a flawed explanation in many respects, but it works for an intuition)
Try initializing the nodes to lie on the unit circle, on a grid or in any other fashion such that they aren't all co-linear. Assuming the library algorithm's update scheme is deterministic, that should give you reproducible visualizations and avoid degeneracy conditions.
If the library is non-deterministic, either find another library which is deterministic, or open up the source code and replace the randomness generator with a PRNG initialized with a fixed seed. I'd recommend the former option though, as other, more advanced libraries should allow you to set edges you want to "ignore" too.
I have read the codes of the "SimpleMatrix" MDS library and found that it use a random permutation matrix to decide the order of points. After fix the permutation order (just use srand(12345) instead of srand(time(0))), the result of the same data is unchanged.
Obviously there's no exact solution in general to this problem; with just 4 nodes ABCD and distances AB=BC=AC=AD=BD=1 CD=10 you cannot clearly draw a suitable 2D diagram (and not even a 3D one).
What those algorithms do is just placing springs between the nodes and then simulate a repulsion/attraction (depending on if the spring is shorter or longer than prescribed distance) probably also adding spatial friction to avoid resonance and explosion.
To keep a "stable" diagram just build a solution and then only update the distances, re-using the current position from previous solution as starting point. Picking two fixed nodes and aligning them seems a good idea to prevent a slow drift but I'd say that spring forces never end up creating a rotational momentum and thus I'd expect that just scaling and centering the solution should be enough anyway.