I do not understand how the Koch Curve is drawn from using this function.
def koch(t, n):
"""Draws a koch curve with length n."""
if n<3:
fd(t, n)
return
m = n/3.0
koch(t, m)
lt(t, 60)
koch(t, m)
rt(t, 120)
koch(t, m)
lt(t, 60)
koch(t, m)
The fd(t, n) command means object 't' will move forward by amount 'n'.
The rt(t, 120) and lt(t, 60) commands means object 't' will turn right or left by the given angles.
So I gather that the author uses recursion in the function but I do not understand how it reiterates so many times with itself as I am a beginner and have very limited logic skills.
As an example say I called koch(t, 100) the if clause is by passed as n > 3 which leads to the next line of code which is m/3.0 so 100/3.0 is 33.3. This then leads to koch(t, 33.3) and as n > 3 still holds it reiterates again to produce koch(t, 11.1) and so forth until we reiterate it until we come to koch(t, 1.23).
Now as n = 1.23 and the if clause activates as soon as n < 3 we can run through the if conditionals block of code replacing all the koch(t, m) statements with fd(t, 1.23). As I see it fd(), lt(), fd(), rt(), fd, lt(), fd() should be activated only one time as n < 3 as soon as n = 1.23 or does it reiterate again with 1.23 / 3.0 and the code is ran again with koch(t, 0.41)? Maybe because an else clause does not exists to cancel the function, however the function does end and if I choose a higher value for n the koch curve is also larger making me more confused as there I can see no line in the code which tells me to reiterate this function n number of times.
I apologize for the lack of clarity as I do not understand how to explain this clearly.
I think you may be looking at this from the wrong end to try to work it out. Consider first what happens if you call koch(t,1). The if statement evaluates to false, and you can see that something like this is drawn:
_/\_
Now what if you call koch(t,3)? Try on a piece of paper and you'll see that each of the straight lines in the picture above is replaced by similar shape...
I found out my problem after reading about recursion and testing some print statements in my console. What I did not understand was why choosing a larger n (length) produced a larger fractal. Basically because choosing a larger n produces more nodes (children) on a recursive tree so choosing a larger n will produce more children nodes and only the last nodes (null nodes) when n < 3 occurs will the turtle t begin to draw and by this time there will be many null nodes if n is large.
To understand recursion even further including how recursion works when there are two or more recursive functions in the block of code as posed by this question I have included a link to a helpful thread and hopes it helps anybody else who is stuck on this question and needs help in understanding recursion.
Understanding recursion
Related
I am new to reinforcement learning. I had recently learned about approximate q learning, or feature-based q learning, in which you describe states by features to save space. I have tried to implement this in a simple grid game. Here, the agent is supposed to learn to not go into a firepit(signaled by an f) and to instead eat up as much dots as possible. Here is the grid used:
...A
.f.f
.f.f
...f
Here A signals the agent's starting location. Now, when implementing, I set up two features. One was 1/((distance to closest dot)^2), and the other was (distance to firepit) + 1. When the agent enters a firepit, the program returns with a reward of -100. If it goes to a non firepit position that was already visited(and thus there is no dot to be eaten), the reward is -50. If it goes to an unvisited dot, the reward is +500. In the above grid, no matter what the initial weights are, the program never learns the correct weight values. Specifically, in the output, the first training session gains a score(how many dots it ate) of 3, but for all other training sessions, the score is just 1 and the weights converge to an incorrect value of -125 for weight 1(distance to firepit) and 25 for weight 2(distance to unvisited dot). Is there something specifically wrong with my code or is my understanding of approximate q learning incorrect?
I have tried to play around with the rewards that the environment is giving and also with the initial weights. None of these have fixed the problem.
Here is the link to the entire program: https://repl.it/repls/WrongCheeryInterface
Here is what is going on in the main loop:
while(points != NUMPOINTS){
bool playerDied = false;
if(!start){
if(!atFirepit()){
r = 0;
if(visited[player.x][player.y] == 0){
points += 1;
r += 500;
}else{
r += -50;
}
}else{
playerDied = true;
r = -100;
}
}
//Update visited
visited[player.x][player.y] = 1;
if(!start){
//This is based off the q learning update formula
pairPoint qAndA = getMaxQAndAction();
double maxQValue = qAndA.q;
double sample = r;
if(!playerDied && points != NUMPOINTS)
sample = r + (gamma2 * maxQValue);
double diff = sample - qVal;
updateWeights(player, diff);
}
// checking end game condition
if(playerDied || points == NUMPOINTS) break;
pairPoint qAndA = getMaxQAndAction();
qVal = qAndA.q;
int bestAction = qAndA.a;
//update player and q value
player.x += dx[bestAction];
player.y += dy[bestAction];
start = false;
}
I would expect that both weights would still be positive, but one of them is negative(the one giving distance to the firepit).
I also expected the program to learn overtime that it is bad to enter a firepit and also bad, but not as bad, to go to an unvisited dot.
Probably not the anwser you want to hear, but:
Have you try to implement the simpler tabular Q-learning before approximate Q-learning? In your setting, with a few states and actions, it will work pefectly. If you are learning, I strongly recommend you to start with the simpler cases in order to get a better understanding/intuition about how Reinforcement Learning works.
Do you know the implications of using approximators instead of learning the exact Q function? In some cases, due to the complexity of the problem (e.g., when the state space is continuous) you should approximate the Q function (or the policy, depending on the algorithm), but this may introduce some convergence problems. Additionally, in you case, you are trying to hand-pick some features, which usually required a depth knowledge of the problem (i.e., environment) and the learning algorithm.
Do you understand the meaning of the hyperparameters alpha and gamma? You can not choose them randomly. Sometimes they are critical to obtain the expected results, not always, depending heavely on the problem and the learning algorithm. In your case, taking a look to the convergence curve of you weights, it's pretty clear that you are using a value of alpha too high. As you pointed out, after the first training session your weigths remain constant.
Therefore, practical recommendations:
Be sure to solve your grid game using a tabular Q-learning algorithm before trying more complex things.
Experiment with different values of alpha, gamma and rewards.
Read more in depth about approximated RL. A very good and accesible book (starting from zero knowledge) is the classical Sutton and Barto's book: Reinforcement Learning: An Introduction, which you can obtain for free and was updated in 2018.
I am new to Python, coming from MATLAB, and long ago from C. I have written a script in MATLAB which simulates sediment transport in rivers as a Markov Process. The code randomly places circles of a random diameter within a rectangular area of a specified dimension. The circles are non-uniform is size, drawn randomly from a specified range of sizes. I do not know how many times I will step through the circle placement operation so I use a while loop to complete the process. In an attempt to be more community oriented, I am translating the MATLAB script to Python. I used the online tool OMPC to get started, and have been working through it manually from the auto-translated version (was not that helpful, which is not surprising). To debug the code as I go, I use the
MATLAB generated results to generally compare and contrast against results in Python. It seems clear to me that I have declared variables in a way that introduces problems as calculations proceed in the script. Here are two examples of consistent problems between different instances of code execution. First, the code generated what I think are arrays within arrays because the script is returning results which look like:
array([[ True]
[False]], dtype=bool)
This result was generated for the following code snippet at the overlap_logix operation:
CenterCoord_Array = np.asarray(CenterCoordinates)
Diameter_Array = np.asarray(Diameter)
dist_check = ((CenterCoord_Array[:,0] - x_Center) ** 2 + (CenterCoord_Array[:,1] - y_Center) ** 2) ** 0.5
radius_check = (Diameter_Array / 2) + radius
radius_check_update = np.reshape(radius_check,(len(radius_check),1))
radius_overlap = (radius_check_update >= dist_check)
# Now actually check the overalp condition.
if np.sum([radius_overlap]) == 0:
# The new circle does not overlap so proceed.
newCircle_Found = 1
debug_value = 2
elif np.sum([radius_overlap]) == 1:
# The new circle overlaps with one other circle
overlap = np.arange(0,len(radius_overlap), dtype=int)
overlap_update = np.reshape(overlap,(len(overlap),1))
overlap_logix = (radius_overlap == 1)
idx_true = overlap_update[overlap_logix]
radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)
A similar result for the same run was produced for variables:
radius_check_update
radius_overlap
overlap_update
Here is the same code snippet for the working MATLAB version (as requested):
distcheck = ((Circles.CenterCoordinates(1,:)-x_Center).^2 + (Circles.CenterCoordinates(2,:)-y_Center).^2).^0.5;
radius_check = (Circles.Diameter ./ 2) + radius;
radius_overlap = (radius_check >= distcheck);
% Now actually check the overalp condition.
if sum(radius_overlap) == 0
% The new circle does not overlap so proceed.
newCircle_Found = 1;
debug_value = 2;
elseif sum(radius_overlap) == 1
% The new circle overlaps with one other circle
temp = 1:size(radius_overlap,2);
idx_true = temp(radius_overlap == 1);
radius = distcheck(1,idx_true) - (Circles.Diameter(1,idx_true)/2);
In the Python version I have created arrays from lists to more easily operate on the contents (the first two lines of the code snippet). The array within array result and creating arrays to access data suggests to me that I have incorrectly declared variable types, but I am not sure. Furthermore, some variables have a size, for example, (2L,) (the numerical dimension will change as circles are placed) where there is no second dimension. This produces obvious problems when I try to use the array in an operation with another array with a size (2L,1L). Because of these problems I started reshaping arrays, and then I stopped because I decided these were hacks because I had declared one, or more than one variable incorrectly. Second, for the same run I encountered the following error:
TypeError: 'numpy.ndarray' object is not callable
for the operation:
radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)
which occurs at the bottom of the above code snippet. I have posted the entire script at the following link because it is probably more useful to execute the script for oneself:
https://github.com/smchartrand/MarkovProcess_Bedload
I have set-up the code to run with some initial parameter values so decisions do not need to be made; these parameter values produce the expected results in the MATLAB-based script, which look something like this when plotted:
So, I seem to specifically be having issues with operations on lines 151-165, depending on the test value np.sum([radius_overlap]) and I think it is because I incorrectly declared variable types, but I am really not sure. I can say with confidence that the Python version and the MATLAB version are consistent in output through the first step of the while loop, and code line 127 which is entering the second step of the while loop. Below this point in the code the above documented issues eventually cause the script to crash. Sometimes the script executes to 15% complete, and sometimes it does not make it to 5% - this is due to the random nature of circle placement. I am preparing the code in the Spyder (Python 2.7) IDE and will share the working code publicly as a part of my research. I would greatly appreciate any help that can be offered to identify my mistakes and misapplications of python coding practice.
I believe I have answered my own question, and maybe it will be of use for someone down the road. The main sources of instruction for me can be found at the following three web pages:
Stackoverflow Question 176011
SciPy FAQ
SciPy NumPy for Matlab users
The third web page was very helpful for me coming from MATLAB. Here is the modified and working python code snippet which relates to the original snippet provided above:
dist_check = ((CenterCoordinates[0,:] - x_Center) ** 2 + (CenterCoordinates[1,:] - y_Center) ** 2) ** 0.5
radius_check = (Diameter / 2) + radius
radius_overlap = (radius_check >= dist_check)
# Now actually check the overalp condition.
if np.sum([radius_overlap]) == 0:
# The new circle does not overlap so proceed.
newCircle_Found = 1
debug_value = 2
elif np.sum([radius_overlap]) == 1:
# The new circle overlaps with one other circle
overlap = np.arange(0,len(radius_overlap[0]), dtype=int).reshape(1, len(radius_overlap[0]))
overlap_logix = (radius_overlap == 1)
idx_true = overlap[overlap_logix]
radius = dist_check[idx_true] - (Diameter[0,idx_true] / 2)
In the end it was clear to me that it was more straightforward for this example to use numpy arrays vs. lists to store results for each iteration of filling the rectangular area. For the corrected code snippet this means I initialized the variables:
CenterCoordinates, and
Diameter
as numpy arrays whereas I initialized them as lists in the posted question. This made a few mathematical operations more straightforward. I was also incorrectly indexing into variables with parentheses () as opposed to the correct method using brackets []. Here is an example of a correction I made which helped the code execute as envisioned:
Incorrect: radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)
Correct: radius = dist_check[idx_true] - (Diameter[0,idx_true] / 2)
This example also shows that I had issues with array dimensions which I corrected variable by variable. I am still not sure if my working code is the most pythonic or most efficient way to fill a rectangular area in a random fashion, but I have tested it about 100 times with success. The revised and working code can be downloaded here:
Working Python Script to Randomly Fill Rectangular Area with Circles
Here is an image of a final results for a successful run of the working code:
The main lessons for me were (1) numpy arrays are more efficient for repetitive numerical calculations, and (2) dimensionality of arrays which I created were not always what I expected them to be and care must be practiced when establishing arrays. Thanks to those who looked at my question and asked for clarification.
I'm working on problem 3 of Project Euler using Python, but I can't seem to solve the problem without running into the following error: "OverflowError: range() result has too many items"
I'm wondering if there's a way to increase the allowed range? My code looks as follows:
target = 600851475143
largest_prime_factor = 1
#find largest prime factor of target
for possible_factor in range(2,(target/2)+1):
if target % possible_factor == 0:
is_prime = True
for i in range(2,(possible_factor/2)+1):
if possible_factor % i == 0:
is_prime = False
break
if is_prime:
largest_prime_factor = possible_factor
print largest_prime_factor
If you run into limitations of your computer or language while trying to solve a puzzle problem, or if it takes too long, it is an indication that probably there exists a better way (read: algorithm) to solve the problem. In your case, you do not need to loop to target / 2 + 1 (though that is a good educated upper bound). You only need to go as far as ceil(sqrt(target)).
And, as a sidenote, you can overcome this limitation by using xrange, which will create a generator, instead of range for Python 2, which creates a list. In Python 3, range will return a sequence type instead of a list by default.
Thanks to #Fernando for the clarification in the comments.
I am trying to convert a Rainflow cycle counting algorithm which is in Fortran, which is a language I am not familiar with, into Matlab.
There is a ready made Rainflow I've downloaded for Matlab but that does not fit the requirements of my project so I'm trying to build one from scratch.
Here is the Fortran code:
INTEGER BUFFER (4096), INDEX, VALUE, RANGE, MEAN, X, Y
INDEX = 0
10 CONTINUE
call 'get next peak/valley', VALUE
INDEX = INDEX + 1
BUFFER (INDEX) = VALUE
20 CONTINUE
IF (INDEX.LT.3) THEN
not enough points to form a cycle
GOTO 10
ELSE
X = ABS (BUFFER(INDEX) - BUFFER(INDEX - 1))
Y = ABS (BUFFER(INDEX - 1) - BUFFER(INDEX - 2))
IF (X.GE.Y) THEN
c -- cycle has been closed
RANGE = Y
MEAN = (BUFFER(INDEX-1) + BUFFER(INDEX-2))/2
c -- remove the cycle
INDEX = INDEX - 2
BUFFER(INDEX) = BUFFER(INDEX+2)
c -- see if this value closes any more cycles
GOTO 20
ELSE
GOTO 10
END IF
END IF
I had downloaded f2matlab (a Fortran to Matlab converter) but it requires a Fortran compiler which I do not have.
The bits I don't really understand how I can convert are:
The call 'get next… line (is this an input()?)
The BUFFER(4096) etc (is this a bit large to be a matrix in matlab?)
The GOTO/CONTINUE structure.
What do they mean, in English (or Matlab)?
I have seen
How to translate fortran goto state to matlab
and
translating loop from Fortran to MATLAB
but they do not help me very much.
This
call 'get next peak/valley', VALUE
isn't (currently) syntactically valid Fortran and I'm not sure whether any compiler of yore would have understood it either. I guess that it means get a VALUE for use in the following bits of code.
INTEGER BUFFER (4096)
is a simple declaration that BUFFER is a vector of 4096 integers, nothing to scare Matlab in that volume of data.
Finally, GOTO is an unconditional jump and the number following it is the label of the line to jump to, so GOTO 10 means execute the line with label 10 next. It was fairly common in FORTRAN of the vintage you are showing us to jump to a CONTINUE statement which is, in this context, a no-operation, execution continues to the next line.
In another context, with DO loops CONTINUE would have marked the end of the block of code inside the scope of the loop and would have a subtly different effect.
I'm a junior programmer and I know the basics of pascal and c++. I made a Tic Tac Toe game with Player-Computer and the game is all finished.
The computer generates a random place where the Os go on the table and that's not good.
I thought that i should multiple procedures that check every winning position and the computer should else try to block the player's Xs or to make a winning position, BUT this would have been lots of time lost cause all of the if's.
Then I thought of a simpler version with some kind of ifs but it would still have been taking lots of time to do.
Then i thought deeper: What about a find-four game? How in the earth someone would manage to check every space available and how it would've been possible that someone could make a function that checks absolutely any winning or progress of player/computer position, Oh and wait, that's not ALL, what if the player is doing some tricks so he blocks the computer? How would the computer know that?!? For sure, that would take ages to program. And I am not talking about something that seems more impossible: Chess.
So here I am, asking myself that there SHOULD be a way more simpler way the computer should search and solve some problems than tons of ifs.
In this case, if any of you know any way of solving this, how can i manage to make the simplest procedure to block and beat the player in a TicTacToe game?
If someone wants to check my code or use it: http://pastebin.com/jhyUn7d1
What you're looking for is Minimax.
Using this algorithm the computer will win every Tic Tac Toe game or you could adjust the depth that the computer analyzes the moves in order to achieve some kind of medium difficulty.
It's not hard to implement, you should be familiar with recursivity and you're set, of course the implementation differs according to your code, but the wikipedia page offers a pretty good starting point.
Tic tac toe algorithm is something like:
Take spot if going to win
Take spot if going to lose
Take corner
Take non-corner non-center
Take center
The short answer is "try all the different moves until the game is won, and record which ones lead to computer winning".
Long answerĂ–
For a limited size TTT game, the number of possible moves before the game is won, isn't that much, so simply try a each possible move, then recursively try all possible opponent moves, and keep going until the game ends. Give each move a "score" of how well it went (e.g. how many different solutions did you get that went successful for the computer, and how many went success to the opponent, and pick the one with the "best" result). Beware that you will probably end up with something that is near on impossible to win against if you do well.
I recently dealt with this, although my code was in C#.
I came up with a way of scoring each candidate move. The approach I took creates a score based on the number of moves that would then be required for a win (less moves needed results in a higher score).
My algorithm also considers the combined number of moves for multiple squares. As a result, the algorithm favors moves that would produce multiple potential wins (the only real tactic I know about for Tic Tac Toe). For example, it is possible sometimes to make a move that produces two potential wins that must be blocked. Since the opponent can only block one, it produces a win.
I posted my entire code and a description of it in the article A Tic-Tac-Toe Game Engine.
I did this once, a long time ago. I don't know if I still have the code lying around...
Anyway, I created a function, return type int, which was the square in which the computer should place its piece (assuming 0 is top left and 8 is bottom right square). Yours uses a 2D array, so would be a little different.
Anyway, for each row, column and diagonal, check to see if any two pieces on that row belong to the player. If they don't, check for the same but belonging to the computer. On the first row that this is true, check the remaining piece - if it's available, put the piece there for the win. If you have a player-dominated row, check that you don't already have a piece there and stick it in to block.
const int PlayerPiece = 1;
const int CPiece = 2;
const int Empty = 0;
int board[3][3];
if(board[0][0] == PlayerPiece && board[0][1] == PlayerPiece && board [0][2] == Empty)
{
//Put_Your_Piece_In_[0][2]
}
You could then go on to changing it so that it could check each row i.e.
int numRows = 3;
for(int i = 0; i < numRows; i++)
{
if(board[i][0] == PlayerPiece && board[i][1] == PlayerPiece && board[i][2] == Empty)
{
//Put_Piece_In_[i][2]
}
}
Then, do the same for rows.
You could always consider that Tic-Tac-Toe is essentially just a magic square, described fairly well here: http://www.sciforums.com/showthread.php?134281-An-isomorphism-Tic-Tac-Toe-on-Magic-Square
There is a perfect strategy for Tic-Tac-Toe available on wikipedia. It is really simple. Due to the small size of the grid, the number of cases you need to test (eg test if there are 2 blocks in a row), are very small.