Minimax seems to do a great job of not losing, but it's very fatalistic in assuming the opponent won't make a mistake. Of course a lot of games are solved to a draw, but one should be playing for "Push as hard as possible for a win without risking losing", even when no forced wins are available.
That is, given two trees with the same (drawn) end position given optimal play, how could the algorithm be adjusted to prefer the one which is most likely to win if the opponent makes a sub-optimal move, or make the opponent more likely to slip up?
Using the simple example of Tic-Tac-Toe, a stronger player would often aim to set up forks and thereby guarantee wins. Even though the opponent could see such a trick coming and stop it beforehand, they're more likely to miss that than if you just put two Xs in an empty row and hope they momentarily forget what game they're playing. Similarly a strong player would tend to start in the centre or perhaps a corner, but in simple minimax there's no reason (since you can still force a draw) not to pick an edge square.
If I understand your question correctly, you're asking how to adjust the minimax algorithm so that it doesn't assume the opponent always makes the best move.
Look into the expectiminimax algorithm, a variation of minimax.
Essentially, instead of dealing with only min or max nodes, it introduces chance nodes which store a probability that the opponent will choose the current move.
To make it even simpler, you could assume the opponent selects each move (node) with equal probability.
In short, when its the opponents turn, instead of returning the minimum score, return the average score of their possible moves.
How about tweaking "min" nodes?
In regular minimax, when evaluating a position for the opponent, the score is the minimum score for each of his moves. Injecting some optimism (from the "max" player's pov) into the search could be done by using a different function than the minimum.
Some things that could be tried out:
-using the second worst score
-using a mix between the min and the average (or median)
Perhaps this should be tied to an optimism factor that increases with the depth of the node. This would avoid ignoring a very bad move by the opponent lower in the tree (which in most games would mean a more obvious move).
Related
I have a program which plays Connect Four against a human opponent using either standard Minimax algorithm or minimax with alpha-beta pruning. Both algorithms have a depth limit, after which they apply an evaluation function. As part of the project I had to create performance curves showing the number of nodes of the search tree expanded by the algorithms, per turn of the computer. The curves show a downward trend, as expected, since the number of possible states goes down as the game progresses. However, I can't explain why the number of nodes increases in the second turn of the computer, most prominently in the alpha-beta case, as can be seen in the images below:
These curves were built based on a test game where the human played first and a depth limit of 8 plies. Does anyone know why the curves are not strictly decreasing?
Minimax can see an increase in the number of nodes depending on how the game branches at different plies. Consider a silly game where, in the first 8 plies, both players are forced to "do nothing", and on the 9th ply, they get many more options. If your depth is 8 plies, you'll certainly see more nodes expanded once the minimax is allowed to reach the 9th ply.
This is also true in alpha-beta pruning. But furthermore, in alpha-beta pruning, the order of node evaluation affects the number of nodes expanded. In other words, it's a bit more random.
I successfully implemented a negascout game engine, which works well, but deterministically. That means I can replay the same game over and over again, because for a given position, the game engine yields the same best move every time. This is unwanted in my case, because I want to compete with my algorithm in coding tournaments and with the deterministic behavior, an opponent can easily write program that wins by just replaying a sequence of winning moves against my program.
My question is, what is the most efficient and elegant way to make it less deterministic? I could add a random offset to my position evaluation, but I'm afraid this could worsen the evaluation quality. Is there a standard way to do this?
Just start from another random open position. Dont add randomness to your engine until you've worked out the bugs. If two or more moves are equal, you can randomise those in the move ordering.
I am currently brainstorming strategies on how to compute the distance in a 2D array, of all points from sets of points with specific attributes. A good example (and one of my likely uses) would be a landscape with pools of water on it. The idea would be to calculate the distance of all points in this landscape from water.
These are the criteria that I would like to abide to and their reasoning:
1) Execution speed is my greatest concern. The terrain is dynamic and the code will need to be run in a semi continuous manner. What I mean by that is that there are be periods of terrain updates which will require constant updates.
2) Memory overhead is not a major concern of mine. This will be run as the primary application.
3) It must be able to update dynamically. See #1 for the reasons behind this. These updates can be localized.
4) Multi-threading is a possibility. I am already make extensive use of multi-threading as my simulation is very CPU intensive. I'd prefer to avoid it since it would speed up development but I can do it if necessary.
I have come up with the following possible approach and am looking for feedback and/or alternative suggestions.
1) Iterate through the entire array and make a collection positions in a container class corresponding to points are next to those with the particular property. Assign a value of 1 to these points and 0 to the points with the property.
2) Use the positions to look up those points adjacent to them that are the next distance away, place them in a second container class.
3) Repeat this process till no points are left unsigned.
4) Save the list of points directly one unit away for future updates.
The idea is to basically flow outward from distance 0, and save computation by continually narrowing the list of points in the loop.
1) The only other way of doing this well, which I can think of, would be with the Cartesian distance formula, but your way seems like it would have less CPU time (since the Cartesian way must calculate to every point on each point).
2) Or, if I understand your desire correctly, you could iterate through once, saving all of the points with your special attribute in a container (point to them), and then iterate through one more time, only using the distance formula from each iteration to each of the saved points (then repeat). Please comment and ask about it if this explanation is unclear. It's late, and I am quite tired.
If you want to run this comparison in the background, you have no choice but to multi-thread the whole program. However, whether or not to multi-thread the functionality of this process is up to you. If you go with the second option I provided, I think you will have cut down enough CPU usage to forgo multi-threading the process in question. The more I think about it, the more I like the second option.
I'm trying to simulate the damping of the hand on a guitarstring on an already recorded/sampled open guitarstring sound. I've been trying to use low pass filter and had a moving frequency range but that didn't make it sound like a damped string, just the loss of higher frequencies.
Could someone help me find good material on this, that a human could atleast grasp a bit?
It's going to be implemented in C++ and I have been searching and found almoust everything about the karplus-strong string algorithm, but that's not what I want.. I do want the damping part implemented on a sample of an already recorded real played string.
This probably not as simple as you think. It is not just the right filter, but also the sound will decay faster. This is likely differrent for different frequencies.
If you have guitar at your disposal, you could measure the sound spectum over time when you strike it normally, and once while you dampen it. You can measure the difference in the initial spectrum as well as the difference in decay rate.
You can apply this information to the sound you want to alter, but you'd need to convert the signal to frequency-vs-time first.
But this may be far too complicated for what you had in mind. A simpler approach could be to first increase the decay, by multiplying the signal by e^(w*t), with w as the decay rate. You could split the signal in low and high pass signals and apply different decay rates, with the high freq component getting a faster decay.
I'm bulk loading data into a django model, and have noticed that the number of objects loaded into memory before doing a commit affects the average time to save each object. I realise this can be due to many different factors, so would rather focus on optimizing this STEPSIZE variable.
What would be a simple algorithm for optimizing this variable, in realtime, while taking into account the fact that this optimum might also change during the process?
I imagine this would be some sort of gradient descent, with a bit of jitter to look for changes in the landscape? Is there a formally defined algorithm for this type of search?
I'd start out assuming that
1) Your function increases monotonically in both directions away from the optimum
2) You roughly know the size of the space of regions in which the optimum will live.
Then I'd recommend a bracket and subdivide approach as follows:
Eval you function outwards from the previous optimum in both directions. Stop the search in each direction when a value higher than the previous optimum is achieved. With the assumptions above, this will give you a bracketed interval in which the new optimum lives. Break this region into two new regions left and right by evaluating the midpoint of the region. Choose left or right based on who has the lowest values, and repeat recursively until your region is small enough for your liking.