Donald Knuth algorithm for Mastermind - can we do better? - knuth

I implemented Donald Knuth 1977 algorithm for Mastermind https://www.cs.uni.edu/~wallingf/teaching/cs3530/resources/knuth-mastermind.pdf
I was able to reproduce his results - 5 guess to win in the worst case and 4.476 on average.
And then I tried something different. I ran Knuth's algorithm repeatedly and shuffled the entire list of combinations randomly each time before starting. I was able to land on a strategy with 5 guesses to win in the worst case (like Knuth) but with 4.451 guesses to win on average. Better than Knuth.
Are there any previous work trying to outperform Knuth algorithm on average , while maintaining the worst case ? I could not find any indication of it on the web so far.
Thanks!
Alon

In the paper, Knuth describes how the strategy was chosen:
Table 1 was found by choosing at every stage a test pattern that minimizes the maximum number of remaining possibilities, over all conceivable responses by the codemaker. If this minimum can be achieved by a “valid” pattern (a pattern that makes “four black hits” possible), a valid one should be used. Subject to this condition, the first such test pattern in numeric order was selected. Fortunately this procedure turns out to guarantee a win in five moves.
So it is to some extent a greedy strategy (trying to make the most progress at each step, rather than overall), and moreover there's an ad-hoc tie-breaking strategy. This means that it need not be optimal in expected value, and indeed Knuth says exactly that:
The strategy in Table 1 isn’t optimal from the “expected number of moves” standpoint, but it is probably very close. One line that can be improved [...]
So already at the time the paper was published, Knuth was aware that it's not optimal and even had an explicit example.
When this paper was republished in his collection Selected Papers on Fun and Games (2010), he adds a 5-page addendum to the 6-page paper. In this addendum, he starts by mentioning randomization in the very first paragraph, and discusses the question of minimizing the expected number of moves. Analyzing it as the sum of all moves made over all 1296 possible codewords, he mentions a few papers:
His original algorithm gave 5801 (average of 5801/1296 ≈ 4.47608), and the minor improvement gives 5800 (≈ 4.4753).
Robert W. Irving, “Towards an optimum Mastermind strategy,” Journal of Recreational Mathematics 11 (1978), 81-87 [while staying within the “at most 5” achieves 5664 ⇒ ≈4.37]
E. Neuwirth, “Some strategies for Mastermind,” Zeitschrift fur Operations Research 26 (1982), B257-B278 [achieves 5658 ⇒ ≈4.3657]
Kenji Koyama and Tony W. Lai, “An optimal Mastermind strategy,” Journal of Recreational Mathematics 25 (1993), 251-256 [achieves 5626 ⇒ ≈4.34104938]
The last of these is the best possible, as it was found with an exhaustive depth-first search. (Note that all of these papers can do slightly better in the expected number of moves, if you allow them to take 6 moves sometimes... I gave the numbers with the “at most 5” constraint because that's what the question here asks for.)
You can make this more general (harder) by assuming the codemaker is adversarial and does not choose uniformly at random among the 1296 possible codewords, but according to whatever distribution will make it hardest for the codebreaker. Finally he mentions a lot of work done by Tom Nestor, which conclusively settles many such questions.
You might have fun trying to follow up or reproduce these results (e.g. write the exhaustive search program). Enjoy!

As far as I know, up till now there is no published work about this effect yet. I have made this observation some time ago, one can get better results by not always choosing the (canonically) first trial out of the "one-step-lookahead-set". I observed the different results by not starting with 1122 but with e.g. with 5544. One can also try to choose randomly and not use the canonically first. Yes, I agree with you, that is an interesting point - but a very, very special one.

Related

Evolutionary Algorithm without an objective function

I'm currently trying to find good parameters for my program (about 16 parameters and execution of the program takes about a minute). Evolutionary algorithms seemed like a nice idea and I wanted to see how they perform.
Unfortunately I don't have a good fitness function because the variance of my objective function is very high (I can not run it often enough without waiting until 2016). I can, however, compute which set of parameters is better (test two configurations against each other). Do you know if there are evolutionary algorithms that only use that information? Are there other optimization techniques more suitable? For this project I'm using C++ and MATLAB.
// Update: Thank you very much for the answers. Both look promising but I will need a few days to evaluate them. Sorry for the delay.
If your pairwise test gives a proper total ordering, i.e. if a >= b, and b >= c implies a >= c, and some other conditions . Then maybe you can construct a ranking objective on the fly, and use CMA-ES to optimize it. CMA-ES is an evolutionary algorithm and is invariant to order preserving transformation of function value, and angle-preserving transformation of inputs. Furthermore because it's a second order method, its convergence is very fast comparing to other derivative-free search heuristics, especially in higher dimensional problems where random search like genetic algorithms take forever.
If you can compare solutions in a pairwise fashion then some sort of tournament selection approach might be good. The Wikipedia article describes using it for a genetic algorithm but it is easily applied to an evolutionary algorithm. What you do is repeatedly select a small set of solutions from the population and have a tournament among them. For simplicity the tournament size could be a power of 2. If it was 8 then pair those 8 up at random and compare them, selecting 4 winners. Pair those up and select 2 winners. In a final round -- select an overall tournament winner. This solution can then be mutated 1 or more times to provide member(s) for the next generation.

Deterministically checking whether a large number is prime or composite?

I'm searching for an algorithm to primality test large (like 10200) numbers.
Are there any good algorithms?
Ideally, I'd prefer an algorithm that isn't probabalistic.
Note: Numbers have over 50 and less then 200 digits.
If you're looking for a non-probabalistic test, you may want to check out the AKS primality testing algorithm, which runs in roughly O(log6 n) time. For the number of digits you have, this is probably feasible.
That said, probabalistic primality tests are extremely good and many have exponentially small error rates. I would suggest using one of those unless there's a good reason not to.
EDIT: I just found this page containing several C++ implementations of AKS. I have no idea whether they work correctly or not, but they might be a good starting point.
Hope this helps!
Typically we would use a probable prime test. I recommend BPSW, which you can follow by a Frobenius test and/or some random-base Miller-Rabin tests if you want more certainty. This will be fast and arguably more certain than running some proof implementations.
Assume you say that isn't good enough. Then you really want to use ECPP and get a certificate. Reasonable implementations are Primo or ecpp-dj. These can prove primality of 200 digit numbers in well under a second, and return a certificate that can be independently verified.
APR-CL is another reasonable method. The downside is that it doesn't return a certificate so you're trusting the implementation -- you get a "yes" or "no" output that is deterministically correct if the implementation was correct. Pari/GP uses APR-CL with its isprime command, and David Cleaver has an excellent open source implementation: mpz_aprcl. Those implementations have had some code review and daily use in various software so should be good.
AKS is a horrible method to use in practice. It doesn't return a certificate, and it's not too hard to find broken implementations, which completely defeats the point of using a proof method vs. good probable prime tests in the first place. It's also horrendously slow. 200 digit numbers are well past the practical point for any implementation I'm aware of. There is a "fast" one included in the previously mentioned ecpp-dj software so you can try it out, and there are quite a few other implementations to be found.
For some idea of speed, here are times of some implementations. I don't know of any implementations of AKS, APR-CL, or BPSW that are faster than the ones shown (please comment if you know of one). Primo starts off a bit slower than ecpp-dj shown, but at 500 or so digits it is faster, and has a better slope past that. It is the program of choice for large inputs (2,000-30,000 digits).

Reinventing The Wheel: Random Number Generator

So I'm new to C++ and am trying to learn some things. As such I am trying to make a Random Number Generator (RNG or PRNG if you will). I have basic knowledge of RNGs, like you have to start with a seed and then send the seed through the algorithm. What I'm stuck at is how people come up with said algorithms.
Here is the code I have to get the seed.
int getSeed()
{
time_t randSeed;
randSeed = time(NULL);
return randSeed;
}
Now I know that there is are prebuilt RNGs in C++ but I'm looking to learn not just copy other people's work and try to figure it out.
So if anyone could lead me to where I could read or show me examples of how to come up with algorithms for this I would be greatly appreciative.
First, just to clarify, any algorithm you come up with will be a pseudo random number generator and not a true random number generator. Since you would be making an algorithm (i.e. writing a function, i.e. making a set of rules), the random number generator would have to eventually repeat itself or do something similar which would be non-random.
Examples of truly random number generators are one's that capture random noise from nature and digitize it. These include:
http://www.fourmilab.ch/hotbits/
http://www.random.org/
You can also buy physical equipment that generate white noise (or some other means on randomness) and digitally capture it:
http://www.lavarnd.org/
http://www.idquantique.com/true-random-number-generator/products-overview.html
http://www.araneus.fi/products-alea-eng.html
In terms of pseudo random number generators, the easiest ones to learn (and ones that an average lay person could probably make on their own) are the linear congruential generators. Unfortunately, these are also some of the worst PRNGs there are.
Some guidelines for determining what is a good PRNG include:
Periodicity (what is the range of available numbers?)
Consecutive numbers (what is the probability that the same number will be repeated twice in a row)
Uniformity (Is it just as likely to pick numbers from a certain sub range as another sub range)
Difficulty in reverse engineering it (If it is close to truly random then someone should not be able to figure out the next number it generates based on the last few numbers it generated)
Speed (how fast can I generate a new number? Does it take 5 or 500 arithmetic operations)
I'm sure there are others I am missing
One of the more popular ones right now that is considered good in most applications (ie not crptography) is the Mersenne Twister. As you can see from the link, it is a simple algorithm, perhaps only 30 lines of code. However trying to come up with those 20 or 30 lines of code from scratch takes a lot of brainpower and study of PRNGs. Usually the most famous algorithms are designed by a professor or industry professional that has studied PRNGs for decades.
I hope you do study PRNGs and try to roll your own (try Knuth's Art of Computer Programming or Numerical Recipes as a starting place), but I just wanted to lay this all out so at the end of the day (unless PRNGs will be your life's work) its much better to just use something someone else has come up with. Also, along those lines, I'd like to point out that historically compilers, spreadsheets, etc. don't use what most mathematicians consider good PRNGs so if you have a need for a high quality PRNGs don't use the standard library one in C++, Excel, .NET, Java, etc. until you have research what they are implementing it with.
A linear congruential generator is commonly used and the Wiki article explains it pretty well.
To quote John von Neumann:
Anyone who considers arithmetical
methods of producing random digits is
of course in a state of sin.
This is taken from Chapter 3 Random Numbers of Knuth's book "The Art of Computer Programming", which must be the most exhaustive overview of the subject available. And once you have read it, you will be exhausted. You will also know why you don't want to write your own random number generator.
The correct solution best fulfills the requirements and the requirements of every situation will be unique. This is probably the simplest way to go about it:
Create a large one dimensional array
populated with "real" random values.
"seed" your pseudo-random generator by
calculating the starting index with
system time.
Iterate through the array and return
the value for each call to your
function.
Wrap around when it reaches the end.

A few sorting questions

I have found a way that improves (as far as I have tested) upon the quicksort algorithm beyond what has already been done. I am working on testing it and then I want to get the word out about it. However, I would appreciate some help with some things. So here are my questions. All of my code is in C++ by the way.
One of the sorts I have been comparing to my quicksort is the std::sort from the C++ Standard Library. However, it appears to be extremely slow. I am only sorting arrays of ints and longs, but it appears to be around 8-10 times slower than both my quicksort and a standard quicksort by Bentley and McIlroy (and maybe Sedgewick). Does anyone have any ideas as to why it is so slow? The code I use for the sort is just
std::sort(a,a+numelem);
where a is the array of longs or ints and numelem is the number of elements in the array. The numbers are very random, and I have tried different sizes as well as different amounts of repeated elements. I also tried qsort, but it is even worse as I expected.
Edit: Ignore this first question - it's been resolved.
I would like to find more good quicksort implementations to compare with my quicksort. So far I have a Bentley-McIlroy one and I have also compared with the first published version of Vladimir Yaroslavskiy's dual-pivot quicksort. In addition, I plan on porting timsort (which is a merge sort I believe) and the optimized dual-pivot quicksort from the jdk 7 source. What other good quicksorts implementations do you know about? If they aren't in C or C++ that might be okay because I am pretty good at porting, but I would prefer C or C++ ones if you know of them.
How would you recommend getting out the word about my additions to the quicksort? So far my quicksort seems to be significantly faster than all other quicksorts that I've tested it against. The main source of its speed is that it handles repeated elements much more efficiently than other methods that I've found. It almost completely eradicates worst case behavior without adding much time in checking for repeated elements. I posted about it on the Java forums, but got no response. I also tried writing to Jon Bentley because he was working with Vladimir on his dual-pivot quicksort and got no response (though I wasn't terribly surprised by this). Should I write a paper about it and put it on arxiv.org? Should I post in some forums? Are there some mailing lists to which I should post? I have been working on this for some time now and my method is legit. I do have some experience with publishing research because I am a PhD candidate in computational physics. Should I try approaching someone in the Computer Science department of my university? By the way, I have also developed a different dual-pivot quicksort, but it isn't better than my single-pivot quicksort (though it is better than Vladimir's dual-pivot quicksort with some datasets).
I really appreciate your help. I just want to add what I can to the computing world. I'm not interested in patenting this or any absurd thing like that.
If you have confidence in your work, definitely try to discuss it with someone knowledgeable at your university as soon as possible. It's not enough to show that your code runs faster than another procedure on your machine. You have to mathematically prove whatever performance gain you claim to have achieved through analysis of your algorithm. I'd say the first thing to do is make sure both algorithms you are comparing are implemented and compiled optimally - you may just be fooling yourself here. The likelihood of an individual achieving such a marked improvement upon such an important sorting method without already having thorough knowledge of its accepted variants just seems minuscule. However, don't let me discourage you. It should be interesting anyway. Would you be willing to post the code here?
...Also, since quicksort is especially vulnerable to worst-case scenarios, the tests you choose to run may have a huge effect, as will the choice of pivots. In general, I would say that any data set with a large number of equivalent elements or one that is already highly sorted is never a good choice for quicksort - and there are already well-known ways of combating that situation, and better alternative sorting methods.
If you have truly made a breakthrough and have the math to prove it, you should try to get it published in the Journal of the ACM. It's definitely one of the more prestigious journals for computer science.
The second best would be one of the IEEE journals such as Transactions on Software Engineering.

Rigor in capturing test cases for unit testing

Let's say we have a simple function defined in a pseudo language.
List<Numbers> SortNumbers(List<Numbers> unsorted, bool ascending);
We pass in an unsorted list of numbers and a boolean specifying ascending or descending sort order. In return, we get a sorted list of numbers.
In my experience, some people are better at capturing boundary conditions than others. The question is, "How do you know when you are 'done' capturing test cases"?
We can start listing cases now and some clever person will undoubtedly think of 'one more' case that isn't covered by any of the previous.
Don't waste too much time trying to think of every boundry condition. Your tests won't be able to catch every bug first time around. The idea is to have tests that are pretty good, and then each time a bug does surface, write a new test specifically for that bug so that you never hear from it again.
Another note I want to make about code coverage tools. In a language like C# or Java where your have many get/set and similar methods, you should not be shooting for 100% coverage. That means you are wasting too much time writing tests for trivial code. You only want 100% coverage on your complex business logic. If your full codebase is closer to 70-80% coverage, you are doing a good job. If your code coverage tool allows multiple coverage metrics, the best one is 'block coverage' which measures coverage of 'basic blocks'. Other types are class and method coverage (which don't give you as much information) and line coverage (which is too fine grain).
How do you know when you are 'done' capturing test cases?
You don't.You can't get to 100% except for the most trivial cases. Also 100% coverage (of lines, paths, conditions...) doesn't tell you you've hit all boundary conditions.
Most importantly, the test cases are not write-and-forget. Each time you find a bug, write an additional test. Check it fails with the original program, check it passes with the corrected program and add it to your test set.
An excerpt from The Art of Software Testing by Glenford J. Myers:
If an input condition specifies a range of values, write test cases for the ends of the range, and invalid-input test cases for situations just beyond the ends.
If an input condition specifies a number of values, write test cases for the minimum and maximum number of values and one beneath and beyond these values.
Use guideline 1 for each output condition.
Use guideline 2 for each output condition.
If the input or output of a program is an ordered set focus attention on the first and last elements of the set.
In addition, use your ingenuity to search for other boundary conditions
(I've only pasted the bare minimum for copyright reasons.)
Points 3. and 4. above are very important. People tend to forget boundary conditions for the outputs. 5. is OK. 6. really doesn't help :-)
Short exam
This is more difficult than it looks. Myers offers this test:
The program reads three integer values from an input dialog. The three values represent the lengths of the sides of a triangle. The program displays a message that states whether the triangle is scalene, isosceles, or equilateral.
Remember that a scalene triangle is one where no two sides are equal, whereas an isosceles triangle has two equal sides, and an equilateral triangle has three sides of equal length. Moreover, the angles opposite the equal sides in an isosceles triangle also are equal (it also follows that the sides opposite equal angles in a triangle are equal), and all angles in an equilateral triangle are equal.
Write your test cases. How many do you have? Myers asks 14 questions about your test set and reports that highly qualified professional programmes average 7.8 out of a possible 14.
From a practical standpoint, I create a list of tests that I believe must pass prior to acceptance. I test these and automate where possible. Based on how much time I've estimated for the task or how much time I've been given, I extend my test coverage to include items that should pass prior to acceptance. Of course, the line between must and should is subjective. After that, I update automated tests as bugs are discovered.
#Keith
I think you nailed it, code coverage is important to look at if you want to see how "done" you are, but I think 100% is a bit unrealistic a goal. Striving for 75-90% will give you pretty good coverage without going overboard... don't test for the pure sake of hitting 100%, because at that point you are just wasting your time.
A good code coverage tool really helps.
100% coverage doesn't mean that it definitely is adequately tested, but it's a good indicator.
For .Net NCover's quite good, but is no longer open source.
#Mike Stone -
Yeah, perhaps that should have been "high coverage" - we aim for 80% minimum, past about 95% it's usually diminishing returns, especially if you have belt 'n' braces code.