How can we obtain the same number by taking Roman Numerals? - knuth

On page 2 of The Art of Computer Programming, Volume 1, Fascicle 1, Knuth (2005) writes, "The same number may also be obtained in a simpler way by taking Roman Numerals."
This is part of Knuth's humorous explanation of the identifying number of the MMIX Computer. The number 2009 is the average of the identifying numbers of 14 other computers. He goes on to say that we can also obtain 2009 by "taking Roman Numerals." How?
I have tried taking the Roman Numerals from the names of the 14 other computers. The sum exceeds 2009 and is far less than 28,126, so neither the sum nor the average work. Knuth could just mean to take the Roman Numerals of MMIX, and if that's it, then fine. Is there something else though? I would love to know.
P.S.
Moderators, this question might not meet SO standards. In that case, please teach me where or how else to ask this question. That way I can better meet the community.
References
Knuth, D. E. (2005). The art of computer programming: Volume 1, fascicle 1 : MMIX, a RISC computer for the new millennium. Upper Saddle River, New Jersey: Addison-Wesley.

Related

What is wrong with this Fortran write-statement? [duplicate]

Is it about performance, clean source code, compilers, ...? I know that many compilers allow longer single-line codes. But, if this extension is possible without any compromise, then why does Fortran standard strictly adhere to this rule?
I know that this is very general question (stackoverflow warns me that this question might be downvoted given its title), but I cannot find any resources that explain the logic behind a max length of 132 characters in modern Fortran standard.
Update Oct 22, 2019: See https://j3-fortran.org/doc/year/19/19-138r1.txt for a proposal accepted as a work item for the next 202X revision of the Fortran standard, which eliminates the maximum line length and continuation limits.
Take a look at specification:
ftp://ftp.nag.co.uk/sc22wg5/N001-N1100/N692.pdf
section: 3.3.1
It's just convention. Somebody decided that 132 will be ok. In 66 version it was 72.
Standards: https://gcc.gnu.org/wiki/GFortranStandards#Fortran_Standards_Documents
Usually, these limitations (like 80, 132 characters per line), were dictated by terminals.
Just to illustrate, in a "funny" way, how was it to code in 90's ;)
The first programming language I learned back in the 1980s was Fortran (FORTRAN77 to be exact)
Everybody was super excited because my group of students were the first ones allowed to use the brand new terminals that had just been set up in the room next to the computer. BTW: The computer was an IBM mainframe and it resided in a room the size of a small concert hall, about four times the size of the classroom with the 16 terminals.
I remember having more than once spent hours and hours debugging my code only to find out that in one of my code lines I had again been using the full line width of 80 characters that the terminal provided instead of the 72 characters allowed by Fortan77. I used to call the language Fortran72 because of that restriction.
When I asked my tutor for the reason he pointed me to the stack of cardboard boxes in the hallway. It was rather a wall of boxes, 8m long and almost 2m high. All these boxes were full of unused punch cards that they did not need anymore after the installation of the terminals.
And yes the punchcards only used 72 characters per code line because the remaining 8 were required for the sequence number of the card.
(Imagine dropping a stack of cards with no sequence numbers punched in.)
I am aware that I broke some rules and conventions here: I hope you still like that little piece of trivia and won't mind that my story does not exactly answer the original question. And yeah, it also repeats some information from previous answers.
The old IBM line printers has 132 character width, so when IBM designed Fortran, that was the max line length
The reason was sequence numbers punched in columns 73-80 of the source code cards. When you dropped your program deck on the floor, they allowed you to bring the scrambled deck to a sorting machine (a large 5 foot long stand alone machine) and sort the deck back into order.
A sequencer program read the deck and could punch a new deck with updated sequence numbers, so the programmer did not get involved in the numbering. You punched a new deck after every few dozen changes.
I did it many times 1970-1990.
In the olden days the punchcards also were of finite length. I forget what was being used for terminals in the 90s other than they were long CRTs, but do not recall the resolution... But it was NOT 2k pixels wide.

Donald Knuth algorithm for Mastermind - can we do better?

I implemented Donald Knuth 1977 algorithm for Mastermind https://www.cs.uni.edu/~wallingf/teaching/cs3530/resources/knuth-mastermind.pdf
I was able to reproduce his results - 5 guess to win in the worst case and 4.476 on average.
And then I tried something different. I ran Knuth's algorithm repeatedly and shuffled the entire list of combinations randomly each time before starting. I was able to land on a strategy with 5 guesses to win in the worst case (like Knuth) but with 4.451 guesses to win on average. Better than Knuth.
Are there any previous work trying to outperform Knuth algorithm on average , while maintaining the worst case ? I could not find any indication of it on the web so far.
Thanks!
Alon
In the paper, Knuth describes how the strategy was chosen:
Table 1 was found by choosing at every stage a test pattern that minimizes the maximum number of remaining possibilities, over all conceivable responses by the codemaker. If this minimum can be achieved by a “valid” pattern (a pattern that makes “four black hits” possible), a valid one should be used. Subject to this condition, the first such test pattern in numeric order was selected. Fortunately this procedure turns out to guarantee a win in five moves.
So it is to some extent a greedy strategy (trying to make the most progress at each step, rather than overall), and moreover there's an ad-hoc tie-breaking strategy. This means that it need not be optimal in expected value, and indeed Knuth says exactly that:
The strategy in Table 1 isn’t optimal from the “expected number of moves” standpoint, but it is probably very close. One line that can be improved [...]
So already at the time the paper was published, Knuth was aware that it's not optimal and even had an explicit example.
When this paper was republished in his collection Selected Papers on Fun and Games (2010), he adds a 5-page addendum to the 6-page paper. In this addendum, he starts by mentioning randomization in the very first paragraph, and discusses the question of minimizing the expected number of moves. Analyzing it as the sum of all moves made over all 1296 possible codewords, he mentions a few papers:
His original algorithm gave 5801 (average of 5801/1296 ≈ 4.47608), and the minor improvement gives 5800 (≈ 4.4753).
Robert W. Irving, “Towards an optimum Mastermind strategy,” Journal of Recreational Mathematics 11 (1978), 81-87 [while staying within the “at most 5” achieves 5664 ⇒ ≈4.37]
E. Neuwirth, “Some strategies for Mastermind,” Zeitschrift fur Operations Research 26 (1982), B257-B278 [achieves 5658 ⇒ ≈4.3657]
Kenji Koyama and Tony W. Lai, “An optimal Mastermind strategy,” Journal of Recreational Mathematics 25 (1993), 251-256 [achieves 5626 ⇒ ≈4.34104938]
The last of these is the best possible, as it was found with an exhaustive depth-first search. (Note that all of these papers can do slightly better in the expected number of moves, if you allow them to take 6 moves sometimes... I gave the numbers with the “at most 5” constraint because that's what the question here asks for.)
You can make this more general (harder) by assuming the codemaker is adversarial and does not choose uniformly at random among the 1296 possible codewords, but according to whatever distribution will make it hardest for the codebreaker. Finally he mentions a lot of work done by Tom Nestor, which conclusively settles many such questions.
You might have fun trying to follow up or reproduce these results (e.g. write the exhaustive search program). Enjoy!
As far as I know, up till now there is no published work about this effect yet. I have made this observation some time ago, one can get better results by not always choosing the (canonically) first trial out of the "one-step-lookahead-set". I observed the different results by not starting with 1122 but with e.g. with 5544. One can also try to choose randomly and not use the canonically first. Yes, I agree with you, that is an interesting point - but a very, very special one.

making calculations with money

so I am making a program that calculates the weight of a book and charges you according so.
as you know in decimal number the 0 is not considered in calculations so how do I make the computer consider it when it is dealing with money?
ps is it possible to make the computer display the pound sterling sign (£) on the program?
edit: sorry guys what i meant was like a number like 1.50 will just be written as 1.5 on a computer but money wise the 0 is important so how would make sure that the program includes it?
I am going to assume you mean printing because otherwise the '0' makes no difference to the compiler.
Use the following:
printf("%.2f ", 123.45678);
The computer can display '£' but I am not sure where youre displaying it. If it is within a print out statement, you should be able to just write it out.
Next time just be a little more careful about what you are asking.

Fortran 95: super large numbers for prime test

I'm pretty new to Fortran, as in started learning it 2 days ago new. I started learning Fortran because I was getting into prime numbers, and I wrote a program in python that was so fast, it could determine 123098237 was a prime in 0.1 seconds.
Impressive, I know.
What's not impressive is when I try to find out if (2^127)-1 or 170141183460469231731687303715884105727 (it is, by the way) is a prime number. The program ran so long, I just ended up having to stop it.
So, I started looking for some faster languages to write it in, so I wrote the program in C.
It was faster, but the problem of super large prime numbers came into play.
I was going to to see if there was a solution but then I heard through the grapevine that, if your programming with numbers, Fortran is the fastest and best way to go. I vaguely remember my step dad's old Fortran 77 text books from college, but they were basically useless to me, because they were talking about working with punch cards. So, I went online, got gfortran for Ubuntu 12.04 x86, got a couple of pdfs, and started learning. Before you know it I made a program that received input and tested for primality, and worked!
But, the same old problem came up, the number was too big.
And so, how do I handle big numbers like this with Fortran?
Fortran, like many other compiled languages, doesn't provide such large integers or operations on them out-of-the-box. An up to date compiler ought to provide an integer with 18 decimal digits, but no more than that.
If you want to program, in Fortran, data types and operations for such big integers use your favourite search engine on terms such as Fortran multiple precision. You could even search around here on SO for relevant questions and answers.
If you want to investigate the mathematics of such large integers stick with Python; you'll struggle to write software yourself which matches its speed of operations on multiple precision arithmetic. One of the reasons that Python takes a long time to determine the primality of a large number is that it takes a program, any program written in any language, a long time to determine the primality of a large number. If you dig around you're likely to find that the relevant Python routines actually call code written in C or something similarly low-level. Investigate, if you wish, the topic of the computational complexity of primality testing.
I'm not saying you won't be able to write code to outperform the Python intrinsics, just that you will find it a challenge.
Most languages provide certain standard intrinsic types which are fully adequate for solving standard scientific and engineering problems. You don't need 80 digit numbers to calculate the thickness of a bridge girder or plan a spacecraft orbit. It would be difficult to measure to that accuracy. In Fortran, if you want to do extra precision calculations (e.g., for number theory) you need to look to libraries that augment the language, e.g., mpfun90 at http://crd-legacy.lbl.gov/~dhbailey/mpdist/ or fmlib at http://myweb.lmu.edu/dmsmith/fmlib.html
I'll guess that your algorithm is trial division. If that's true, you need a better algorithm; the implementation language won't matter.
Pseudocode for the Miller-Rabin primality test is shown below. It's probabilistic, but you can reduce the chance of error by increasing the k parameter, up to a maximum of about k=25:
function isPrime(n, k=5)
if n < 2 then return False
for p in [2,3,5,7,11,13,17,19,23,29]
if n % p == 0 then return n == p
s, d = 0, n-1
while d % 2 == 0
s, d = s+1, d/2
for i from 0 to k
x = powerMod(randint(2, n-1), d, n)
if x == 1 or x == n-1 then next i
for r from 1 to s
x = (x * x) % n
if x == 1 then return False
if x == n-1 then next i
return False
return True
I'll leave it to you to translate that to Fortran or some other language; if you're programming in C, there is a library called GMP that is frequently used for handling very large numbers, and the function shown above is built-it to that library. It's very fast; even numbers that are hundreds of digits long should be classified as prime or composite almost instantly.
If you want to be certain of the primality of a number, there are other algorithms that can actually provide a proof of primality. But they are much more complicated, and much slower.
You might be interested in the essay Programming with Prime Numbers at my blog.

What is the maximum theoretically possible compression rate?

This is a theoretical question, so expect that many details here are not computable in practice or even in theory.
Let's say I have a string s that I want to compress. The result should be a self-extracting binary (can be x86 assembler, but it can also be some other hypothetical Turing-complete low level language) which outputs s.
Now, we can easily iterate through all possible such binaries and programs, ordered by size. Let B_s be the sub-list of these binaries who output s (of course B_s is uncomputable).
As every set of positive integers must have a minimum, there must be a smallest program b_min_s in B_s.
For what languages (i.e. set of strings) do we know something about the size of b_min_s? Maybe only an estimation. (I can construct some trivial examples where I can always even calculate B_s and also b_min_s, but I am interested in more interesting languages.)
This is Kolmogorov complexity, and you are correct that it's not computable. If it were, you could create a paradoxical program of length n that printed a string with Kolmogorov complexity m > n.
Clearly, you can bound b_min_s for given inputs. However, as far as I know most of the efforts to do so have been existence proofs. For instance, there is an ongoing competition to compress English Wikipedia.
Claude Shannon estimated the information density of the English language to be somewhere between 0.6 and 1.3 bits per character in his 1951 paper Prediction and Entropy of Printed English (PDF, 1.6 MB. Bell Sys. Tech. J (3) p. 50-64).
The maximal (avarage) compression rate possible is 1:1.
The number of possible inputs is equal to the number of outputs.
It has to be to be able to map the output back to the input.
To be able to store the output you need container at the same size as the minimal container for the input - giving 1:1 compression rate.
Basically, you need enough information to rebuild your original information. I guess the other answers are more helpful for your theoretical discussion, but just keep this in mind.