generating a list in arrays and use of functions - c++

I am wondering how to make an array with values that start at 1111 and go all the way up to 8888. I am asking this because I need to generate a list of 4 digit numbers with each digit ranging from 1-8. I would like to have this in a loop form. Also, I am lost on my functions trim, methodicalEliminate, guessAndEliminate, and guessThreeThenEliminate in my following program. Here are the directions:
This assignment focuses on the use of arrays in a program, including using one as a parameter to a function.
PROBLEM DESCRIPTION
In the game of Mastermind, a player is only given a finite number of guesses to try to identify the hidden combination (such as twelve guesses). Often that is the only constraint in playing the game.
But some players might make a competition with each other to see who can guess the other's combination in fewer tries. In this case, the problem is not only to come up with a strategy that can find the answer within a specified limit, but to find one that is likely to require the minimal number of guesses.
Here is where the computer comes in -- one can write a program that would try out different guessing strategies, and see how they work out. Since a computer can do the analysis and computations more rapidly than a person, it could just pretend to play Mastermind on our behalf using any strategy we choose, and tell us how long it took to do that.
OVERALL SOLUTION
Of course, it would be extremely difficult to teach the computer to reason along the same lines as a person. For example, if we guessed a combination 1111 and got one black peg, we would make a mental note that the answer has exactly one 1 in it, and then proceed to make other guesses with that one fact in mind. If we next guesses 1222 and got one white peg, we would know there were no 2's, and that the single 1 is not in the first position. But how to keep track of such information after a series of guesses would be rather hard.
Fortunately, for a computer simulation with an array, we can record all of our known facts in a different way. We just maintain a list of all possible answers that there could be, and then remove numbers from the list that could no longer be the solution. If our first guess tells us there is exactly one 1 digit, we would remove all the numbers that do not have that feature. When we find out there are no 2's, we eliminate all the values that contain 2's. Eventually, the only number left would be the correct answer.
SOME SIMPLE STRATEGIES
This is a strategy that many players use, resembling what was described above. Just methodically got through the possibilities in a straightforward fashion. The first guess of 1111 would answer how many 1's are in the solution; the next guess would answer how many 2's are in the solution, and also say something about where any 1's might be, and so on.
With our list approach, which contain a whole lot of possibilities in order beginning with 1111, 1112, 1113, 1114, etc., our next guess would always be the first in the list.
The next strategy is for those who like a little more excitement. The guesses appear to be more or less random, with the hopes that a lot more information can be discovered. Simulating this approach is surprisingly simple -- if you have a list of numbers, just pick one at random. If you have 837 possibilities to choose from in an array, just pick a random subscript in the range of 0 to 836.
This third strategy considers the possibility that answers that give similar results to a given guess are in a sense similar to each other. So to try to get a little more information, it will still pick some numbers at random without regard to how they were evaluated, and then only start thinking about the results.
To implement this one, let us just pick any three possible answers and guess them, temporarily ignoring how many black pegs and white pegs they earn us. Only after making those guesses will we trim the list of possibilities, then proceeding as the second strategy above.
SAMPLE INTERFACE
These are the sample results from the current implementation: Please enter a combination to try for, or 0 for a random value: 0
Guessing at 2475
Guessing 1111...
Guessing 2222...
Guessing 2333...
Guessing 2444...
Guessing 2455...
Guessing 2456...
Guessing 2475...
Methodical Eliminate required 7 tries.
Guessing 6452...
Guessing 2416...
Guessing 2485...
Guessing 2445...
Guessing 2435...
Guessing 2425...
Guessing 2475...
Guess and Eliminate required 7 tries.
Guessing 7872...
Guessing 6472...
Guessing 1784...
Guessing 2475...
Guess Three then Eliminate required 4 tries.
Play another game? (y/n) y
Please enter a combination to try for, or 0 for a random value: 0
Guessing at 4474
Guessing 1111...
Guessing 2222...
Guessing 3333...
Guessing 4444...
Guessing 4445...
Guessing 4464...
Guessing 4474...
Methodical Eliminate required 7 tries.
Guessing 3585...
Guessing 7162...
Guessing 4474...
Guess and Eliminate required 3 tries.
Guessing 8587...
Guessing 1342...
Guessing 1555...
Guessing 7464...
Guessing 6764...
Guessing 4468...
Guessing 4474...
Guess Three then Eliminate required 7 tries.
NOTE: This program allows each digit to go up to 8 instead of 6. Even though there are 4096 possible answers, it still finds them rather rapidly.
PROGRAM SPECIFICATIONS
The assigned program must implement all of the following functions. Additional ones are permitted as desired -- these below are required. Future assignments will not detail the functions as below -- but will instead require the students to design their own function descriptions in advance to writing the program. main:
Simply governs the overall behavior of the program. A number will be
chosen as the target combination, and then each strategy will attempt
to find it.
Calls: generateAnswer, (to compare all three must have the same answer)
methodicalEliminate, guessAndEliminate, guessThreeThenEliminate
generateAnswer:
Either lets the user at the keyboard choose the mystery combination,
or gives the option to have the computer generate a random combination.
(For a competitive game, it might be interesting to know what sorts
of combinations would be the hardest to guess!)
Parameters: none!
Returns: a 4-digit combination, each digit in the range 1 to 8
generateSearchSpace:
Populates an array with all possible combinations of four-digit
values in the range 1 to 8.
Parameters:
guesses (modified int array) list of guesses
length (output int) number of values in list
Pre-condition:
The array must be allocated to no fewer than 4096 elements.
trim:
Analyzes the response to a particular guess and then eliminates
any values from the list of possibilities that are no longer
possible answers. In each case, it assumes that a value in the
list is an answer, and evaluates the guess accordingly. If the
number of black and white pegs is not the same as those specified,
then it cannot be the correct answer.
Parameters:
guesses (modified int array) list of guesses
length (modified int) number of values in list
guess (input int) a guess that has been evaluated
black (input int) how many black pegs that guess earned
white (input int) how many white pegs that guess earned
Pre-condition:
black and white actually do contain the results of comparing
the guess with the actual answer
Post-condition:
length has been reduced (we learned something)
the viable answers occupy the first 'length' positions
in the guesses array (so the list is shorter)
Calls: evaluate
methodicalEliminate:
beginning with a list of all possible candidate answers
continually guesses the first element in the list, and
trim answers accordingly, until an answer is found
Parameter:
answer (input int) the actual answer
(necessary to get black/white pegs)
Returns: number of guesses required to find the answer
Calls: generateSearchSpace, evaluate, trim
gusssAndEliminate:
beginning with a list of all possible candidate answers
continually guesses a random element in the list, and
trim answers accordingly, until an answer is found
Parameter:
answer (input int) the actual answer
Returns: number of guesses required to find the answer
Calls: generateSearchSpace, evaluate, trim
gusssThreeThenEliminate:
beginning with a list of all possible candidate answers
first guesses three answers at random before trimming
the list of possibilites, and then narrows on the answer
one random guess at a time
Parameter:
answer (input int) the actual answer
Returns: number of guesses required to find the answer
Calls: generateSearchSpace, evaluate, trim
NOTE: These last functions use the correct answer to evaluate
each guess and then use the black/white pegs for the guessing
strategy. NOne of these strategies may peek at the answer to
decide what to do next!
ALSO: The following functions should also appear in this program
from the previous assignment, though they are not themselves
part of the grade for this one.
evaluate:
evaluates a combination by comparing it with the answer
Correctness is indicated by black pegs (correct digit in correct position)
and white pegs (correct digit in incorrect position)
Parameters:
answer (input int) the correct combination
guess (input int) the current guess
black (output int) number of black pegs
white (output int) number of white pegs
pre-conditions:
answer and guess are both 4-digit numbers with no zero digits
post-conditions:
black and white are both > 0 and their sum is <= 4
Calls: nthDigit, clearNthDigit
nthDigit:
identified the n'th digit of a combination
whether digits count from left to right or right to left is unspecified
Parameters:
combination (input int) combination to examine
position (input int) which digit to examine
(returned) (output int) the value of the actual digit
pre-conditions:
combination has the appropriate number of digits, and
0 < position <= number of digits
post-condition:
0 <= returned digit <= 9 (single digit)
clearNthDigit:
ears the n'th digit of a combination to zero, so it will no longer match
digits must be counted in the same manner as nthDigit above.
parameters:
combination (in/out int) combination to modify
position (input int) which digit to set to 0
pre-condition:
same as those for nthDigit above
post-condition:
corresponding digit is set to zero
Calls: nthDigit (optional, depending on the implementation)
Thank you for reading such this long question, and I hope you can help me on arrays!

Just because your guesses have the form of 1111 through to 8888, it does not mean they are numbers.
They are numbers if it makes sense to do arithmetic calculations on them. It does not make sense to define arithmetic calculations on the guesses: what would it mean to add a guess 4571 to a guess 6214?
If your guesses are not numbers, don't use a representation that is reserved for numbers. You can, however, use an array of four integers:
int guesses[4][4096];
int g = 0;
for (int i = 1; i <= 8; ++i)
for (int j = 1; j <= 8; ++j)
for (int k = 1; k <= 8; ++k)
for (int m = 1; m <= 8; ++m)
guesses[g++] = {i, j, k, m};
I am pretty convinced that putting all possible guesses into the array guesses like that is not a good idea either; the code mainly demonstrates how the guesses could be generated.
Go through the other functions, think of what high level operations should be performed on the remaining gueses (like "eliminate all guesses that has a certain number in the thrid position" etc). This should give you an idea about what would be a better data structure to replace the guesses array.

Related

Recursion to output possible outcomes of N number of coin flips

I'm trying to use recursion to output the possible outcomes of N number of coin flips. For instance, if I flip a coin 3 times the possible outputs could be TTT, TTH, THT, THH, HTT, HTH, HHT, and HHH. I'm not looking for an answer but a push in the right direction. Would this be best done with a character array? Or assigning H and T integer values?
Alternatively, since it can only ever be heads or tails, you could use a boolean value. This would be more efficient for memory and will also help avoid the need for error checking. But there is no single way of doing it, experiment and see what works best.
I would say integers. Look up permutations and simple combinatorics if you​haven't already. Remember, recursion operates on the principal of breaking a big problem into smaller ones.

Permutations of English Alphabet of a given length

So I have this code. Not sure if it works because the runtime for the program is still continuing.
void permute(std::vector<std::string>& wordsVector, std::string prefix, int length, std::string alphabet) {
if (length == 0) {
//end the recursion
wordsVector.push_back(prefix);
}
else {
for (int i = 0; i < alphabet.length(); ++i) {
permute(wordsVector, prefix + alphabet.at(i), length - 1, alphabet);
}
}}
where I'm trying to get all combinations of characters in the English alphabet of a given length. I'm not sure if the approach is correct at the moment.
Alphabet consists of A-Z in a string of length 26. WordsVectors holds all the different combinations of words. prefix is meant to pass through recursively until a word is made and length is self explanatory.
Example, if I give the length of 7 to the function, I expect a size of 26 x 25 x 24 x 23 x 22 x 21 x 20 = 3315312000 if I'm correct, following the formula for permutations.
I don't think programs are meant to run this long so either I'm hitting an infinite loop or something is wrong with my approach. Please advise. Thanks.
Surely the stack would overflow but concentrating on your question even if you write an iterative program it will take a long time ( not an infinite loop just very long )
[26L, 650L, 15600L, 358800L, 7893600L, 165765600L, 3315312000L, 62990928000L, 1133836704000L, 19275223968000L, 308403583488000L, 4626053752320000L, 64764752532480000L, 841941782922240000L, 10103301395066880000L, 111136315345735680000L, 1111363153457356800000L, 10002268381116211200000L, 80018147048929689600000L, 560127029342507827200000L, 3360762176055046963200000L, 16803810880275234816000000L, 67215243521100939264000000L, 201645730563302817792000000L, 403291461126605635584000000L, 403291461126605635584000000L]
The above list is the number of possibilities for 1<=n<=26. You can see as n increases number of possibilities increases tremendously. Say you have 1GHz processor that does 10^9 operations per second. Say consider number of possibilities for n=26 its 403291461126605635584000000L. Its evident that if you sit down to list all possibilities its so so long ( so so many years ) that
you will feel it has hit an infinite loop. Finally I have not looked that closely into your code , but in nutshell even if you write it correctly,iteratively and don't store (again can't have this much memory) and just print all possibilities its going to take long time for larger values of n.
EDIT
As jaromax and others said if you just want to write it for smaller values of n,
say less than 10-12 you can write an iterative program to list/print them. It will run quite fast for small values. But if you also want to store them them then n will have to be say less than 5 say. (Really depends on how much RAM is available or you could find some permutations write them to disk, then depends on how much disk memory you can spare, again refer the number of possibilities list I posted above. It gives a rough idea of both time and space complexity).
I think there could be quite a problem that you do this on stack. A large part of the calculation you do recursively and this means every time allocated space for function.
Try to reformulate it linearly. I think I had such a problem before.
Your question implies you think there are 26x25x24x ... permutations
Your code doesn't have anything I can see to avoid "AAAAAAA" being a permutation, in which case there are 26x26x26x ...
So in addition to being a very complicated way of counting in base 26, I think it's also giving bad answers?

Is long long in C++ known to be very nasty in terms of precision?

The Given Problem:
Given a theater with n rows, m seats, and a list of seats that are reserved. Given these values, determine how many ways two friends can sit together in the same row.
So, if the theater was a size of 2x3 and the very first seat in the first row was reserved, there would be 3 different seatings that these two guys can take.
The Problem That I'm Dealing With
The function itself is supposed to return the number of seatings that there are based on these constraints. The return value is a long long.
I've gone through my code many many times...and I'm pretty sure that it's right. All I'm doing is incrementing this one value. However, ALL of the values that my function return differ from the actual solution by 1 or 2.
Any ideas? And if you think that it's just something wrong with my code, please tell me. I don't mind being called an idiot just as long as I learn something.
Unless you're overflowing or underflowing, it definitely sounds like something is wrong with your code. For integral types, there are no precision ambiguities in c or c++
First, C++ doesn't have a long long type. Second, in C99, long long can represent any integral value from LLONG_MIN (<= -2^63) to LLONG_MAX (>= 2^63 - 1) exactly. The problem lies elsewhere.
Given the description of the problem, I think it is unambiguous.
Normally, the issue is that you don't know if the order in which the combinations are taken is important or not, but the example clearly disambiguate: if the order was important we would have 6 solutions, not 3.
What is the value that your code gives for this toy example ?
Anyway I can add a few examples with my own values if you wish, so that you can compare against them, I can't do much more for you unless you post your code. Obviously, the rows are independent so I'm only going to show the result row by row.
X occupied seat
. free seat
1: X..X
1: .X..
2: X...X
3: X...X..
5: ..X.....
From a computation point of view, I should note it's (at least) an O(N) process where N is the number of seats: you have to inspect nearly each seat once, except the first (and last) ones in case the second (and next to last) are occupied; and that's effectively possible to solve this linearly.
From a technic point of view:
make sure you initialize your variable to 0
make sure you don't count too many seats on toy example
I'd be happy to help more but I would not like to give you the full solution before you have a chance to think it over and review your algorithm calmly.

Find the numbers missing

If we have an array of all the numbers up to N (N < 10), what is the best way to find all the numbers that are missing.
Example:
N = 5
1 5 3 2 3
Output: 1 5 4 2 3
In the ex, the number 4 was the missing one and there were 2 3s, so we replaced the first one with 4 and now the array is complete - all the numbers up to 5 are there.
Is there any simple algorithm that can do this ?
Since N is really small, you can use F[i] = k if number i appears k times.
int F[10]; // make sure to initialize it to 0
for ( int i = 0; i < N; ++i )
++F[ numbers[i] ];
Now, to replace the duplicates, traverse your number array and if the current number appears more than once, decrement its count and replace it with a number that appears 0 times and increment that number's count. You can keep this O(N) if you keep a list of numbers that don't appear at all. I'll let you figure out what exactly needs to be done, as this sounds like homework.
Assume all numbers within the range 1 ≤ x ≤ N.
Keep 2 arrays of size N. output, used (as an associative array). Initialize them all to 0.
Scan from the right, fill in values to output unless it is used.
Check for unused values, and put them into the empty (zero) slots of output in order.
O(N) time complexity, O(N) space complexity.
You can use a set data structure - one for all the numbers up to N, one for the numbers you actually saw, and use a set difference.
One way to do this would be to look at each element of the array in sequence, and see whether that element has been seen before in elements that you've already checked. If so, then change that number to one you haven't seen before, and proceed.
Allow me to introduce you to my friend Schlemiel the Painter. Discovery of a more efficient method is left as a challenge for the reader.
This kind of looks like homework, please let us know if it isn't. I'll give you a small hint, and then I'll improve my answer if you confirm this isn't homework.
My tip for now is this: If you were to do this by hand, how would you do it? Would you write out an extra list of numbers of some time, would you read through the list (how many times?)? etc.
For simple problems, sometimes modelling your algorithm after an intuitive by-hand approach can work well.
Here's a link I read just today that may be helpful.
http://research.swtch.com/2008/03/using-uninitialized-memory-for-fun-and.html

Testing for Random Value - Thoughts on this Approach?

OK, I have been working on a random image selector and queue system (so you don't see the same images too often).
All was going swimmingly (as far as my crappy code does) until I got to the random bit. I wanted to test it, but how do you test for it? There is no Debug.Assert(i.IsRandom) (sadly) :D
So, I got my brain on it after watering it with some tea and came up with the following, I was just wondering if I could have your thoughts?
Basically I knew the random bit was the problem, so I ripped that out to a delegate (which would then be passed to the objects constructor).
I then created a class that pretty much performs the same logic as the live code, but remembers the value selected in a private variable.
I then threw that delegate to the live class and tested against that:
i.e.
Debug.Assert(myObj.RndVal == RndIntTester.ValuePassed);
But I couldn't help but think, was I wasting my time? I ran that through lots of iterations to see if it fell over at any time etc.
Do you think I was wasting my time with this? Or could I have got away with:
GateKiller's answer reminded me of this:
Update to Clarify
I should add that I basically never want to see the same result more than X number of times from a pool of Y size.
The addition of the test container basically allowed me to see if any of the previously selected images were "randomly" selected.
I guess technically the thing here being tested in not the RNG (since I never wrote that code) but the fact that am I expecting random results from a limited pool, and I want to track them.
Test from the requirement : "so you don't see the same images too often"
Ask for 100 images. Did you see an image too often?
There is a handy list of statistical randomness tests and related research on Wikipedia. Note that you won't know for certain that a source is truly random with most of these, you'll just have ruled out some ways in which it may be easily predictable.
If you have a fixed set of items, and you don't want them to repeat too often, shuffle the collection randomly. Then you will be sure that you never see the same image twice in a row, feel like you're listening to Top 20 radio, etc. You'll make a full pass through the collection before repeating.
Item[] foo = …
for (int idx = foo.size(); idx > 1; --idx) {
/* Pick random number from half-open interval [0, idx) */
int rnd = random(idx);
Item tmp = foo[idx - 1];
foo[idx - 1] = foo[rnd];
foo[rnd] = tmp;
}
If you have too many items to collect and shuffle all at once (10s of thousands of images in a repository), you can add some divide-and-conquer to the same approach. Shuffle groups of images, then shuffle each group.
A slightly different approach that sounds like it might apply to your revised problem statement is to have your "image selector" implementation keep its recent selection history in a queue of at most Y length. Before returning an image, it tests to see if its in the queue X times already, and if so, it randomly selects another, until it find one that passes.
If you are really asking about testing the quality of the random number generator, I'll have to open the statistics book.
It's impossible to test if a value is truly random or not. The best you can do is perform the test some large number of times and test that you got an appropriate distribution, but if the results are truly random, even this has a (very small) chance of failing.
If you're doing white box testing, and you know your random seed, then you can actually compute the expected result, but you may need a separate test to test the randomness of your RNG.
The generation of random numbers is
too important to be left to chance. -- Robert R. Coveyou
To solve the psychological problem:
A decent way to prevent apparent repetitions is to select a few items at random from the full set, discarding duplicates. Play those, then select another few. How many is "a few" depends on how fast you're playing them and how big the full set is, but for example avoiding a repeat inside the larger of "20", and "5 minutes" might be OK. Do user testing - as the programmer you'll be so sick of slideshows you're not a good test subject.
To test randomising code, I would say:
Step 1: specify how the code MUST map the raw random numbers to choices in your domain, and make sure that your code correctly uses the output of the random number generator. Test this by Mocking the generator (or seeding it with a known test value if it's a PRNG).
Step 2: make sure the generator is sufficiently random for your purposes. If you used a library function, you do this by reading the documentation. If you wrote your own, why?
Step 3 (advanced statisticians only): run some statistical tests for randomness on the output of the generator. Make sure you know what the probability is of a false failure on the test.
There are whole books one can write about randomness and evaluating if something appears to be random, but I'll save you the pages of mathematics. In short, you can use a chi-square test as a way of determining how well an apparently "random" distribution fits what you expect.
If you're using Perl, you can use the Statistics::ChiSquare module to do the hard work for you.
However if you want to make sure that your images are evenly distributed, then you probably won't want them to be truly random. Instead, I'd suggest you take your entire list of images, shuffle that list, and then remove an item from it whenever you need a "random" image. When the list is empty, you re-build it, re-shuffle, and repeat.
This technique means that given a set of images, each individual image can't appear more than once every iteration through your list. Your images can't help but be evenly distributed.
All the best,
Paul
What the Random and similar functions give you is but pseudo-random numbers, a series of numbers produced through a function. Usually, you give that function it's first input parameter (a.k.a. the "seed") which is used to produce the first "random" number. After that, each last value is used as the input parameter for the next iteration of the cycle. You can check the Wikipedia article on "Pseudorandom number generator", the explanation there is very good.
All of these algorithms have something in common: the series repeats itself after a number of iterations. Remember, these aren't truly random numbers, only series of numbers that seem random. To select one generator over another, you need to ask yourself: What do you want it for?
How do you test randomness? Indeed you can. There are plenty of tests for that. The first and most simple is, of course, run your pseudo-random number generator an enormous number of times, and compile the number of times each result appears. In the end, each result should've appeared a number of times very close to (number of iterations)/(number of possible results). The greater the standard deviation of this, the worse your generator is.
The second is: how much random numbers are you using at the time? 2, 3? Take them in pairs (or tripplets) and repeat the previous experiment: after a very long number of iterations, each expected result should have appeared at least once, and again the number of times each result has appeared shouldn't be too far away from the expected. There are some generators which work just fine for taking one or 2 at a time, but fail spectacularly when you're taking 3 or more (RANDU anyone?).
There are other, more complex tests: some involve plotting the results in a logarithmic scale, or onto a plane with a circle in the middle and then counting how much of the plots fell within, others... I believe those 2 above should suffice most of the times (unless you're a finicky mathematician).
Random is Random. Even if the same picture shows up 4 times in a row, it could still be considered random.
My opinion is that anything random cannot be properly tested.
Sure you can attempt to test it, but there are so many combinations to try that you are better off just relying on the RNG and spot checking a large handful of cases.
Well, the problem is that random numbers by definition can get repeated (because they are... wait for it: random). Maybe what you want to do is save the latest random number and compare the calculated one to that, and if equal just calculate another... but now your numbers are less random (I know there's not such a thing as "more or less" randomness, but let me use the term just this time), because they are guaranteed not to repeat.
Anyway, you should never give random numbers so much thought. :)
As others have pointed out, it is impossible to really test for randomness. You can (and should) have the randomness contained to one particular method, and then write unit tests for every other method. That way, you can test all of the other functionality, assuming that you can get a random number out of that one last part.
store the random values and before you use the next generated random number, check against the stored value.
Any good pseudo-random number generator will let you seed the generator. If you seed the generator with same number, then the stream of random numbers generated will be the same. So why not seed your random number generator and then create your unit tests based on that particular stream of numbers?
To get a series of non-repeating random numbers:
Create a list of random numbers.
Add a sequence number to each random number
Sort the sequenced list by the original random number
Use your sequence number as a new random number.
Don't test the randomness, test to see if the results your getting are desirable (or, rather, try to get undesirable results a few times before accepting that your results are probably going to be desirable).
It will be impossible to ensure that you'll never get an undesirable result if you're testing a random output, but you can at least increase the chances that you'll notice it happening.
I would either take N pools of Y size, checking for any results that appear more than X number of times, or take one pool of N*Y size, checking every group of Y size for any result that appears more than X times (1 to Y, 2 to Y + 1, 3 to Y + 2, etc). What N is would depend on how reliable you want the test to be.
Random numbers are generated from a distribution. In this case, every value should have the same propability of appearing. If you calculate an infinite amount of randoms, you get the exact distribution.
In practice, call the function many times and check the results. If you expect to have N images, calculate 100*N randoms, then count how many of each expected number were found. Most should appear 70-130 times. Re-run the test with different random-seed to see if the results are different.
If you find the generator you use now is not good enough, you can easily find something. Google for "Mersenne Twister" - that is much more random than you ever need.
To avoid images re-appearing, you need something less random. A simple approach would be to check for the unallowed values, if its one of those, re-calculate.
Although you cannot test for randomness, you can test that for correlation, or distribution, of a sequence of numbers.
Hard to test goal: Each time we need an image, select 1 of 4 images at random.
Easy to test goal: For every 100 images we select, each of the 4 images must appear at least 20 times.
I agree with Adam Rosenfield. For the situation you're talking about, the only thing you can usefully test for is distribution across the range.
The situation I usually encounter is that I'm generating pseudorandom numbers with my favourite language's PRNG, and then manipulating them into the desired range. To check whether my manipulations have affected the distribution, I generate a bunch of numbers, manipulate them, and then check the distribution of the results.
To get a good test, you should generate at least a couple orders of magnitude more numbers than your range holds. The more values you use, the better the test. Obviously if you have a really large range, this won't work since you'll have to generate far too many numbers. But in your situation it should work fine.
Here's an example in Perl that illustrates what I mean:
for (my $i=0; $i<=100000; $i++) {
my $r = rand; # Get the random number
$r = int($r * 1000); # Move it into the desired range
$dist{$r} ++; # Count the occurrences of each number
}
print "Min occurrences: ", (sort { $a <=> $b } values %dist)[1], "\n";
print "Max occurrences: ", (sort { $b <=> $a } values %dist)[1], "\n";
If the spread between the min and max occurrences is small, then your distribution is good. If it's wide, then your distribution may be bad. You can also use this approach to check whether your range was covered and whether any values were missed.
Again, the more numbers you generate, the more valid the results. I tend to start small and work up to whatever my machine will handle in a reasonable amount of time, e.g. five minutes.
Supposing you are testing a range for randomness within integers, one way to verify this is to create a gajillion (well, maybe 10,000 or so) 'random' numbers and plot their occurrence on a histogram.
****** ****** ****
***********************************************
*************************************************
*************************************************
*************************************************
*************************************************
*************************************************
*************************************************
*************************************************
*************************************************
1 2 3 4 5
12345678901234567890123456789012345678901234567890
The above shows a 'relatively' normal distribution.
if it looked more skewed, such as this:
****** ****** ****
************ ************ ************
************ ************ ***************
************ ************ ****************
************ ************ *****************
************ ************ *****************
*************************** ******************
**************************** ******************
******************************* ******************
**************************************************
1 2 3 4 5
12345678901234567890123456789012345678901234567890
Then you can see there is less randomness. As others have mentioned, there is the issue of repetition to contend with as well.
If you were to write a binary file of say 10,000 random numbers from your generator using, say a random number from 1 to 1024 and try to compress that file using some compression (zip, gzip, etc.) then you could compare the two file sizes. If there is 'lots' of compression, then it's not particularly random. If there isn't much of a change in size, then it's 'pretty random'.
Why this works
The compression algorithms look for patterns (repetition and otherwise) and reduces that in some way. One way to look a these compression algorithms is a measure of the amount of information in a file. A highly compressed file has little information (e.g. randomness) and a little-compressed file has much information (randomness)