If we have an array of all the numbers up to N (N < 10), what is the best way to find all the numbers that are missing.
Example:
N = 5
1 5 3 2 3
Output: 1 5 4 2 3
In the ex, the number 4 was the missing one and there were 2 3s, so we replaced the first one with 4 and now the array is complete - all the numbers up to 5 are there.
Is there any simple algorithm that can do this ?
Since N is really small, you can use F[i] = k if number i appears k times.
int F[10]; // make sure to initialize it to 0
for ( int i = 0; i < N; ++i )
++F[ numbers[i] ];
Now, to replace the duplicates, traverse your number array and if the current number appears more than once, decrement its count and replace it with a number that appears 0 times and increment that number's count. You can keep this O(N) if you keep a list of numbers that don't appear at all. I'll let you figure out what exactly needs to be done, as this sounds like homework.
Assume all numbers within the range 1 ≤ x ≤ N.
Keep 2 arrays of size N. output, used (as an associative array). Initialize them all to 0.
Scan from the right, fill in values to output unless it is used.
Check for unused values, and put them into the empty (zero) slots of output in order.
O(N) time complexity, O(N) space complexity.
You can use a set data structure - one for all the numbers up to N, one for the numbers you actually saw, and use a set difference.
One way to do this would be to look at each element of the array in sequence, and see whether that element has been seen before in elements that you've already checked. If so, then change that number to one you haven't seen before, and proceed.
Allow me to introduce you to my friend Schlemiel the Painter. Discovery of a more efficient method is left as a challenge for the reader.
This kind of looks like homework, please let us know if it isn't. I'll give you a small hint, and then I'll improve my answer if you confirm this isn't homework.
My tip for now is this: If you were to do this by hand, how would you do it? Would you write out an extra list of numbers of some time, would you read through the list (how many times?)? etc.
For simple problems, sometimes modelling your algorithm after an intuitive by-hand approach can work well.
Here's a link I read just today that may be helpful.
http://research.swtch.com/2008/03/using-uninitialized-memory-for-fun-and.html
Related
So, I'm working on an assignment for my intro to computer science class. The assignment is as follows.
There is an organism whose population can be determined according to
the following rules:
The organism requires at least one other organism to propagate. Thus,
if the population goes to 1, then the organism will become extinct in
one time cycle (e.g. one breeding season). In an unusual turn of
events, an even number of organisms is not a good thing. The
organisms will form pairs and from each pair, only one organism will
survive If there are an odd number of organisms and this number is
greater than 1 (e.g., 3,5,7,9,…), then this is good for population
growth. The organisms cannot pair up and in one time cycle, each
organism will produce 2 other organisms. In addition, one other
organism will be created. (As an example, let us say there are 3
organisms. Since 3 is an odd number greater than 1, in one time
cycle, each of the 3 organisms will produce 2 others. This yields 6
additional organisms. Furthermore, there is one more organism
produced so the total will be 10 organisms, 3 originals, 6 produced by
the 3, and then 1 more.)
A: Write a program that tests initial populations from 1 to 100,000.
Find all populations that do not eventually become extinct.
Write your answer here:
B: Find the value of the initial population that eventually goes
extinct but that has the largest number of time cycles before it does.
Write your answer here:
The general idea of what I have so far is (lacking sytanx) is this with P representing the population
int generations = 0;
{
if (P is odd) //I'll use a modulus modifier to divide by two and if the result is not 0 then I'll know it's odd
P = 3P + 1
else
P = 1/2 P
generations = generations + 1
}
The problem for me is that I'm uncertain how to tell what numbers will not go extinct or how to figure out which population takes the longest time to go extinct. Any suggestions would be helpful.
Basically what you want to do is this: wrap your code into a while-loop that exits if either P==1 or generations > someMaxValue.
Wrap this construct into a for-loop that counts from 1 to 100,000 and uses this count to set the initial P.
If you always store the generations after your while-loop (e.g. into an array) you can then search for the greatest element in the array.
This problem can actually be harder than it looks at the first sight. First, you should use memorization to speed things up - for example, with 3 you get 3 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1 -> 0, so you know the answer for all those numbers as well (note that every power of 2 will extinct).
But as pointed out by #Jerry, the problem is with the generations which eventually do not extinct - it will be difficult to say when to actually stop. The only chance is that there will (always) be a recurrence (number of organisms you already passed once when examining the current number of organisms), then you can say for sure that the organisms will not extinct.
Edit: I hacked a solution quickly and if it is correct, you are lucky - every population between 1-100,000 seems to eventually extinct (as my program terminated so I didn't actually need to check for recurrences). Will not give you the solution for now so that you can try by yourself and learn, but according to my program the largest number of cycles is 351 (and the number is close to 3/4 of the range). According to the google search for Collatz conjecture, that is a correct number (they say 350 to go to population of 1, where I'm adding one extra cycle to 0), also the initial population number agrees.
One additional hint: Check for integer overflow, and use 64-bit integer (unsigned __int64, unsigned long long) to calculate the population growth, as with 32-bit unsignet int, there is already an overflow in the range of 1-100,000 (the population can indeed grow much higher intermediately) - that was a problem in my initial solution, although it did not change the result. With 64-bit ints I was able to calculate up to 100,000,000 in relatively decent time (didn't try more; optimized release MSVC build), for that I had to limit the memo table to first 80,000,000 items to not go out of memory (compiled in 32-bit with LARGEADDRESSAWARE to be able to use up to 4 GB of memory - when compiled 64-bit the table could of course be larger).
I came across this question from a colleague.
Q: Given a huge list (say some thousands)of positive integers & has many values repeating in the list, how to find those values occurring odd number of times?
Like 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 1 2 3 4 5 6 1 2 3 4 5 1 2 3 4 1 2 3 1 2 1...
Here,
1 occrus 8 times
2 occurs 7 times (must be listed in output)
3 occurs 6 times
4 occurs 5 times (must be listed in output)
& so on... (the above set of values is only for explaining the problem but really there would be any positive numbers in the list in any order).
Originally we were looking at deriving a logic (to be based on c).
I suggested the following,
Using a hash table and the values from the list as an index/key to the table, keep updating the count in the corresponding index every time when the value is encountered while walking through the list; however, how to decide on the size of the hash table?? I couldn't say it surely though it might require Hashtable as big as the list.
Once the list is walked through & the hash table is populated (with the 'count' number of occurrences for each values/indices), only way to find/list the odd number of times occurring value is to walk through the table & find it out? Is that's the only way to do?
This might not be the best solution given this scenario.
Can you please suggest on any other efficient way of doing it so??
I sought in SO, but there were queries/replies on finding a single value occurring odd number of times but none like the one I have mentioned.
The relevance for this question is not known but seems to be asked in his interview...
Please suggest.
Thank You,
If the values to be counted are bounded by even a moderately reasonable limit then you can just create an array of counters, and use the values to be counted as the array indices. You don't need a tight bound, and "reasonable" is somewhat a matter of platform. I would not hesitate to take this approach for a bound (and therefore array size) sufficient for all uint16_t values, and that's not a hard limit:
#define UPPER_BOUND 65536
uint64_t count[UPPER_BOUND];
void count_values(size_t num_values, uint16_t values[num_values]) {
size_t i;
memset(count, 0, sizeof(count));
for (i = 0; i < num_values; i += 1) {
count[values[i]] += 1;
)
}
Since you only need to track even vs. odd counts, though, you really only need one bit per distinct value in the input. Squeezing it that far is a bit extreme, but this isn't so bad:
#define UPPER_BOUND 65536
uint8_t odd[UPPER_BOUND];
void count_values(size_t num_values, uint16_t values[num_values]) {
size_t i;
memset(odd, 0, sizeof(odd));
for (i = 0; i < num_values; i += 1) {
odd[values[i]] ^= 1;
)
}
At the end, odd[i] contains 1 if the value i appeared an odd number of times, and it contains 0 if i appeared an even number of times.
On the other hand, if the values to be counted are so widely distributed that an array would require too much memory, then the hash table approach seems reasonable. In that case, however, you are asking the wrong question. Rather than
how to decide on the size of the hash table?
you should be asking something along the lines of "what hash table implementation doesn't require me to manage the table size manually?" There are several. Personally, I have used UTHash successfully, though as of recently it is no longer maintained.
You could also use a linked list maintained in order, or a search tree. No doubt there are other viable choices.
You also asked
Once the list is walked through & the hash table is populated (with the 'count' number of occurrences for each values/indices), only way to find/list the odd number of times occurring value is to walk through the table & find it out? Is that's the only way to do?
If you perform the analysis via the general approach we have discussed so far then yes, the only way to read out the result is to iterate through the counts. I can imagine alternative, more complicated, approaches wherein you switch numbers between lists of those having even counts and those having odd counts, but I'm having trouble seeing how whatever efficiency you might gain in readout could fail to be swamped by the efficiency loss at the counting stage.
In your specific case, you can walk the list and toggle the value's existence in a set. The resulting set will contain all of the values that appeared an odd number of times. However, this only works for that specific predicate, and the more generic count-then-filter algorithm you describe will be required if you wanted, say, all of the entries that appear an even number of times.
Both algorithms should be O(N) time and worst-case O(N) space, and the constants will probably be lower for the set-based algorithm, but you'll need to benchmark it against your data. In practice, I'd run with the more generic algorithm unless there was a clear performance problem.
So I have this code. Not sure if it works because the runtime for the program is still continuing.
void permute(std::vector<std::string>& wordsVector, std::string prefix, int length, std::string alphabet) {
if (length == 0) {
//end the recursion
wordsVector.push_back(prefix);
}
else {
for (int i = 0; i < alphabet.length(); ++i) {
permute(wordsVector, prefix + alphabet.at(i), length - 1, alphabet);
}
}}
where I'm trying to get all combinations of characters in the English alphabet of a given length. I'm not sure if the approach is correct at the moment.
Alphabet consists of A-Z in a string of length 26. WordsVectors holds all the different combinations of words. prefix is meant to pass through recursively until a word is made and length is self explanatory.
Example, if I give the length of 7 to the function, I expect a size of 26 x 25 x 24 x 23 x 22 x 21 x 20 = 3315312000 if I'm correct, following the formula for permutations.
I don't think programs are meant to run this long so either I'm hitting an infinite loop or something is wrong with my approach. Please advise. Thanks.
Surely the stack would overflow but concentrating on your question even if you write an iterative program it will take a long time ( not an infinite loop just very long )
[26L, 650L, 15600L, 358800L, 7893600L, 165765600L, 3315312000L, 62990928000L, 1133836704000L, 19275223968000L, 308403583488000L, 4626053752320000L, 64764752532480000L, 841941782922240000L, 10103301395066880000L, 111136315345735680000L, 1111363153457356800000L, 10002268381116211200000L, 80018147048929689600000L, 560127029342507827200000L, 3360762176055046963200000L, 16803810880275234816000000L, 67215243521100939264000000L, 201645730563302817792000000L, 403291461126605635584000000L, 403291461126605635584000000L]
The above list is the number of possibilities for 1<=n<=26. You can see as n increases number of possibilities increases tremendously. Say you have 1GHz processor that does 10^9 operations per second. Say consider number of possibilities for n=26 its 403291461126605635584000000L. Its evident that if you sit down to list all possibilities its so so long ( so so many years ) that
you will feel it has hit an infinite loop. Finally I have not looked that closely into your code , but in nutshell even if you write it correctly,iteratively and don't store (again can't have this much memory) and just print all possibilities its going to take long time for larger values of n.
EDIT
As jaromax and others said if you just want to write it for smaller values of n,
say less than 10-12 you can write an iterative program to list/print them. It will run quite fast for small values. But if you also want to store them them then n will have to be say less than 5 say. (Really depends on how much RAM is available or you could find some permutations write them to disk, then depends on how much disk memory you can spare, again refer the number of possibilities list I posted above. It gives a rough idea of both time and space complexity).
I think there could be quite a problem that you do this on stack. A large part of the calculation you do recursively and this means every time allocated space for function.
Try to reformulate it linearly. I think I had such a problem before.
Your question implies you think there are 26x25x24x ... permutations
Your code doesn't have anything I can see to avoid "AAAAAAA" being a permutation, in which case there are 26x26x26x ...
So in addition to being a very complicated way of counting in base 26, I think it's also giving bad answers?
I am wondering how to make an array with values that start at 1111 and go all the way up to 8888. I am asking this because I need to generate a list of 4 digit numbers with each digit ranging from 1-8. I would like to have this in a loop form. Also, I am lost on my functions trim, methodicalEliminate, guessAndEliminate, and guessThreeThenEliminate in my following program. Here are the directions:
This assignment focuses on the use of arrays in a program, including using one as a parameter to a function.
PROBLEM DESCRIPTION
In the game of Mastermind, a player is only given a finite number of guesses to try to identify the hidden combination (such as twelve guesses). Often that is the only constraint in playing the game.
But some players might make a competition with each other to see who can guess the other's combination in fewer tries. In this case, the problem is not only to come up with a strategy that can find the answer within a specified limit, but to find one that is likely to require the minimal number of guesses.
Here is where the computer comes in -- one can write a program that would try out different guessing strategies, and see how they work out. Since a computer can do the analysis and computations more rapidly than a person, it could just pretend to play Mastermind on our behalf using any strategy we choose, and tell us how long it took to do that.
OVERALL SOLUTION
Of course, it would be extremely difficult to teach the computer to reason along the same lines as a person. For example, if we guessed a combination 1111 and got one black peg, we would make a mental note that the answer has exactly one 1 in it, and then proceed to make other guesses with that one fact in mind. If we next guesses 1222 and got one white peg, we would know there were no 2's, and that the single 1 is not in the first position. But how to keep track of such information after a series of guesses would be rather hard.
Fortunately, for a computer simulation with an array, we can record all of our known facts in a different way. We just maintain a list of all possible answers that there could be, and then remove numbers from the list that could no longer be the solution. If our first guess tells us there is exactly one 1 digit, we would remove all the numbers that do not have that feature. When we find out there are no 2's, we eliminate all the values that contain 2's. Eventually, the only number left would be the correct answer.
SOME SIMPLE STRATEGIES
This is a strategy that many players use, resembling what was described above. Just methodically got through the possibilities in a straightforward fashion. The first guess of 1111 would answer how many 1's are in the solution; the next guess would answer how many 2's are in the solution, and also say something about where any 1's might be, and so on.
With our list approach, which contain a whole lot of possibilities in order beginning with 1111, 1112, 1113, 1114, etc., our next guess would always be the first in the list.
The next strategy is for those who like a little more excitement. The guesses appear to be more or less random, with the hopes that a lot more information can be discovered. Simulating this approach is surprisingly simple -- if you have a list of numbers, just pick one at random. If you have 837 possibilities to choose from in an array, just pick a random subscript in the range of 0 to 836.
This third strategy considers the possibility that answers that give similar results to a given guess are in a sense similar to each other. So to try to get a little more information, it will still pick some numbers at random without regard to how they were evaluated, and then only start thinking about the results.
To implement this one, let us just pick any three possible answers and guess them, temporarily ignoring how many black pegs and white pegs they earn us. Only after making those guesses will we trim the list of possibilities, then proceeding as the second strategy above.
SAMPLE INTERFACE
These are the sample results from the current implementation: Please enter a combination to try for, or 0 for a random value: 0
Guessing at 2475
Guessing 1111...
Guessing 2222...
Guessing 2333...
Guessing 2444...
Guessing 2455...
Guessing 2456...
Guessing 2475...
Methodical Eliminate required 7 tries.
Guessing 6452...
Guessing 2416...
Guessing 2485...
Guessing 2445...
Guessing 2435...
Guessing 2425...
Guessing 2475...
Guess and Eliminate required 7 tries.
Guessing 7872...
Guessing 6472...
Guessing 1784...
Guessing 2475...
Guess Three then Eliminate required 4 tries.
Play another game? (y/n) y
Please enter a combination to try for, or 0 for a random value: 0
Guessing at 4474
Guessing 1111...
Guessing 2222...
Guessing 3333...
Guessing 4444...
Guessing 4445...
Guessing 4464...
Guessing 4474...
Methodical Eliminate required 7 tries.
Guessing 3585...
Guessing 7162...
Guessing 4474...
Guess and Eliminate required 3 tries.
Guessing 8587...
Guessing 1342...
Guessing 1555...
Guessing 7464...
Guessing 6764...
Guessing 4468...
Guessing 4474...
Guess Three then Eliminate required 7 tries.
NOTE: This program allows each digit to go up to 8 instead of 6. Even though there are 4096 possible answers, it still finds them rather rapidly.
PROGRAM SPECIFICATIONS
The assigned program must implement all of the following functions. Additional ones are permitted as desired -- these below are required. Future assignments will not detail the functions as below -- but will instead require the students to design their own function descriptions in advance to writing the program. main:
Simply governs the overall behavior of the program. A number will be
chosen as the target combination, and then each strategy will attempt
to find it.
Calls: generateAnswer, (to compare all three must have the same answer)
methodicalEliminate, guessAndEliminate, guessThreeThenEliminate
generateAnswer:
Either lets the user at the keyboard choose the mystery combination,
or gives the option to have the computer generate a random combination.
(For a competitive game, it might be interesting to know what sorts
of combinations would be the hardest to guess!)
Parameters: none!
Returns: a 4-digit combination, each digit in the range 1 to 8
generateSearchSpace:
Populates an array with all possible combinations of four-digit
values in the range 1 to 8.
Parameters:
guesses (modified int array) list of guesses
length (output int) number of values in list
Pre-condition:
The array must be allocated to no fewer than 4096 elements.
trim:
Analyzes the response to a particular guess and then eliminates
any values from the list of possibilities that are no longer
possible answers. In each case, it assumes that a value in the
list is an answer, and evaluates the guess accordingly. If the
number of black and white pegs is not the same as those specified,
then it cannot be the correct answer.
Parameters:
guesses (modified int array) list of guesses
length (modified int) number of values in list
guess (input int) a guess that has been evaluated
black (input int) how many black pegs that guess earned
white (input int) how many white pegs that guess earned
Pre-condition:
black and white actually do contain the results of comparing
the guess with the actual answer
Post-condition:
length has been reduced (we learned something)
the viable answers occupy the first 'length' positions
in the guesses array (so the list is shorter)
Calls: evaluate
methodicalEliminate:
beginning with a list of all possible candidate answers
continually guesses the first element in the list, and
trim answers accordingly, until an answer is found
Parameter:
answer (input int) the actual answer
(necessary to get black/white pegs)
Returns: number of guesses required to find the answer
Calls: generateSearchSpace, evaluate, trim
gusssAndEliminate:
beginning with a list of all possible candidate answers
continually guesses a random element in the list, and
trim answers accordingly, until an answer is found
Parameter:
answer (input int) the actual answer
Returns: number of guesses required to find the answer
Calls: generateSearchSpace, evaluate, trim
gusssThreeThenEliminate:
beginning with a list of all possible candidate answers
first guesses three answers at random before trimming
the list of possibilites, and then narrows on the answer
one random guess at a time
Parameter:
answer (input int) the actual answer
Returns: number of guesses required to find the answer
Calls: generateSearchSpace, evaluate, trim
NOTE: These last functions use the correct answer to evaluate
each guess and then use the black/white pegs for the guessing
strategy. NOne of these strategies may peek at the answer to
decide what to do next!
ALSO: The following functions should also appear in this program
from the previous assignment, though they are not themselves
part of the grade for this one.
evaluate:
evaluates a combination by comparing it with the answer
Correctness is indicated by black pegs (correct digit in correct position)
and white pegs (correct digit in incorrect position)
Parameters:
answer (input int) the correct combination
guess (input int) the current guess
black (output int) number of black pegs
white (output int) number of white pegs
pre-conditions:
answer and guess are both 4-digit numbers with no zero digits
post-conditions:
black and white are both > 0 and their sum is <= 4
Calls: nthDigit, clearNthDigit
nthDigit:
identified the n'th digit of a combination
whether digits count from left to right or right to left is unspecified
Parameters:
combination (input int) combination to examine
position (input int) which digit to examine
(returned) (output int) the value of the actual digit
pre-conditions:
combination has the appropriate number of digits, and
0 < position <= number of digits
post-condition:
0 <= returned digit <= 9 (single digit)
clearNthDigit:
ears the n'th digit of a combination to zero, so it will no longer match
digits must be counted in the same manner as nthDigit above.
parameters:
combination (in/out int) combination to modify
position (input int) which digit to set to 0
pre-condition:
same as those for nthDigit above
post-condition:
corresponding digit is set to zero
Calls: nthDigit (optional, depending on the implementation)
Thank you for reading such this long question, and I hope you can help me on arrays!
Just because your guesses have the form of 1111 through to 8888, it does not mean they are numbers.
They are numbers if it makes sense to do arithmetic calculations on them. It does not make sense to define arithmetic calculations on the guesses: what would it mean to add a guess 4571 to a guess 6214?
If your guesses are not numbers, don't use a representation that is reserved for numbers. You can, however, use an array of four integers:
int guesses[4][4096];
int g = 0;
for (int i = 1; i <= 8; ++i)
for (int j = 1; j <= 8; ++j)
for (int k = 1; k <= 8; ++k)
for (int m = 1; m <= 8; ++m)
guesses[g++] = {i, j, k, m};
I am pretty convinced that putting all possible guesses into the array guesses like that is not a good idea either; the code mainly demonstrates how the guesses could be generated.
Go through the other functions, think of what high level operations should be performed on the remaining gueses (like "eliminate all guesses that has a certain number in the thrid position" etc). This should give you an idea about what would be a better data structure to replace the guesses array.
I removed the codes because it's homework. If you actually needs the help, you can either look at the discussion I had with George B (below), or PM me.
Hi guys. This is a homework assignment. I have tested it against other sorting algorithms, and Q.S. is the only one that is crashing on some random inputs.
The program is quit long (with other stuff), but input is randomly generated....
I spent a few hours tracing the codes and still couldn't figure out any error....
Q.S. is probably very easy for the professionals, so I hope to receive advices on this implementation....
Any input is appreciated!
Q: What is "random"?
A: A portion of generation is included.
void randomArray(unsigned long*& A, unsigned long size)
{
//Note that RAND_MAX is a little small for some compilers (2^16-1).
//In order to test our algorithms on large arrays without huge
//numbers of duplicates, we'll set the high-order and low-order
//parts of the return value with two random values.
A = new unsigned long[size];
for(unsigned long i=0; i<size; i++)
A[i] = (rand()<<16) | (rand());
//Another note: initially, if you want to test your program out with smaller
//arrays and small numbers, just reduce A[i] mod k for some small value k as in the following:
//A[i] = rand() % 16;
//this may help you debug at first.
}
Q: What kind of error?
Well, I am not getting compilation error. Without Q.S., I can ran other four sorting algorithm without problems (I can continuously running the sorting). When Q.S is activated, after running the program one or two or three times, or even at the first time of running, the program ends (I am using Eclipse, so the consoles ends).
enter the number of elements, or a
negative number to quit: 5 {some
arrays}
selection sort took 0 seconds. merge
sort took 0 seconds. quick sort took 0
seconds. heap sort took 0 seconds.
bucket sort took 0 seconds. {output of
5 sorted arrays}
enter the number of elements, or a
negative number to quit: 6 {some
arrays}
selection sort took 0 seconds. merge
sort took 0 seconds. quick sort took 0
seconds. heap sort took 0 seconds.
bucket sort took 0 seconds.
{output of 5 sorted arrays}
enter the number of elements, or a
negative number to quit: 8 {arrays}
--- console ends---
Again, the problem is that it crashes quite often, so this suggests that there is a high possibility of access violation,,, but doing 10+ tracings I don't see the problem.... (maybe I overloaded my brain stack -_- )
Thanks.
Hint:
q is unsigned (the result of the partition function)
so, q-1 is also unsigned
what if q is zero?
(It is homework so you have to figure it out I guess :) )
Trace your algorithm with the array {2,5,2}. Obviously your program crashes as soon as there are duplicate numbers in your list. The first call of Partition will return 2 as the index of r. Thus, the second call of quickSort(A,3,2) will access memory locations not within array boundaries. Its always a good idea to do the boundary checks for arrays manually and generate understandable output to more easily trace and debug your program.