Is it worth memoising a primality test? - c++

I have another backtracking challenge, in which I have to get all possible combinations of prime numbers that add up to a certain number. I have finished the task using the general use algorithm from Wikipedia, but for the number 100, it took more than an hour to run, and it still hadn't finished by the end of class. I was wondering: Would memoisation(how do you spell that?) have significantly improved the algorithm's performance(as in, would it have made it noticeably faster)? I am using c++, and the function is called a huge number of times. I am using recursive backtracking, which I seem to remember is roughly O(n!) for simple problems.

Create an array external to the function checking for primarity and reachable from it. Global or static, depending on the used language. That array will content all found primary numbers.
If the number in question is in the array, return true.
if number is less or equal than squared max number in the array, return false.
Check for divisibility for all known primaries
if the number is primary, write it into array and return true
return false
That adding is simple enough. Do it and check the changed time.

Related

Time Complexity of fibonacci series in Bottom Up approach(DP)

Algorithm in Bottom up approach
a[0]=0,a[1]=1
integer fibo(n)
if a[n]== null
a[n] = fibo(n-1) + fibo(n-2)
return a[n]
How this algorithm has the time limit of O(N)
For 5 it calls 8 times.
pass of fibnacci series in Bottom Up approach
fibo(5) calling 8 times to go top to down and also calling 8 times to return top from bottom. so total call is 8+8=16 of my view. So how the time complexity is O(N) it's unclear to me.
I found many similar questions answered here but all of those isn't related with my
interest.
Some of these are:
Time Complexity of Fibonacci Series
Time Complexity of Fibonacci Algorithm
Anyone help would be appreciated.Thanks
There are a couple of quick things to mention before answering your question about the time complexity. The reason for this is that the time complexity at least partially depends on these answers.
First, there seems to be a bug in your program as you have an array 'a' for the base conditions (Fibbinacci numbers 0 and 1) and some array 'm' which is set in the fibo function, but never used again. More importantly, when you reach n=1 or n=0, you return the value of m[n] which is entirely unknown. So, I'm going to assume the algorithm is rewritten as follows:
a[0]=0,a[1]=1
integer fibo(n)
if a[n]== null
a[n] = fibo(n-1) + fibo(n-2)
return a[n]
Okay, second problem. Let's assume that that a is always defined as at least n+1 integers. There needs to be enough room for the incoming data. This is important because c++ will let you overwrite values at the n+1th index. It's out of bounds and wrong, but c++ doesn't give those sorts of protections. It is up to you as the programmer to verify boundary conditions like that. (I'm assuming c++ because this is tagged with c++. The code looks more like python, which has its wrap-around indices which are problematic on their own.)
Third, let's assume that you don't start with a new array 'a' for each run of the algorithm. This is important because if a stores already-calculated values then you will save time on calculation by not having to re-evaluate those values. That time savings is a great thing even if it won't affect how I calculate time complexity.
Great. Let's get started with your question. Let's use the image below to answer it. When you start the algorithm at n you are going to make two recursive calls for fibo(n-1) and fibo(n-2) BUT they do not happen simultaneously. Instead the first call for fibo(n-1) takes place and must be 100% complete before the second call for fibo(n-2) begins. That call is represented by the green line from n-1 on the nth line to the n-1th line.
Now, those green lines apply to each recursion down the line until you reach the fibo(1) call. That call terminates early because a[n] is NOT null. Finally the second call for fibo(0) is executed and it also terminates early because a[n] is not null. Okay, so much for the first set of recursive calls.
As each recursive call returns, the second call (represented by the orange broken line) is made, but a[n] is no longer null, so that call terminates early and the call returns up to the next layer.
So, let's count the number of calls. From n to 1 is n-1 recursive calls. At the end there is one additional call to fibo(0) so that is n recursive calls. Then on the way up there are n-2 additional calls which terminate early. So, altogether we have 2n-2 calls which is O(n).
Of course, if you call fibo(k) and then fibo(k+x) you will only need to do the first 2x calls because everything from fibo(k) down is already known. It is a considerable savings after the initial investment. Any questions?
Regarding O(2n)=O(n), that is a good follow up. Big-O complexity rules say that we are interested in the order-of-magnitude when you compare efficiency. So, suppose that you were looking at a n=1000. O(n)=1000, O(2n)=2000, but O(n2)=1,000,000. O(n) is more or less the same as O(2n), but if you compare them with O(n2), that is a huge difference. Similarly, if you have O(n+1)=1001 that isn't much different from O(n). So, in general we say that the leading term, the most important value in the equation is what is important. We aren't really interested in extra terms. We aren't really interested in specific coefficients because they don't really affect the outcome.
If you still have questions, see this site for some additional information.
https://justin.abrah.ms/computer-science/big-o-notation-explained.html

how does IF affect complexity?

Let's say we have an array of 1.000.000 elements and we go through all of them to check something simple, for example if the first character is "A". From my (very little) understanding, the complexity will be O(n) and it will take some X amount of time. If I add another IF (not else if) to check, let's say, if the last character is "G", how will it change complexity? Will it double the complexity and time? Like O(2n) and 2X?
I would like to avoid taking into consideration the number of calculations different commands have to make. For example, I understand that Len() requires more calculations to give us the result than a simple char comparison does, but let's say that the commands used in the IFs will have (almost) the same amount of complexity.
O(2n) = O(n). Generalizing, O(kn) = O(n), with k being a constant. Sure, with two IFs it might take twice the time, but execution time will still be a linear function of input size.
Edit: Here and Here are explanations, with examples, of the big-O notation which is not too mathematic-oriented
Asymptotic complexity (which is what big-O uses) is not dependent on constant factors, more specifically, you can add / remove any constant factor to / from the function and it will remain equivalent (i.e. O(2n) = O(n)).
Assuming an if-statement takes a constant amount of time, it will only add a constant factor to the complexity.
A "constant amount of time" means:
The time taken for that if-statement for a given element is not dependent on how many other elements there are in the array
So basically if it doesn't call a function which looks through the other elements in the array in some way or something similar to this
Any non-function-calling if-statement is probably fine (unless it contains a statement that goes through the array, which some language allows)
Thus 2 (constant-time) if-statements called for each each element will be O(2n), but this is equal to O(n) (well, it might not really be 2n, more on that in the additional note).
See Wikipedia for more details and a more formal definition.
Note: Apart from not being dependent on constant factors, it is also not dependent on asymptotically smaller terms (terms which remain smaller regardless of how big n gets), e.g. O(n) = O(n + sqrt(n)). And big-O is just an upper bound, so saying it is O(n9999) would also be correct (though saying that in a test / exam will probably get you 0 marks).
Additional note: The problem when not ignoring constant factors is - what classifies as a unit of work? There is no standard definition here. One way is to use the operation that takes the longest, but determining this may not always be straight-forward, nor would it always be particularly accurate, nor would you be able to generically compare complexities of different algorithms.
Some key points about time complexity:
Theta notation - Exact bound, hence if a piece of code which we are analyzing contains conditional if/else and either part has some more code which grows based on input size then exact bound can't be obtained since either of branch might be taken and Theta notation is not advisable for such cases. On the other hand, if both of the branches resolve to constant time code, then Theta notation can be applicable in such case.
Big O notation - Upper bound, so if a code has conditionals where either of the conditional branches might grow with input size n, then we assume max or upper bound to calculate the time consumption by the code, hence we use Big O for such conditionals assuming we take the path that has max time consumption. So, the path which has lower time can be assumed as O(1) in amortized analysis(including the fact that we assume this path has no no recursions that may grow with the input size) and calculate time complexity Big O for the lengthiest path.
Big Omega notation - Lower bound, This is the minimum guaranteed time that a piece of code can take irrespective of the input. Useful for cases where the time taken by code doesn't grow based on input size n, but it consumes a significant amount of time k. In these cases, we can use the lower bound analysis.
Note: All of these notations doesn't depend upon the input being best/avg/worst and all of these can be applied to any piece of code.
So as discussed above, Big O doesn't care about the constant factors such as k and only sees how time increases with respect to growth in n, in which case here it is O(kn) = O(n) linear.
PS: This post was about the relation of big O and conditionals evaluation criteria for amortized analysis.
It's related to a question I posted myself today.
In your example it depends on whether you can jump from the first to the last element and if you can't then it also depends on the average length of each entry.
If as you went down through the array you had to read each full entry in order to evaluate your two if statements then your order would be O(1,000,000xN) where N is the average length of each entry. IF N is variable then it will affect the order. An example would be standard multiplication where we perform Log(N) additions of an entry which is Log(N) in lenght and so the order is O(Log^2(N)) or if you prefer O((Log(N))^2).
On the other hand if you can just check the first and last character then N = 2 and is constant so can be ignored.
This is an IMPORTANT point you have to be careful though because how can you decide if your multipler can be ignored. For example say we were doing Log(N) additions of a Log(N/100) number. Now just because Log(N/100) is the smaller term doesn't mean we can ignore it. The multiplying factor cannot be ignored if it is variable.

Algorithm to find a duplicate entry in constant space and O(n) time

Given an array of N integer such that only one integer is repeated. Find the repeated integer in O(n) time and constant space. There is no range for the value of integers or the value of N
For example given an array of 6 integers as 23 45 67 87 23 47. The answer is 23
(I hope this covers ambiguous and vague part)
I searched on the net but was unable to find any such question in which range of integers was not fixed.
Also here is an example that answers a similar question to mine but here he created a hash table with the highest integer value in C++.But the cpp does not allow such to create an array with 2^64 element(on a 64-bit computer).
I am sorry I didn't mention it before the array is immutable
Jun Tarui has shown that any duplicate finder using O(log n) space requires at least Ω(log n / log log n) passes, which exceeds linear time. I.e. your question is provably unsolvable even if you allow logarithmic space.
There is an interesting algorithm by Gopalan and Radhakrishnan that finds duplicates in one pass over the input and O((log n)^3) space, which sounds like your best bet a priori.
Radix sort has time complexity O(kn) where k > log_2 n often gets viewed as a constant, albeit a large one. You cannot implement a radix sort in constant space obviously, but you could perhaps reuse your input data's space.
There are numerical tricks if you assume features about the numbers themselves. If almost all numbers between 1 and n are present, then simply add them up and subtract n(n+1)/2. If all the numbers are primes, you could cheat by ignoring the running time of division.
As an aside, there is a well-known lower bound of Ω(log_2(n!)) on comparison sorting, which suggests that google might help you find lower bounds on simple problems like finding duplicates as well.
If the array isn't sorted, you can only do it in O(nlogn).
Some approaches can be found here.
If the range of the integers is bounded, you can perform a counting sort variant in O(n) time. The space complexity is O(k) where k is the upper bound on the integers(*), but that's a constant, so it's O(1).
If the range of the integers is unbounded, then I don't think there's any way to do this, but I'm not an expert at complexity puzzles.
(*) It's O(k) since there's also a constant upper bound on the number of occurrences of each integer, namely 2.
In the case where the entries are bounded by the length of the array, then you can check out Find any one of multiple possible repeated integers in a list and the O(N) time and O(1) space solution.
The generalization you mention is discussed in this follow up question: Algorithm to find a repeated number in a list that may contain any number of repeats and the O(n log^2 n) time and O(1) space solution.
The approach that would come closest to O(N) in time is probably a conventional hash table, where the hash entries are simply the numbers, used as keys. You'd walk through the list, inserting each entry in the hash table, after first checking whether it was already in the table.
Not strictly O(N), however, since hash search/insertion gets slower as the table fills up. And in terms of storage it would be expensive for large lists -- at least 3x and possibly 10-20x the size of the array of numbers.
As was already mentioned by others, I don't see any way to do it in O(n).
However, you can try a probabilistic approach by using a Bloom Filter. It will give you O(n) if you are lucky.
Since extra space is not allowed this can't be done without comparison.The concept of lower bound on the time complexity of comparison sort can be applied here to prove that the problem in its original form can't be solved in O(n) in the worst case.
We can do in linear time o(n) here as well
public class DuplicateInOnePass {
public static void duplicate()
{
int [] ar={6,7,8,8,7,9,9,10};
Arrays.sort(ar);
for (int i =0 ; i <ar.length-1; i++)
{
if (ar[i]==ar[i+1])
System.out.println("Uniqie Elements are" +ar[i]);
}
}
public static void main(String[] args) {
duplicate();
}
}

Perfect hash function for a set of integers with no updates

In one of the applications I work on, it is necessary to have a function like this:
bool IsInList(int iTest)
{
//Return if iTest appears in a set of numbers.
}
The number list is known at app load up (But are not always the same between two instances of the same application) and will not change (or added to) throughout the whole of the program. The integers themselves maybe large and have a large range so it is not efficient to have a vector<bool>. Performance is a issue as the function sits in a hot spot. I have heard about Perfect hashing but could not find out any good advice. Any pointers would be helpful. Thanks.
p.s. I'd ideally like if the solution isn't a third party library because I can't use them here. Something simple enough to be understood and manually implemented would be great if it were possible.
I would suggest using Bloom Filters in conjunction with a simple std::map.
Unfortunately the bloom filter is not part of the standard library, so you'll have to implement it yourself. However it turns out to be quite a simple structure!
A Bloom Filter is a data structure that is specialized in the question: Is this element part of the set, but does so with an incredibly tight memory requirement, and quite fast too.
The slight catch is that the answer is... special: Is this element part of the set ?
No
Maybe (with a given probability depending on the properties of the Bloom Filter)
This looks strange until you look at the implementation, and it may require some tuning (there are several properties) to lower the probability but...
What is really interesting for you, is that for all the cases it answers No, you have the guarantee that it isn't part of the set.
As such a Bloom Filter is ideal as a doorman for a Binary Tree or a Hash Map. Carefully tuned it will only let very few false positive pass. For example, gcc uses one.
What comes to my mind is gperf. However, it is based in strings and not in numbers. However, part of the calculation can be tweaked to use numbers as input for the hash generator.
integers, strings, doesn't matter
http://videolectures.net/mit6046jf05_leiserson_lec08/
After the intro, at 49:38, you'll learn how to do this. The Dot Product hash function is demonstrated since it has an elegant proof. Most hash functions are like voodoo black magic. Don't waste time here, find something that is FAST for your datatype and that offers some adjustable SEED for hashing. A good combo there is better than the alternative of growing the hash table.
#54:30 The Prof. draws picture of a standard way of doing perfect hash. The perfect minimal hash is beyond this lecture. (good luck!)
It really all depends on what you mod by.
Keep in mind, the analysis he shows can be further optimized by knowing the hardware you are running on.
The std::map you get very good performance in 99.9% scenarios. If your hot spot has the same iTest(s) multiple times, combine the map result with a temporary hash cache.
Int is one of the datatypes where it is possible to just do:
bool hash[UINT_MAX]; // stackoverflow ;)
And fill it up. If you don't care about negative numbers, then it's twice as easy.
A perfect hash function maps a set of inputs onto the integers with no collisions. Given that your input is a set of integers, the values themselves are a perfect hash function. That really has nothing to do with the problem at hand.
The most obvious and easy to implement solution for testing existence would be a sorted list or balanced binary tree. Then you could decide existence in log(N) time. I doubt it'll get much better than that.
For this problem I would use a binary search, assuming it's possible to keep the list of numbers sorted.
Wikipedia has example implementations that should be simple enough to translate to C++.
It's not necessary or practical to aim for mapping N distinct randomly dispersed integers to N contiguous buckets - i.e. a perfect minimal hash - the important thing is to identify an acceptable ratio. To do this at run-time, you can start by configuring a worst-acceptible ratio (say 1 to 20) and a no-point-being-better-than-this-ratio (say 1 to 4), then randomly vary (e.g. changing prime numbers used) a fast-to-calculate hash algorithm to see how easily you can meet increasingly difficult ratios. For worst-acceptible you don't time out, or you fall back on something slower but reliable (container or displacement lists to resolve collisions). Then, allow a second or ten (configurable) for each X% better until you can't succeed at that ratio or reach the no-pint-being-better ratio....
Just so everyone's clear, this works for inputs only known at run time with no useful patterns known beforehand, which is why different hash functions have to be trialed or actively derived at run time. It is not acceptible to simple say "integer inputs form a hash", because there are collisions when %-ed into any sane array size. But, you don't need to aim for a perfectly packed array either. Remember too that you can have a sparse array of pointers to a packed array, so there's little memory wasted for large objects.
Original Question
After working with it for a while, I came up with a number of hash functions that seemed to work reasonably well on strings, resulting in a unique - perfect hashing.
Let's say the values ranged from L to H in the array. This yields a Range R = H - L + 1.
Generally it was pretty big.
I then applied the modulus operator from H down to L + 1, looking for a mapping that keeps them unique, but has a smaller range.
In you case you are using integers. Technically, they are already hashed, but the range is large.
It may be that you can get what you want, simply by applying the modulus operator.
It may be that you need to put a hash function in front of it first.
It also may be that you can't find a perfect hash for it, in which case your container class should have a fall back position.... binary search, or map or something like that, so that
you can guarantee that the container will work in all cases.
A trie or perhaps a van Emde Boas tree might be a better bet for creating a space efficient set of integers with lookup time bring constant against the number of objects in the data structure, assuming that even std::bitset would be too large.

How to ensure that randomly generated numbers are not being repeated? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Unique (non-repeating) random numbers in O(1)?
How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N
I want to generate random number in a certain diapason, and I must be sure, that each new number is not a duplicate of formers. One solution is to store formerly generated numbers in a container and each new number checks aginst the container. If there is such number in the container, then we generate agin, else we use and add it to the container. But with each new number this operation is becoming slower and slower. Is there any better approach, or any rand function that can work faster and ensure uniqueness of the generation?
EDIT: Yes, there is a limit (for example from 0 to 1.000.000.000). But I want to generate 100.000 unique numbers! (Would be great if the solution will be by using Qt features.)
Is there a range for the random numbers? If you have a limit for random numbers and you keep generating unique random numbers, then you'll end up with a list of all numbers from x..y in random order, where x-y is the valid range of your random numbers. If this is the case, you might improve speed greatly by simply generating the list of all numbers x..y and shuffling it, instead of generating the numbers.
I think there are 3 possible approaches, depending on range-size, and performance pattern needed you can use another algorithm.
Create a random number, see if it is in (a sorted) list. If not add and return, else try another.
Your list will grow and consume memory with every number you need. If every number is 32 bit, it will grow with at least 32 bits every time.
Every new random number increases the hit-ratio and this will make it slower.
O(n^2) - I think
Create an bit-array for every number in the range. Mark with 1/True if already returned.
Every number now only takes 1 bit, this can still be a problem if the range is big, but every number now only allocates 1 bit.
Every new random number increases the hit-ratio and this will make it slower.
O(n*2)
Pre-populate a list with all the numbers, shuffle it, and return the Nth number.
The list will not grow, returning numbers will not get slower,
but generating the list might take a long time, and a lot of memory.
O(1)
Depending on needed speed, you could store all lists in a database. There's no need for them to be in memory except speed.
Fill out a list with the numbers you need, then shuffle the list and pick your numbers from one end.
If you use a simple 32-bit linear congruential RNG (such as the so-called "Minimal Standard"), all you have to do is store the seed value you use and compare each generated number to it. If you ever reach that value again, your sequence is starting to repeat itself and you're out of values. This is O(1), but of course limited to 2^32-1 values (though I suppose you could use a 64-bit version as well).
There is a class of pseudo-random number generators that, I believe, has the properties you want: the Linear congruential generator. If defined properly, it will produce a list of integers from 0 to N-1, with no two numbers repeating until you've used all of the numbers in the list once.
#include <stdint.h>
/*
* Choose these values as follows:
*
* The MODULUS and INCREMENT must be relatively prime.
* The MULTIPLIER-1 must be divisible by all prime factors of the MODULUS.
* The MULTIPLIER-1 must be divisible by 4, if the MODULUS is divisible by 4.
*
* In addition, modulus must be <= 2**32 (0x0000000100000000ULL).
*
* A small example would be 8, 5, 3.
* A larger example would be 256, 129, 251.
* A useful example would be 0x0000000100000000ULL, 1664525, 1013904223.
*/
#define MODULUS (0x0000000100000000ULL)
#define MULTIPLIER (1664525)
#define INCREMENT (1013904223)
static uint64_t seed;
uint32_t lcg( void ) {
uint64_t temp;
temp = seed * MULTIPLIER + INCREMENT; // 64-bit intermediate product
seed = temp % MODULUS; // 32-bit end-result
return (uint32_t) seed;
}
All you have to do is choose a MODULUS such that it is larger than the number of numbers you'll need in a given run.
It wouldn't be random if there is such a pattern?
As far as I know you would have to store and filter all unwanted numbers...
unsigned int N = 1000;
vector <unsigned int> vals(N);
for(unsigned int i = 0; i < vals.size(); ++i)
vals[i] = i;
std::random_shuffle(vals.begin(), vals.end());
unsigned int random_number_1 = vals[0];
unsigned int random_number_2 = vals[1];
unsigned int random_number_3 = vals[2];
//etc
You could store the numbers in a vector, and get them by index (1..n-1). After each random generation, remove the indexed number from the vector, then generate the next number in the interval 1..n-2. etc.
If they can't be repeated, they aren't random.
EDIT:
Furthermore..
if they can't be repeated, they don't fit in a finite computer
How many random numbers do you need? Maybe you can apply a shuffle algorithm to a precalculated array of random numbers?
There is no way a random generator will output values depending on previously outputted values, because they wouldn't be random. However, you can improve performance by using different pools of random values each with values combined by a different salt value, which will divide the quantity of numbers to check by the quantity of pools you have.
If the range of the random number doesn't matter you could use a really large range of random numbers and hope you don't get any collisions. If your range is billions of times larger than the number of elements you expect to create your chances of a collision are small but still there. If the numbers don't to have an actual random distribution you could have a two part number {counter}{random x digits} that would ensure a unique number but it wouldn't be randomly distributed.
There's not going to be a pure functional approach that isn't O(n^2) on the number of results returned so far - every time a number is generated you will need to check against every result so far. Additionally, think about what happens when you're returning e.g. the 1000th number out of 1000 - you will require on average 1000 tries until the random algorithm comes up with the last unused number, with each attempt requiring an average of 499.5 comparisons with the already-generated numbers.
It should be clear from this that your description as posted is not quite exactly what you want. The better approach, as others have said, is to take a list of e.g. 1000 numbers upfront, shuffle it, and then return numbers from that list incrementally. This will guarantee you're not returning any duplicates, and return the numbers in O(1) time after the initial setup.
You can allocate enough memory for array of bits with 1 bit for each possible number. and check/set bits for every generated number. for example for numbers from 0 to 65535 you will need only 8192 (8kb) of memory.
Here's an interesting solution I came up with:
Assume you have numbers 1 to 1000 - and you don't have enough memory.
You could put all 1000 numbers into an array, and remove them one by one, but you'll get memory overflow error.
You could split the array in two, so you have an array of 1-500 and one empty array
You could then check if the number exists in array 1, or doesn't exist in the second array.
So assuming you have 1000 numbers, you can get a random number from 1-1000. If its less than 500, check array 1 and remove it if present. If it's NOT in array 2, you can add it.
This halves your memory usage.
If you propogate this using recursion, you can split your 500 array into a 250 and empty array.
Assuming empty arrays use no space, you can decrease your memory usage quite a bit.
Searching will be massively faster too, because if you break it down a lot, you generate a number such as 29. It's less than 500, less than 250, less than 125, less than 62, less than 31, greater than 15, so you do those 6 calculations, then check the array containing an average of 16/2 items - 8 in total.
I should patent this search, although I bet it already exists!
Especially given the desired number of values, you want a Linear Feedback Shift Register.
Why?
No shuffle step, nor a need to keep track of values you've already hit. As long as you go less than the full period, you should be fine.
It turns out that the Wikipedia article has some C++ code examples which are more tested than anything I would give you off the top of my head. Note that you'll want to be pulling values from inside the loops -- the loops just iterate the shift register through. You can see this in the snippet here.
(Yes, I know this was mentioned, briefly in the dupe -- saw it as I was revising. Given it hasn't been brought up here and is the best way to solve the poster's question, I think it should be brought up again.)
Let's say size=100.000 then create an array with this size. Create random numbers then put them into array.Problem is which index that number will be ? randomNumber%size will give you index.
When u put next number, use that function for index and check this value is exist or not. If not exist put it if exist then create new number and try that. U can create in fastest way with this way. Disadvange of this way is you will never find numbers which last section is same.
For example for last sections is
1231232444556
3458923444556
you will never have such numbers in your list even if they are totally different but last sections are same.
First off, there's a huge difference between random and pseudorandom. There's no way to generate perfectly random numbers from a deterministic process (such as a computer) without bringing in some physical process like latency between keystrokes or another entropy source.
The approach of saving all the numbers generated will slow down the computation rather quickly; the more numbers you have, the larger your storage needs, until you've filled up all available memory. A better method would be (as someone's already suggested) using a well known pseudorandom number generator such as the Linear Congruential Generator; it's super fast, requiring only modular multiplication and addition, and the theory behind it gets a lot of mention in Vol. 2 of Knuth's TAOCP. That way, the theory involved guarantees a rather large period before repetition, and the only storage needed are the parameters and seed used.
If you have no problem when a value can be calculated by the previous one, LFSR and LCG are fine. When you don't want that one output value can be calculated by another, you can use a block cipher in counter mode to generate the output sequence, given that the cipher block length is equal to the output length.
Use Hashset generic class . This class does not contain same values. You can put in all of your generated numbers then u can use them in Hashset.You can also check it if it is exist or not .Hashset can determine existence of items in fastest way.Hashset does not slow when list become bigger and this is biggest feature of it.
For example :
HashSet<int> array = new HashSet<int>();
array.Add(1);
array.Add(2);
array.Add(1);
foreach (var item in array)
{
Console.WriteLine(item);
}
Console.ReadKey();