Store 3 bit binary numbers in C++ array - c++

I have a program that takes 2 inputs, N and myarray[ ].
cin >> N;
cin >> myarray[];
In this example say, N=3 which means an integer array of size 3 has to be allocated and suppose that the entries of myarray[ ] are {1,2,3}.
Now I have a function createsubset() that creates all the possible subsets of the entries {1,2,3}. The logic that I am following is:
Total number of subsets of a set containing n elements is m=2^n, because an element can be either present or absent in a subset.
So, when m=7 and corresponding binary notation is 111.
Now iterate from m=0 to m=7 to generate all the subsets(except for the set itself which is outcome of m=8):
Example:
m=0, binary=000, subset={ }
m=1, binary=001, subset={c}
m=2, binary=010, subset={b}
m=3, binary=011, subset={b,c}
and so on.
This is done by a function generate() that iterates from m=0 to m=8.
void generate()
{
for(m=0; m<8; m++)
{
decimaltobinary(m);
}
}
Now, I have to store the output of decimaltobinary() function (which is a 3-bit binary number) in an array which I will use later to create subsets. This is the part where I am stuck right now.
Can we store a multibit binary number in an array and use it directly?
Please help me regarding this.
Any suggestion regarding the createsubset() fuction is also welcomed.

Numbers in C/C++ are stored in binary so there is no need to "convert" them. You can use any C/C++ unsigned integral type you want to to store a 3 bit number so, for example, storing them in std::vector<unsigned char> would work fine.
Along this line, rather than storing the numbers you read into an array [fixed size container] consider storing them in a vector [variable size container] because you don't know the size "up front"

Related

A simple exercise with sequence

I am new to this and I need help with an exercise, which seems to be very simple but I was thinking for hours.
I have a sequence of integers, and I have to return an ordered sequence from least to greatest. whose elements have the same difference. Example: {1,4,5,6,7,10} -> {4,5,6,7}
A possible algorithm:
Store your N integers in a vector<int> and sort it. Then for every integer Ki for i=2 to i=n check if Ki - Ki-1 equals Ki-1 - Ki-2.
Note that in C++, indices start from 0, not 1, so adapt the above accordingly (your i will be from 1 to n-1 instead of 2 to n.
I'm not going to write the code for you though, that's your homework.

Time Limit Exceeded in dealing with large arrays

I'm trying to solve this question:
As we all know that Varchas is going on. So FOC wants to organise an event called Finding Occurrence.
The task is simple :
Given an array A[1...N] of positive integers. There will be Q queries. In the queries you will be given an integer. You need to find out the frequency of that integer in the given array.
INPUT:
First line of input comprises of integer N, the number of integers in given array.
The next line will comprise of N space separated integers. The next line will be Q, number of Queries.
The next Q lines will comprise of a single integer whose Occurrence you are supposed to find out.
OUTPUT:
Output single integer for each Query which is the frequency of the given integer.
Constraints:
1<=N<=100000
1<=Q<=100000
0<=A[i]<=1000000
And this is my code:
#include <iostream>
using namespace std;
int main()
{
long long n=0;
cin >> n;
long long a[1000000];
for (int i=1;i<=n;i++)
{
cin >> a[i];
}
long long q=0;
cin >> q;
while (q--)
{
long long temp=0,counter=0;
cin >> temp;
for (int k=1;k<=n;k++)
{
if (a[k]==temp)
counter++;
}
cout << "\n" << counter;
temp=0;
counter=0;
}
return 0;
}
However, I encountered the 'Time Limit Exceeded' error. I suspect this is due to the failure to handle large values in arrays. Could someone tell me how to handle such large size arrays?
The failure is in the algorithm itself, note that for each query, you traverse the whole array. There are 100,000 queries and 100,000 elements. That means at worse case you're traversing 100,000 * 100,000 elements = 10,000,000,000 elements, which won't finish in time. If you analyze the complexity using the Big-O notation, your algorithm is O(nq), which is too slow for this problem, since n*q are large.
What you're supposed to do is to calculate the scores before any query is made, then store in an array (this is why the range of A[i] is given. You should be able to do this by traversing the array only once. (hint: you don't need to store the input into an array, you can just count directly).
By doing this, the algorithm will just be O(n), and since n is small enough (as a rule of thumb, less than one million is small), it should finish in time.
Then you can answer each query instantly, making your program fast enough to be under the time limit.
Another thing that you can improve is the data type of the array. The value stored in that array won't be larger than 1 million, and so you don't need long long, which uses more memory. You can just use int.
Your algorithm was inefficient. You read all the numbers into an array, then you searched linearly through the array for each query.
What you should have done is make one array of counts. In other words, if you read the number 5, do count[5]++. Then for each query all you have to do is return the count from the array. For example, how many 5's were there in the array? Answer: count[5].
Since your maximum number can be 10^6, I think that your problem will take memory limit exceeded, even if it fits in time. Another solution is to sort the array( you can do it in N*logN using STL sort function) and for each query you can make two binary searches. First is used to find the first position where the element appears and the second is used to find the last position where your element appears, so the answer for each query will be lastPosition - firstPosition + 1.

Sorting an integer array of 100 elements having only 3 elements in it

Suppose I have an array of 100 numbers. The only distinct values in the array are 1, 2 and 3. The values are randomly ordered throughout the array. For instance, the array might be populated as:
int values[100];
for (int i = 0; i < 100; i++)
values[i] = 1 + rand() % 3;
How can I efficiently sort an array like this?
The fastest solution is not to "sort" at all:
Run through the array and count the number of occurrences of 1,2 and 3. These counts should hopefully fit in registers...
Fill the array with the right number of 1s, 2s and 3s, overwriting whatever is there already.
At the end you will have a fully sorted array.
In general, this can be a useful O(n) sorting algorithm when you have a very small range of possible values compared to the size of the array.
Dutch National flag algorithm is the commonly cited algorithm for this and is actually the partition step in one of the variants of quicksort (1 corresponds to less than, 2 to equal to and 3 to greater than). In that variant, you don't need to sort the middle portion.

Generate a new element different from 1000 elements of an array

I was asked this questions in an interview. Consider the scenario of punched cards, where each punched card has 64 bit pattern. I was suggested each card as an int since each int is a collection of bits.
Also, to be considered that I have an array which already contains 1000 such cards. I have to generate a new element everytime which is different from the previous 1000 cards. The integers(aka cards) in the array are not necessarily sorted.
Even more, how would that be possible the question was for C++, where does the 64 bit int comes from and how can I generate this new card from the array where the element to be generated is different from all the elements already present in the array?
There are 264 64 bit integers, a number that is so much
larger than 1000 that the simplest solution would be to just generate a
random 64 bit number, and then verify that it isn't in the table of
already generated numbers. (The probability that it is is
infinitesimal, but you might as well be sure.)
Since most random number generators do not generate 64 bit values, you
are left with either writing your own, or (much simpler), combining the
values, say by generating 8 random bytes, and memcpying them into a
uint64_t.
As for verifying that the number isn't already present, std::find is
just fine for one or two new numbers; if you have to do a lot of
lookups, sorting the table and using a binary search would be
worthwhile. Or some sort of a hash table.
I may be missing something, but most of the other answers appear to me as overly complicated.
Just sort the original array and then start counting from zero: if the current count is in the array skip it, otherwise you have your next number. This algorithm is O(n), where n is the number of newly generated numbers: both sorting the array and skipping existing numbers are constants. Here's an example:
#include <algorithm>
#include <iostream>
unsigned array[] = { 98, 1, 24, 66, 20, 70, 6, 33, 5, 41 };
unsigned count = 0;
unsigned index = 0;
int main() {
std::sort(array, array + 10);
while ( count < 100 ) {
if ( count > array[index] )
++index;
else {
if ( count < array[index] )
std::cout << count << std::endl;
++count;
}
}
}
Here's an O(n) algorithm:
int64 generateNewValue(list_of_cards)
{
return find_max(list_of_cards)+1;
}
Note: As #amit points out below, this will fail if INT64_MAX is already in the list.
As far as I'm aware, this is the only way you're going to get O(n). If you want to deal with that (fairly important) edge case, then you're going to have to do some kind of proper sort or search, which will take you to O(n log n).
#arne is almost there. What you need is a self-balancing interval tree, which can be built in O(n lg n) time.
Then take the top node, which will store some interval [i, j]. By the properties of an interval tree, both i-1 and j+1 are valid candidates for a new key, unless i = UINT64_MIN or j = UINT64_MAX. If both are true, then you've stored 2^64 elements and you can't possibly generate a new element. Store the new element, which takes O(lg n) worst-case time.
I.e.: init takes O(n lg n), generate takes O(lg n). Both are worst-case figures. The greatest thing about this approach is that the top node will keep "growing" (storing larger intervals) and merging with its successor or predecessor, so the tree will actually shrink in terms of memory use and eventually the time per operation decays to O(1). You also won't waste any numbers, so you can keep generating until you've got 2^64 of them.
This algorithm has O(N lg N) initialisation, O(1) query and O(N) memory usage. I assume you have some integer type which I will refer to as int64 and that it can represent the integers [0, int64_max].
Sort the numbers
Create a linked list containing intervals [u, v]
Insert [1, first number - 1]
For each of the remaining numbers, insert [prev number + 1, current number - 1]
Insert [last number + 1, int64_max]
You now have a list representing the numbers which are not used. You can simply iterate over them to generate new numbers.
I think the way to go is to use some kind of hashing. So you store your cards in some buckets based on lets say on MOD operation. Until you create some sort of indexing you are stucked with looping over the whole array.
IF you have a look on HashSet implementation in java you might get a clue.
Edit: I assume you wanted them to be random numbers, if you don't mind sequence MAX+1 below is good solution :)
You could build a binary tree of the already existing elements and traverse it until you find a node whose depth is not 64 and which has less than two child nodes. You can then construct a "missing" child node and have a new element. The should be fairly quick, in the order of about O(n) if I'm not mistaken.
bool seen[1001] = { false };
for each element of the original array
if the element is in the range 0..1000
seen[element] = true
find the index for the first false value in seen
Initialization:
Don't sort the list.
Create a new array 1000 long containing 0..999.
Iterate the list and, if any number is in the range 0..999, invalidate it in the new array by replacing the value in the new array with the value of the first item in the list.
Insertion:
Use an incrementing index to the new array. If the value in the new array at this index is not the value of the first element in the list, add it to the list, else check the value from the next position in the new array.
When the new array is used up, refill it using 1000..1999 and invalidating existing values as above. Yes, this is looping over the list, but it doesn't have to be done for each insertion.
Near O(1) until the list gets so large that occasionally iterating it for invalidation of the 'new' new array becomes significant. Maybe you could mitigate this by using a new array that grows, maybee always the size of the list?
Rgds,
Martin
Put them all into a hash table of size > 1000, and find the empty cell (this is the parking problem). Generate a key for that. This will of course work better for bigger table size. The table needs only 1-bit entries.
EDIT: this is the pigeonhole principle.
This needs "modulo tablesize" (or some other "semi-invertible" function) for a hash function.
unsigned hashtab[1001] = {0,};
unsigned long long long long numbers[1000] = { ... };
void init (void)
{
unsigned idx;
for (idx=0; idx < 1000; idx++) {
hashtab [ numbers[idx] % 1001 ] += 1; }
}
unsigned long long long long generate(void)
{
unsigned idx;
for (idx = 0; idx < 1001; idx++) {
if ( !hashtab [ idx] ) break; }
return idx + rand() * 1001;
}
Based on the solution here: question on array and number
Since there are 1000 numbers, if we consider their remainders with 1001, at least one remainder will be missing. We can pick that as our missing number.
So we maintain an array of counts: C[1001], which will maintain the number of integers with remainder r (upon dividing by 1001) in C[r].
We also maintain a set of numbers for which C[j] is 0 (say using a linked list).
When we move the window over, we decrement the count of the first element (say remainder i), i.e. decrement C[i]. If C[i] becomes zero we add i to the set of numbers. We update the C array with the new number we add.
If we need one number, we just pick a random element from the set of j for which C[j] is 0.
This is O(1) for new numbers and O(n) initially.
This is similar to other solutions but not quite.
How about something simple like this:
1) Partition the array into numbers equal and below 1000 and above
2) If all the numbers fit within the lower partition then choose 1001 (or any number greater than 1000) and we're done.
3) Otherwise we know that there must exist a number between 1 and 1000 that doesn't exist within the lower partition.
4) Create a 1000 element array of bools, or a 1000-element long bitfield, or whatnot and initialize the array to all 0's
5) For each integer in the lower partition, use its value as an index into the array/bitfield and set the corresponding bool to true (ie: do a radix sort)
6) Go over the array/bitfield and pick any unset value's index as the solution
This works in O(n) time, or since we've bounded everything by 1000, technically it's O(1), but O(n) time and space in general. There are three passes over the data, which isn't necessarily the most elegant approach, but the complexity remains O(n).
you can create a new array with the numbers that are not in the original array, then just pick one from this new array.
¿O(1)?

USACO: Subsets (Inefficient)

I am trying to solve subsets from the USACO training gateway...
Problem Statement
For many sets of consecutive integers from 1 through N (1 <= N <= 39), one can partition the set into two sets whose sums are identical.
For example, if N=3, one can partition the set {1, 2, 3} in one way so that the sums of both subsets are identical:
{3} and {1,2}
This counts as a single partitioning (i.e., reversing the order counts as the same partitioning and thus does not increase the count of partitions).
If N=7, there are four ways to partition the set {1, 2, 3, ... 7} so that each partition has the same sum:
{1,6,7} and {2,3,4,5}
{2,5,7} and {1,3,4,6}
{3,4,7} and {1,2,5,6}
{1,2,4,7} and {3,5,6}
Given N, your program should print the number of ways a set containing the integers from 1 through N can be partitioned into two sets whose sums are identical. Print 0 if there are no such ways.
Your program must calculate the answer, not look it up from a table.
End
Before I was running on a O(N*2^N) by simply permuting through the set and finding the sums.
Finding out how horribly inefficient that was, I moved on to mapping the sum sequences...
http://en.wikipedia.org/wiki/Composition_(number_theory)
After many coding problems to scrape out repetitions, still too slow, so I am back to square one :(.
Now that I look more closely at the problem, it looks like I should try to find a way to not find the sums, but actually go directly to the number of sums via some kind of formula.
If anyone can give me pointers on how to solve this problem, I'm all ears. I program in java, C++ and python.
Actually, there is a better and simpler solution. You should use Dynamic Programming
instead. In your code, you would have an array of integers (whose size is the sum), where each value at index i represents the number of ways to possibly partition the numbers so that one of the partitions has a sum of i. Here is what your code could look like in C++:
int values[N];
int dp[sum+1]; //sum is the sum of the consecutive integers
int solve(){
if(sum%2==1)
return 0;
dp[0]=1;
for(int i=0; i<N; i++){
int val = values[i]; //values contains the consecutive integers
for(int j=sum-val; j>=0; j--){
dp[j+val]+=dp[j];
}
}
return dp[sum/2]/2;
}
This gives you an O(N^3) solution, which is by far fast enough for this problem.
I haven't tested this code, so there might be a syntax error or something, but you get the point. Let me know if you have any more questions.
This is the same thing as finding the coefficient x^0 term in the polynomial (x^1+1/x)(x^2+1/x^2)...(x^n+1/x^n), which should take about an upper bound of O(n^3).