This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Find integer not occurring twice in an array
Accenture interview question - find the only unpaired element in the array
Given an array of integers of odd size. All the integers in the array appear twice except for a single integer. How to find this uncoupled integer in most efficient (both memory and complexity-wise) way?
If you XOR all of them together, you'll end up with the lone (uncoupled) value.
That's because x XOR x is zero for all x values, and 0 XOR x is x.
By way of example, the following program outputs 99:
#include <stdio.h>
int main (void) {
int num[] = { 1, 2, 3, 4, 5, 99, 1, 2, 3, 4, 5};
unsigned int i;
int accum;
for (accum = 0, i = 0; i < sizeof(num)/sizeof(*num); i++)
accum ^= num[i];
printf ("%d\n", accum);
return 0;
}
In terms of efficiency, it's basically O(1) space and O(n) time complexity, minimal, average and worst case.
As pax suggests, XOR'ing all the elements together will give you your lone value.
int getUncoupled(int *values, int len)
{
int uncoupled = 0;
for(int i = 0; i < len; i++)
uncoupled ^= values[i];
return uncoupled;
}
However there is a slight caveat here: What's the difference between no uncoupled values and a set whose uncoupled value is zero? x^x^y^y = 0 but doesn't x^x^0^y^y also equal zero? Food for thought :)
Related
I am trying to push all the individual digits of BigNumber into an array called IndividualNumber. (See code below.) Somehow the code that I try to use doesn't work. It doesn't push the digits into the array. Can someone please explain me why not?
int BigNumber = 2639;
array IndividualNumber;
for (int i = 0; i < 10; i++) {
IndividualNumber.push(BigNumber[i]);
}
//IndividualNumber should be [2, 6, 3, 9].
Thank you in advance and have a nice day.
There are a few problems with this code:
BigNumber is an integer but you are trying to index it like an array or pointer. One way to get the digits of a number in base 10 is to take the remainder when divided by powers of 10.
In C++ (which is used with Arduino), arrays have to be declared with a type and a capacity. The correct way to declare IndividualNumber with a capacity of 10 numbers for example would be something like:
int Individual[10];
To set i-th element of the array you use the following syntax:
IndividualNumber[i] = ...
With these corrected, a possible solution might look something like:
int BigNumber = 2639;
int IndividualNumber[10];
int temp = BigNumber;
for (int i = 0; i < 10; i++) {
int digit = temp % 10; // Remainder on division by 10
temp = temp / 10;
IndividualNumber[i] = digit;
}
This will store up to 10 digits of a number in IndividualNumber, in reverse order.
I have a list of 100 random integers. Each random integer has a value from 0 to 99. Duplicates are allowed, so the list could be something like
56, 1, 1, 1, 1, 0, 2, 6, 99...
I need to find the smallest integer (>= 0) is that is not contained in the list.
My initial solution is this:
vector<int> integerList(100); //list of random integers
...
vector<bool> listedIntegers(101, false);
for (int theInt : integerList)
{
listedIntegers[theInt] = true;
}
int smallestInt;
for (int j = 0; j < 101; j++)
{
if (!listedIntegers[j])
{
smallestInt = j;
break;
}
}
But that requires a secondary array for book-keeping and a second (potentially full) list iteration. I need to perform this task millions of times (the actual application is in a greedy graph coloring algorithm, where I need to find the smallest unused color value with a vertex adjacency list), so I'm wondering if there's a clever way to get the same result without so much overhead?
It's been a year, but ...
One idea that comes to mind is to keep track of the interval(s) of unused values as you iterate the list. To allow efficient lookup, you could keep intervals as tuples in a binary search tree, for example.
So, using your sample data:
56, 1, 1, 1, 1, 0, 2, 6, 99...
You would initially have the unused interval [0..99], and then, as each input value is processed:
56: [0..55][57..99]
1: [0..0][2..55][57..99]
1: no change
1: no change
1: no change
0: [2..55][57..99]
2: [3..55][57..99]
6: [3..5][7..55][57..99]
99: [3..5][7..55][57..98]
Result (lowest value in lowest remaining interval): 3
I believe there is no faster way to do it. What you can do in your case is to reuse vector<bool>, you need to have just one such vector per thread.
Though the better approach might be to reconsider the whole algorithm to eliminate this step at all. Maybe you can update least unused color on every step of the algorithm?
Since you have to scan the whole list no matter what, the algorithm you have is already pretty good. The only improvement I can suggest without measuring (that will surely speed things up) is to get rid of your vector<bool>, and replace it with a stack-allocated array of 4 32-bit integers or 2 64-bit integers.
Then you won't have to pay the cost of allocating an array on the heap every time, and you can get the first unused number (the position of the first 0 bit) much faster. To find the word that contains the first 0 bit, you only need to find the first one that isn't the maximum value, and there are bit twiddling hacks you can use to get the first 0 bit in that word very quickly.
You program is already very efficient, in O(n). Only marginal gain can be found.
One possibility is to divide the number of possible values in blocks of size block, and to register
not in an array of bool but in an array of int, in this case memorizing the value modulo block.
In practice, we replace a loop of size N by a loop of size N/block plus a loop of size block.
Theoretically, we could select block = sqrt(N) = 12 in order to minimize the quantity N/block + block.
In the program hereafter, block of size 8 are selected, assuming that dividing integers by 8 and calculating values modulo 8 should be fast.
However, it is clear that a gain, if any, can be obtained only for a minimum value rather large!
constexpr int N = 100;
int find_min1 (const std::vector<int> &IntegerList) {
constexpr int Size = 13; //N / block
constexpr int block = 8;
constexpr int Vmax = 255; // 2^block - 1
int listedBlocks[Size] = {0};
for (int theInt : IntegerList) {
listedBlocks[theInt / block] |= 1 << (theInt % block);
}
for (int j = 0; j < Size; j++) {
if (listedBlocks[j] == Vmax) continue;
int &k = listedBlocks[j];
for (int b = 0; b < block; b++) {
if ((k%2) == 0) return block * j + b;
k /= 2;
}
}
return -1;
}
Potentially you can reduce the last step to O(1) by using some bit manipulation, in your case __int128, set the corresponding bits in loop one and call something like __builtin_clz or use the appropriate bit hack
The best solution I could find for finding smallest integer from a set is https://codereview.stackexchange.com/a/179042/31480
Here are c++ version.
int solution(std::vector<int>& A)
{
for (std::vector<int>::size_type i = 0; i != A.size(); i++)
{
while (0 < A[i] && A[i] - 1 < A.size()
&& A[i] != i + 1
&& A[i] != A[A[i] - 1])
{
int j = A[i] - 1;
auto tmp = A[i];
A[i] = A[j];
A[j] = tmp;
}
}
for (std::vector<int>::size_type i = 0; i != A.size(); i++)
{
if (A[i] != i+1)
{
return i + 1;
}
}
return A.size() + 1;
}
This question already has answers here:
Gray code increment function
(4 answers)
Closed 8 years ago.
Let's say i have n integers in an array a, and i want to iterate through all possible subsets of these integers, find the sum, and then do something with it.
What i immedieatelly did, was to create a bit field b, which indicated which numbers were included in the subset, and iterate through its possible values using ++b. Then, to compute the sum in each step, i had to iterate through all bits like this:
int sum = 0;
for (int i = 0; i < n; i++)
if (b&1<<i)
sum += a[i];
Then i realized that if i iterated through the possible values of b in a Gray code order, so that each time only a single bit is flipped, i wouldn't have to reconstruct the sum completely, but only needed to add or subtract the single value that is being added or removed from the subset. It should work like this:
int sum = 0;
int whichBitToFlip = 0;
bool isBitSet = false;
for (int k = 0; whichBitToFlip < n; k++) {
sum += (isBitSet ? -1 : 1)*a[whichBitToFlip];
// do something with sum here
whichBitToFlip = ???;
bool isBitSet = ???;
}
But i can't figure out how to directly and efficiently compute whichBitToFlip. The desired values are basically sequence A007814. I know that i can compute the Gray code using the formula (k>>1)^k and xor it with the previous one, but then i need to find the position of the changed bit, which might not be much faster.
So is there any better way to determine these values (index of flipped bit), preferably without a cycle, faster than recomputing the whole sum (of at most 64 values) every time?
To convert a bitmask to a bit index, you can use the ffs function (if you have one), which corresponds to a machine opcode on some machines.
Otherwise, the bit changed in the gray code corresponds to the ruler function:
0, 1, 0, 2, 0, 1, 0, 3, 0, 1...
for which there is a simple recursion. You can simulate the recursion with a stack (it will have maximum depth O(log N), so it's not much space), but probably ffs is a lot faster.
(By the way, even if you were to count bits one at a time from right-to-left, the increment function would be O(1) on average because the total number of trailing 0s in the integers from 1 to 2k is 2k-1.)
So i came up with this:
int sum = 0;
unsigned long grayPos = 0;
int graySign = 1;
for (uint64 k = 2; grayPos < n; k++) {
sum += graySign*a[grayPos];
// Do something with sum
#ifdef _M_X64
grayPos = n;
_BitScanForward64(&grayPos, k);
#else
for (grayPos = 0; !(k&1ull<<grayPos); grayPos++);
#endif
graySign = 2-(k>>grayPos&0x3);
}
It works really well, brought down the execution time (in comparison to always recomputing the whole sum) from 254 to only 7 seconds for n = 32. I also found that counting trailing zeroes with the for cycle is only slightly (~15%) slower than using _BitScanForward64 for the reasons mentioned by rici. So thanks.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
How i can put the numbers into array like 11*11 = 121 how i can put 121 as 1,2,1 into array , so that it should like int arr[] = [1,2,1]; , the one logic to me is divide it by 10 , the code i am trying is
long mul , cube;
mul = num*num;
cube = num*num*num;
float unt = mul/10.0;
But how to save the number after period . into array , like if i have the number 2.3 so i want to save 3 into array
I think you want to get the individual digits into an array.
You can get the digits into an array in reverse order by following the algorithm (let n be the number you want to split):
while (n > 0):
push (n mod 10) into the array --- this is one digit from 0..9
divide n by 10, ignoring the decimal part (ignoring remainder, that is)
For example, with n=97 you get:
(n mod 10) = 7, push that into array
divide 97 by 10 to get 9 (ignoring .7)
(n mod 10) = 9, push that into array
divide 9 by 10 to get 0
end algorithm
now you have [7, 9] in the array, then reverse the array to get the left-to-right ordering.
Follow on from the OP requesting a code sample, and my understanding of the question. This implements the algorithm from another answer (Antti Huima);
#include <iostream>
#include <vector>
#include <algorithm>
std::vector<int> convert(int number)
{
std::vector<int> result;
if (number == 0) {
result.push_back(0);
}
while (number) {
int temp = number % 10;
number /= 10;
result.push_back(temp);
//result.insert(result.begin(), temp); // alternative... inserts in "forward" order
}
// push_back inserts the elements in reverse order, so we reverse them again
std::reverse(result.begin(), result.end());
return result;
}
int main()
{
int i = 123;
auto result = convert(i);
std::cout << i << std::endl;
for (auto& j : result) {
std::cout << j << ',';
}
std::cout << std::endl;
}
It's been implemented assuming certain basics (that can be altered or templated);
int base type and int as the array member
Variable array size is catered for (in the vector)
convert is a very arbitrary name
The main is just a demonstration of its use (so I was a little more liberal in C++11 language use).
Live code
(This is a generalization of: Finding duplicates in O(n) time and O(1) space)
Problem: Write a C++ or C function with time and space complexities of O(n) and O(1) respectively that finds the repeating integers in a given array without altering it.
Example: Given {1, 0, -2, 4, 4, 1, 3, 1, -2} function must print 1, -2, and 4 once (in any order).
EDIT: The following solution requires a duo-bit (to represent 0, 1, and 2) for each integer in the range of the minimum to the maximum of the array. The number of necessary bytes (regardless of array size) never exceeds (INT_MAX – INT_MIN)/4 + 1.
#include <stdio.h>
void set_min_max(int a[], long long unsigned size,\
int* min_addr, int* max_addr)
{
long long unsigned i;
if(!size) return;
*min_addr = *max_addr = a[0];
for(i = 1; i < size; ++i)
{
if(a[i] < *min_addr) *min_addr = a[i];
if(a[i] > *max_addr) *max_addr = a[i];
}
}
void print_repeats(int a[], long long unsigned size)
{
long long unsigned i;
int min, max = min;
long long diff, q, r;
char* duos;
set_min_max(a, size, &min, &max);
diff = (long long)max - (long long)min;
duos = calloc(diff / 4 + 1, 1);
for(i = 0; i < size; ++i)
{
diff = (long long)a[i] - (long long)min; /* index of duo-bit
corresponding to a[i]
in sequence of duo-bits */
q = diff / 4; /* index of byte containing duo-bit in "duos" */
r = diff % 4; /* offset of duo-bit */
switch( (duos[q] >> (6 - 2*r )) & 3 )
{
case 0: duos[q] += (1 << (6 - 2*r));
break;
case 1: duos[q] += (1 << (6 - 2*r));
printf("%d ", a[i]);
}
}
putchar('\n');
free(duos);
}
void main()
{
int a[] = {1, 0, -2, 4, 4, 1, 3, 1, -2};
print_repeats(a, sizeof(a)/sizeof(int));
}
The definition of big-O notation is that its argument is a function (f(x)) that, as the variable in the function (x) tends to infinity, there exists a constant K such that the objective cost function will be smaller than Kf(x). Typically f is chosen to be the smallest such simple function such that the condition is satisfied. (It's pretty obvious how to lift the above to multiple variables.)
This matters because that K — which you aren't required to specify — allows a whole multitude of complex behavior to be hidden out of sight. For example, if the core of the algorithm is O(n2), it allows all sorts of other O(1), O(logn), O(n), O(nlogn), O(n3/2), etc. supporting bits to be hidden, even if for realistic input data those parts are what actually dominate. That's right, it can be completely misleading! (Some of the fancier bignum algorithms have this property for real. Lying with mathematics is a wonderful thing.)
So where is this going? Well, you can assume that int is a fixed size easily enough (e.g., 32-bit) and use that information to skip a lot of trouble and allocate fixed size arrays of flag bits to hold all the information that you really need. Indeed, by using two bits per potential value (one bit to say whether you've seen the value at all, another to say whether you've printed it) then you can handle the code with fixed chunk of memory of 1GB in size. That will then give you enough flag information to cope with as many 32-bit integers as you might ever wish to handle. (Heck that's even practical on 64-bit machines.) Yes, it's going to take some time to set that memory block up, but it's constant so it's formally O(1) and so drops out of the analysis. Given that, you then have constant (but whopping) memory consumption and linear time (you've got to look at each value to see whether it's new, seen once, etc.) which is exactly what was asked for.
It's a dirty trick though. You could also try scanning the input list to work out the range allowing less memory to be used in the normal case; again, that adds only linear time and you can strictly bound the memory required as above so that's constant. Yet more trickiness, but formally legal.
[EDIT] Sample C code (this is not C++, but I'm not good at C++; the main difference would be in how the flag arrays are allocated and managed):
#include <stdio.h>
#include <stdlib.h>
// Bit fiddling magic
int is(int *ary, unsigned int value) {
return ary[value>>5] & (1<<(value&31));
}
void set(int *ary, unsigned int value) {
ary[value>>5] |= 1<<(value&31);
}
// Main loop
void print_repeats(int a[], unsigned size) {
int *seen, *done;
unsigned i;
seen = calloc(134217728, sizeof(int));
done = calloc(134217728, sizeof(int));
for (i=0; i<size; i++) {
if (is(done, (unsigned) a[i]))
continue;
if (is(seen, (unsigned) a[i])) {
set(done, (unsigned) a[i]);
printf("%d ", a[i]);
} else
set(seen, (unsigned) a[i]);
}
printf("\n");
free(done);
free(seen);
}
void main() {
int a[] = {1,0,-2,4,4,1,3,1,-2};
print_repeats(a,sizeof(a)/sizeof(int));
}
Since you have an array of integers you can use the straightforward solution with sorting the array (you didn't say it can't be modified) and printing duplicates. Integer arrays can be sorted with O(n) and O(1) time and space complexities using Radix sort. Although, in general it might require O(n) space, the in-place binary MSD radix sort can be trivially implemented using O(1) space (look here for more details).
The O(1) space constraint is intractable.
The very fact of printing the array itself requires O(N) storage, by definition.
Now, feeling generous, I'll give you that you can have O(1) storage for a buffer within your program and consider that the space taken outside the program is of no concern to you, and thus that the output is not an issue...
Still, the O(1) space constraint feels intractable, because of the immutability constraint on the input array. It might not be, but it feels so.
And your solution overflows, because you try to memorize an O(N) information in a finite datatype.
There is a tricky problem with definitions here. What does O(n) mean?
Konstantin's answer claims that the radix sort time complexity is O(n). In fact it is O(n log M), where the base of the logarithm is the radix chosen, and M is the range of values that the array elements can have. So, for instance, a binary radix sort of 32-bit integers will have log M = 32.
So this is still, in a sense, O(n), because log M is a constant independent of n. But if we allow this, then there is a much simpler solution: for each integer in the range (all 4294967296 of them), go through the array to see if it occurs more than once. This is also, in a sense, O(n), because 4294967296 is also a constant independent of n.
I don't think my simple solution would count as an answer. But if not, then we shouldn't allow the radix sort, either.
I doubt this is possible. Assuming there is a solution, let's see how it works. I'll try to be as general as I can and show that it can't work... So, how does it work?
Without losing generality we could say we process the array k times, where k is fixed. The solution should also work when there are m duplicates, with m >> k. Thus, in at least one of the passes, we should be able to output x duplicates, where x grows when m grows. To do so, some useful information has been computed in a previous pass and stored in the O(1) storage. (The array itself can't be used, this would give O(n) storage.)
The problem: we have O(1) of information, when we walk over the array we have to identify x numbers(to output them). We need a O(1) storage than can tell us in O(1) time, if an element is in it. Or said in a different way, we need a data structure to store n booleans (of wich x are true) that uses O(1) space, and takes O(1) time to query.
Does this data structure exists? If not, then we can't find all duplicates in an array with O(n) time and O(1) space (or there is some fancy algorithm that works in a completely different manner???).
I really don't see how you can have only O(1) space and not modify the initial array. My guess is that you need an additional data structure. For example, what is the range of the integers? If it's 0..N like in the other question you linked, you can have an additinal count array of size N. Then in O(N) traverse the original array and increment the counter at the position of the current element. Then traverse the other array and print the numbers with count >= 2. Something like:
int* counts = new int[N];
for(int i = 0; i < N; i++) {
counts[input[i]]++;
}
for(int i = 0; i < N; i++) {
if(counts[i] >= 2) cout << i << " ";
}
delete [] counts;
Say you can use the fact you are not using all the space you have. You only need one more bit per possible value and you have lots of unused bit in your 32-bit int values.
This has serious limitations, but works in this case. Numbers have to be between -n/2 and n/2 and if they repeat m times, they will be printed m/2 times.
void print_repeats(long a[], unsigned size) {
long i, val, pos, topbit = 1 << 31, mask = ~topbit;
for (i = 0; i < size; i++)
a[i] &= mask;
for (i = 0; i < size; i++) {
val = a[i] & mask;
if (val <= mask/2) {
pos = val;
} else {
val += topbit;
pos = size + val;
}
if (a[pos] < 0) {
printf("%d\n", val);
a[pos] &= mask;
} else {
a[pos] |= topbit;
}
}
}
void main() {
long a[] = {1, 0, -2, 4, 4, 1, 3, 1, -2};
print_repeats(a, sizeof (a) / sizeof (long));
}
prints
4
1
-2