Is there any number repeated in the array?

Is there any number repeated in the array? - c++

There's array of size n. The values can be between 0 and (n-1) as the indices.
For example: array[4] = {0, 2, 1, 3}
I should say if there's any number that is repeated more than 1 time.
For example: array[5] = {3,4,1,2,4} -> return true because 4 is repeated.
This question has so many different solutions and I would like to know if this specific solution is alright (if yes, please prove, else refute).
My solution (let's look at the next example):
array: indices 0 1 2 3 4
values 3 4 1 2 0
So I suggest:
count the sum of the indices (4x5 / 2 = 10) and check that the values' sum (3+4+1+2+0) is equal to this sum. if not, there's repeated number.
in addition to the first condition, get the multiplication of the indices(except 0. so: 1x2x3x4) and check if it's equal to the values' multiplication (except 0, so: 3x4x1x2x0).
=> if in each condition, it's equal then I say that there is NO repeated number. otherwise, there IS a repeated number.
Is it correct? if yes, please prove it or show me a link. else, please refute it.

Why your algorithm is wrong?
Your solution is wrong, here is a counter example (there may be simpler ones, but I found this one quite quickly):
int arr[13] = {1, 1, 2, 3, 4, 10, 6, 7, 8, 9, 10, 11, 6};
The sum is 78, and the product is 479001600, if you take the normal array of size 13:
int arr[13] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
It also has a sum of 78 and a product of 479001600 so your algorithm does not work.
How to find counter examples?1
To find a counter example2 3:
Take an array from 0 to N - 1;
Pick two even numbers3 M1 > 2 and M2 > 2 between 0 and N - 1 and halve them;
Replace P1 = M1/2 - 1 by 2 * P1 and P2 = M2/2 + 1 by 2 * P2.
In the original array you have:
Product = M1 * P1 * M2 * P2
Sum = 0 + M1 + P1 + M2 + P2
= M1 + M1/2 - 1 + M2 + M2/2 + 1
= 3/2 * (M1 + M2)
In the new array you have:
Product = M1/2 * 2 * P1 + M2/2 * 2 * P2
= M1 * P1 * M2 * P2
Sum = M1/2 + 2P1 + M2/2 + 2P2
= M1/2 + 2(M1/2 - 1) + M2/2 + 2(M2/2 + 1)
= 3/2 * M1 - 2 + 3/2 * M2 + 2
= 3/2 * (M1 + M2)
So both array have the same sum and product, but one has repeated values, so your algorithm does not work.
1 This is one method of finding counter examples, there may be others (there are probably others).
2 This is not exactly the same method I used to find the first counter example - In the original method, I used only one number M and was using the fact that you can replace 0 by 1 without changing the product, but I propose a more general method here in order to avoid argument such as "But I can add a check for 0 in my algorithm.".
3 That method does not work with small array because you need to find 2 even numbers M1 > 2 and M2 > 2 such that M1/2 != M2 (and reciprocally) and M1/2 - 1 != M2/2 + 1, which (I think) is not possible for any array with a size lower than 14.
What algorithms do work?4
Algorithm 1: O(n) time and space complexity.
If you can allocate a new array of size N, then:
template <std::size_t N>
bool has_repetition (std::array<int, N> const& array) {
std::array<bool, N> rep = {0};
for (auto v: array) {
if (rep[v]) {
return true;
}
rep[v] = true;
}
return false;
}
Algorithm 2: O(nlog(n)) time complexity and O(1) space complexity, with a mutable array.
You can simply sort the array:
template <std::size_t N>
bool has_repetition (std::array<int, N> &array) {
std::sort(std::begin(array), std::end(array));
auto it = std::begin(array);
auto ne = std::next(it);
while (ne != std::end(array)) {
if (*ne == *it) {
return true;
}
++it; ++ne;
}
return false;
}
Algorithm 3: O(n^2) time complexity and O(1) space complexity, with non mutable array.
template <std::size_t N>
bool has_repetition (std::array<int, N> const& array) {
for (auto it = std::begin(array); it != std::end(array); ++it) {
for (auto jt = std::next(it); jt != std::end(array); ++jt) {
if (*it == *jt) {
return true;
}
}
}
return false;
}
4 These algorithms do work, but there may exist other ones that performs better - These are only the simplest ones I could think of given some "restrictions".

What's wrong with your method?
Your method computes some statistics of the data and compares them with those expected for a permutation (= correct answers). While a violation of any of these comparisons is conclusive (the data cannot satisfy the constraint), the inverse is not necessarily the case. You only look at two statistics, and these are too few for sufficiently large data sets. Owing to the fact that the data are integer, the smallest number of data for which your method may fail is larger than 3.

If you are searching duplicates in your array there is simple way:
int N =5;
int array[N] = {1,2,3,4,4};
for (int i = 0; i< N; i++){
for (int j =i+1; j<N; j++){
if(array[j]==array[i]){
std::cout<<"DUPLICATE FOUND\n";
return true;
}
}
}
return false;
Other simple way to find duplicates is using the std::set container for example:
std::set<int> set_int;
set_int.insert(5);
set_int.insert(5);
set_int.insert(4);
set_int.insert(4);
set_int.insert(5);
std::cout<<"\nsize "<<set_int.size();
the output will be 2, because there is 2 individual values

A more in depth explanation why your algorithm is wrong:
count the sum of the indices (4x5 / 2 = 10) and check that the values' sum (3+4+1+2+0) is equal to this sum. if not, there's repeated number.
Given any array A which has no duplicates, it is easy to create an array that meets your first requirement but now contains duplicates. Just take take two values and subtract one of them by some value v and add that value to the other one. Or take multiple values and make sure the sum of them stays the same. (As long as new values are still within the 0 .. N-1 range.) For N = 3 it is already possible to change {0,1,2} to {1,1,1}. For an array of size 3, there are 7 compositions that have correct sum, but 1 is a false positive. For an array of size 4 there are 20 out of 44 have duplicates, for an array of size 5 that's 261 out of 381, for an array of size 6 that's 3612 out of 4332, and so on. It is save to say that the number of false positives grows much faster than real positives.
in addition to the first condition, get the multiplication of the indices(except 0. so: 1x2x3x4) and check if it's equal to the values' multiplication (except 0, so: 3x4x1x2x0).
The second requirement involves the multiplication of all indices above 0. It is easy to realize this is could never be a very strong restriction either. As soon as one of the indices is not prime, the product of all indices is no longer uniquely tied to the multiplicands and a list can be constructed of different values with the same result. E.g. a pair of 2 and 6 can be replaced with 3 and 4, 2 and 9 can be replaced with 6 and 3 and so on. Obviously the number of false positives increases as the array-size gets larger and more non-prime values are used as multiplicands.
None of these requirements is really strong and the cannot compensate for the other. Since 0 is not even considered for the second restriction a false positive can be created fairly easy for arrays starting at size 5. any pair of 0 and 4 can simply be replaced with two 2's in any unique array, for example {2, 1, 2, 3, 2}
What you would need, is to have a result that is uniquely tight to the occurring values. You could tweak your second requirement to a more complex approach and skip over the non-prime values and take 0 into account. For example you could use the first prime as multiplicand (2) for 0, use 3 as multiplicand for 1, 5 as multiplicand for 2, and so on. That would work (you would not need the first requirement), but this approach would be overly complex. An simpler way to get a unique result would be to OR the i-th bit for each value (0 => 1 << 0, 1 => 1 << 1, 2 => 1 << 2, and so on. (Obviously it is faster to check wether a bit was already set by a reoccurring value, rather than wait for the final result. And this is conceptually the same as using a bool array/vector from the other examples!)

Related

Does this problem have overlapping subproblems?

I am trying to solve this question on LeetCode.com:
You are given an m x n integer matrix mat and an integer target. Choose one integer from each row in the matrix such that the absolute difference between target and the sum of the chosen elements is minimized. Return the minimum absolute difference. (The absolute difference between two numbers a and b is the absolute value of a - b.)
So for input mat = [[1,2,3],[4,5,6],[7,8,9]], target = 13, the output should be 0 (since 1+5+7=13).
The solution I am referring is as below:
int dp[71][70 * 70 + 1] = {[0 ... 70][0 ... 70 * 70] = INT_MAX};
int dfs(vector<set<int>>& m, int i, int sum, int target) {
if (i >= m.size())
return abs(sum - target);
if (dp[i][sum] == INT_MAX) {
for (auto it = begin(m[i]); it != end(m[i]); ++it) {
dp[i][sum] = min(dp[i][sum], dfs(m, i + 1, sum + *it, target));
if (dp[i][sum] == 0 || sum + *it > target)
break;
}
} else {
// cout<<"Encountered a previous value!\n";
}
return dp[i][sum];
}
int minimizeTheDifference(vector<vector<int>>& mat, int target) {
vector<set<int>> m;
for (auto &row : mat)
m.push_back(set<int>(begin(row), end(row)));
return dfs(m, 0, 0, target);
}
I don't follow how this problem is solvable by dynamic programming. The states apparently are the row i and the sum (from row 0 to row i-1). Given that the problem constraints are:
m == mat.length
n == mat[i].length
1 <= m, n <= 70
1 <= mat[i][j] <= 70
1 <= target <= 800
My understanding is that we would never encounter a sum that we have previously encountered (all values are positive). Even the debug cout statement that I added does not print anything on the sample inputs given in the problem.
How could dynamic programming be applicable here?

This problem is NP-hard, since the 0-1 knapsack problem reduces to it pretty easily.
This problem also has a dynamic programming solution that is similar to the one for 0-1 knapsack:
Find all the sums you can make with a number from the first row (that's just the numbers in the first row):
For each subsequent row, add all the numbers from the ith row to all the previously accessible sums to find the sums you can get after i rows.
If you need to be able to recreate a path through the matrix, then for each sum at each level, remember the preceding one from the previous level.
There are indeed overlapping subproblems, because there will usually be multiple ways to get a lot of the sums, and you only have to remember and continue from one of them.
Here is your example:
sums from row 1: 1, 2, 3
sums from rows 1-2: 5, 6, 7, 8, 9
sums from rows 1-3: 12, 13, 14, 15, 16, 17, 18
As you see, we can make the target sum. There are a few ways:
7+4+2, 7+5+1, 8+4+1
Some targets like 15 have a lot more ways. As the size of the matrix increases, the amount of overlap tends to increase, and so this solutions is reasonably efficient in many cases. The total complexity is in O(M * N * max_weight).
But, this is an NP-hard problem, so this is not always tractable -- max_weight can grow exponentially with the size of the problem.

Find the first element that is n times larger than current element in a list

It is easy to come up with an O(n) algorithm to solve this very famous question:
For every element in the list, find the first element that is larger than it. This can be done using a stack. But, what if I want to find the first element that is larger than n*current element?
More specifically:
Given an array [2, 5, 4, 7, 3, 8, 9, 6] and n = 2.
I want [5, -1, 9, -1, 8, -1, -1, -1]
For 2, 5 is the next element larger than n * 2, for 4, 9 is the next element larger than n * 4. For 5, there is no element larger than n * 5 so return -1 at that position.
Can we do better than O(n^2)?

I agree with OP that, the simple predicate of the O(N) algo might not work on the stack-based solution when looking for the first element > 2x in the remaining array.
I found a O(NlogN) solution for this btw.
It uses a Min-heap to maintain the frontier elements we are interested in.
Pseudo-code:
def get_2x_elements(input_list, multipler = 2):
H = [] #min-heap with node-values as tuples (index, value)
R = [-1 for _ in range(len(input_list))] # results-list
for index, value in enumerate(input_list):
while multiplier*H[0][1] < value:
minval = extractMinFromHeap(H)
R[minval[0]] = value
insertToMinHeap(H, (index, value))
return R
Complexity-analysis:
1. Insertion/Extraction from min-heap = O(logN)
2. Number of such operations = N
Total-complexity = O(NlogN)
PS: This assumes we need the first >2x element from the remaining part of the list.
Re:
I made a Java verion implementation of your idea. Thanks #Serial Lazer
private static class ValueAndIndexPair implements Comparable<ValueAndIndexPair>{
public final double value;
public final int index;
public ValueAndIndexPair(double value, int index) {
this.value = value;
this.index = index;
}
#Override
public int compareTo(ValueAndIndexPair other) {
return Double.compare(value, other.value);
}
}
public static double[] returnNextNTimeLargerElementIndex(final List<Double> valueList, double multiplier) {
double[] result = new double[valueList.size()];
PriorityQueue<ValueAndIndexPair> minHeap = new PriorityQueue<>();
// Initialize O(n)
for (int i = 0; i < valueList.size(); i++) {
result[i] = -1.0;
}
if (valueList.size() <= 1) return result;
minHeap.add(new ValueAndIndexPair(valueList.get(0) * multiplier, 0));
for (int i = 1; i <valueList.size(); i++) {
double currentElement = valueList.get(i);
while (!minHeap.isEmpty() && minHeap.peek().value < currentElement) {
result[minHeap.poll().index] = currentElement;
}
minHeap.add(new ValueAndIndexPair(currentElement * multiplier, i));
}
return result;
}

Sure, easily.
We just need a sorted version of the array (sorted elements plus their original index) and then we can do an efficient search (a modified binary search) that points us to the start of the elements that are larger than the current number (or a multiple of it, it doesn't matter). Those elements we can then search sequentially for the one with the smallest index (that is greater than the one of the current number, if so required).
Edit: It was pointed out that the algorithm may not be better than O(n²) because of the sequential search of the elements that satisfy the condition of being greater. This may be so, I'm not sure.
But note that we may build a more complex search structure that involves the index already, somehow. This is left as homework. :)

The stack-based solution offered at geeksforgeeks does not seem to maintain the order of the elements in the result even in its output:
input: int arr[] = { 11, 13, 21, 3 };
output:
11 -- 13
13 -- 21
3 -- -1
21 -- -1
After minor modification to find the first element which is greater N times than a current, this algorithm fails to detect 9 for the element 4 from the given example.
Online demo
input: int arr[] = { 2, 5, 4, 7, 3, 8, 9, 6 }; // as in this question
output:
2 * 2 --> 5
2 * 3 --> 8
6 -- -1
9 -- -1
8 -- -1
7 -- -1
4 -- -1
5 -- -1
Thus, initial assumption about existing solution with complexity O(N) is not quite applicable to the expected result.

C++ : Inserting Operator + and - In an array and check if it is possible to create n

Problem : If you were given a number n, you will have an array with (n-1) index. with the 1st index containing 1, 2nd index containing 2, and n-1 index containing n-1. Given those sets of numbers, How can one check when + or -, the array can be equal to n?
Example :
n = 3, Array = {1,2}
+1 +2 = 3 (True)
n = 4, Array = {1,2,3}
-1 + 2 + 3 = 4 (True)
n = 5, Array = {1,2,3,4}
No possible combination
I tried too long to think about it and still haven't come up with the right answer :(

If you ale looking for simple solvable/not solvable answer, then it seems the answer if very simple
(sum - n) % 2 != 0 // => non-solvable
Here is result of an experiment:
When n gets larger it becomes easier to subtract necessary sum and there are plenty of possible solutions.

Algorithm to determine that a 2x2 square contains the numbers 1-4 (no repeats)

What would be an applicable C++ algorithm to determine that a 2x2 square (say, represented by a 1d vector) contains the numbers 1-4? I can't think of this, although it is quite simple. I would prefer to not have a giant if statement.
Examples of appropriate squares
1 2
3 4
2 3
4 1
1 3
2 4
Inappropriate squares:
1 1
2 3
1 2
3 3
1 2
4 4

I would probably start with an unsigned int set to 0 (e.g., call it x). I'd assign one bit in x to each possible input number (e.g., 1->bit 0, 2->bit 1, 3->bit 2, 4->bit 3). As I read the numbers, I'd verify that the number was in range, and if it was, set the corresponding bit in x.
At the end, if all the numbers are different, I should have 4 bits of x set. If any of the numbers was repeated, some of those bits won't be set.
If you prefer, you could use std::bitset or std::vector<bool> instead of the bits in a single number. In this case a single number is probably easier though, because you can verify the presence of all four desired bits with a single comparison.

bool valid(unsigned[] square) {
unsigned r = 0;
for(int i = 0; i < 4; ++i)
r |= 1 << square[i];
return r == 30;
}
Just set the appropriate bits, and check whether all are set at the end.
Though it assumes the numbers are smaller than sizeof(unsigned) * CHAR_BIT.

Well if it's represented by a vector and we just want something that works:
bool isValidSquare(const std::vector<int>& square) {
if (square.size() == 4) {
std::set<int> uniqs(square.begin(), square.end());
return uniqs.count(1) && uniqs.count(2) && uniqs.count(3) && uniqs.count(4);
}
return false;
}

Create a static bitset for corresponding bit 1-4 set, and another one with all bits unset.
Traverse through the vector, setting the respective bit in the 2nd set for current vector element.
Compare the 1st and 2nd set. If they match, the square is appropriate. Otherwise, it isn't.

You can use the standard library for this
#include <iostream>
#include <algorithm>
#include <vector>
int main()
{
std::vector<int> input{1,5,2,4};
sort(std::begin(input), std::end(input));
std::cout << std::boolalpha
<< std::equal(std::begin(input), std::end(input), std::begin({1,2,3,4}));
}

Assuming your inputs are only 1 to 4 numbers (assumption based on your examples), you can actually xor them and check if the result is 4 :
if ((tab[0] ^ tab[1] ^ tab[2] ^ tab[3]) == 4)
// Matches !
I had the feeling this would work, but am too tired to prove it mathematically, but this python program will prove this is right :
numbers = [1, 2, 3, 4]
good_results = []
bad_results = []
for i in numbers:
for j in numbers:
for k in numbers:
for l in numbers:
res = i ^ j ^ k ^ l
print "%i %i %i %i -> %i" % (i, j, k, l, res)
if len(set([i, j, k, l])) == 4: # this condition checks if i, j, k and l are different
good_results.append(res)
else:
bad_results.append(res)
print set(good_results) # => set([4])
print set(bad_results) # => set([0, 1, 2, 3, 5, 6, 7])

Indices of objects in a list of non-redundant pairs

I am implementing a collision detection algorithm stores the distance between all the objects in a single octree node. For instance if there are 4 objects in the node, there is a distance between objects 1&2, 1&3, 1&4, 2&3, 2&4 and 3&4. The formula for the total number of pairs is t = n * (n-1) / 2, where t is the total number of pairs and n is the number of objects in a node.
My question is, how do I convert from a position in the list to a pair of objects. For instance, using the above list of pairs, 3 would return the pair 2&3.
To save space in memory, the list is just a list of floats for the distance instead of containing distance and pointers to 2 objects.
I am unsure how to mathematically convert the single list index to a pair of numbers. Any help would be great. I am hoping to be able to break this down to 2 functions, the first returns the first object in the pair and the second returns the second, both the functions taking 2 variables, one being the index and the other being the total objects in the node. If possible I would like to make a function without any looping or having a recursive function because this will be run in real time for my collision detection algorithm.

Better ordering
I suggest using colexicographical order, as in that case you won't have to supply the total number of objects. Order your pairs like this:
0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: …
0&1, 0&2, 1&2, 0&3, 1&3, 2&3, 0&4, 1&4, 2&4, 3&4, 0&5, 1&5, 2&5, 3&5, …
You'll ve able to extend this list to infinite length, so you can know the index of any pair without knowing the number of items. This has the benefit that when you add new items to your data structure, you'll only have to append to your arrays, not relocate existing entries. I've adjusted the indices to zero-based, as you tagged your question C++ so I assume you'll be using zero-based indexing. All my answer below assumes this ordering.
You can also visualize the colex ordering like this:
a: 0 1 2 3 4 5 …
b:
1 0
2 1 2 index of
3 3 4 5 a&b
4 6 7 8 9
5 10 11 12 13 14
6 15 16 17 18 19 20
⋮ ⋮ ⋱
Pair to single index
Let us first turn a pair into a single index. The trick is that for every pair, you look at the second position and imagine all the pairs that had a lesser number in that position. So for example for the pair 2&4 you first count all the pairs where the second number is less than 4. This is the number of possible ways to choose two items from a set of 4 (i.e. the numbers 0 through 3), so you could express this as a binomial coefficient 4C2. If you evaluate it, you end up with 4(4−1)/2=6. To that you add the first number, as this is the number of pairs with lower index but with the same number in the second place. For 2&4 this is 2, so the overall index of 2&4 is 4(4−1)/2+2=8.
In general, for a pair a&b the index will be b(b−1)/2+a.
int index_from_pair(int a, int b) {
return b*(b - 1)/2 + a;
}
Single index to pair
One way to turn the single index i back into a pair of numbers would be increasing b until b(b+1)/2 > i, i.e. the situation where the next value of b would result in indices larger than i. Then you can find a as the difference a = i−b(b−1)/2. This approach by incrementing b one at a time involves using a loop.
pair<int, int> pair_from_index(int i) {
int a, b;
for (b = 0; b*(b + 1)/2 <= i; ++b)
/* empty loop body */;
a = i - b*(b - 1)/2;
return make_pair(a, b);
}
You could also interpret b(b−1)/2 = i as a quadratic equation, which you can solve using a square root. The real b you need is the floor of the floating point b you'd get as the positive solution to this quadratic equation. As you might encounter problems due to rounding errors in this approach, you might want to check whether b(b+1)/2 > i. If that is not the case, increment b as you would do in the loop approach. Once you have b, the computation of a remains the same.
pair<int, int> pair_from_index(int i) {
int b = (int)floor((sqrt(8*i + 1) + 1)*0.5);
if (b*(b + 1)/2 <= i) ++b; // handle possible rounding error
int a = i - b*(b - 1)/2;
return make_pair(a, b);
}
Sequential access
Note that you only need to turn indices back to pairs for random access to your list. When iterating over all pairs, a set of nested loops is easier. So instead of
for (int = 0; i < n*(n - 1)/2; ++i) {
pair<int, int> ab = pair_from_index(i);
int a = ab.first, b = ab.second;
// do stuff
}
you'd better write
for (int i = 0, b = 1; b != n; ++b) {
for (int a = 0; a != b; ++a) {
// do stuff
++i;
}
}

Based on my understanding of the question, one way to get a pair a&b (1-based, 2&3 in your example) from the index (0-based, 3 in your example) and the number of objects n (4 in your example) is:
t = n * (n - 1) / 2;
a = n - floor((1 + sqrt(1 + 8 * (t - index - 1))) / 2);
b = index + (n - a) * (n - a + 1) / 2 - t + a + 1;
Some credits to http://oeis.org/A002024
Generalized algorithms (for tuples rather than pairs) can be found at Calculate Combination based on position and http://saliu.com/bbs/messages/348.html, but they seem to involve calculating combinations in a loop.
Edit: a nicer formula for a (from the same source):
a = n - floor(0.5 + sqrt(2 * (t - index)));

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js