Find an element in an array with limited comparisons?

Find an element in an array with limited comparisons? - c++

Given X, M, N where X = element to be searched in an array and N = access only first N elements in an array and M = array size, how do we find an element in an array with maximum (N+1) comparisons?
For example,
A = [3,5,2,9,8,4,1,6,7] here M = 9
Let's have N = 6 and X = 5 => So for this case, access only first 6 elements of an array and try to find whether X is present in it or not? Here answer will return true. But for X = 6 answer will be false.
This problem is not about time complexity. it's about number of comparisons you make. For example, Brute force method looks like this.
void search(vector<int> A){
for(int i=0; i<N; i++){ // [i < N is also comparison which is N times]
if(A[i] != X) continue; // [N comparisons ]
else return true;
}
return false;
}
Time complexity is O(n) but number of comparisons will be 2*N. Reduce this comparisons to (N+1). I tried to solve it but did not get solution. Is there any solution actually for this?

Idea
Modify N+1-th element to have X value and eliminate range check. Then once you have found element with X value (which is going to be true if M < N), check it's index (this is a last check that you can perform). If it's equal to N+1 then you haven't found one.
Analysis
Despite that the approach eliminates comparisons duplication, it's still has one "extra" comparison:
bool search(int* a, int n, int x)
{
a[n] = x;
int idx = 0;
while (a[idx] != x) // n + 1 comparisons in case if value hasn't been found
++idx;
return idx < n; // (n + 2)-th comparison in case if value hasn't been found
}
Solution (not perfect, though)
I can see only one way to cut that extra comparison with this approach: is to use the fact that zero integer value converts to false and any integer value not equal to zero converts to true. Using this the code is going to look like this:
bool search(int* a, int n, int x)
{
a[n] = x;
int idx = 0;
while (a[idx] != x) // n + 1 comparisons in case if value hasn't been found
++idx;
return idx - n; // returns 0 only when idx == n, which means that value is not found
}

Related

Accessing an uninitialized value, most probably in a vector

I am doing an exercise where I need to find a positive integers p and q which are factors of another natural number n.
Following the formula n=pq*q where p is a squarefree number.
However, for some instances of the program my compiler detects a memory error saying that I am accessing an uninitialized value.
The logic I tried is as follows. Firstly, I took the number that needs to be factored (name it n). Next I found all factors of the number n and placed them in a vector. After that, check if every element of that vector is squarefree. If true, put the element in another vector(a vector of squarefree factors of the number n). After that, go through every element of the vector of squarefree factors and solve the equation q=sqrt(n/p) where p is the squarefree factor from the vector. Additionally, I check the condition if(int(sqrt(n/p))==sqrt(n/p)) because the square root needs to be a positive integer.
#include <iostream>
#include <vector>
#include <cmath>
using namespace std;
// Function that checks if the number is squarefree
bool isSquareFree(int n)
{
if (n % 2 == 0)
n = n / 2;
if (n % 2 == 0)
return false;
for (int i = 3; i <= sqrt(n); i += 2)
{
if (n % i == 0)
{
n = n / i;
if (n % i == 0)
return false;
}
}
return true;
}
void Factorise_the_number(int n, int &p, int &q)
{
if (n <= 0)
return 0;
vector<int> factors(0); // vector of factors
vector<int> sqfree_factors(0); // vector of squarefree factors
int sqfree_number; // the number "p"
int squared; // is essentially the number "q"
for (int i = 1; i <= n / 2; i++)
{
if (n % i == 0)
factors.push_back(i); // takes all factors of the number "n"
}
for (int i = 0; i < factors.size(); i++)
{
if (isSquareFree(factors.at(i)))
sqfree_factors.push_back(factors.at(i));
} // checks if each factor is squarefree. if yes, put it in a separate vector
for (auto x : sqfree_factors)
{
if (int(sqrt(n / x)) == sqrt(n / x))
{ // if true, we found the numbers
squared = sqrt(n / x);
sqfree_number = x;
break;
}
}
p = sqfree_number;
q = squared;
}
int main()
{
int n, p = 0, q = 0;
cin >> n;
Factorise_the_number(n, p, q);
cout << p << " " << q;
return 0;
}
For example, my program works if I enter the number 99, but doesn't work if I enter 39. Can anyone give any insight?
Thanks!

As you said, for 39 it doesn't work. Have you checked what it's doing with 39? You should do it, as it is the best way to debug your program.
Let's have a look at it together. First it tries to find all the factors, and it finds 1, 3 and 13: this looks fine.
Then, it checks whether each of those numbers is squarefree, and they all are: this also looks correct.
Then, it checks whether any of the squarefree factors satisfy the equality you are looking for. None of them does (39 is 3 x 13, there's no way it can contain a squared factor). This means that if (int(sqrt(n / x)) == sqrt(n / x)) is never true, and that block is never run. What's the value of sqfree_number and squared at that point? It is never initialised. Using uninitialised values leads to "undefined behaviour", that is, your program can do anything. In this case, p and q end up containing random values.
How can you fix it? Consider this: if n doesn't satisfy your equation, that is, it can't be expressed as pq*q, what, exactly, should the program output? Would your output, as it is now, ever make sense? No. This means you have to modify your program so that it covers a case you hadn't considered.
A way is to add a bool found = false; just before your final for loop. When you find the factors, before breaking, set that variable to true. Then, outside the loop, check it: is it true? Then you can return the correct values. But if it's still false, it means the equality doesn't hold, and you can't return correct values. You have to find a way to signal this to the caller (which is your main function), so that it can print an appropriate message.
And how can you signal it? In general, you could change your Factorise_the_number (by the way, the name of functions should start with a lowercase letter; uppercase letters are usually used for classes) to return a bool. Or you could use a trick: return a special value for p and q that cannot be the result of the calculation. Like -1. Then, before printing, check: if the values are -1, it means the number can't be expressed as pq*q.

count number of partitions of a set with n elements into k subsets

This program is for count number of partitions of a set with n elements into k subsets I am confusing here return k*countP(n-1, k) + countP(n-1, k-1);
can some one explain what is happening here?
why we are multiplying with k?
NOTE->I know this is not the best way to calculate number of partitions that would be DP
// A C++ program to count number of partitions
// of a set with n elements into k subsets
#include<iostream>
using namespace std;
// Returns count of different partitions of n
// elements in k subsets
int countP(int n, int k)
{
// Base cases
if (n == 0 || k == 0 || k > n)
return 0;
if (k == 1 || k == n)
return 1;
// S(n+1, k) = k*S(n, k) + S(n, k-1)
return k*countP(n-1, k) + countP(n-1, k-1);
}
// Driver program
int main()
{
cout << countP(3, 2);
return 0;
}

Each countP call implicitly considers a single element in the set, lets call it A.
The countP(n-1, k-1) term comes from the case where A is in a set by itself. In this case, we just have to count how many ways there are to partition all the other elements (N-1) into (K-1) subsets, since A takes up one subset by itself.
The k*countP(n-1, k) term, then, comes from the case where A is not in a set by itself. So we figure out the number of ways of partitioning all the other (N-1) values into K subsets, and multiply by K because there are K possible subsets we could add A to.
For example, consider the set [A,B,C,D], with K=2.
The first case, countP(n-1, k-1), describes the following situation:
{A, BCD}
The second case, k*countP(n-1, k), describes the following cases:
2*({BC,D}, {BD,C}, {B,CD})
Or:
{ABC,D}, {ABD,C}, {AB,CD}, {BC,AD}, {BD,AC}, {B,ACD}

How do we get countP(n,k)? Assuming that we have devided previous n-1 element into a certain number of partions, and now we have the n-th element, and we try to make k partition.
we have two option for this:
either
we have devided the previous n-1 elements into k partions(we have countP(n-1, k) ways of doing this), and we put this n-th element into one of these partions(we have k choices). So we have k*countP(n-1, k).
or:
we divide previous n-1 elements into k-1 partition(we have countP(n-1, k-1); ways of doing this), and we make the n-th element a single partion to achieve a k partition(we only have 1 choice: putting it seperately). So we have countP(n-1, k-1);.
So we sum them up and get the result.

What you mentioned is the Stirling numbers of the second kind which enumerates the number of ways to partition a set of n objects into k non-empty subsets and denoted by or .
Its recursive relation is:
for k > 0 with initial conditions:
.
Calculating it using dynamic programming is more faster than recursive approach:
int secondKindStirlingNumber(int n, int k) {
int sf[n + 1][n + 1];
for (int i = 0; i < k; i++) {
sf[i][i] = 1;
}
for (int i = 1; i < n + 1; i++) {
for (int j = 1; j < k + 1; j++) {
sf[i][j] = j * sf[i - 1][j] + sf[i - 1][j - 1];
}
}
return sf[n][k];
}

Based on This a partition of a set is a grouping of the set's elements into non-empty subsets, in such a way that every element is included in one and only one of the subsets. So the total number of partitions of an n-element set is the Bell number which is calculated like below:
Bell number formula
Hence if you want to convert the formula to a recursive function it will be like:
k*countP(n-1,k) + countP(n-1, k-1);

c++ How to find the minimum number of elements from a vector that sum to a given number

I want to find the minimum number of elements from an array that their sum is equal to the given number.

Looks like an easy dynamic programming task.
Suppose that there are N elements in the array A and you want to get the minimum number of elements which sum is S. Then we can easily solve the problem with O(N x S) time complexity.
Consider dp[i][j] - the minimum number of elements among first i elements which sum is j, 1 <= i <= N and 0 <= j <= S. Then for A[i] <= j <= S:
dp[i][j] = min(infinity, dp[i - 1, j], 1 + dp[i - 1][j - A[i]]).
We can assume that dp[0][0] = 0, dp[0][j] = infinity for 0 < j <= S.

The simplest way is to solve it recursively.
find_sum(goal, sorted_list) {
int best_result = infinity;
for each (remaining_largest : sorted_list){
result = find_sum(goal - remaining_largest, sorted_list_without_remaining_largest) + 1;
if result < best_result then best_result = result;
}
return best_result;
}
There are many ways to optimize this algorithm and may be fundamentally better algorithms as well, but I was trying to keep it very simple.
One optimization would be to store the best combination to get to a given number in a hash table. The naive algorithm suffers from the same drawbacks as a recursive fibonacci solver in that it is constantly re-solving duplicate sub-problems.
I haven't actually run this:
#include <vector>
#include <map>
using namespace std;
// value, num values to sum for value
map<int,int> cache;
// returns -1 on no result, >= 0 is found result
int find(int goal, vector<int> sorted_list, int starting_position = 0) {
// recursive base case
if (goal== 0) return 0;
// check the cache as to not re-compute solved sub-problems
auto cache_result = cache.find(goal);
if (cache_result != cache.end()) {
return cache_result->second;
// find the best possibility
int best_result = -1;
for (int i = starting_position; i < sorted_list.size(); i++) {
if (sorted_list[starting_position] <= goal) {
auto maybe_result = find(goal- sorted_list[starting_position], sorted_list, i++);
if (maybe_result >= 0 && maybe_result < best_result) {
best_result = maybe_result + 1;
}
}
}
// cache the computed result so it can be re-used if needed
cache[goal] = best_result;
return best_result;
}

Try ordering it ascending and while you iterate through it make a temporary sum in which you add every element until you reach the required sum. If by adding a new element you go over the sum continue without adding the current element. Try something like this:
for(i=0;i<nr_elem;i++)
minimum = 0;
temp_sum=0;
for(j=0;j<nr_elem;j++){
if(temp_sum + elem[j] > req_sum)
*ignore*
else
temp_sum+=elem[j];
minimum+=1;}
if(global_min < minimum)
global_min = minimum;
Not the most elegant method or the most efficient but it should work

Next lexical "permutation" algorithm

I wrote a program that solves a generalized version of 24(link for those curious). That is, given a set of n numbers, is there a way to perform binary operations on them such that they compute to a target number.
To do this, I viewed possible expressions as a char array consisting of either 'v' or 'o', where 'v' is a placeholder for a value and 'o' is a placeholder for an operation. Note that if there are n values, there must be n-1 operations.
How the program currently works is it checks every permutation of {'o','o',...,'v',v'...} in lexicographical order and sees if the prefix expression is valid. For example, when n = 4, the following expressions are considered valid:
{‘o’,’o’,’o’,’v’,’v’,’v’,’v’}
{‘o’, ‘v’, ‘o’, ‘v’, ‘o’, ‘v’, ‘v’}
The following expressions are not valid:
{‘v’,’o’,’v’,’o’,’o’,’v’,’v’}
{‘o’,’o’,’v’,’v’,’v’,’o’,’v’}
My question is does there exist an efficient algorithm to get the next permutation that is valid in some sort of ordering? The goal is to eliminate having to check if an expression is valid for every permutation.
Moreover, if such an algorithm exists, does there exist an O(1) time to compute the kth valid permutation?
What I have so far
I hypothesize that an prefix expression A of length 2n-1 is considered valid if and only if
number of operations < number of values for each A[i:2n-1)
where 0<=i<2n-1(the subarray starting at i and ending (non-inclusive) at 2n-1)
Moreover, that implies there are exactly (1/n)C(2n-2,n-1) valid permutations where C(n,k) is n choose k.

Here's how to generate the ov-patterns. The details behind the code below are in Knuth Volume 4A (or at least alluded to; I might have worked one of the exercises). You can use the existing permutation machinery to permute the values every which way before changing patterns.
The code
#include <cstdio>
namespace {
void FirstTree(int f[], int n) {
for (int i = n; i >= 0; i--) f[i] = 2 * i + 1;
}
bool NextTree(int f[], int n) {
int i = 0;
while (f[i] + 1 == f[i + 1]) i++;
f[i]++;
FirstTree(f, i - 1);
return i + 1 < n;
}
void PrintTree(int f[], int n) {
int i = 0;
for (int j = 0; j < 2 * n; j++) {
if (j == f[i]) {
std::putchar('v');
i++;
} else {
std::putchar('o');
}
}
std::putchar('v');
std::putchar('\n');
}
}
int main() {
constexpr int kN = 4;
int f[1 + kN];
FirstTree(f, kN);
do {
PrintTree(f, kN);
} while (NextTree(f, kN));
}
generates the output
ovovovovv
oovvovovv
ovoovvovv
oovovvovv
ooovvvovv
ovovoovvv
oovvoovvv
ovoovovvv
oovovovvv
ooovvovvv
ovooovvvv
oovoovvvv
ooovovvvv
oooovvvvv
There's a way to get the kth tree, but in time O(n) rather than O(1). The magic words are unranking binary trees.

What is the fastest way to find longest 'consecutive numbers' streak in vector ?

I have a sorted std::vector<int> and I would like to find the longest 'streak of consecutive numbers' in this vector and then return both the length of it and the smallest number in the streak.
To visualize it for you :
suppose we have :
1 3 4 5 6 8 9
I would like it to return: maxStreakLength = 4 and streakBase = 3
There might be occasion where there will be 2 streaks and we have to choose which one is longer.
What is the best (fastest) way to do this ? I have tried to implement this but I have problems with coping with more than one streak in the vector. Should I use temporary vectors and then compare their lengths?

No you can do this in one pass through the vector and only storing the longest start point and length found so far. You also need much fewer than 'N' comparisons. *
hint: If you already have say a 4 long match ending at the 5th position (=6) and which position do you have to check next?
[*] left as exercise to the reader to work out what's the likely O( ) complexity ;-)

It would be interesting to see if the fact that the array is sorted can be exploited somehow to improve the algorithm. The first thing that comes to mind is this: if you know that all numbers in the input array are unique, then for a range of elements [i, j] in the array, you can immediately tell whether elements in that range are consecutive or not, without actually looking through the range. If this relation holds
array[j] - array[i] == j - i
then you can immediately say that elements in that range are consecutive. This criterion, obviously, uses the fact that the array is sorted and that the numbers don't repeat.
Now, we just need to develop an algorithm which will take advantage of that criterion. Here's one possible recursive approach:
Input of recursive step is the range of elements [i, j]. Initially it is [0, n-1] - the whole array.
Apply the above criterion to range [i, j]. If the range turns out to be consecutive, there's no need to subdivide it further. Send the range to output (see below for further details).
Otherwise (if the range is not consecutive), divide it into two equal parts [i, m] and [m+1, j].
Recursively invoke the algorithm on the lower part ([i, m]) and then on the upper part ([m+1, j]).
The above algorithm will perform binary partition of the array and recursive descent of the partition tree using the left-first approach. This means that this algorithm will find adjacent subranges with consecutive elements in left-to-right order. All you need to do is to join the adjacent subranges together. When you receive a subrange [i, j] that was "sent to output" at step 2, you have to concatenate it with previously received subranges, if they are indeed consecutive. Or you have to start a new range, if they are not consecutive. All the while you have keep track of the "longest consecutive range" found so far.
That's it.
The benefit of this algorithm is that it detects subranges of consecutive elements "early", without looking inside these subranges. Obviously, it's worst case performance (if ther are no consecutive subranges at all) is still O(n). In the best case, when the entire input array is consecutive, this algorithm will detect it instantly. (I'm still working on a meaningful O estimation for this algorithm.)
The usability of this algorithm is, again, undermined by the uniqueness requirement. I don't know whether it is something that is "given" in your case.
Anyway, here's a possible C++ implementation
typedef std::vector<int> vint;
typedef std::pair<vint::size_type, vint::size_type> range;
class longest_sequence
{
public:
const range& operator ()(const vint &v)
{
current = max = range(0, 0);
process_subrange(v, 0, v.size() - 1);
check_record();
return max;
}
private:
range current, max;
void process_subrange(const vint &v, vint::size_type i, vint::size_type j);
void check_record();
};
void longest_sequence::process_subrange(const vint &v,
vint::size_type i, vint::size_type j)
{
assert(i <= j && v[i] <= v[j]);
assert(i == 0 || i == current.second + 1);
if (v[j] - v[i] == j - i)
{ // Consecutive subrange found
assert(v[current.second] <= v[i]);
if (i == 0 || v[i] == v[current.second] + 1)
// Append to the current range
current.second = j;
else
{ // Range finished
// Check against the record
check_record();
// Start a new range
current = range(i, j);
}
}
else
{ // Subdivision and recursive calls
assert(i < j);
vint::size_type m = (i + j) / 2;
process_subrange(v, i, m);
process_subrange(v, m + 1, j);
}
}
void longest_sequence::check_record()
{
assert(current.second >= current.first);
if (current.second - current.first > max.second - max.first)
// We have a new record
max = current;
}
int main()
{
int a[] = { 1, 3, 4, 5, 6, 8, 9 };
std::vector<int> v(a, a + sizeof a / sizeof *a);
range r = longest_sequence()(v);
return 0;
}

I believe that this should do it?
size_t beginStreak = 0;
size_t streakLen = 1;
size_t longest = 0;
size_t longestStart = 0;
for (size_t i=1; i < len.size(); i++) {
if (vec[i] == vec[i-1] + 1) {
streakLen++;
}
else {
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}
beginStreak = i;
streakLen = 1;
}
}
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}

You can't solve this problem in less than O(N) time. Imagine your list is the first N-1 even numbers, plus a single odd number (chosen from among the first N-1 odd numbers). Then there is a single streak of length 3 somewhere in the list, but worst case you need to scan the entire list to find it. Even on average you'll need to examine at least half of the list to find it.

Similar to Rodrigo's solutions but solving your example as well:
#include <vector>
#include <cstdio>
#define len(x) sizeof(x) / sizeof(x[0])
using namespace std;
int nums[] = {1,3,4,5,6,8,9};
int streakBase = nums[0];
int maxStreakLength = 1;
void updateStreak(int currentStreakLength, int currentStreakBase) {
if (currentStreakLength > maxStreakLength) {
maxStreakLength = currentStreakLength;
streakBase = currentStreakBase;
}
}
int main(void) {
vector<int> v;
for(size_t i=0; i < len(nums); ++i)
v.push_back(nums[i]);
int lastBase = v[0], currentStreakBase = v[0], currentStreakLength = 1;
for(size_t i=1; i < v.size(); ++i) {
if (v[i] == lastBase + 1) {
currentStreakLength++;
lastBase = v[i];
} else {
updateStreak(currentStreakLength, currentStreakBase);
currentStreakBase = v[i];
lastBase = v[i];
currentStreakLength = 1;
}
}
updateStreak(currentStreakLength, currentStreakBase);
printf("maxStreakLength = %d and streakBase = %d\n", maxStreakLength, streakBase);
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Find an element in an array with limited comparisons? - c++

Related

Accessing an uninitialized value, most probably in a vector

count number of partitions of a set with n elements into k subsets

c++ How to find the minimum number of elements from a vector that sum to a given number

Next lexical "permutation" algorithm

What is the fastest way to find longest 'consecutive numbers' streak in vector ?

Categories

Resources