Binary search with returned index in STL? - c++

I need a binary search function.
I couldn't find any function in the standard library that will return the index of the found item, and if it wasn't found, will return the bitwise complement of the index of the next element that is larger than the item I looked for.
What is the function I am looking for?
Edit:
I need to insert an item to a sorted vector and to keep it sorted. That's why I need to bitwise complement index.

I'm quite certain the standard library doesn't include anything to do precisely what you're asking for.
To get what you want, you'll probably want to start from std::lower_bound or std::upper_bound, and convert the iterator it returns into an index, then complement the index if the value wasn't found.
lower_bound will find the position of the first item with that value (if any).
upper_bound will find the position of the last item with that value (again, if any).
Both will return an iterator to the next larger item if the specified value isn't present (or .last() if there is no larger item).

There is no simple STL method which returns index against a sorted vector as far as I know, however you can use sample function below:
/**
* #param v - sorted vector instance
* #param data - value to search
* #return 0-based index if data found, -1 otherwise
*/
int binary_search_find_index(std::vector<int> v, int data) {
auto it = std::lower_bound(v.begin(), v.end(), data);
if (it == v.end() || *it != data) {
return -1;
} else {
std::size_t index = std::distance(v.begin(), it);
return index;
}
}

This code should work fine
auto itr = lower_bound(v.begin(), v.end(), key) ;
index = distance(v.begin(), itr);
More about std::lower_bound() -
https://www.geeksforgeeks.org/stdlower_bound-in-c/

Clearly, this "will return the bitwise complement" is a big deal for you and I do not understand what you mean. That said, lookup std::upper_bound and see if it does what you want.

using STL we can find the index
vector<int> vec{1,2,3,4,5,6,7,8,9} ;
vector<int> :: iterator index;
index=lower_bound(vec.begin(),vec.end(),search_data);
return (index-vec.begin());

int bin_search (ForwardIterator first, ForwardIterator last, const T& val)
{
ForwardIterator low;
low = std::lower_bound(first,last,val);
if(low!=last && !(val<*low)){
return (low - first + 1);
}else{
return 0;
}
}

int a = 0, b = n-1;
while (a <= b) {
int k = (a+b)/2;
if (array[k] == x)
{
// x found at index k
}
if (array[k] < x) a = k+1;
else b = k-1;
}

Using Jonathan's answer, a one line function, for the case where we want to know at which index the value is or should have been if it does not exist:
int searchInsert(vector<int>& nums, int target) {
return lower_bound(nums.begin(), nums.end(), target) - nums.begin();
}

I was working on a similar problem where I needed to insert an element in a vector while keeping it sorted. The solution I came up with is:
The function(modified binary search) returns the position where the value should be inserted.
int Position(vector<int> arr, size_t size, int val)//void for commented part
{
int start=0, middle, end=size-1;
while(arr[start] <= arr[end])
{
middle = (int)((start+end+1)/2);
if (arr[middle] < val)
{
start = middle+1;
}
else if (arr[middle] > val)
{
end = middle-1;
}
else//arr[middle]=val;
{
return middle;
}
}
mid = (int)((start+end+1)/2);
//arr.insert(arr.begin()+mid, val); Can I do it here? got error trying to do it.
return mid;
}
int main()
{
vector<int> x; cin>> val;
mid = Position(x, x.size(), val);
x.insert(x.begin()+mid, val);
}

Related

Converting an iterative function to a recursive function without changing parameters

I want to convert the iterative template function getSmallest into a recursive function without changing anything in main (no changing function parameters etc.) because in class we are being taught to always keep the public interfaces of our functions the same (so if we work in a big project, we don't start changing things that break the whole program)
Here's the program I wish to convert:
// PRE: 0 <= start < end <= length of arr
// PARAM: arr = array of integers
// start = start index of sub-array
// end = end index of sub-array + 1
// POST: returns index of smallest value in arr{start:end}
template <class T>
int getSmallest(T arr[], int start, int end) {
int smallest = start;
for (int i = start + 1; i < end; ++i) {
if (arr[i] < arr[smallest]) {
smallest = i;
}
}
return smallest;
}
I have spent the last few hours scouring class notes, the internet and stackoverflow for any help but it all seems not related to my problem, so I am asking here.
Here is the best attempt I came up with:
template <class T>
int getSmallest(T arr[], int start, int end)
{
int smallest = 0; //this should only run on the first recursion, not the rest
i = start;
if (i == end-1)
{
return smallest;
}
else
{
if (arr[i] < arr[smallest])
{
smallest = i;
}
return getSmallest<T>(arr, i, end);
}
}
I can't seem to make int smallest = 0; only run on the first recursion and while my program compiles, it is functionally useless.
Any help would be appreciated. Thanks!
Rather than a recursive function of depth O(N) (as in OP's attempt), how about a recursive function of depth O(log(N))?
Divide the array in 2 each recursion.
int getSmallest(T arr[], int start, int end) {
if (start + 1 == end) {
return start;
}
int mid = start + (end-start)/2; // mid = (start + end)/2 may overflow
int left = getSmallest(arr, start, mid + 1);
int right = getSmallest(arr, mid, end);
return arr[left] < arr[right] ? left : right;
}
There are three conditions:
The array has no elements.
The array has 1 element.
The array has more than 1 element
The first condition we can return -1, for the second condition we return the index of the first element, and for the third condition we compare the current minimum with the minimum element found from the rest of the array, and return the index of the smallest:
template<class T>
int getSmallest(T arr[], int start, int end) {
if (end - start == 0)
return -1;
if (end - start == 1)
return start;
int idx = getSmallest(arr, start + 1, end);
if (arr[start] < arr[idx])
return start;
return idx;
}
The general way to transform an iterative function is to replace each loop with a helper function. For example:
for (int i = start + 1; i < end; i++) {
if (arr[i] < arr[smallest]) {
smallest = i;
}
}
We first rewrite this into a while loop:
int i = start + 1;
while (i < end) {
if (arr[i] < arr[smallest) {
smallest = i;
}
i++;
}
We then ask ourselves "what variables defined before the loop do we use in the loop?" Answer: arr, i, end, and smallest. These become the inputs of our function.
We then ask ourselves: what state was modified in the loop that we later use outside the loop? Answer: the value of smallest. This becomes our output.
We thus write the following helper function:
int helper_function(int i, int end, T arr[], int smallest) {
if (i < end) {
if (arr[i] < arr[smallest]) {
smallest = i;
}
i = i + 1;
return helper_function(i, end, arr, smallest);
} else {
return smallest;
}
}
which can be rewritten to
int helper_function(int i, int end, T arr[], int smallest) {
return i < end ? helper_function(i + 1,
end,
arr,
arr[i] < arr[smallest] ? i : smallest)
: smallest;
}
And replace the loop with
smallest = helper_function(i, end, arr, smallest);
So the code at this stage is
int helper_function(int i, int end, T arr[], int smallest) {
return i < end ? helper_function(i + 1,
end,
arr,
arr[i] < arr[smallest] ? i : smallest)
: smallest;
}
int get_smallest(T arr[], int start, int end) {
int smallest = start;
int i = start + 1;
smallest = helper_function(i, end, arr, smallest);
return smallest;
}
And of course we can dramatically simplify get_smallest - we don't need to redefine smallest and then return it, for example. So we get the following code for get_smallest:
int get_smallest(T arr[], int start, int end) {
return helper_function(start + 1, end, arr, start);
}
What do you do if there's more than one level of loop nesting? Get rid of the loops one at a time starting from the outermost loop.
Note that in this special case, it's possible to solve the problem with a different recursive algorithm. That algorithm would look like this:
int get_smallest(T arr[], int start, int end) {
if (start + 1 < end) {
int result = get_smallest(arr, start + 1, end);
return arr[result] < arr[start] ? result : start;
} else {
return start;
}
}
This is possible because of the fact that the min function is associative.
Before there were lambda functions in C++ (c++11 and newer has them), there was no other way than creating helper functions with an extra argument for the state (e.g. the accumulator or the smallest value or whatever is wanted). Other languages including old and dusty Pascal have nested functions.
template <class T>
T smallest(const T* first, const T* last) {
// iterative implementation
}
turns, with the help of lambda functions into:
template <class T>
T smallest(const T* first, const T* last) {
T result = 0; // dubious in itself, because we do not want to assume too much about what T actually is.
auto loop = [&result,last] (const T* current) {
if (current < last) {
if (*current < result) {
result = *current;
}
loop(current + 1);
}
};
loop(first);
return result;
}
Because the implementation is only really and truly generic if we do not assume too much about T, the line T result = 0; is already too much and we would need some generic 0 for every conceivable type. And even this would be wrong, because if we have an array containing negative numbers, you would need the smallest possible value and not 0 for the whole thing to work correctly.
What we also silently assumed is that there is an operator<(T x, T y) defined for T and that it is the one we want (imagine we want to use it for T = std::string - case sensitive comparison? utf8 aware? case insensitive?).
So, even though changing interfaces in code is often not a good idea, sometimes there simply should be an improvement. Because, lets admit it - the design of this function is bad, because it tries to sell us more genericity than it can truly grant.
template <class T>
T* smallest(const T* first, const T* last) {
if (first == last)
return last;
auto loop = [] (const T* first, const T* last, T* smallest) -> T* {
if (first != last) {
if (*first < *smallest) {
return loop( first + 1, last, first);
} else {
return loop( first + 1, last, smallest);
}
} else {
return smallest;
}
};
return loop(first + 1, last, first);
}
Is an improvement as it only assumes an operator<(...) but does not need to make assumptions about the "smallest possible value of T". It also shows more clearly how the "state" of the recursion is passed along (and thus makes this implementation tail recursive). And if lucky, the c++ compiler has an optimization for that and avoids unnecessary stack usage.
The next level of improvement would be to add an extra argument, which allows passing a compare function to smallest. And because we then as well could pass a "greater than" to this function, we could try and find a more general name.
template <class T>
T* selectLast( const T* first, const T* last, bool (*predicate)(const T*, const T*)) {
// ...
}
or
template <class T, class Pred>
T* selectLast( const T* first, const T* last, Pred predicate) {
// ...
}
Or we could generalize even more (since we are already in the realm of higher functions) and just offer a reduce/fold function. And as such a function already exists, we would use std::reduce() from the C++ standard library header file <numeric>.

Maintain an unordered_map but at the same time need the lowest of it's mapped values at every step

I have an unordered_map<int, int> which is updated at every step of a for loop. But at the end of the loop, I also need the lowest of the mapped values. Traversing it to find the minimum in O(n) is too slow. I know there exists MultiIndex container in boost but I can't use boost. What is the simplest way it can be done using only STL?
Question:
Given an array A of positive integers, call a (contiguous, not
necessarily distinct) subarray of A good if the number of different
integers in that subarray is exactly K.
(For example, [1,2,3,1,2] has 3 different integers: 1, 2, and 3.)
Return the number of good subarrays of A.
My code:
class Solution {
public:
int subarraysWithKDistinct(vector<int>& A, int K) {
int left, right;
unordered_map<int, int> M;
for (left = right = 0; right < A.size() && M.size() < K; ++right)
M[A[right]] = right;
if (right == A.size())
return 0;
int smallest, count;
smallest = numeric_limits<int>::max();
for (auto p : M)
smallest = min(smallest, p.second);
count = smallest - left + 1;
for (; right < A.size(); ++right)
{
M[A[right]] = right;
while (M.size() > K)
{
if (M[A[left]] == left)
M.erase(A[left]);
++left;
}
smallest = numeric_limits<int>::max();
for (auto p : M)
smallest = min(smallest, p.second);
count += smallest - left + 1;
}
return count;
}
};
Link to the question: https://leetcode.com/problems/subarrays-with-k-different-integers/
O(n) is not slow, in fact it is the theoretically fastest possible way to find the minimum, as it's obviously not possible to find the minimum of n items without actually considering each of them.
You could update the minimum during the loop, which is trivial if the loop only adds new items to the map but becomes much harder if the loop may change existing items (and may increase the value of the until-then minimum item!), but ultimately, this also adds O(n) amount of work, or more, so complexity-wise, it's not different from doing an extra loop at the end (obviously, the constant can be different - the extra loop may be slower than reusing the original loop, but the complexity is the same).
As you said, there are data structures that make it more efficient (O(log n) or even O(1)) to retrieve the minimum item, but at the cost of increased complexity to maintain this data structure during insertion. These data structures only make sense if you frequently need to check the minimum item while inserting or changing items - not if you only need to know the minimum only at the end of the loop, as you described.
I made a simple class to make it work although it's far from perfect, it's good enough for the above linked question.
class BiMap
{
public:
void insert(int key, int value)
{
auto itr = M.find(key);
if (itr == M.cend())
M.emplace(key, S.insert(value).first);
else
{
S.erase(itr->second);
M[key] = S.insert(value).first;
}
}
void erase(int key)
{
auto itr = M.find(key);
S.erase(itr->second);
M.erase(itr);
}
int operator[] (int key)
{
return *M.find(key)->second;
}
int size()
{
return M.size();
}
int minimum()
{
return *S.cbegin();
}
private:
unordered_map<int, set<int>::const_iterator> M;
set<int> S;
};
class Solution {
public:
int subarraysWithKDistinct(vector<int>& A, int K) {
int left, right;
BiMap M;
for (left = right = 0; right < A.size() && M.size() < K; ++right)
M.insert(A[right], right);
if (right == A.size())
return 0;
int count = M.minimum() - left + 1;
for (; right < A.size(); ++right)
{
M.insert(A[right], right);
while (M.size() > K)
{
if (M[A[left]] == left)
M.erase(A[left]);
++left;
}
count += M.minimum() - left + 1;
}
return count;
}
};

Increasing Sequence C++

I try to solve this challenge on CodeFights, but, it doesn't work. My best solution got 25/26 (time limit exceeded on the last test) but I deleted that because I tried it yesterday (it was O(n^2)). Now I tried a new one in O(n). I am very tired and I really want to get this done today, so please help me.
Here are the statements:
Given a sequence of integers as an array, determine whether it is possible to obtain a strictly increasing sequence by removing no more than one element from the array.
Example
For sequence = [1, 3, 2, 1], the output should be
almostIncreasingSequence(sequence) = false;
There is no one element in this array that can be removed in order to get a strictly increasing sequence.
For sequence = [1, 3, 2], the output should be
almostIncreasingSequence(sequence) = true.
You can remove 3 from the array to get the strictly increasing sequence [1, 2]. Alternately, you can remove 2 to get the strictly increasing sequence [1, 3].
And here is my code until now... (poor code):
#include <iostream>
#include <vector>
#include <algorithm>
bool almostIncreasingSequence(std::vector<int> sequence)
{
int count = 0;
for(int i = 0; i < sequence.size()-1; i++)
{
if(sequence[i] > sequence[i+1])
{
count++;
sequence.erase(sequence.begin(), sequence.begin() + i);
i--;
}
if(count == 2)
return false;
}
return true;
}
int main()
{
std::cout << std::endl;
return 0;
}
Here is a C++11 solution with O(N) runtime:
constexpr auto Max = std::numeric_limits<std::size_t>::max();
bool is_sorted_but_skip(const std::vector<int>& vec, std::size_t index = Max){
auto Start = index == 0 ? 1 : 0;
auto prev = vec[Start];
for(std::size_t i = Start + 1; i < vec.size(); i++){
if(i == index) continue;
if(prev >= vec[i]) return false;
prev = vec[i];
}
return true;
}
bool almostIncreasingSequence(std::vector<int> v)
{
auto iter = std::adjacent_find(v.begin(), v.end(), [](int L, int R){ return L >= R; });
if(is_sorted_but_skip(v, std::distance(v.begin(), iter)))
return true;
return is_sorted_but_skip(v, std::distance(v.begin(), std::next(iter)));
}
We use std::adjacent_find to find the first element, iter greater than or equal its next element. Then we check that sequence is strictly sorted while skipping iter's position.
Otherwise, we check that the sequence is strictly sorted while we skip iter+1's position
Worse case complexity: 3 linear scan
Demo
Here's a hint (well, almost a solution really):
If you see a decrease between one element to the next, then you have to remove one of them (*).
Now, what if you find two decreases, between two disjoint pairs of elements? That's right :-)
Keeping that in mind, you should be able to solve your problem using a linear scan and a bit of constant-time work.
(*) excluding the first and the last pair of elements.
This is still O(N^2), because you delete the first element of the vector in each iteration. Don't delete the first element and don't i-- in the loop.
If you must erase the numbers (you don't, but still), at least do it from the end of the list. That way erasing a number is probably an O(1) operation (I'm not 100% sure that's how std::vector is implemented).
You really don't have to erase the numbers.
#include<iostream>
#include<vector>
using namespace std;
int almostIncreasingSequence( vector<int> sequence );
int main(){
int array[] = {40, 50, 60, 10, 20, 30};
std::vector<int> vect (array, array + sizeof(array) / sizeof(int) );
bool ret = almostIncreasingSequence(vect);
if( ret ){
std::cout<<"Array is strictly increasing.";
}
else{
std::cout<<"Array is not strictly increasing.";
}
return 0;
}
bool almostIncreasingSequence(std::vector<int> sequence) {
int val = 0;
int currentBig = sequence.at(0);
for (int i = 1; i < sequence.size(); i++){
if( currentBig < sequence.at(i))
{
currentBig = sequence.at(i);
}
else{
val++;
if( val>1)
{
return false;
}
if( i > 1 ){
if (sequence.at(i) > sequence.at(i-2)){
if( currentBig < sequence.at(i) ){
}
else{
currentBig = sequence.at(i);
}
}
}
else{
currentBig = sequence.at(i);
}
}
}
return true;
}

How to find the minimal missing integer in a list in an STL way

I want to find the minimal missing positive integer in a given list. That is if given a list of positive integers, i.e. larger than 0 with duplicate, how to find from those missing the one that is the smallest.
There is always at least one missing element from the sequence.
For example given
std::vector<int> S={9,2,1,10};
The answer should be 3, because the missing integers are 3,4,5,6,7,8,11,... and the minimum is 3.
I have come up with this:
int min_missing( std::vector<int> & S)
{
int max = std::max_element(S.begin(), S.end());
int min = std::min_element(S.begin(), S.end());
int i = min;
for(; i!=max and std::find(S.begin(), S.end(), i) != S.end() ; ++i);
return i;
}
This is O(nmlogn) in time, but I cannot figure out if there is a more efficient C++ STL way to do this?
This is not an exercise but I am doing a set of problems for self-improvement , and I have found this to be a very interesting problem. I am interested to see how I can improve this.
You could use std::sort, and then use std::adjacent_findwith a custom predicate.
int f(std::vector<int> v)
{
std::sort(v.begin(), v.end());
auto i = std::adjacent_find( v.begin(), v.end(), [](int x, int y)
{
return y != x+1;
} );
if (i != v.end())
{
return *i + 1;
}
}
It is left open what happens when no such element exists, e.g. when the vector is empty.
Find the first missing positive, With O(n) time and constant space
Basiclly, when you read a value a, just swap with the S[a], like 2 should swap with A[2]
class Solution {
public:
/**
* #param A: a vector of integers
* #return: an integer
*/
int firstMissingPositive(vector<int> A) {
// write your code here
int n = A.size();
for(int i=0;i<n;)
{
if(A[i]==i+1)
i++;
else
{
if(A[i]>=1&&A[i]<=n&& A[A[i]-1]!=A[i])
swap(A[i],A[A[i]-1]);
else
i++;
}
}
for(int i=0;i<n;i++)
if(A[i]!=i+1)
return i+1;
return n+1;
}
};
Assuming the data are sorted first:
auto missing_data = std::mismatch(S.cbegin(), S.cend()-1, S.cbegin() + 1,
[](int x, int y) { return (x+1) == y;});
EDIT
As your input data are not sorted, the simplest solution is to sort them first:
std::vector<int> data(S.size());
std::partial_sort_copy (S.cbegin(), S.cend(), data.begin(), data.end());
auto missing_data = std::mismatch (data.cbegin(), data.cend()-1, data.cbegin()+1,
[](int x, int y) { return (x+1) == y;});
you can use algorithm the standard template library c ++ to work in your code.
#include <algorithm> // std::sort
this std::sort in algorithm:
std::vector<int> v={9,2,5,1,3};
std::sort(v.begin(),v.end());
std::cout << v[0];
I hope I understand what you, looking.
You can do this by building a set of integers and adding larger seen in the set, and holding the minimum not seen in as a counter. Once there is a number that is equal to the latter, go through the set removing elements until there is a missing integer.
Please see below for implementation.
template<typename I> typename I::value_type solver(I b, I e)
{
constexpr typename I::value_type maxseen=
std::numeric_limits<typename I::value_type>::max();
std::set<typename I::value_type> seen{maxseen};
typename I::value_type minnotseen(1);
for(I p=b; p!=e;++p)
{
if(*p == minnotseen)
{
while(++minnotseen == *seen.begin())
{
seen.erase(seen.begin());
}
} else if( *p > minnotseen)
{
seen.insert(*p);
}
}
return minnotseen;
}
In case you sequence is in a vector you should use this with:
solver(sequence.begin(),sequence.end());
The algorithm is O(N) in time and O(1) in space since it uses only a counter, constant size additional space, and a few iterators to keep track of the least value.
Complexity ( order of growth rate ) The algorithm keeps a subset only of the input which is expected to be of constant order of growth with respect the growth rate of the input, thus O(1) in space. The growth rate of the iterations is O(N+NlogK) where K is the growth rate of the larger subsequence of seen larger numbers. The latter is the aforementioned subsequence of constant growth rate i.e. K=1 , which results in the algorithm having O(N) complexity. (see comments)

Find which numbers appears most in a vector

I have some numbers stored in a std::vector<int>. I want to find which number appears most in the vector.
e.g. in the vector
1 3 4 3 4 2 1 3 2 3
the element that occurs the most is 3.
Is there any algorithm (STL or whatever) that does this ?
Sort it, then iterate through it and keep a counter that you increment when the current number is the same as the previous number and reset to 0 otherwise. Also keep track of what was the highest value of the counter thus far and what the current number was when that value was reached. This solution is O(n log n) (because of the sort).
Alternatively you can use a hashmap from int to int (or if you know the numbers are within a limited range, you could just use an array) and iterate over the vector, increasing the_hashmap[current_number] by 1 for each number. Afterwards iterate through the hashmap to find its largest value (and the key belonging to it). This requires a hashmap datastructure though (unless you can use arrays which will also be faster), which isn't part of STL.
If you want to avoid sorting your vector v, use a map:
int max = 0;
int most_common = -1;
map<int,int> m;
for (vi = v.begin(); vi != v.end(); vi++) {
m[*vi]++;
if (m[*vi] > max) {
max = m[*vi];
most_common = *vi;
}
}
This requires more memory and has a very similar expected runtime. The memory required should be on the order of a full vector copy, less if there are many duplicate entries.
Try this
int FindMode(vector<int> value)
{
int index = 0;
int highest = 0;
for (unsigned int a = 0; a < value.size(); a++)
{
int count = 1;
int Position = value.at(a);
for (unsigned int b = a + 1; b < value.size(); b++)
{
if (value.at(b) == Position)
{
count++;
}
}
if (count >= index)
{
index = count;
highest = Position;
}
}
return highest;
}
This is how i did it:
int max=0,mostvalue=a[0];
for(i=0;i<a.size();i++)
{
co = (int)count(a.begin(), a.end(), a[i]);
if(co > max)
{ max = co;
mostvalue = a[i];
}
}
I just don't know how fast it is, i.e. O() ? If someone could calculate it and post it here that would be fine.
Here is an O(n) generic solution for finding the most common element in an iterator range. You use it simply by doing:
int commonest = most_common(my_vector.begin(), my_vector.end());
The value type is extracted from the iterator using iterator_traits<>.
template<class InputIt, class T = typename std::iterator_traits<InputIt>::value_type>
T most_common(InputIt begin, InputIt end)
{
std::map<T, int> counts;
for (InputIt it = begin; it != end; ++it) {
if (counts.find(*it) != counts.end()) {
++counts[*it];
}
else {
counts[*it] = 1;
}
}
return std::max_element(counts.begin(), counts.end(),
[] (const std::pair<T, int>& pair1, const std::pair<T, int>& pair2) {
return pair1.second < pair2.second;})->first;
}