Find First Missing Element in a vector - c++

This question has been asked before but I cannot find it for C++.
If I have a vector and I have a starting number, does std::algorithm provide me a way to find the next highest missing number?
I can obviously write this in a nested loop, I just cant shake the feeling that I'm reinventing the wheel.
For example, given: vector foo{13,8,3,6,10,1,7,0};
The starting number 0 should find 2.
The starting number 6 should find 9.
The starting number -2 should find -1.
EDIT:
Thus far all the solutions require sorting. This may in fact be required, but a temporary sorted vector would have to be created to accommodate this, as foo must remain unchanged.

At least as far as I know, there's no standard algorithm that directly implements exactly what you're asking for.
If you wanted to do it with something like O(N log N) complexity, you could start by sorting the input. Then use std::upper_bound to find the (last instance of) the number you've asked for (if present). From there, you'd find a number that differs from the previous by more than one. From there you'd scan for a difference greater than 1 between the consecutive numbers in the collection.
One way to do this in real code would be something like this:
#include <iostream>
#include <algorithm>
#include <vector>
#include <numeric>
#include <iterator>
int find_missing(std::vector<int> x, int number) {
std::sort(x.begin(), x.end());
auto pos = std::upper_bound(x.begin(), x.end(), number);
if (*pos - number > 1)
return number + 1;
else {
std::vector<int> diffs;
std::adjacent_difference(pos, x.end(), std::back_inserter(diffs));
auto pos2 = std::find_if(diffs.begin() + 1, diffs.end(), [](int x) { return x > 1; });
return *(pos + (pos2 - diffs.begin() - 1)) + 1;
}
}
int main() {
std::vector<int> x{ 13, 8, 3, 6, 10, 1,7, 0};
std::cout << find_missing(x, 0) << "\n";
std::cout << find_missing(x, 6) << "\n";
}
This is somewhat less than what you'd normally think of as optimal to provide the external appearance of a vector that can/does remain un-sorted (and unmodified in any way). I've done that by creating a copy of the vector, and sorting the copy inside the find_missing function. Thus, the original vector remains unmodified. The disadvantage is obvious: if the vector is large, copying it can/will be expensive. Furthermore, this ends up sorting the vector for every query instead of sorting once, then carrying out as many queries as desired on it.

So I thought I'd post an answer. I don't know anything in std::algorithm that accomplishes this directly, but in combination with vector<bool> you can do this in O(2N).
template <typename T>
T find_missing(const vector<T>& v, T elem){
vector<bool> range(v.size());
elem++;
for_each(v.begin(), v.end(), [&](const T& i){if((i >= elem && i - elem < range.size())range[i - elem] = true;});
auto result = distance(range.begin(), find(range.begin(), range.end(), false));
return result + elem;
}

First you need to sort the vector. Use std::sort for that.
std::lower_bound finds the first element that is greater or equal with a given element. (the elements have to be at least partially ordered)
From there you iterate while you have consecutive elements.
Dealing with duplicates: One way is the way I went: consider consecutive and equal elements when iterating. Another approach is to add a prerequisite that the vector / range contains unique elements. I chose the former because it avoids erasing elements.
Here is how you eliminate duplicates from a sorted vector:
v.erase(std::unique(v.begin(), v.end()), v.end());
My implementation:
// finds the first missing element in the vector v
// prerequisite: v must be sorted
auto firstMissing(std::vector<int> const &v, int elem) -> int {
auto low = std::lower_bound(std::begin(v), std::end(v), elem);
if (low == std::end(v) || *low != elem) {
return elem;
}
while (low + 1 != std::end(v) &&
(*low == *(low + 1) || *low + 1 == *(low + 1))) {
++low;
}
return *low + 1;
}
And a generalized version:
// finds the first missing element in the range [first, last)
// prerequisite: the range must be sorted
template <class It, class T = decltype(*std::declval<It>())>
auto firstMissing(It first, It last, T elem) -> T {
auto low = std::lower_bound(first, last, elem);
if (low == last || *low != elem) {
return elem;
}
while (std::next(low) != last &&
(*low == *std::next(low) || *low + 1 == *std::next(low))) {
std::advance(low, 1);
}
return *low + 1;
}
Test case:
int main() {
auto v = std::vector<int>{13, 8, 3, 6, 10, 1, 7, 7, 7, 0};
std::sort(v.begin(), v.end());
for (auto n : {-2, 0, 5, 6, 20}) {
cout << n << ": " << firstMissing(v, n) << endl;
}
return 0;
}
Result:
-2: -2
0: 2
5: 5
6: 9
20: 20
A note about sorting: From the OP's comments he was searching for a solution that wouldn't modify the vector.
You have to sort the vector for an efficient solution. If modifying the vector is not an option you could create a copy and work on it.
If you are hell-bent on not sorting, there is a brute force solution (very very inefficient - O(n^2)):
auto max = std::max_element(std::begin(v), std::end(v));
if (elem > *max) {
return elem;
}
auto i = elem;
while (std::find(std::begin(v), std::end(v), i) != std::end(v)) {
++i;
}
return i;

First solution:
Sort the vector. Find the starting number and see what number is next.
This will take O(NlogN) where N is the size of vector.
Second solution:
If the range of numbers is small e.g. (0,M) you can create boolean vector of size M. For each number of initial vector make the boolean of that index true. Later you can see next missing number by checking the boolean vector. This will take O(N) time and O(M) auxiliary memory.

Related

Finding smallest number >= x not present in the given sorted array

I am having difficulty writing a modified binary search algorithm that returns the smallest number greater than or equal to X which is not present in the sorted array.
For example, if the array is {1,2,3,5,6} and x = 2 then answer is 4. Please guide me how to write the binary search for this. I have to answer this in O(log n) time for each x. Since I am taking this array as input which will initially take linear time, you may do some kind of preprocessing on the array initially if you want.
x is also taken as input and may or may not be present in the array.
The input array may have repeating elements.
My input numbers can be in the range [0,10^9] and hence first putting all the missing values in the array is not feasible because of space constraints.
Also, you can do preprocessing which takes O(n) time since you are taking the array as input in linear time. After that, there will be let us say 10^6 queries of X, which you have to answer, each in O(log n) time
If I understand correctly, you are allowed to do any kind of preprocessing and only finding the result for different x must be O(log n). If thats the case finding the result after preprocessing isn't a big deal. O(log n) search algorithms do exist. Good candidates are std::binary_search or std::lower_bound.
A very naive approach is to prepare a vector with all missing elements and then std::lower_bound on that:
#include <iostream>
#include <vector>
#include <algorithm>
int main() {
std::vector<int> input{1,2,3,5,6,10,12};
std::vector<int> missing_elements{4,7,8,9,11};
int x = 2;
auto it = std::lower_bound(missing_elements.begin(),missing_elements.end(),x);
std::cout << *it << "\n";
}
Populating missing_elements can be done in O(1). However, a missing_elements with a size of the order of 10^9 is of course not feasible. Also this approach is extremely wasteful for input like [1,100000000] (not in terms of time complexity, but in terms of runtime and memory usage).
An idea put up by Jarod42 in a comment is to prepare a vector of segments and then std::lower_bound on that. First assuming preprocessing has already been done:
#include <iostream>
#include <vector>
#include <algorithm>
int find_first_missing(const std::vector<std::pair<int,int>>& segments,int x){
std::pair<int,int> p{x,x};
auto it = std::lower_bound(segments.begin(),segments.end(),p,[](auto a,auto b){
return a.second < b.second;
});
if (it == segments.end()) return x;
if (it->first > x) return x;
return it->second+1;
}
int main() {
std::vector<int> input{1,2,3,5,6,10,12};
std::vector<std::pair<int,int>> segments{{1,3},{5,6},{10,10},{12,12}};
for (int x=0; x<13;++x) std::cout << x << " -> " << find_first_missing(segments,x) << "\n";
}
Output:
0 -> 0
1 -> 4
2 -> 4
3 -> 4
4 -> 4
5 -> 7
6 -> 7
7 -> 7
8 -> 8
9 -> 9
10 -> 11
11 -> 11
12 -> 13
Because input is sorted, and segments is sorted, we can use a custom comparator that only compares the end of the segment. The vector of segments is also sorted with respect to that comparator. The call to lower_bound returns an iterator to the segment where either x is inside or x is lower than the segment, hence if (it->first > x) return x; otherwise we know that it->second+1 is the next missing number.
Now it is only left to create the vector of segments:
#include <iostream>
#include <vector>
#include <algorithm>
#include <cassert>
std::vector<std::pair<int,int>> segment(const std::vector<int>& input){
std::vector<std::pair<int,int>> result;
if (input.size() == 0) return result;
int current_start = input[0];
for (int i=1;i<input.size();++i){
if (input[i-1] == input[i] || input[i-1]+1 == input[i]) continue;
result.push_back({current_start,input[i-1]});
current_start = input[i];
}
result.push_back({current_start,input.back()});
return result;
}
int main() {
std::vector<int> input{1,2,3,5,6,10,12};
std::vector<std::pair<int,int>> expected{{1,3},{5,6},{10,10},{12,12}};
auto result = segment(input);
for (const auto& e : result){
std::cout << e.first << " " << e.second << "\n";
}
assert(expected == result);
}
If x is not present in the array then return x.
If x is present, say it's at position l. Also, let's denote missing(i) to be the number of elements missing to the left of i. In a 1-indexed array, this equals A[i]-i. Then keep moving right from l until missing(i) - missing(l) = 0. You can use modified binary search for this. Suppose p is the position of the last element where missing(p) - missing(l) = 0 then A[p]+1 is the first missing number greater than x.
Take a look on this:
#include <iostream>
#include <vector>
using namespace std;
int greaterValue(const vector<int>& elements, int x){
int low = 0,
high = elements.size() -1,
answer = x + 1;
while (low <= high) {
int mid = (low + high) / 2;
if (elements[mid] <= answer) {
if (elements[mid] == answer) {
answer++;
high = elements.size() - 1;
}
low = mid + 1;
}
else {
high = mid - 1;
}
}
return answer;
}
int main() {
vector<int> elements = { 1, 2, 3, 5, 6 };
int x = 2;
int result = greaterValue(elements, x);
cout << "The element is: " << result;
return 0;
}
Test:
{ 1, 2, 3, 5, 6 }
Result:
The element is: 4
Time complexity:
O(log(n))

Arranging odd and even numbers in a vector C++

I have this problem: Given a vector with n numbers, sort the numbers so that the even ones will be on odd positions and the odd numbers will be on even positions. E.g. If I have the vector 2 6 7 8 9 3 5 1, the output should be 2 7 6 9 8 3 5 1 . The count should start from 1. So on position 1 which is actually index 0 should be an even number, on position 2 which is actually index 1 should be an odd number and so on. Now this is easy if the odd and even numbers are the same, let's say 4 even number and 4 odd numbers in the vector, but what if the number of odd numbers differs from the number of even numbers like in the above example? How do I solve that. I attached the code with one of the tries I did, but it doesn't work. Can I get some help please. I ask you to keep it simple that means only with vectors and such. No weird methods or anything cause I'm a beginner and I only know the basics. Thanks in advance!
I have to mention that n initial is globally declared and is the number of vector elements and v_initial is the initial vector with the elements that need to be rearranged.
The task says to add the remaining numbers to the end of the vector. Like if there are 3 odd and 5 even numbers, The 2 extra even numbers should be thrown at the end of the vector
void vector_pozitii_pare_impare(int v_initial[])
{
int v_pozitie[50],c1=0,c2=1;
for (i = 0; i < n_initial; i++)
{
if (v_initial[i] % 2 == 0)
{
bool isTrue = 1;
for (int k = i + 1; k < n_initial; k++)
{
if (v_initial[k] % 2 != 0)
isTrue = 0;
}
if (isTrue)
{
v_pozitie[c1] = v_initial[i];
c1++;
}
else
{
v_pozitie[c1] = v_initial[i];
c1 += 2;
}
}
else
{
bool isTrue = 1;
for (int j = i + 1; j < n_initial; j++)
{
if (v_initial[j] % 2 == 0)
{
isTrue = 0;
}
if (isTrue)
{
v_pozitie[c2] = v_initial[i];
c2++;
}
else
{
v_pozitie[c2] = v_initial[i];
c2 += 2;
}
}
}
}
This may not be a perfect solution and it just popped out right off my mind without being tested or verified, but it's just to give you an idea.
(Let A,B,C,D be odd numbers and 0,1,2 even numbers correspondingly)
Given:
A 0 B C D 1 2 (random ordered list of odd/even numbers)
Wanted:
A 0 B 1 C 2 D (input sequence altered to match the wanted odd/even criteria)
Next, we invent the steps required to get from given to wanted:
// look at 'A' -> match, next
// Result: A 0 B C D 1 2
// look at '0' -> match, next
// Result: A 0 B C D 1 2
// look at 'B' -> match, next
// Result: A 0 B C D 1 2
// look at 'C' -> mismatch, remember index and find first match starting from index+1
// Result: A 0 B C D ->1<- 2
// now swap the numbers found at the remembered index and the found one.
// Result: A 0 B 1 D C 2
// continue until the whole list has been consumed.
As I said, this algorithm may not be perfect, but my intention is to give you an example on how to solve these kinds of problems. It's not good to always think in code first, especially not with a problem like this. So you should first think about where you start, what you want to achieve and then carefully think of how to get there step by step.
I feel I have to mention that I did not provide an example in real code, because once you got the idea, the execution should be pretty much straight forward.
Oh, and just a small remark: Almost nothing about your code is C++.
A simple solution, that is not very efficient would be to split the vector into 2 vectors, that contain even and uneven numbers and then always take one from the even, one from the uneven and then the remainder, from the one that is not completely entered.
some c++ (that actually uses vectors, but you can use an array the same way, but need to change the pointer arithmetic)
I did not test it, but the principle should be clear; it is not very efficient though
EDIT: The answer below by #AAAAAAAAARGH outlines a better algorithmic idea, that is inplace and more efficient.
void change_vector_even_uneven(std::vector<unsigned>& in_vec){
std::vector<unsigned> even;
std::vector<unsigned> uneven;
for (auto it = in_vec.begin(); it != in_vec.end(); it++){
if ((*it) % 2 == 0)) even.push_back(*it);
else uneven.push_back(*it);
}
auto even_it = even.begin();
auto uneven_it = uneven.begin();
for (auto it = in_vec.begin(); it != in_vec.end(); it++){
if (even_it == even.end()){
(*it) = (*uneven_it);
uneven_it++;
continue;
}
if (uneven_it == uneven.end()){
(*it) = (*even_it);
even_it++;
continue;
}
if ((it - in_vec.begin()) % 2 == 0){
(*it) = (*even_it);
even_it++;
}
else{
(*it) = (*uneven_it);
uneven_it++;
}
}
}
The solutions is simple. We sort the even and odd values into a data structure. In a loop, we iterate over all source values. If they are even (val & 2 == 0) we add them at the end of a std::deque for evens and if odd, we add them to a std::deque for odds.
Later, we we will extract the the values from the front of the std::deque.
So, we have a first in first out principle.
The std::deque is optimized for such purposes.
Later, we make a loop with an alternating branch in it. We, alternatively extract data from the even queue and then from the odd queue. If a queue is empty, we do not extract data.
We do not need an additional std::vector and can reuse the old one.
With that, we do not need to take care for the same number of evens and odds. It will of course always work.
Please see below one of millions of possible solutions:
#include <iostream>
#include <vector>
#include <deque>
int main() {
std::vector testData{ 2, 6, 7, 8, 9, 3, 5, 1 };
// Show initial data
std::cout << "\nInitial data: ";
for (const int i : testData) std::cout << i << ' ';
std::cout << '\n';
// We will use a deques to store odd and even numbers
// With that we can efficiently push back and pop front
std::deque<int> evenNumbers{};
std::deque<int> oddNumbers{};
// Sort the original data into the specific container
for (const int number : testData)
if (number % 2 == 0)
evenNumbers.push_back(number);
else
oddNumbers.push_back(number);
// Take alternating the data from the even and the odd values
bool takeEven{ true };
for (size_t i{}; !evenNumbers.empty() && !oddNumbers.empty(); ) {
if (takeEven) { // Take even numbers
if (not evenNumbers.empty()) { // As long as there are even values
testData[i] = evenNumbers.front(); // Get the value from the front
evenNumbers.pop_front(); // Remove first value
++i;
}
}
else { // Now we take odd numbers
if (not oddNumbers.empty()) { // As long as there are odd values
testData[i] = oddNumbers.front(); // Get the value from the front
oddNumbers.pop_front(); // Remove first value
++i;
}
}
// Next take the other container
takeEven = not takeEven;
}
// Show result
std::cout << "\nResult: ";
for (const int i : testData) std::cout << i << ' ';
std::cout << '\n';
return 0;
}
Here is yet another solution (using STL), in case you want a stable result (that is, the order of your values is preserved).
#include <algorithm>
#include <vector>
auto ints = std::vector<int>{ 2, 6, 7, 8, 9, 3, 5, 1 };
// split list to even/odd sections -> [2, 6, 8, 7, 9, 3, 5, 1]
const auto it = std::stable_partition(
ints.begin(), ints.end(), [](auto value) { return value % 2 == 0; });
auto results = std::vector<int>{};
results.reserve(ints.size());
// merge both parts with equal size
auto a = ints.begin(), b = it;
while (a != it && b != ints.end()) {
results.push_back(*a++);
results.push_back(*b++);
}
// copy remaining values to end of list
std::copy(a, it, std::back_inserter(results));
std::copy(b, ints.end(), std::back_inserter(results));
The result ist [2, 7, 6, 9, 8, 3, 5, 1]. The complexity is O(n).
This answer, like some of the others, divides the data and then reassembles the result. The standard library std::partition_copy is used to separate the even and odd numbers into two containers. Then the interleave function assembles the result by alternately copying from two input ranges.
#include <algorithm>
#include <iostream>
#include <vector>
template <typename InIt1, typename InIt2, typename OutIt>
OutIt interleave(InIt1 first1, InIt1 last1, InIt2 first2, InIt2 last2, OutIt dest)
{
for (;;) {
if (first1 == last1) {
return std::copy(first2, last2, dest);
}
*dest++ = *first1++;
if (first2 == last2) {
return std::copy(first1, last1, dest);
}
*dest++ = *first2++;
}
}
void reorder_even_odd(std::vector<int> &data)
{
auto is_even = [](int value) { return (value & 1) == 0; };
// split
std::vector<int> even, odd;
std::partition_copy(begin(data), end(data), back_inserter(even), back_inserter(odd), is_even);
// merge
interleave(begin(even), end(even), begin(odd), end(odd), begin(data));
}
int main()
{
std::vector<int> data{ 2, 6, 7, 8, 9, 3, 5, 1 };
reorder_even_odd(data);
for (int value : data) {
std::cout << value << ' ';
}
std::cout << '\n';
}
Demo on Compiler Explorer
As suggested, I am using vectors and STL.
No need to be a great mathematician to understand v_pozitie will start with pairs of odd and even and terminate with the integers not in the initial pairs.
I am then updating three iterators in v_positie (no need of temporary containers to calculate the result) : even, odd and end,(avoiding push_back) and would code this way :
#include <vector>
#include <algorithm>
void vector_pozitii_pare_impare(std::vector<int>& v_initial, std::vector<int>& v_pozitie) {
int nodd (0), neven (0);
std::for_each (v_initial.begin (), v_initial.end (), [&nodd] (const int& n) {
nodd += n%2;
});
neven = v_initial.size () - nodd;
int npair (neven < nodd ?neven:nodd);
npair *=2;
std::vector<int>::iterator iend (&v_pozitie [npair]), ieven (v_pozitie.begin ()), iodd (&v_pozitie [1]);
std::for_each (v_initial.begin (), v_initial.end (), [&iend, &ieven, &iodd, &npair] (const int& s) {
if (npair) {
switch (s%2) {
case 0 :
*ieven++ = s;
++ieven;
break;
case 1 :
*iodd++ = s;
++iodd;
break;
}
--npair;
}
else *iend++ = s;
});
}
int main (int argc, char* argv []) {
const int N = 8;
int tab [N] = {2, 6, 7, 8, 9, 3, 5, 1};
std::vector<int> v_initial (tab, (int*)&tab [N]);
std::cout << "\tv_initial == ";
std::for_each (v_initial.begin (), v_initial.end (), [] (const int& s) {std::cout << s << " ";});
std::cout << std::endl;
std::vector<int> v_pozitie (v_initial.size (), -1);
vector_pozitii_pare_impare (v_initial, v_pozitie);
std::cout << "\tv_pozitie == ";
std::for_each (v_pozitie.begin (), v_pozitie.end (), [] (const int& s) {std::cout << s << " ";});
std::cout << std::endl;
}

Find the last non-zero element in a std::vector

I'm trying to find the index of the last non-zero element in a std::vector<double>. If the last element in the vector is non-zero then it should return the index of that last element.
I believe I can use std::find_if_not, reverse iterators and std::distance, based on this:
std::find_if_not(amounts.rbegin(), amounts.rend(), 0.0)
where amounts is a std::vector<double>, but I'm having difficulty in combining this with std::distance and a forward iterator amounts.begin().
Also, is there a way I can introduce a predicate to compare on, say a tolerance of 1e-8?
I'm using C++11.
Example:
std::vector<double> v{1.32, 1.423, 2.543, 3.534, 4.2, 0};
auto result1 = std::find_if(std::rbegin(v), std::rend(v), [](auto& v) { return std::fabs(v - 0) > std::numeric_limits<double>::epsilon(); } );
if (result1 != std::rend(v)) {
std::cout << *result1 << "\n";
std::cout << std::distance(std::begin(v), (result1 + 1).base());
}
outputs:
4.2
4
[edit]
more explanation on:
std::fabs(v - 0) > std::numeric_limits<double>::epsilon(); }
in OP question there was:
Also, is there a way I can introduce a predicate to compare on, say a tolerance of 1e-8?
so this is such tolerance check, you can replace epsilon use with some other value.
A simple for loop can also do the trick, see live sample: http://ideone.com/dVNOKk
#include <iostream>
#include <vector>
int main() {
std::vector<int> v{1, 2, 3, 4, 1, 2, 3, 0, 4, 1, 2, 3, 4, 0, 0, 0};
for (int i = static_cast<int>(v.size()) - 1; i >= 0; --i) {
if (v.at(i) != 0) {
std::cout << "Last non-zero at: " << i << '\n';
break;
}
}
return 0;
}
Output: Last non-zero at: 12
but I'm having difficulty in combining this with std::distance and a forward iterator amounts.begin()
Reverse iterators have member function base that returns a non-reverse iterator with the relation &*(rit.base() - 1) == &*rit. So, you can use the following:
std::distance(amounts.begin(), found.base()) - 1;
Another option:
amounts.size() - std::distance(amounts.rbegin(), found) - 1
Also, is there a way I can introduce a predicate to compare on, say a tolerance of 1e-8?
Yes. In fact, you must use a predicate, even for the exact comparison, since that's what std::find_if_not expects as the third parameter (instead of value of an element).

Finding number of pairs having sum as 0 in two different arrays

If I have two seperate sorted arrays, containing equal number of entries, and I need to find the number of pairs(both numbers should be from seperate arrays) having sum = 0 in linear time, how can I do that?
I can easily do it in O(n^2) but how to do it in linear time?
OR should I merge the two arrays and then proceed?
Thanks!
You don't need the arrays to be sorted.
Stick the numbers from one of the arrays into a hash table. Then iterate over the other array. For each number n, see if -n is in the hash table.
(If either array can contain duplicates, you need to take some care around handling them.)
P.S. You can exploit the fact that the arrays are sorted. Just iterate over them from the opposite ends once, looking for items that have the same value but the opposite signs. I leave figuring out the details as an exercise (hint: think of the merge step of merge sort).
Try this:
for(i=0;j=0;i<n&&j<n;)
{
if(arr1[i]+arr2[j]==0)
{
count++;
i++;
j++;
}
else if(arr[i]>arr[j])
{
j++;
}
else
{
i++;
}
}
Following may help:
std::size_t count_zero_pair(const std::vector<int>& v1, const std::vector<int>& v2)
{
assert(is_sorted(v1.begin(), v1.end()));
assert(is_sorted(v2.begin(), v2.end()));
std::size_t res = 0;
auto it1 = v1.begin();
auto it2 = v2.rbegin();
while (it1 != v1.end() && it2 != v2.rend()) {
const int sum = *it1 + *it2;
if (sum < 0) {
++it1;
} else if (0 < sum) {
++it2;
} else { // sum == 0
// may be more complicated depending
// how you want to manage duplicated pairs
++it1;
++it2;
++res;
}
}
return res;
}
If they are already sorted, you can traverse them, one frome left to right, one from right to left:
Take two pointers, and put one at the very left of one array, the other at the very right of the other array. Look at both values you currently point on. If the absolute value of one of these values is greater than the other, advance the greater one. If the absolute values are equal, report both values, and advance both pointers. Stop, as soon as the pointer coming from the left reaches a positive value, or the pointer from the right reaches a negative value. After that, do the same with the pointers starting at the resp. other ends of the arrays.
This is essentially the solution proposed by #Matthias with an added pointer to catch duplicates. If there is a string of duplicate values in arr2, searchStart will always point to the one with the highest index so that we can check the entire string against the next value in arr1. All values in arr1 are explicitly checked, so no extra duplicate handling is required.
int pairCount = 0;
for (int base=0, searchStart=arr2Size-1; base<arr1Size; base++) {
int searchCurrent = searchStart;
while (arr1[base]+arr2[searchCurrent] > 0) {
searchCurrent--;
if (searchCurrent < 0) break;
}
searchStart=searchCurrent;
if (searchStart < 0) break;
while (arr1[base]+arr2[searchCurrent] == 0) {
std::cout << "arr1[" << base << "] + arr2[" << searchCurrent << "] = ";
std::cout << "[" << arr1[base] << "," << arr2[searchCurrent] << "]\n";
pairCount++;
searchCurrent--;
}
}
std::cout << "pairCount = " << pairCount << "\n";
Given the arrays:
arr1[] = {-5, -3, -3, -2, -1, 0, 2, 4, 4, 5, 8};
arr2[] = {-7, -5, -5, -4, -3, -2, 1, 3, 4, 5, 6, 7, 8};
we get:
arr1[0] + arr2[9] = [-5,5]
arr1[1] + arr2[7] = [-3,3]
arr1[2] + arr2[7] = [-3,3]
arr1[4] + arr2[6] = [-1,1]
arr1[6] + arr2[5] = [2,-2]
arr1[7] + arr2[3] = [4,-4]
arr1[8] + arr2[3] = [4,-4]
arr1[9] + arr2[2] = [5,-5]
arr1[9] + arr2[1] = [5,-5]
pairCount = 9
Now we come to the question of time complexity. The construction of searchStart is such that for each value in arr1 can have an extra compare with one value in arr2 (but no more than 1). Otherwise, for arrays with no duplicates this checks each value in arr2 exactly once, so this algorithm runs in O(n).
If duplicate values are present, however, it complicates things a bit. Consider the arrays:
arr1 = {-3, -3, -3}
arr2 = { 3, 3, 3}
Clearly, since all O(n²) pairs equal zero, we have to count all O(n²) pairs. This means that in the worst case, the algorithm is O(n²) and this is the best we can do. It is possibly more constructive to say that the complexity is O(n + p) where p is the number of matching pairs.
Note that if you only want to count the number of matches rather than printing them all, you can do this in linear time as well. Just change when searchStart is updated to when the last match is found and keep a counter that equals the number of matches found for the current searchStart. Then if the next arr1[base] matches arr2[searchStart], add the counter to the number of pairs.

What is the fastest way to find longest 'consecutive numbers' streak in vector ?

I have a sorted std::vector<int> and I would like to find the longest 'streak of consecutive numbers' in this vector and then return both the length of it and the smallest number in the streak.
To visualize it for you :
suppose we have :
1 3 4 5 6 8 9
I would like it to return: maxStreakLength = 4 and streakBase = 3
There might be occasion where there will be 2 streaks and we have to choose which one is longer.
What is the best (fastest) way to do this ? I have tried to implement this but I have problems with coping with more than one streak in the vector. Should I use temporary vectors and then compare their lengths?
No you can do this in one pass through the vector and only storing the longest start point and length found so far. You also need much fewer than 'N' comparisons. *
hint: If you already have say a 4 long match ending at the 5th position (=6) and which position do you have to check next?
[*] left as exercise to the reader to work out what's the likely O( ) complexity ;-)
It would be interesting to see if the fact that the array is sorted can be exploited somehow to improve the algorithm. The first thing that comes to mind is this: if you know that all numbers in the input array are unique, then for a range of elements [i, j] in the array, you can immediately tell whether elements in that range are consecutive or not, without actually looking through the range. If this relation holds
array[j] - array[i] == j - i
then you can immediately say that elements in that range are consecutive. This criterion, obviously, uses the fact that the array is sorted and that the numbers don't repeat.
Now, we just need to develop an algorithm which will take advantage of that criterion. Here's one possible recursive approach:
Input of recursive step is the range of elements [i, j]. Initially it is [0, n-1] - the whole array.
Apply the above criterion to range [i, j]. If the range turns out to be consecutive, there's no need to subdivide it further. Send the range to output (see below for further details).
Otherwise (if the range is not consecutive), divide it into two equal parts [i, m] and [m+1, j].
Recursively invoke the algorithm on the lower part ([i, m]) and then on the upper part ([m+1, j]).
The above algorithm will perform binary partition of the array and recursive descent of the partition tree using the left-first approach. This means that this algorithm will find adjacent subranges with consecutive elements in left-to-right order. All you need to do is to join the adjacent subranges together. When you receive a subrange [i, j] that was "sent to output" at step 2, you have to concatenate it with previously received subranges, if they are indeed consecutive. Or you have to start a new range, if they are not consecutive. All the while you have keep track of the "longest consecutive range" found so far.
That's it.
The benefit of this algorithm is that it detects subranges of consecutive elements "early", without looking inside these subranges. Obviously, it's worst case performance (if ther are no consecutive subranges at all) is still O(n). In the best case, when the entire input array is consecutive, this algorithm will detect it instantly. (I'm still working on a meaningful O estimation for this algorithm.)
The usability of this algorithm is, again, undermined by the uniqueness requirement. I don't know whether it is something that is "given" in your case.
Anyway, here's a possible C++ implementation
typedef std::vector<int> vint;
typedef std::pair<vint::size_type, vint::size_type> range;
class longest_sequence
{
public:
const range& operator ()(const vint &v)
{
current = max = range(0, 0);
process_subrange(v, 0, v.size() - 1);
check_record();
return max;
}
private:
range current, max;
void process_subrange(const vint &v, vint::size_type i, vint::size_type j);
void check_record();
};
void longest_sequence::process_subrange(const vint &v,
vint::size_type i, vint::size_type j)
{
assert(i <= j && v[i] <= v[j]);
assert(i == 0 || i == current.second + 1);
if (v[j] - v[i] == j - i)
{ // Consecutive subrange found
assert(v[current.second] <= v[i]);
if (i == 0 || v[i] == v[current.second] + 1)
// Append to the current range
current.second = j;
else
{ // Range finished
// Check against the record
check_record();
// Start a new range
current = range(i, j);
}
}
else
{ // Subdivision and recursive calls
assert(i < j);
vint::size_type m = (i + j) / 2;
process_subrange(v, i, m);
process_subrange(v, m + 1, j);
}
}
void longest_sequence::check_record()
{
assert(current.second >= current.first);
if (current.second - current.first > max.second - max.first)
// We have a new record
max = current;
}
int main()
{
int a[] = { 1, 3, 4, 5, 6, 8, 9 };
std::vector<int> v(a, a + sizeof a / sizeof *a);
range r = longest_sequence()(v);
return 0;
}
I believe that this should do it?
size_t beginStreak = 0;
size_t streakLen = 1;
size_t longest = 0;
size_t longestStart = 0;
for (size_t i=1; i < len.size(); i++) {
if (vec[i] == vec[i-1] + 1) {
streakLen++;
}
else {
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}
beginStreak = i;
streakLen = 1;
}
}
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}
You can't solve this problem in less than O(N) time. Imagine your list is the first N-1 even numbers, plus a single odd number (chosen from among the first N-1 odd numbers). Then there is a single streak of length 3 somewhere in the list, but worst case you need to scan the entire list to find it. Even on average you'll need to examine at least half of the list to find it.
Similar to Rodrigo's solutions but solving your example as well:
#include <vector>
#include <cstdio>
#define len(x) sizeof(x) / sizeof(x[0])
using namespace std;
int nums[] = {1,3,4,5,6,8,9};
int streakBase = nums[0];
int maxStreakLength = 1;
void updateStreak(int currentStreakLength, int currentStreakBase) {
if (currentStreakLength > maxStreakLength) {
maxStreakLength = currentStreakLength;
streakBase = currentStreakBase;
}
}
int main(void) {
vector<int> v;
for(size_t i=0; i < len(nums); ++i)
v.push_back(nums[i]);
int lastBase = v[0], currentStreakBase = v[0], currentStreakLength = 1;
for(size_t i=1; i < v.size(); ++i) {
if (v[i] == lastBase + 1) {
currentStreakLength++;
lastBase = v[i];
} else {
updateStreak(currentStreakLength, currentStreakBase);
currentStreakBase = v[i];
lastBase = v[i];
currentStreakLength = 1;
}
}
updateStreak(currentStreakLength, currentStreakBase);
printf("maxStreakLength = %d and streakBase = %d\n", maxStreakLength, streakBase);
return 0;
}