Find index of Nth occurrence of a number using Binary Search

Find index of Nth occurrence of a number using Binary Search - c++

I have a finite array whose elements are only -1,0 or 1. I want to find the index of Nth occurrence of a number (say 0).
I can iterate through the entire array, but I'm looking for a faster approach. I can think of using Binary Search, but having trouble modelling the algorithm. How do I proceed with Binary Search in this case?

You cannot do this without at least one pass of O(N) pre-processing. From an standpoint of information theory alone, you must have knowledge of elements [0:k-1] to know whether element [k] is the one you want.
If you're going to make this search many times, then you can make a simple linear pass over the array, counting each element as you go. Store the indices in a 2-D array, so you can directly index whatever occurrence you want.
For instance, given [-1 0 1 1 -1 -1 0 0 0 -1 1], you can convert this to a 3xN array, idx
[[0 4 5 9]]
[[1 6 7 8]]
[[2 3 10]]
The Nth occurrence of element I is idx[I+1][N-1].
After that initial O(N) pass, your look-up is O(1) time, using O(N) space.

The OP stated that the ordered structure is important and that the vector or array is unsorted. To the best of my knowledge there is no faster search algorithm than linear for unsorted data. Here are a few links for references:
gamedev.net
quora.com
discuss.codechef.com
ubuntuforums.org
With the above links for references; this should be enough evidence to conclude that if the data in the array or vector is unsorted and must maintain its structure, then there is but no choice to use linear iteration, it may be possible to use a hashing technique, but that can still be tricky, using binary search will only work on sorted data in most cases.
- Here is a good linear algorithm to find the Nth occurrence of T in data.
To solve your problem of finding the Nth occurrence of element T in a given unsorted array, vector or container you can use this simple function template:
It takes 3 parameters:
a const reference to the container that is populated with data
a const unsigned value N where N is the Nth occurrence.
and a const template type T that you are searching for.
It returns an unsigned value for the index location within the container
of the Nth occurrence of element T
template<class T>
unsigned RepititionSearch( const std::vector<T>& data, const unsigned N, const T element ) {
if ( data.empty() || N < 0 || N >= data.size() ) {
return -1;
}
unsigned counter = 0;
unsigned i = 0;
for ( auto e : data ) {
if ( element == e ) {
++counter;
i++;
} else {
i++;
}
if ( counter == N ) {
return i - 1;
}
}
return -1;
}
Break down of the algorithm
It first does some sanity checks:
It checks to see if the container is empty
It checks the value N to see if it is within bounds of [0,container.size())
If any of these fail, it returns -1; in production code this might throw
an exception or an error
We then have a need for 2 incrementing counters:
1 for the current index location
1 for the number of occurrences of element T
We then use a simplified for loop using c++11 or higher
We go through each e in data
We check to see if the element passed into the function is equal to
the current e in data
If the check passes or is true we then pre-increment counter and
post-increment i otherwise we only want to post-increment i
After incrementing the counters we then check to see if the current
counter is equal to the Nth value passed into the function
If the check passes we return the value of i-1 since containers are 0 based
If the check fails here we then continue to the next iteration of the loop and repeat the process
If after all e in data has been checked and there are no occurrences
of T == e or N != counter then we leave the for loop and the function
returns a -1; in production code this might throw an exception or return an error.
The worst case scenario here is either there are no finds, or the Nth occurrence of T happens to be the very last e in data where this will yield O(N) which is linear, and for basic containers this should be efficient enough. If the containers have array indexing capabilities their item access should be O(1) constant if you know which index location you want.
Note: This would be the answer that I feel should solve the problem, if you are interested in a breakdown of how the design process of designing or modeling such an algorithm works you can refer to my reference answer here
AFAIK I do not think there is a better way to do this with unsorted array data, but don't quote me on it.

Since you are looking to search through an array, a vector or some container where the search in question pertains to the index location of some element T based on its Nth occurrence within its container this post may be of some help to you:
According to your question as well as some of the comments in regards to it where you explicitly stated that your container is Unsorted while you were thinking of using a binary search and were having trouble with the process of modeling an algorithm:
This post here serves as an example of the development process towards the design of an algorithm in which it may help you achieve what you are looking for:
The search algorithm here is a linear one, where a binary search will not be suitable to your current needs:
This same process of building an algorithm can be applied to other types of algorithms including, binary searches, hash tables, etc.
- 1st Build
struct Index {
static unsigned counter; // Static Counter
unsigned location; // index location of Nth element
unsigned count; // How many of this element up to this point
Index() : location( 0 ), count( 0 ) {}
};
unsigned Index::counter = 0;
// These typedefs are not necessarily needed;
// just used to make reading of code easier.
typedef Index IndexZero;
typedef Index IndexPos1;
typedef Index IndexNeg1;
template<class T>
class RepititionSearch {
public:
// Some Constants to compare against: don't like "magic numbers"
const T NEG { -1 };
const T ZERO { 0 };
const T POS { 1 };
private:
std::vector<T> data_; // The actual array or vector of data to be searched
std::vector<Index> indices_; // A vector of Indexes - record keeping to prevent multiple full searches.
public:
// Instantiating a search object requires an already populated container
explicit RepititionSearch ( const std::vector<T>& data ) : data_( data ) {
// make sure indices_ is empty upon construction.
indices_.clear();
}
// method to find the Nth occurrence of object A
unsigned getNthOccurrence( unsigned NthOccurrence, T element ) {
// Simple bounds checking
if ( NthOccurrence < 0 || NthOccurrence >= data.size() ) {
// Can throw error or print message...;
return -1;
}
IndexZero zeros;
IndexPos1 ones;
IndexNeg1 negOnes;
// Clear out the indices_ so that each consecutive call is correct
indices_.clear();
unsigned idx = 0;
for ( auto e : data_ ) {
if ( element == e && element == NEG ) {
++negOnes.counter;
negOnes.location = idx;
negOnes.count = negOnes.counter;
indices_.push_back( negOnes );
}
if ( element == e && element == ZERO ) {
++zeros.counter;
zeros.location = idx;
zeros.count = zeros.counter;
indices_.push_back( zeros );
}
if ( element == e && element == POS ) {
++ones.counter;
ones.location = idx;
ones.count = ones.counter;
indices_.push_back( ones );
}
idx++;
} // for each T in data_
// Reset static counters
negOnes.counter = 0;
zeros.counter = 0;
ones.counter = 0;
// Now that we saved a record: find the nth occurance
// This will not search the full vector unless it is last element
// This has early termination. Also this vector should only be
// a percentage of the original data vector's size in elements.
for ( auto index : indices_ ) {
if ( index.count == NthOccurrence) {
// We found a match
return index.location;
}
}
// Not Found
return -1;
}
};
int main() {
// using the sample array or vector from User: Prune's answer!
std::vector<char> vec{ -1, 0, 1, 1, -1, -1, 0, 0, 0, -1, 1 };
RepititionSearch <char> search( vec );
unsigned idx = search.getNthOccurrence( 3, 1 );
std::cout << idx << std::endl;
std::cout << "\nPress any key and enter to quit." << std::endl;
char q;
std::cin >> q;
return 0;
}
// output:
10
The value of 10 is the correct answer as the 3rd occurrence of the value 1 is at location 10 in the original vector since vectors are 0 based. The vector of indices is only used as book keeping for faster search.
If you noticed I even made this a class template to accept any basic type T that'll be stored in std::vector<T> as long as T is comparable, or has operators defined for it.
AFAIK I do not think that there is any other searching method that is faster than this for the type of search that you are striving for, but don't quote me on it. However I think I can optimize this code a little more... just need some time to look at it closer.
This may appear to be a bit crazy but this does work: just a bit of fun playing around with the code
int main() {
std::cout <<
RepititionSearch<char>( std::vector<char>( { -1, 0, 1, 1, -1, -1, 0, 0, 0, -1, 1 } ) ).getNthOccurrence( 3, 1 )
<< std::endl;
}
It can be done on a single line & printed to the console without creating an instance of class.
- 2nd Build
Now this may not necessarily make the algorithm faster, but this would clean up the code a bit for readability. Here I removed the typedefs, and just by using a single version of the Index struct in the 3 if statements you will see duplicate code so I decided to make a private helper function for that and this is how simple the algorithm looks for clear readability.
struct Index {
unsigned location;
unsigned count;
static unsigned counter;
Index() : location(0), count(0) {}
};
unsigned Index::counter = 0;
template<class T>
class RepitiionSearch {
public:
const T NEG { -1 };
const T ZERO { 0 };
const T POS { 1 };
private:
std::vector<T> data_;
std::vector<Index> indices_;
public:
explicit RepititionSearch( const std::vector<T>& data ) : data_( data )
indices_.clear();
}
unsigned getNthOccurrence( unsigned NthOccurrence, T element ) {
if ( NthOccurrence < 0 || NthOccurrence >= data.size() ) {
return -1;
}
indices_.clear();
Index index;
unsigned i = 0;
for ( auto e : data_ ) {
if ( element == e && element == NEG ) {
addIndex( index, i );
}
if ( element == e && element == ZERO ) {
addIndex( index, i );
}
if ( element == e && element == POS ) {
addIndex( index, i );
}
i++;
}
index.counter = 0;
for ( auto idx : indices_ ) {
if ( idx.count == NthOccurrence ) {
return idx.location;
}
}
return -1;
}
private:
void addIndex( Index& index, unsigned inc ) {
++index.counter;
index.location = inc;
index.count = index.counter;
indices_.push_back( index );
}
};
- 3rd Build
And to make this completely generic to find any Nth occurrence of any element T the above can be simplified and reduced down to this: I also removed the static counter from Index and moved it to the private section of RepititionSearch, it just made more sense to place it there.
struct Index {
unsigned location;
unsigned count;
Index() : location(0), count(0) {}
};
template<class T>
class RepititionSearch {
private:
static unsigned counter_;
std::vector<T> data_;
std::vector<Index> indices_;
public:
explicit RepititionSearch( const std::vector<T>& data ) : data_( data ) {
indices_.clear();
}
unsigned getNthOccurrence( unsigned NthOccurrence, T element ) {
if ( NthOccurrence < 0 || NthOccurrence >= data_.size() ) {
return -1;
}
indices_.clear();
Index index;
unsigned i = 0;
for ( auto e : data_ ) {
if ( element == e ) {
addIndex( index, i );
}
i++;
}
counter_ = 0;
for ( auto idx : indices_ ) {
if ( idx.count == NthOccurrence ) {
return idx.location;
}
}
return -1;
}
private:
void addIndex( Index& index, unsigned inc ) {
++counter_;
index.location = inc;
index.count = counter_;
indices_.push_back( index );
}
};
template<class T>
unsigned RepititionSearch<T>::counter_ = 0;
- 4th Build
I have also done this same algorithm above without the need or dependency of needing a vector just to hold index information. This version doesn't need the Index struct at all and doesn't need a helper function either. It looks like this:
template<class T>
class RepititionSearch {
private:
static unsigned counter_;
std::vector<T> data_;
public:
explicit RepititionSearch( const std::vector<T>& data ) : data_( data ) {}
unsigned getNthOcc( unsigned N, T element ) {
if ( N < 0 || N >= data_.size() ) {
return -1;
}
unsigned i = 0;
for ( auto e : data_ ) {
if ( element == e ) {
++counter_;
i++;
} else {
i++;
}
if ( counter_ == N ) {
counter_ = 0;
return i-1;
}
}
counter_ = 0;
return -1;
}
};
template<class T>
unsigned RepititionSearch<T>::counter_ = 0;
Since we were able to remove the dependency of the secondary vector and removed the need for a helper function; we don't even need a class at all to hold the container; we can just write a function template that takes a vector and apply the same algorithm. Also there is no need for a static counter with this version.
- 5th Build
template<class T>
unsigned RepititionSearch( const std::vector<T>& data, unsigned N, T element ) {
if ( data.empty() || N < 0 || N >= data.size() ) {
return -1;
}
unsigned counter = 0;
unsigned i = 0;
for ( auto e : data ) {
if ( element == e ) {
++counter;
i++;
} else {
i++;
}
if ( counter == N ) {
return i - 1;
}
}
return -1;
}
Yes this is a lot to take in; but these are the steps that are involved in the process of writing and designing an algorithm and refining it down to simpler code. As you have seen I have refined this code about 5 times. I went from using a struct, a class, typedefs, and a static member with multiple stored containers, to removing the typedefs and putting the repeatable code into a helper function, to removing the dependency of a secondary container & the helper function, down to not even needing a class at all and just creating a function that does what it is supposed to do.
You can apply a similar approach to these steps into building a function that does what you want or need it to do. You can use the same process to write a function that will do a binary search, hash table, etc.

Related

How do I Optimize my C++ key-value program to have a faster runtime?

This is a2.hpp, and is the program that can be edited, as far as I know the code is correct, just too slow. I am honestly lost here, I know my for loops are probably whats slowing me down so much, maybe use an iterator?
// <algorithm>, <list>, <vector>
// YOU CAN CHANGE/EDIT ANY CODE IN THIS FILE AS LONG AS SEMANTICS IS UNCHANGED
#include <algorithm>
#include <list>
#include <vector>
class key_value_sequences {
private:
std::list<std::vector<int>> seq;
std::vector<std::vector<int>> keyref;
public:
// YOU SHOULD USE C++ CONTAINERS TO AVOID RAW POINTERS
// IF YOU DECIDE TO USE POINTERS, MAKE SURE THAT YOU MANAGE MEMORY PROPERLY
// IMPLEMENT ME: SHOULD RETURN SIZE OF A SEQUENCE FOR GIVEN KEY
// IF NO SEQUENCE EXISTS FOR A GIVEN KEY RETURN 0
int size(int key) const;
// IMPLEMENT ME: SHOULD RETURN POINTER TO A SEQUENCE FOR GIVEN KEY
// IF NO SEQUENCE EXISTS FOR A GIVEN KEY RETURN nullptr
const int* data(int key) const;
// IMPLEMENT ME: INSERT VALUE INTO A SEQUENCE IDENTIFIED BY GIVEN KEY
void insert(int key, int value);
}; // class key_value_sequences
int key_value_sequences::size(int key) const {
//checks if the key is invalid or the count vector is empty.
if(key<0 || keyref[key].empty()) return 0;
// sub tract 1 because the first element is the key to access the count
return keyref[key].size() -1;
}
const int* key_value_sequences::data(int key) const {
//checks if key index or ref vector is invalid
if(key<0 || keyref.size() < static_cast<unsigned int>(key+1)) {
return nullptr;
}
// ->at(1) accesses the count (skipping the key) with a pointer
return &keyref[key].at(1);
}
void key_value_sequences::insert(int key, int value) {
//checks if key is valid and if the count vector needs to be resized
if(key>=0 && keyref.size() < static_cast<unsigned int>(key+1)) {
keyref.resize(key+1);
std::vector<int> val;
seq.push_back(val);
seq.back().push_back(key);
seq.back().push_back(value);
keyref[key] = seq.back();
}
//the index is already valid
else if(key >=0) keyref[key].push_back(value);
}
#endif // A2_HPP
This is a2.cpp, this just tests the functionality of a2.hpp, this code cannot be changed
// DO NOT EDIT THIS FILE !!!
// YOUR CODE MUST BE CONTAINED IN a2.hpp ONLY
#include <iostream>
#include "a2.hpp"
int main(int argc, char* argv[]) {
key_value_sequences A;
{
key_value_sequences T;
// k will be our key
for (int k = 0; k < 10; ++k) { //the actual tests will have way more than 10 sequences.
// v is our value
// here we are creating 10 sequences:
// key = 0, sequence = (0)
// key = 1, sequence = (0 1)
// key = 2, sequence = (0 1 2)
// ...
// key = 9, sequence = (0 1 2 3 4 5 6 7 8 9)
for (int v = 0; v < k + 1; ++v) T.insert(k, v);
}
T = T;
key_value_sequences V = T;
A = V;
}
std::vector<int> ref;
if (A.size(-1) != 0) {
std::cout << "fail" << std::endl;
return -1;
}
for (int k = 0; k < 10; ++k) {
if (A.size(k) != k + 1) {
std::cout << "fail";
return -1;
} else {
ref.clear();
for (int v = 0; v < k + 1; ++v) ref.push_back(v);
if (!std::equal(ref.begin(), ref.end(), A.data(k))) {
std::cout << "fail 3 " << A.data(k) << " " << ref[k];
return -1;
}
}
}
std::cout << "pass" << std::endl;
return 0;
} // main
If anyone could help me improve my codes efficiency I would really appreciate it, thanks.

First, I'm not convinced your code is correct. In insert, if they key is valid you create a new vector and insert it into sequence. Sounds wrong, as that should only happen if you have a new key, but if your tests pass it might be fine.
Performance wise:
Avoid std::list. Linked lists have terrible performance on today's hardware because they break pipelineing, caching and pre-fetching. Always use std::vector instead. If the payload is really big and you are worried about copies use std::vector<std::unique_ptr<T>>
Try to avoid copying vectors. In your code you have keyref[key] = seq.back() which copies the vector, but should be fine since it's only one element.
Otherwise there's no obvious performance problems. Try to benchmark and profile your program and see where the slow parts are. Usually there's one or two places that you need to optimize and get great performance. If it's still too slow, ask another question where you post your results so that we can better understand the problem.

I will join Sorin in saying don't use std::list if avoidable.
So you use key as direct index, where does it say it is none-negative? where does it say its less than 100000000?
void key_value_sequences::insert(int key, int value) {
//checks if key is valid and if the count vector needs to be resized
if(key>=0 && keyref.size() < static_cast<unsigned int>(key+1)) {
keyref.resize(key+1); // could be large
std::vector<int> val; // don't need this temporary.
seq.push_back(val); // seq is useless?
seq.back().push_back(key);
seq.back().push_back(value);
keyref[key] = seq.back(); // we now have 100000000-1 empty indexes
}
//the index is already valid
else if(key >=0) keyref[key].push_back(value);
}
Can it be done faster? depending on your key range yes it can. You will need to implement a flat_map or hash_map.
C++11 concept code for a flat_map version.
// effectively a binary search
auto key_value_sequences::find_it(int key) { // type should be iterator
return std::lower_bound(keyref.begin(), keyref.end(), [key](const auto& check){
return check[0] < key; // key is 0-element
});
}
void key_value_sequences::insert(int key, int value) {
auto found = find_it(key);
// at the end or not found
if (found == keyref.end() || found->front() != key) {
found = keyref.emplace(found, key); // add entry
}
found->emplace_back(value); // update entry, whether new or old.
}
const int* key_value_sequences::data(int key) const {
//checks if key index or ref vector is invalid
auto found = find_it(key);
if (found == keyref.end())
return nullptr;
// ->at(1) accesses the count (skipping the key) with a pointer
return found->at(1);
}
(hope I got that right ...)

Sorting one std::vector based on the content of another [duplicate]

This question already has answers here:
How can I sort two vectors in the same way, with criteria that uses only one of the vectors?
(9 answers)
Closed 9 months ago.
I have several std::vector, all of the same length. I want to sort one of these vectors, and apply the same transformation to all of the other vectors. Is there a neat way of doing this? (preferably using the STL or Boost)? Some of the vectors hold ints and some of them std::strings.
Pseudo code:
std::vector<int> Index = { 3, 1, 2 };
std::vector<std::string> Values = { "Third", "First", "Second" };
Transformation = sort(Index);
Index is now { 1, 2, 3};
... magic happens as Transformation is applied to Values ...
Values are now { "First", "Second", "Third" };

friol's approach is good when coupled with yours. First, build a vector consisting of the numbers 1…n, along with the elements from the vector dictating the sorting order:
typedef vector<int>::const_iterator myiter;
vector<pair<size_t, myiter> > order(Index.size());
size_t n = 0;
for (myiter it = Index.begin(); it != Index.end(); ++it, ++n)
order[n] = make_pair(n, it);
Now you can sort this array using a custom sorter:
struct ordering {
bool operator ()(pair<size_t, myiter> const& a, pair<size_t, myiter> const& b) {
return *(a.second) < *(b.second);
}
};
sort(order.begin(), order.end(), ordering());
Now you've captured the order of rearrangement inside order (more precisely, in the first component of the items). You can now use this ordering to sort your other vectors. There's probably a very clever in-place variant running in the same time, but until someone else comes up with it, here's one variant that isn't in-place. It uses order as a look-up table for the new index of each element.
template <typename T>
vector<T> sort_from_ref(
vector<T> const& in,
vector<pair<size_t, myiter> > const& reference
) {
vector<T> ret(in.size());
size_t const size = in.size();
for (size_t i = 0; i < size; ++i)
ret[i] = in[reference[i].first];
return ret;
}

typedef std::vector<int> int_vec_t;
typedef std::vector<std::string> str_vec_t;
typedef std::vector<size_t> index_vec_t;
class SequenceGen {
public:
SequenceGen (int start = 0) : current(start) { }
int operator() () { return current++; }
private:
int current;
};
class Comp{
int_vec_t& _v;
public:
Comp(int_vec_t& v) : _v(v) {}
bool operator()(size_t i, size_t j){
return _v[i] < _v[j];
}
};
index_vec_t indices(3);
std::generate(indices.begin(), indices.end(), SequenceGen(0));
//indices are {0, 1, 2}
int_vec_t Index = { 3, 1, 2 };
str_vec_t Values = { "Third", "First", "Second" };
std::sort(indices.begin(), indices.end(), Comp(Index));
//now indices are {1,2,0}
Now you can use the "indices" vector to index into "Values" vector.

Put your values in a Boost Multi-Index container then iterate over to read the values in the order you want. You can even copy them to another vector if you want to.

Only one rough solution comes to my mind: create a vector that is the sum of all other vectors (a vector of structures, like {3,Third,...},{1,First,...}) then sort this vector by the first field, and then split the structures again.
Probably there is a better solution inside Boost or using the standard library.

You can probably define a custom "facade" iterator that does what you need here. It would store iterators to all your vectors or alternatively derive the iterators for all but the first vector from the offset of the first. The tricky part is what that iterator dereferences to: think of something like boost::tuple and make clever use of boost::tie. (If you wanna extend on this idea, you can build these iterator types recursively using templates but you probably never want to write down the type of that - so you either need c++0x auto or a wrapper function for sort that takes ranges)

I think what you really need (but correct me if I'm wrong) is a way to access elements of a container in some order.
Rather than rearranging my original collection, I would borrow a concept from Database design: keep an index, ordered by a certain criterion. This index is an extra indirection that offers great flexibility.
This way it is possible to generate multiple indices according to different members of a class.
using namespace std;
template< typename Iterator, typename Comparator >
struct Index {
vector<Iterator> v;
Index( Iterator from, Iterator end, Comparator& c ){
v.reserve( std::distance(from,end) );
for( ; from != end; ++from ){
v.push_back(from); // no deref!
}
sort( v.begin(), v.end(), c );
}
};
template< typename Iterator, typename Comparator >
Index<Iterator,Comparator> index ( Iterator from, Iterator end, Comparator& c ){
return Index<Iterator,Comparator>(from,end,c);
}
struct mytype {
string name;
double number;
};
template< typename Iter >
struct NameLess : public binary_function<Iter, Iter, bool> {
bool operator()( const Iter& t1, const Iter& t2 ) const { return t1->name < t2->name; }
};
template< typename Iter >
struct NumLess : public binary_function<Iter, Iter, bool> {
bool operator()( const Iter& t1, const Iter& t2 ) const { return t1->number < t2->number; }
};
void indices() {
mytype v[] = { { "me" , 0.0 }
, { "you" , 1.0 }
, { "them" , -1.0 }
};
mytype* vend = v + _countof(v);
Index<mytype*, NameLess<mytype*> > byname( v, vend, NameLess<mytype*>() );
Index<mytype*, NumLess <mytype*> > bynum ( v, vend, NumLess <mytype*>() );
assert( byname.v[0] == v+0 );
assert( byname.v[1] == v+2 );
assert( byname.v[2] == v+1 );
assert( bynum.v[0] == v+2 );
assert( bynum.v[1] == v+0 );
assert( bynum.v[2] == v+1 );
}

A slightly more compact variant of xtofl's answer for if you are just looking to iterate through all your vectors based on the of a single keys vector. Create a permutation vector and use this to index into your other vectors.
#include <boost/iterator/counting_iterator.hpp>
#include <vector>
#include <algorithm>
std::vector<double> keys = ...
std::vector<double> values = ...
std::vector<size_t> indices(boost::counting_iterator<size_t>(0u), boost::counting_iterator<size_t>(keys.size()));
std::sort(begin(indices), end(indices), [&](size_t lhs, size_t rhs) {
return keys[lhs] < keys[rhs];
});
// Now to iterate through the values array.
for (size_t i: indices)
{
std::cout << values[i] << std::endl;
}

ltjax's answer is a great approach - which is actually implemented in boost's zip_iterator http://www.boost.org/doc/libs/1_43_0/libs/iterator/doc/zip_iterator.html
It packages together into a tuple whatever iterators you provide it.
You can then create your own comparison function for a sort based on any combination of iterator values in your tuple. For this question, it would just be the first iterator in your tuple.
A nice feature of this approach is that it allows you to keep the memory of each individual vector contiguous (if you're using vectors and that's what you want). You also don't need to store a separate index vector of ints.

This would have been an addendum to Konrad's answer as it an approach for a in-place variant of applying the sort order to a vector. Anyhow since the edit won't go through I will put it here
Here is a in-place variant with a slightly higher time complexity that is due to a primitive operation of checking a boolean. The additional space complexity is of a vector which can be a space efficient compiler dependent implementation. The complexity of a vector can be eliminated if the given order itself can be modified.
Here is a in-place variant with a slightly higher time complexity that is due to a primitive operation of checking a boolean. The additional space complexity is of a vector which can be a space efficient compiler dependent implementation. The complexity of a vector can be eliminated if the given order itself can be modified. This is a example of what the algorithm is doing.
If the order is 3 0 4 1 2, the movement of the elements as indicated by the position indices would be 3--->0; 0--->1; 1--->3; 2--->4; 4--->2.
template<typename T>
struct applyOrderinPlace
{
void operator()(const vector<size_t>& order, vector<T>& vectoOrder)
{
vector<bool> indicator(order.size(),0);
size_t start = 0, cur = 0, next = order[cur];
size_t indx = 0;
T tmp;
while(indx < order.size())
{
//find unprocessed index
if(indicator[indx])
{
++indx;
continue;
}
start = indx;
cur = start;
next = order[cur];
tmp = vectoOrder[start];
while(next != start)
{
vectoOrder[cur] = vectoOrder[next];
indicator[cur] = true;
cur = next;
next = order[next];
}
vectoOrder[cur] = tmp;
indicator[cur] = true;
}
}
};

Here is a relatively simple implementation using index mapping between the ordered and unordered names that will be used to match the ages to the ordered names:
void ordered_pairs()
{
std::vector<std::string> names;
std::vector<int> ages;
// read input and populate the vectors
populate(names, ages);
// print input
print(names, ages);
// sort pairs
std::vector<std::string> sortedNames(names);
std::sort(sortedNames.begin(), sortedNames.end());
std::vector<int> indexMap;
for(unsigned int i = 0; i < sortedNames.size(); ++i)
{
for (unsigned int j = 0; j < names.size(); ++j)
{
if (sortedNames[i] == names[j])
{
indexMap.push_back(j);
break;
}
}
}
// use the index mapping to match the ages to the names
std::vector<int> sortedAges;
for(size_t i = 0; i < indexMap.size(); ++i)
{
sortedAges.push_back(ages[indexMap[i]]);
}
std::cout << "Ordered pairs:\n";
print(sortedNames, sortedAges);
}
For the sake of completeness, here are the functions populate() and print():
void populate(std::vector<std::string>& n, std::vector<int>& a)
{
std::string prompt("Type name and age, separated by white space; 'q' to exit.\n>>");
std::string sentinel = "q";
while (true)
{
// read input
std::cout << prompt;
std::string input;
getline(std::cin, input);
// exit input loop
if (input == sentinel)
{
break;
}
std::stringstream ss(input);
// extract input
std::string name;
int age;
if (ss >> name >> age)
{
n.push_back(name);
a.push_back(age);
}
else
{
std::cout <<"Wrong input format!\n";
}
}
}
and:
void print(const std::vector<std::string>& n, const std::vector<int>& a)
{
if (n.size() != a.size())
{
std::cerr <<"Different number of names and ages!\n";
return;
}
for (unsigned int i = 0; i < n.size(); ++i)
{
std::cout <<'(' << n[i] << ", " << a[i] << ')' << "\n";
}
}
And finally, main() becomes:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
void ordered_pairs();
void populate(std::vector<std::string>&, std::vector<int>&);
void print(const std::vector<std::string>&, const std::vector<int>&);
//=======================================================================
int main()
{
std::cout << "\t\tSimple name - age sorting.\n";
ordered_pairs();
}
//=======================================================================
// Function Definitions...

**// C++ program to demonstrate sorting in vector
// of pair according to 2nd element of pair
#include <iostream>
#include<string>
#include<vector>
#include <algorithm>
using namespace std;
// Driver function to sort the vector elements
// by second element of pairs
bool sortbysec(const pair<char,char> &a,
const pair<int,int> &b)
{
return (a.second < b.second);
}
int main()
{
// declaring vector of pairs
vector< pair <char, int> > vect;
// Initialising 1st and 2nd element of pairs
// with array values
//int arr[] = {10, 20, 5, 40 };
//int arr1[] = {30, 60, 20, 50};
char arr[] = { ' a', 'b', 'c' };
int arr1[] = { 4, 7, 1 };
int n = sizeof(arr)/sizeof(arr[0]);
// Entering values in vector of pairs
for (int i=0; i<n; i++)
vect.push_back( make_pair(arr[i],arr1[i]) );
// Printing the original vector(before sort())
cout << "The vector before sort operation is:\n" ;
for (int i=0; i<n; i++)
{
// "first" and "second" are used to access
// 1st and 2nd element of pair respectively
cout << vect[i].first << " "
<< vect[i].second << endl;
}
// Using sort() function to sort by 2nd element
// of pair
sort(vect.begin(), vect.end(), sortbysec);
// Printing the sorted vector(after using sort())
cout << "The vector after sort operation is:\n" ;
for (int i=0; i<n; i++)
{
// "first" and "second" are used to access
// 1st and 2nd element of pair respectively
cout << vect[i].first << " "
<< vect[i].second << endl;
}
getchar();
return 0;`enter code here`
}**

with C++11 lambdas and the STL algorithms based on answers from Konrad Rudolph and Gabriele D'Antona:
template< typename T, typename U >
std::vector<T> sortVecAByVecB( std::vector<T> & a, std::vector<U> & b ){
// zip the two vectors (A,B)
std::vector<std::pair<T,U>> zipped(a.size());
for( size_t i = 0; i < a.size(); i++ ) zipped[i] = std::make_pair( a[i], b[i] );
// sort according to B
std::sort(zipped.begin(), zipped.end(), []( auto & lop, auto & rop ) { return lop.second < rop.second; });
// extract sorted A
std::vector<T> sorted;
std::transform(zipped.begin(), zipped.end(), std::back_inserter(sorted), []( auto & pair ){ return pair.first; });
return sorted;
}

So many asked this question and nobody came up with a satisfactory answer. Here is a std::sort helper that enables to sort two vectors simultaneously, taking into account the values of only one vector. This solution is based on a custom RadomIt (random iterator), and operates directly on the original vector data, without temporary copies, structure rearrangement or additional indices:
C++, Sort One Vector Based On Another One

Trying to combine like terms for a templated polynomial class using recursion in c++

I'm teaching my self C++.
I'm trying to combine polynomials. For this I have defined straightforward classes:
Polynomial<T>, Term<T> and Coefficient<T> (which may also just be
complex<T>) using simple value composition. I have defined the required operator overloads.
Polynomial's compare by sorting their terms (std::sort).
I am working on combineLikeTerms(); This method when called will first call
another member method that will sort this vector of Terms. For example:
4x^3 + 5x^2 + 3x - 4
would be a possible resulting sorted vector.
Question:
I am using two iterators on this vector and Im trying to merge adjacent terms
of the same order.
Lets say our initial vector after being sorted is this:
4x^3 - 2x^3 + x^3 - 2x^2 + x ...
after the function completes its iterations the temp stack vector would then
look like this 2x^3 + x^3 - 2x^2 + x ... if we look there are still like terms
this needs to be refactored again.
How do I do this? I'm thinking of using recursion.
// ------------------------------------------------------------------------- //
// setPolynomialByDegreeOfExponent()
// should be called before combineLikeTerms
template <class T>
void Polynomial<T>::setPolynomialByDegreeOfExponent()
{
unsigned int uiIndex = _uiNumTerms - 1;
if ( uiIndex < 1 )
{
return;
}
struct _CompareOperator_
{
bool operator() ( math::Term<T> a, Term<T> b )
{
return ( a.getDegreeOfTerm() > b.getDegreeOfTerm() );
} // operator()
};
stable_sort( _vTerms.begin(), _vTerms.end(), _CompareOperator_() );
} // setPolynomialByDegreeOfExponent
// ------------------------------------------------------------------------- //
// addLikeTerms()
template <class T>
bool Polynomial<T>::addLikeTerms( const Term<T>& termA, const Term<T>& termB, Term<T>& result ) const
{
if ( termA.termsAreAlike( termB ) )
{
result = termA + termB;
return true;
}
return false;
} // addLikeTerms
// ------------------------------------------------------------------------- //
// combineLikeTerms()
template <class T>
void Polynomial<T>::combineLikeTerms()
{
// First We Order Our Terms.
setPolynomialByDegreeOfExponent();
// Nothing To Do Then
if ( _vTerms.size() == 1 )
{
return;
}
Term<T> result; // Temp Variable
// No Need To Do The Work Below This If Statement This Is Simpler
if ( _vTerms.size() == 2 )
{
if ( addLikeTerms( _vTerms.at(0), _vTerms.at(1) )
{
_vTerms.clear();
_vTerms.push_back( result );
}
return;
}
// For 3 Ore More Terms
std::vector<Term<T>> vTempTerms; // Temp storage
std::vector<Term<T>>::iterator it = _vTerms.begin();
std::vector<Term<T>>::iterator it2 = _vTerms.begin()+1;
bool bFound = addLikeTerms( *it, *it2, result );
while ( it2 != _vTerms.end() )
{
if ( bFound )
{
// Odd Case Last Three Elems
if ( (it2 == (_vTerms.end()-2)) && (it2+1) == (_vTerms.end()-1)) )
{
vTempTerms.push_back( result );
vTempTerms.push_back( _vTerms.back() );
break;
}
// Even Case Last Two Elems
else if ( (it2 == (_vTerms.end()-1)) && (it == (_vTerms.end()-2)) )
{
vTempTerms.push_back( result );
break;
}
else
{
vTempTerms.push_back( result );
it += 2; // Increment by 2
it2 += 2; "
bFound = addLikeTerms( *it, *it2, result );
}
}
else {
// Push Only First One
vTempTerms.push_back( *it );
it++; // Increment By 1
it2++; "
// Test Our Second Iterator
if ( it2 == _vTerms.end() )
{
vTempTerms.push_back( *(--it2) ); // same as using _vTerms.back()
}
else
{
bFound = addLikeTerms( *it, *it2, result );
}
}
}
// Now That We Have Went Through Our Container, We Need To Update It
_vTerms.clear();
_vTerms = vTempTerms;
// At This point our stack variable should contain all elements from above,
// however this temp variable can still have like terms in it.
// ??? Were do I call the recursion and how do I define the base case
// to stop the execution of the recursion where the base case is a
// sorted std::vector of Term<T> objects that no two terms that are alike...
// I do know that the recursion has to happen after the above while loop
} // combineLikeTerms
Can someone help me find the next step? I'd be happy to hear about any bugs/efficiency issues in the code shown.
I love c++

Here's my take on it in modern C++.
Note the extra optimization of dropping terms with an effective coefficient of zero
Self contained sample: http://liveworkspace.org/code/ee68769826a80d4c7dc314e9b792052b
Update: posted a c++03 version of this http://ideone.com/aHuB8
#include <algorithm>
#include <vector>
#include <functional>
#include <iostream>
template <typename T>
struct Term
{
T coeff;
int exponent;
};
template <typename T>
struct Poly
{
typedef Term<T> term_t;
std::vector<term_t> _terms;
Poly(std::vector<term_t> terms) : _terms(terms) { }
void combineLikeTerms()
{
if (_terms.empty())
return;
std::vector<term_t> result;
std::sort(_terms.begin(), _terms.end(),
[] (term_t const& a, term_t const& b) { return a.exponent > b.exponent; });
term_t accum = { T(), 0 };
for(auto curr=_terms.begin(); curr!=_terms.end(); ++curr)
{
if (curr->exponent == accum.exponent)
accum.coeff += curr->coeff;
else
{
if (accum.coeff != 0)
result.push_back(accum);
accum = *curr;
}
}
if (accum.coeff != 0)
result.push_back(accum);
std::swap(_terms, result); // only update if no exception
}
};
int main()
{
Poly<int> demo({ { 4, 1 }, { 6, 7 }, {-3, 1 }, { 5, 5 } });
demo.combineLikeTerms();
for (auto it = demo._terms.begin(); it!= demo._terms.end(); ++it)
std::cout << (it->coeff>0? " +" : " ") << it->coeff << "x^" << it->exponent;
std::cout << "\n";
}

You need to look at the polynomial as a sequence of pairs (coefficient,variable):
[(coefficient1,variable1),(coefficient2,variable2),(coefficient3,variable3),...]
As you describe, you iterate through this from left to right, combining two adjacent pairs into one whenever the variable part is identical (this of course assumes that the list has already been sorted by the variable part!).
Now what happens when there are three or more elements in this list that share their variables? Well, then just keep combining them. There is no need for recursion or anything complicated, really.
At any point during the iteration you combine the variable part of the current pair with the variable part last seen. If they are identical, you combine them and simply continue. If the next pair you get still has the same variable part as the one last seen, well then you combine them again. If you do this correctly, there shouldn't be any duplicates left.
Here is an example of how to do this. It works by creating a new pair list, then iterating through the input list, for each item of the input list it decides whether to either combine it with the item last pushed to the new list, or by adding a new element to the new list:
#include <utility>
#include <vector>
#include <iostream>
typedef std::vector<std::pair<float,std::string>> Polynomial;
Polynomial combine_like_terms(const Polynomial &poly)
{
if (poly.empty())
return poly;
/* Here we store the new, cleaned-up polynomial: */
Polynomial clean_poly;
/* Now we iterate: */
auto it = begin(poly);
clean_poly.push_back(*it);
++it;
while (it != end(poly)) {
if (clean_poly.back().second == it->second)
clean_poly.back().first += it->first; // Like term found!
else
clean_poly.push_back(*it); // Sequence of like-terms ended!
++it;
}
return clean_poly;
}
int main()
{
Polynomial polynomial {
{ 1.0 , "x^2" },
{ 1.4 , "x^3" },
{ 2.6 , "x^3" },
{ 0.2 , "x^3" },
{ 2.3 , "x" },
{ 0.7 , "x" }
};
Polynomial clean_polynomial = combine_like_terms(polynomial);
for (auto term : clean_polynomial)
std::cout << '(' << term.first << ',' << term.second << ")\n";
std::cout.flush();
return 0;
}
You can easily make this templated again if you need to – I used float for the coefficients and strings for the variable part. It's really just a code example to show how this can be done easily without recursion or lots of iterators used in parallel.
Oh, and the code is written for C++11. Again, it's just a model and can be adjusted for C++03.

Storing set of non-overlapping ranges and finding whether a value is present in any one of the ranges strictly

I have a set of ranges :
Range1 ---- (0-10)
Range2 ---- (15-25)
Range3 ---- (100-1000) and likewise.
I would like to have only the bounds stored since storing large ranges , it would be efficient.
Now I need to search for a number , say 14 . In this case, 14 is not present in any of the ranges whereas (say a number) 16 is present in one of the ranges.
I would need a function
bool search(ranges, searchvalue)
{
if searchvalues present in any of the ranges
return true;
else
return false;
}
How best can this be done ? This is strictly non-overlapping and the important criteria is that the search has to be most efficient.

One possibility is to represent ranges as a pair of values and define a suitable comparison function. The following should consider one range less than another if its bounds are smaller and there is no overlap. As a side effect, this comparison function doesn't let you store overlapping ranges in the set.
To look up an integer n, it can be treated as a range [n, n]
#include <set>
#include <iostream>
typedef std::pair<int, int> Range;
struct RangeCompare
{
//overlapping ranges are considered equivalent
bool operator()(const Range& lhv, const Range& rhv) const
{
return lhv.second < rhv.first;
}
};
bool in_range(const std::set<Range, RangeCompare>& ranges, int value)
{
return ranges.find(Range(value, value)) != ranges.end();
}
int main()
{
std::set<Range, RangeCompare> ranges;
ranges.insert(Range(0, 10));
ranges.insert(Range(15, 25));
ranges.insert(Range(100, 1000));
std::cout << in_range(ranges, 14) << ' ' << in_range(ranges, 16) << '\n';
}

The standard way to handle this is through so called interval trees. Basically, you augment an ordinary red-black tree with additional information so that each node x contains an interval x.int and the key of x is the low endpoint, x.int.low, of the interval. Each node x also contains a value x.max, which is the maximum value of any interval endpoint stored in the subtree rooted at x. Now you can determine x.max given interval x.int and the max values of node x’s children as follows:
x.max = max(x.int.high, x.left.max, x.right.max)
This implies that, with n intervals, insertion and deletion run in O(lg n) time. In fact, it is possible to update the max attributes after a rotation in O(1) time. Here is how to search for an element i in the interval tree T
INTERVAL-SEARCH(T, i)
x = T:root
while x is different from T.nil and i does not overlap x.int
if x.left is different from T.nil and x.left.max is greater than or equal to i.low
x = x.left
else
x = x.right
return x
The complexity of the search procedure is O(lg n) as well.
To see why, see CLRS Introduction to algorithms, chapter 14 (Augmenting Data Structures).

You could put something together based on std::map and std::map::upper_bound:
Assuming you have
std::map<int,int> ranges; // key is start of range, value is end of range
You could do the following:
bool search(const std::map<int,int>& ranges, int searchvalue)
{
auto p = ranges.upper_bound(searchvalue);
// p->first > searchvalue
if(p == ranges.begin())
return false;
--p; // p->first <= searchvalue
return searchvalue >= p->first && searchvalue <= p->second;
}
I'm using C++11, if you use C++03, you'll need to replace "auto" by the proper iterator type.
EDIT: replaced pseudo-code inrange() by explicit expression in return statement.

A good solution can be as the following. It is O(log(n)).
A critical condition is non overlapping ranges.
#include <set>
#include <iostream>
#include <assert.h>
template <typename T> struct z_range
{
T s , e ;
z_range ( T const & s,T const & e ) : s(s<=e?s:e), e(s<=e?e:s)
{
}
};
template <typename T> bool operator < (z_range<T> const & x , z_range<T> const & y )
{
if ( x.e<y.s)
return true ;
return false ;
}
int main(int , char *[])
{
std::set<z_range<int> > x;
x.insert(z_range<int>(20,10));
x.insert(z_range<int>(30,40));
x.insert(z_range<int>(5,9));
x.insert(z_range<int>(45,55));
if (x.find(z_range<int>(15,15)) != x.end() )
std::cout << "I have it" << std::endl ;
else
std::cout << "not exists" << std::endl ;
}

If you have ranges ri = [ai, bi]. You could sort all the ai and put them into an array and search for x having x >= ai and ai minimal using binary search.
After you found this element you have to check whether x <= bi.
This is suitable if you have big numbers. If, on the other hand, you have either a lot of memory or small numbers, you can think about putting those ranges into a bool array. This may be suitable if you have a lot of queries:
bool ar[];
ar[0..10] = true;
ar[15..25] = true;
// ...
bool check(int searchValues) {
return ar[searchValues];
}

Since the ranges are non-overlapping the only thing left to do is performing a search within the range that fit's the value. If the values are ordered within the ranges, searching is even simpler. Here is a summary of search algorithms.
With respect to C++ you also can use algorithms from STL or even functions provided by the containers, e. g. set::find.

So, this assumes the ranges are continous (i.e range [100,1000] contains all numbers between 100 and 1000):
#include <iostream>
#include <map>
#include <algorithm>
bool is_in_ranges(std::map<int, int> ranges, int value)
{
return
std::find_if(ranges.begin(), ranges.end(),
[&](std::pair<int,int> pair)
{
return value >= pair.first && value <= pair.second;
}
) != ranges.end();
}
int main()
{
std::map<int, int> ranges;
ranges[0] = 10;
ranges[15] = 25;
ranges[100] = 1000;
std::cout << is_in_ranges(ranges, 14) << '\n'; // 0
std::cout << is_in_ranges(ranges, 16) << '\n'; // 1
}
In C++03, you'd need a functor instead of a lambda function:
struct is_in {
is_in(int x) : value(x) {}
bool operator()(std::pair<int, int> pair)
{
return value >= pair.first && value <= pair.second;
}
private:
int value;
};
bool is_in_ranges(std::map<int, int> ranges, int value)
{
return
std::find_if(ranges.begin(), ranges.end(), is_in(value)) != ranges.end();
}

Erasing elements in std::vector by using indexes

I've a std::vector<int> and I need to remove all elements at given indexes (the vector usually has high dimensionality). I would like to know, which is the most efficient way to do such an operation having in mind that the order of the original vector should be preserved.
Although, I found related posts on this issue, some of them needed to remove one single element or multiple elements where the remove-erase idiom seemed to be a good solution.
In my case, however, I need to delete multiple elements and since I'm using indexes instead of direct values, the remove-erase idiom can't be applied, right?
My code is given below and I would like to know if it's possible to do better than that in terms of efficiency?
bool find_element(const vector<int> & vMyVect, int nElem){
return (std::find(vMyVect.begin(), vMyVect.end(), nElem)!=vMyVect.end()) ? true : false;
}
void remove_elements(){
srand ( time(NULL) );
int nSize = 20;
std::vector<int> vMyValues;
for(int i = 0; i < nSize; ++i){
vMyValues.push_back(i);
}
int nRandIdx;
std::vector<int> vMyIndexes;
for(int i = 0; i < 6; ++i){
nRandIdx = rand() % nSize;
vMyIndexes.push_back(nRandIdx);
}
std::vector<int> vMyResult;
for(int i=0; i < (int)vMyValues.size(); i++){
if(!find_element(vMyIndexes,i)){
vMyResult.push_back(vMyValues[i]);
}
}
}

I think it could be more efficient, if you just just sort your indices and then delete those elements from your vector from the highest to the lowest. Deleting the highest index on a list will not invalidate the lower indices you want to delete, because only the elements higher than the deleted ones change their index.
If it is really more efficient will depend on how fast the sorting is. One more pro about this solultion is, that you don't need a copy of your value vector, you can work directly on the original vector. code should look something like this:
... fill up the vectors ...
sort (vMyIndexes.begin(), vMyIndexes.end());
for(int i=vMyIndexes.size() - 1; i >= 0; i--){
vMyValues.erase(vMyValues.begin() + vMyIndexes[i])
}

to avoid moving the same elements many times, we can move them by ranges between deleted indexes
// fill vMyIndexes, take care about duplicated values
vMyIndexes.push_back(-1); // to handle range from 0 to the first index to remove
vMyIndexes.push_back(vMyValues.size()); // to handle range from the last index to remove and to the end of values
std::sort(vMyIndexes.begin(), vMyIndexes.end());
std::vector<int>::iterator last = vMyValues.begin();
for (size_t i = 1; i != vMyIndexes.size(); ++i) {
size_t range_begin = vMyIndexes[i - 1] + 1;
size_t range_end = vMyIndexes[i];
std::copy(vMyValues.begin() + range_begin, vMyValues.begin() + range_end, last);
last += range_end - range_begin;
}
vMyValues.erase(last, vMyValues.end());
P.S. fixed a bug, thanks to Steve Jessop that patiently tried to show me it

Erase-remove multiple elements at given indices
Update: after the feedback on performance from #kory, I've modified the algorithm not to use flagging and move/copy elements in chunks (not one-by-one).
Notes:
indices need to be sorted and unique
uses std::move (replace with std::copy for c++98):
Github
Live example
Code:
template <class ForwardIt, class SortUniqIndsFwdIt>
inline ForwardIt remove_at(
ForwardIt first,
ForwardIt last,
SortUniqIndsFwdIt ii_first,
SortUniqIndsFwdIt ii_last)
{
if(ii_first == ii_last) // no indices-to-remove are given
return last;
typedef typename std::iterator_traits<ForwardIt>::difference_type diff_t;
typedef typename std::iterator_traits<SortUniqIndsFwdIt>::value_type ind_t;
ForwardIt destination = first + static_cast<diff_t>(*ii_first);
while(ii_first != ii_last)
{
// advance to an index after a chunk of elements-to-keep
for(ind_t cur = *ii_first++; ii_first != ii_last; ++ii_first)
{
const ind_t nxt = *ii_first;
if(nxt - cur > 1)
break;
cur = nxt;
}
// move the chunk of elements-to-keep to new destination
const ForwardIt source_first =
first + static_cast<diff_t>(*(ii_first - 1)) + 1;
const ForwardIt source_last =
ii_first != ii_last ? first + static_cast<diff_t>(*ii_first) : last;
std::move(source_first, source_last, destination);
// std::copy(source_first, source_last, destination) // c++98 version
destination += source_last - source_first;
}
return destination;
}
Usage example:
std::vector<int> v = /*...*/; // vector to remove elements from
std::vector<int> ii = /*...*/; // indices of elements to be removed
// prepare indices
std::sort(ii.begin(), ii.end());
ii.erase(std::unique(ii.begin(), ii.end()), ii.end());
// remove elements at indices
v.erase(remove_at(v.begin(), v.end(), ii.begin(), ii.end()), v.end());

What you can do is split the vector (actually any non-associative container) in two
groups, one corresponding to the indices to be erased and one containing the rest.
template<typename Cont, typename It>
auto ToggleIndices(Cont &cont, It beg, It end) -> decltype(std::end(cont))
{
int helpIndx(0);
return std::stable_partition(std::begin(cont), std::end(cont),
[&](typename Cont::value_type const& val) -> bool {
return std::find(beg, end, helpIndx++) != end;
});
}
you can then delete from (or up to) the split point to erase (keep only)
the elements corresponding to the indices
std::vector<int> v;
v.push_back(0);
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
v.push_back(5);
int ar[] = { 2, 0, 4 };
v.erase(ToggleIndices(v, std::begin(ar), std::end(ar)), v.end());
If the 'keep only by index' operation is not needed you can use remove_if insted of stable_partition (O(n) vs O(nlogn) complexity)
To work for C arrays as containers the lambda function should be
[&](decltype(*(std::begin(cont))) const& val) -> bool
{ return std::find(beg, end, helpIndx++) != end; }
but then the .erase() method is no longer an option

If you want to ensure that every element is only moved once, you can simply iterate through each element, copy those that are to remain into a new, second container, do not copy the ones you wish to remove, and then delete the old container and replace it with the new one :)

This is an algorithm based on Andriy Tylychko's answer so that this can make it easier and faster to use the answer, without having to pick it apart. It also removes the need to have -1 at the beginning of the indices list and a number of items at the end. Also some debugging code to make sure the indices are valid (sorted and valid index into items).
template <typename Items_it, typename Indices_it>
auto remove_indices(
Items_it items_begin, Items_it items_end
, Indices_it indices_begin, Indices_it indices_end
)
{
static_assert(
std::is_same_v<std::random_access_iterator_tag
, typename std::iterator_traits<Items_it>::iterator_category>
, "Can't remove items this way unless Items_it is a random access iterator");
size_t indices_size = std::distance(indices_begin, indices_end);
size_t items_size = std::distance(items_begin, items_end);
if (indices_size == 0) {
// Nothing to erase
return items_end;
}
// Debug check to see if the indices are already sorted and are less than
// size of items.
assert(indices_begin[0] < items_size);
assert(std::is_sorted(indices_begin, indices_end));
auto last = items_begin;
auto shift = [&last, &items_begin](size_t range_begin, size_t range_end) {
std::copy(items_begin + range_begin, items_begin + range_end, last);
last += range_end - range_begin;
};
size_t last_index = -1;
for (size_t i = 0; i != indices_size; ++i) {
shift(last_index + 1, indices_begin[i]);
last_index = indices_begin[i];
}
shift(last_index + 1, items_size);
return last;
}
Here is an example of usage:
template <typename T>
std::ostream& operator<<(std::ostream& os, std::vector<T>& v)
{
for (auto i : v) {
os << i << " ";
}
os << std::endl;
return os;
}
int main()
{
using std::begin;
using std::end;
std::vector<int> items = { 1, 3, 6, 8, 13, 17 };
std::vector<int> indices = { 0, 1, 2, 3, 4 };
std::cout << items;
items.erase(
remove_indices(begin(items), end(items), begin(indices), end(indices))
, std::end(items)
);
std::cout << items;
return 0;
}
Output:
1 3 6 8 13 17
17
The headers required are:
#include <iterator>
#include <vector>
#include <iostream> // only needed for output
#include <cassert>
#include <type_traits>
And a Demo can be found on godbolt.org.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Find index of Nth occurrence of a number using Binary Search - c++

Related

How do I Optimize my C++ key-value program to have a faster runtime?

Sorting one std::vector based on the content of another [duplicate]

Trying to combine like terms for a templated polynomial class using recursion in c++

Storing set of non-overlapping ranges and finding whether a value is present in any one of the ranges strictly

Erasing elements in std::vector by using indexes

Categories

Resources