C++ "select" algorithm

C++ "select" algorithm - c++

Among the functionalities found in std::algorithm I can't seem to find one of the most basic I can think of: selected a subset of a collection (for example, return all the odd numbers, all the employees that have status == 'employed', all items that cost less that 20 dollars).
So, given a list of ints like
vector<int> ints {1, 9, 3, 27, 5, 19, 3, 8, 2, 12};
vector<int> evens = ?
vector<int> greaterThan7 = ?
How to find those that are even and those that are greater than 7?

If you want something more functional, you can check out the boost range library. Specifically, filtered:
for (int i : ints | filtered([](int i){return i > 7;}))
{
...
}
This gives you a lazy view, without constructing a new container.
You can get the same from Eric Niebler's range-v3:
for (int i : view::filter(ints, [](int i){return i > 7;})
{
...
}
with the benefit that you can just assign that to a vector too (so you can choose if it's lazy or eager, which Boost.Ranges does not allow).
std::vector<int> greaterThan7 = view::filter(ints, [](int i){return i > 7;});
std::vector<int> sameThing = ints | view::filter([](int i){return i > 7;});

For example
vector<int> ints {1, 9, 3, 27, 5, 19, 3, 8, 2, 12};
vector<int> evens;
std::copy_if( ints.begin(), ints.end(), std::back_inserter( evens ),
[]( int x ) { return x % 2 == 0; } );
Here is a demonstrative program
#include <iostream>
#include <algorithm>
#include <iterator>
#include <vector>
int main()
{
std::vector<int> ints { 1, 9, 3, 27, 5, 19, 3, 8, 2, 12 };
std::vector<int> evens;
std::copy_if( ints.begin(), ints.end(), std::back_inserter( evens ),
[]( int x ) { return x % 2 == 0; } );
for ( int x : evens ) std::cout << x << ' ';
std::cout << std::endl;
}
Its output is
8 2 12

Depending on what your exact requirements are, consider std::stable_partition (or std::partition). It reorders elements in the range such that all which satisfy a predicate come first. You can think of it as splitting the range into a "subset" and a "not subset" part. Here is an example:
#include <algorithm>
#include <vector>
#include <iostream>
int main()
{
using std::begin;
using std::end;
using std::cbegin;
using std::cend;
std::vector<int> ints { 1, 9, 3, 27, 5, 19, 3, 8, 2, 12 };
auto const greater_than_7 = [](int number) { return number > 7; };
auto const iter_first_not_greater_than_7 = std::stable_partition(begin(ints), end(ints), greater_than_7);
for (auto const_iter = cbegin(ints); const_iter != iter_first_not_greater_than_7; ++const_iter)
{
std::cout << *const_iter << "\n";
}
}
If, however, you are fine with copying each matching element to a new collection, for example because the source range must not be modified, then use std::copy_if.
Perhaps what you are really looking for is a view of an unmodifiable range. In this case, you are approaching the problem from the wrong direction. You don't need a particular algorithm; a more natural solution to the problem would be a filtering iterator, like for example Boost's Filter Iterator. You can use the one in Boost or study its implementation to learn how you could write filtering iterators yourself.

Related

C++ sort table by column while preserving row contents

Given a row-major table of type std::vector<std::vector<T>> (where T is a less-comparable type like int or std::string), I'd like to sort the table by a specific column while preserving the row contents (i.e. a row can only be moved as a whole, not the individual cells).
For example, given this table:
2 8 1 4
3 7 6 7
3 3 4 9
8 6 3 4
7 1 5 7
Sorting by the 3rd column (index 2), the desired result would be:
2 8 1 4
8 6 3 4
3 3 4 9
7 1 5 7
3 7 6 7
What is the STL way of achieving this?
One solution I can think of is copying the column that should be sorted into an associative container (for example std::unordered_map<T, std::size_t> where the key is the cell value and the value is the row index), then sorting the map by key (using std::sort()), extracting the resulting row index order and using that to re-order the rows in the original table.
However, this solution seems non-elegant and rather verbose when writing it as actual code.
What are possible, "nice" solutions to implement this?
Note: The table type of std::vector<std::vector<T>> is a given and cannot be changed/modified.

Use a comparator that compare the element to compare.
std::vector<std::vector<T>> vec;
// add elements to vec
int idx = 2;
std::sort(vec.begin(), vec.end(), [idx](const std::vector<T>& a, const std::vector<T>& b) {
return a.at(idx) < b.at(idx);
});
Full working example:
#include <iostream>
#include <vector>
#include <algorithm>
typedef int T;
int main() {
std::vector<std::vector<T>> vec = {
{2, 8, 1, 4},
{3, 7, 6, 7},
{3, 3, 4, 9},
{8, 6, 3, 4},
{7, 1, 5, 7}
};
int idx = 2;
std::sort(vec.begin(), vec.end(), [idx](const std::vector<T>& a, const std::vector<T>& b) {
return a.at(idx) < b.at(idx);
});
for (size_t i = 0; i < vec.size(); i++) {
for (size_t j = 0; j < vec[i].size(); j++) {
std::cout << vec[i][j] << (j + 1 < vec[i].size() ? ' ' : '\n');
}
}
return 0;
}

You can do this by using a custom projection function for std::ranges::sort.
#include <algorithm>
#include <vector>
int main() {
std::vector<std::vector<int>> v{
{2, 8, 1, 4},
{3, 7, 6, 7},
{3, 3, 4, 9},
{8, 6, 3, 4},
{7, 1, 5, 7}
};
int col = 2;
std::ranges::sort(
v, {}, [&col](auto& x) { return x[col]; }
);
}
Demo

std::sort can optionally take a comparator argument.
comparison function object ... which returns true if the first argument is less than (i.e. is ordered before) the second.
The signature of the comparison function should be equivalent to the following:
bool cmp(const Type1 &a, const Type2 &b);
So if you've got a std::vector<std::vector<T>> called vec (and assuming the type T is orderable by the < operator) and we want to sort based on the third column, we can write
std::sort(
std::begin(vec),
std::end(vec),
[](const std::vector<T>& a, const std::vector<T>& b) { return a[2] < b[2]; }
);
If you want to change the column, simply change the 2. It can also be a variable provided that you capture that variable in the lambda.

C++20 way:
std::ranges::sort(v, std::less<> {}, [](const auto& vv) { return vv[2]; });
Live demo

There is another to sort the array, and that is to not sort the array.
Instead, sort an array of indices that point to the data item:
#include <iostream>
#include <vector>
#include <algorithm>
#include <numeric>
int main() {
std::vector<std::vector<int>> vec = {
{2, 8, 1, 4},
{3, 7, 6, 7},
{3, 3, 4, 9},
{8, 6, 3, 4},
{7, 1, 5, 7}
};
// index starts out as 0,1,2,3,4
std::vector<int> index(5);
std::iota(index.begin(), index.end(), 0);
// desired column
int idx = 2;
// sort index array based on vec's column value
std::sort(index.begin(), index.end(), [&](int n1, int n2)
{ return vec[n1][idx] < vec[n2][idx]; });
// Output results using the index array
for (size_t i = 0; i < vec.size(); ++i)
{
for (size_t j = 0; j < vec[index[i]].size(); ++j)
std::cout << vec[index[i]][j] << " ";
std::cout << "\n";
}
}
Output:
2 8 1 4
8 6 3 4
3 3 4 9
7 1 5 7
3 7 6 7
One advantage to this is that you're not swapping entire vectors around during the sort. Instead, a simple int is swapped.

Getting index of unique elements in a vector

Is there any STL/Boost function in C++ that allows me to find the indices of all unique elements in a vector?
I have seen many solutons to find unique elements, but I need their index.
vector<int> v = { 1,1,1, 2,2,2,2, 3, 3, ,4,5,5,5,5,5,5 };// already sorted
Either I need first index of unique elemnt
vector<int> unique_index={0,3,7,9,10};
or I need last index of unique elements
vector<int> unique_index={2,6,8,9,15};

A simple way (aside from just keeping track of what the last element was) is to use a std::set to test if the current element is unique in the elements of the vector -- seen so far, and populate your unique indexes as you go. This provides a single pass through to collect the indexes where the first unique element is seen, e.g.
#include <iostream>
#include <vector>
#include <set>
int main (void) {
std::vector<int> v = { 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5, 5, 5 },
uniqueidx{};
std::set<int> s{};
for (size_t i = 0; i < v.size(); i++)
if (s.insert(v[i]).second)
uniqueidx.push_back(i);
for (const auto i : uniqueidx)
std::cout << i << '\n';
}
Example Use/Output
$ ./bin/set_index_of_unique_in_vector
0
3
7
10
11
(note: the last two values are 10 and 11, not 9 and 10 -- you are missing a value in your vector initialization, e.g. 3, ,4)
If you just wanted a simple-old loop to do the same thing, you could use:
#include <iostream>
#include <vector>
int main (void) {
std::vector<int> v = { 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5, 5, 5 },
uniqueidx{};
for (size_t i = 0; i < v.size(); i++)
if (!i || v[i-1] != v[i])
uniqueidx.push_back(i);
for (const auto i : uniqueidx)
std::cout << i << '\n';
}
(same output)
The benefit of the approach with std::set is you leave the determination of uniqueness up to std::set, with the simple loop -- it's up to you....
Look things over and let me know if you have questions.

Similar to David's answer of using std::set, you could also use std::map with its member function try_emplace(key, value):
#include <iostream>
#include <vector>
#include <map>
int main (void) {
std::vector<int> v = { 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5, 5, 5 };
std::map<int, int> m;
for (size_t i = 0; i < v.size(); i++)
{
// `i` is only entered if the `m[v[i]]` isn't filled yet.
m.try_emplace(v[i], i);
}
for (auto [valueFromV, indexFromV] : m)
{
std::cout << indexFromV << '\n';
}
}

you can change this code and using int instead of a string.
#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <iterator>
#include <algorithm>
std::map<std::string, int> get_unique_indices(const std::vector<std::string>& products) {
std::vector<std::string> tempProducts(products);
std::map<std::string, int> result;
std::sort(std::begin(tempProducts), std::end(tempProducts));
for (auto it1 = std::begin(tempProducts), it2 = it1; it1 != std::end(tempProducts); it1 = it2) {
int duplication = 0;
for (; it2 != std::end(tempProducts) && (*it2 == *it1); ++it2) {
duplication++;
}
if (duplication == 1) {
result.insert({ *it1, std::find(std::begin(products), std::end(products), *it1) - std::begin(products) });
}
}
return result;
}
int main()
{
using namespace std;
vector<string> products = { "apple", "orange", "lemon", "apple", "kivi", "orange", "kivi", "melon"};
auto result = get_unique_indices(products);
for (const auto& [key, value] : result) {
std::cout << key << " " << value << std::endl;
}
return 0;
}
At first, I create another vector to save the original data. (Cause the position of the item will change in the sort method).
then I use two iterators to find duplicate items and count them. the result will be saved in a map instance.
this code must compile with c++17 onwards.

How to compare two vectors for equality?

I have the following program:
std::vector<int> nums = {1, 2, 3, 4, 5};
std::vector<int> nums2 = {5, 4, 3, 2, 1};
bool equal = std::equal(nums.begin(), nums.end(), nums2.begin());
if (equal)
{
cout << "Both vectors are equal" << endl;
}
There are two vectors that have equal elements. std::equal function does not work here because it goes sequentially and compares corresponding elements. Is there a way to check that both these vectors are equal and get true in my case without sorting? In real example I do not have integers but custom objects which comparing as equality of pointers.

You can construct a std::unordered_set from each vector, then compare those, as shown in the code snippet below:
#include <iostream>
#include <vector>
#include <unordered_set>
using namespace std;
int main()
{
std::vector<int> nums = { 1, 2, 3, 4, 5 };
std::vector<int> nums2 = { 5, 4, 3, 2, 1 };
std::vector<int> nums3 = { 5, 4, 9, 2, 1 };
std::unordered_set<int> s1(nums.begin(), nums.end());
std::unordered_set<int> s2(nums2.begin(), nums2.end());
std::unordered_set<int> s3(nums3.begin(), nums3.end());
if (s1 == s2) {
std::cout << "1 and 2 are equal";
}
else {
std::cout << "1 and 2 are different";
}
std::cout << std::endl;
if (s1 == s3) {
std::cout << "1 and 3 are equal";
}
else {
std::cout << "1 and 3 are different";
}
std::cout << std::endl;
return 0;
}
However, there are some points to bear in mind:
For vectors of custom type objects, you would need to provide an operator== for that type (but that would have to be done anyway, or how can you say if the two vectors have the same contents).
Vectors that containing duplicate entries will create sets that have those duplicates removed: Thus, {1, 2, 2, 3} will show equal to {1, 2, 3}.
You will also need to provide a std:hash for your custom type. For a trivial class, bob, which just wraps an integer, that hash, and the required operator==, could be defined as shown below; you can then replace the <int> specializations in the above example with <bob> and it will work. (This cppreference article explains more about the hash.)
class bob {
public:
int data;
bob(int arg) : data{ arg } { }
};
bool operator==(const bob& lhs, const bob& rhs)
{
return lhs.data == rhs.data;
}
template<> struct std::hash<bob> {
std::size_t operator()(bob const& b) const noexcept {
return static_cast<size_t>(b.data);
}
};

The std::is_permutation standard library algorithm does exactly what you want. It returns true if both ranges contain the same elements in any order.
It might be slow for some applications but it only requires equality comparison and it doesn't require a temporary container or additional memory allocation.
#include <algorithm>
#include <iostream>
#include <vector>
int main()
{
std::vector<int> nums = { 1, 2, 3, 4, 5 };
std::vector<int> nums2 = { 5, 4, 3, 2, 1 };
bool equal = std::is_permutation(nums.begin(), nums.end(), nums2.begin(), nums2.end());
if (equal) {
std::cout << "Both vectors are equal" << std::endl;
}
}

You're looking for a set or a multiset. C++ standard library does have implementations of these data structures:
std::set
std::multiset
std::unordered_set
std::unordered_multiset

If I had to invent an algorithm to do this from scratch, because for whatever reason using sets or sorts doesn't work, I would consider removing matches. This will be inefficient - it's O(n^2) - but it will work, and inefficiency is unlikely to be an issue in many cases:
bool compare(A, B)
copy B to BB
for each a in A
if BB contains a
remove a from BB
else
return false
if BB is empty
return true
return false

Throwing-in one more way to solve your task, using std::sort. It also works correctly if vectors have duplicate elements (as seen from example below). It is also quite fast, O(N*Log(N)) time on average, in real life will be close in time to solutions with std::unordered_set or std::unordered_multiset.
Yes, I see that OP asked to achieve without sorting, but actually std::is_permutation is also doing sorting underneath almost for sure. And std::set does indirectly sorting too using trees. Even std::unordered_set does a kind of sorting by hash value (hash bucketing is a kind of radix sorting). So probably this task can't be solved without a kind of indirect sorting or other kind of ordering whole vector.
Try it online!
#include <algorithm>
#include <vector>
#include <iostream>
int main() {
std::vector<int> nums1 = {1, 2, 3, 4, 1, 5}, nums2 = {5, 4, 3, 1, 2, 1};
std::sort(nums1.begin(), nums1.end());
std::sort(nums2.begin(), nums2.end());
std::cout << std::boolalpha << (nums1 == nums2) << std::endl;
}
Output:
true

Kick out duplicate entries across vectors

I have vectors and I would like to retrieve one vector that contains all entries which aren't duplicated anywhere in all input vectors.
#include <vector>
int main() {
std::vector<int> a = {2, 1, 3};
std::vector<int> b = {99, 1, 3, 5, 4};
std::vector<int> c = {5, 6, 7, 1};
// magic to retrieve {2, 99, 4, 6, 7} (order doesn't matter)
}
Is there a library function that can help performing this task efficiently?
I'm not tied to using vectors. The solution could include lists, sets, or whatever are most appropriate for the task.

Using unordered_map, O(N) space complexity and O(N) time complexity:
#include <vector>
#include <unordered_map>
#include <iostream>
std::vector<int>
get_unique_values(std::initializer_list<std::vector<int>> vectors)
{
std::unordered_map<int, size_t> tmp;
auto insert_value_in_tmp = [&tmp](int v) {
auto i = tmp.find(v);
if (i == tmp.end())
tmp[v] = 1;
else if (i->second != 2)
i->second = 2;
};
for ( auto& vec : vectors) {
for ( auto vec_value : vec ) {
insert_value_in_tmp(vec_value);
}
}
std::vector<int> result;
for (auto v : tmp) {
if (v.second == 1)
result.push_back(v.first);
}
return result;
};
int main() {
std::vector<int> a = {2, 1, 3};
std::vector<int> b = {99, 3, 5, 4};
std::vector<int> c = {5, 6, 7};
std::vector<int> result = get_unique_values({a,b,c});
for (auto v : result) {
std::cout << v << " ";
}
std::cout << '\n';
return 0;
}

Efficiently sort subset of the vector that defines the order

I have the vector that defines the order of items (0..N-1), e.g.
{5, 0, 4, 3, 2, 1, 7, 6}.
I have to sort subsets of that vector. So, for {0, 1, 2, 5} I should get {5, 0, 2, 1}.
I tested the following solutions:
Create a set of items in a subset, then clear the subset, go through the ordering vector, adding only items in the set.
Create new sorted vector by going through the ordering vector, adding only items found by in the subset by std::lower_bound.
The second solution seems much faster, although it needs subset to be sorted. Are there any better solutions? I am using C++/STL/Qt, but the problem is probably not language-dependent.

Check this code :-
#include <iostream>
#include <algorithm>
#include <vector>
struct cmp_subset
{
std::vector<int> vorder;
cmp_subset(const std::vector<int>& order)
{
vorder.resize(order.size());
for (int i=0; i<order.size(); ++i)
vorder.at(order[i]) = i;
}
bool operator()(int lhs, int rhs) const
{
return vorder[lhs] < vorder[rhs];
}
};
int main()
{
std::vector<int> order = {5, 0, 4, 3, 2, 1, 7, 6};
std::vector<int> subset = {0, 1, 2, 5};
for (auto x : subset)
std::cout << x << ' ';
std::cout << '\n';
std::sort(subset.begin(), subset.end(), cmp_subset(order));
for (auto x : subset)
std::cout << x << ' ';
std::cout << '\n';
return 0;
}
The code is copied from here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ "select" algorithm - c++

Related

C++ sort table by column while preserving row contents

Getting index of unique elements in a vector

How to compare two vectors for equality?

Kick out duplicate entries across vectors

Efficiently sort subset of the vector that defines the order

Categories

Resources