C++ sort table by column while preserving row contents - c++

Given a row-major table of type std::vector<std::vector<T>> (where T is a less-comparable type like int or std::string), I'd like to sort the table by a specific column while preserving the row contents (i.e. a row can only be moved as a whole, not the individual cells).
For example, given this table:
2 8 1 4
3 7 6 7
3 3 4 9
8 6 3 4
7 1 5 7
Sorting by the 3rd column (index 2), the desired result would be:
2 8 1 4
8 6 3 4
3 3 4 9
7 1 5 7
3 7 6 7
What is the STL way of achieving this?
One solution I can think of is copying the column that should be sorted into an associative container (for example std::unordered_map<T, std::size_t> where the key is the cell value and the value is the row index), then sorting the map by key (using std::sort()), extracting the resulting row index order and using that to re-order the rows in the original table.
However, this solution seems non-elegant and rather verbose when writing it as actual code.
What are possible, "nice" solutions to implement this?
Note: The table type of std::vector<std::vector<T>> is a given and cannot be changed/modified.

Use a comparator that compare the element to compare.
std::vector<std::vector<T>> vec;
// add elements to vec
int idx = 2;
std::sort(vec.begin(), vec.end(), [idx](const std::vector<T>& a, const std::vector<T>& b) {
return a.at(idx) < b.at(idx);
});
Full working example:
#include <iostream>
#include <vector>
#include <algorithm>
typedef int T;
int main() {
std::vector<std::vector<T>> vec = {
{2, 8, 1, 4},
{3, 7, 6, 7},
{3, 3, 4, 9},
{8, 6, 3, 4},
{7, 1, 5, 7}
};
int idx = 2;
std::sort(vec.begin(), vec.end(), [idx](const std::vector<T>& a, const std::vector<T>& b) {
return a.at(idx) < b.at(idx);
});
for (size_t i = 0; i < vec.size(); i++) {
for (size_t j = 0; j < vec[i].size(); j++) {
std::cout << vec[i][j] << (j + 1 < vec[i].size() ? ' ' : '\n');
}
}
return 0;
}

You can do this by using a custom projection function for std::ranges::sort.
#include <algorithm>
#include <vector>
int main() {
std::vector<std::vector<int>> v{
{2, 8, 1, 4},
{3, 7, 6, 7},
{3, 3, 4, 9},
{8, 6, 3, 4},
{7, 1, 5, 7}
};
int col = 2;
std::ranges::sort(
v, {}, [&col](auto& x) { return x[col]; }
);
}
Demo

std::sort can optionally take a comparator argument.
comparison function object ... which returns ​true if the first argument is less than (i.e. is ordered before) the second.
The signature of the comparison function should be equivalent to the following:
bool cmp(const Type1 &a, const Type2 &b);
So if you've got a std::vector<std::vector<T>> called vec (and assuming the type T is orderable by the < operator) and we want to sort based on the third column, we can write
std::sort(
std::begin(vec),
std::end(vec),
[](const std::vector<T>& a, const std::vector<T>& b) { return a[2] < b[2]; }
);
If you want to change the column, simply change the 2. It can also be a variable provided that you capture that variable in the lambda.

C++20 way:
std::ranges::sort(v, std::less<> {}, [](const auto& vv) { return vv[2]; });
Live demo

There is another to sort the array, and that is to not sort the array.
Instead, sort an array of indices that point to the data item:
#include <iostream>
#include <vector>
#include <algorithm>
#include <numeric>
int main() {
std::vector<std::vector<int>> vec = {
{2, 8, 1, 4},
{3, 7, 6, 7},
{3, 3, 4, 9},
{8, 6, 3, 4},
{7, 1, 5, 7}
};
// index starts out as 0,1,2,3,4
std::vector<int> index(5);
std::iota(index.begin(), index.end(), 0);
// desired column
int idx = 2;
// sort index array based on vec's column value
std::sort(index.begin(), index.end(), [&](int n1, int n2)
{ return vec[n1][idx] < vec[n2][idx]; });
// Output results using the index array
for (size_t i = 0; i < vec.size(); ++i)
{
for (size_t j = 0; j < vec[index[i]].size(); ++j)
std::cout << vec[index[i]][j] << " ";
std::cout << "\n";
}
}
Output:
2 8 1 4
8 6 3 4
3 3 4 9
7 1 5 7
3 7 6 7
One advantage to this is that you're not swapping entire vectors around during the sort. Instead, a simple int is swapped.

Related

Sort 2d C++ array by first element in subarray

I am trying to sort a c++ subarray by first element.
My code is currently set up like this:
int umbrellas[3][2] = {{5, 6}, {2, 7}, {9, 20}};
int n = sizeof(umbrellas) / sizeof(umbrellas[0]);
sort(umbrellas, umbrellas + n, greater<int>());
The sort function doesn't seem to be functioning properly and when I run the code it generates errors. Is there a way to sort the array from
{{5, 6}, {2, 7}, {9, 20}}
into
{{2, 7}, {5, 6}, {9, 20}}
?
Use a std::vector of std::vector as your container and the sort becomes much easier to do with. Not only is std::vector the preferred container for C++ but using STL functions on it is way simpler and direct , without any substantiable overhead.
Define your data as
std::vector<std::vector<int>> umbrellas{
{5, 6},
{2, 7},
{9, 20}
};
Now you can use a custom comparator lambda that takes in two vector element references and returns True when the first element of the above vector is smaller than that of the one below.
std::sort(umbrellas.begin(),
umbrellas.end(),
[](const std::vector<int> &above, const std::vector<int> &below)
{
return (above[0] < below[0]);
});
And the output :
for (auto &&row : umbrellas) {
for (auto element : row) {
std::cout<< element<< " ";
}
std::cout<< "\n";
}
2 7
5 6
9 20
Taking this to C++20 it's even easier:
std::ranges::sort(umbrellas, std::less(),
[](const auto &v) { return v[0];});
If time complexity doesn't matter, this code will achieve the desired with O(n^2) complexity
int arr[3][2] = {{5, 6}, {2, 7}, {9, 20}};
int n = sizeof(arr) / sizeof(arr[0]);
for(int i = 0 ; i < n - 1; i++){
for(int j = 0 ; j < n - 1 ; j++){
if(arr[j][0] > arr[j + 1][0])
swap(arr[j],arr[j + 1]);
}
}

How to replace the 2D vector corresponding to the values?

I am trying to replace the elements in a 2D vector (vector<vector<int>>). I want to change the elements not only by one value, but by a list, which means, for example, change 1,3,4,5,8,9 to 1,2,3,4,5,6 one-to-one correspondence. I have made a very slow code with double loops. Is there any way to speed up the process, with new function or sort the element? Because my 2D vector is very big, 3*300000 actually. My example code is below:
int myints[] = { 1,3,4,5,8,9 };
int myints2[] = { 1,2,3,4,5,6 };
std::vector<int> vals (myints, myints+6);
std::vector<int> vals2 (myints2, myints2+6);
vector<vector<int>> V0(3);
V0[0]={1,4,5};
V0[1]={3,1,8};
V0[2]={1,9,4};
for (size_t j = 0; j < V0.size(); j++)
{
for (int i = 0; i < vals.size(); i++)
replace(V0[j].begin(), V0[j].end(), vals[i], vals2[i]);
};
The ideal output V0 should be
1 3 4
2 1 5
1 6 3
You can use an unordered_map to replace each value directly, instead of searching through the whole vector for each replacement:
#include <vector>
#include <unordered_map>
#include <algorithm>
#include <iostream>
using namespace std;
int main()
{
unordered_map<int, int> replacements{{1, 1}, {3, 2}, {4, 3}, {5, 4}, {8, 5}, {9, 6}};
vector<vector<int>> v0(3);
v0[0] = {1, 4, 5};
v0[1] = {3, 1, 8};
v0[2] = {1, 9, 4};
for_each(v0.begin(), v0.end(), [&](vector<int>& v)
{
transform(v.begin(), v.end(), v.begin(), [&](int val)
{
auto it = replacements.find(val);
return it != replacements.end() ? replacements[val] : val;
});
});
// Print
for (auto& v : v0)
{
cout << "[ ";
for (auto val : v)
{
cout << val << ", ";
}
cout << "]" << endl;
}
return 0;
}
Output:
[ 1, 3, 4, ]
[ 2, 1, 5, ]
[ 1, 6, 3, ]
In C++17, you may also choose a parallel execution policy in for_each and/or transform, since all the changes can be done in parallel.

Index of vector containing the global minimum

Given a vector of vectors, is there an optimal way to determine the index of the vector which holds the global minimum?
What is the complexity in Big-O notation?
#include <algorithm>
#include <iostream>
#include <vector>
unsigned getMinimumIndex(std::vector<std::vector<unsigned>> const& a) {
if (!a.size())
return 0;
unsigned ret = 0; unsigned temp; unsigned global = 1 << 31;
for (std::size_t idx = 0; idx < a.size(); ++idx) {
if ((temp = *std::min_element(std::begin(a[idx]), std::end(a[idx]))) < global) {
global = temp;
ret = idx;
}
}
return ret;
}
int main() {
std::vector<std::vector<unsigned>> a = {{2, 4, 6, 8}, {3, 9, 5, 7},
{3, 4, 4, 3}, {2, 8, 3, 2},
{4, 4, 4, 0}, {1, 2, 3, 4}};
std::cout << getMinimumIndex(a); // 4-th vector posseses the value '0'
return 0;
}
Since neither your vectors nor the numbers inside a vector are sorted, you have to check every number to be the smallest value.
Thus you get a complexity of O(n).
You can either use iterators like you did or simply use 2 for loops and access the vector with a[i][j] (which should be minor faster because of the missing overhead from iterators).
Also - since you only have unsigned int, you can break as soon as you find 0.

C++ "select" algorithm

Among the functionalities found in std::algorithm I can't seem to find one of the most basic I can think of: selected a subset of a collection (for example, return all the odd numbers, all the employees that have status == 'employed', all items that cost less that 20 dollars).
So, given a list of ints like
vector<int> ints {1, 9, 3, 27, 5, 19, 3, 8, 2, 12};
vector<int> evens = ?
vector<int> greaterThan7 = ?
How to find those that are even and those that are greater than 7?
If you want something more functional, you can check out the boost range library. Specifically, filtered:
for (int i : ints | filtered([](int i){return i > 7;}))
{
...
}
This gives you a lazy view, without constructing a new container.
You can get the same from Eric Niebler's range-v3:
for (int i : view::filter(ints, [](int i){return i > 7;})
{
...
}
with the benefit that you can just assign that to a vector too (so you can choose if it's lazy or eager, which Boost.Ranges does not allow).
std::vector<int> greaterThan7 = view::filter(ints, [](int i){return i > 7;});
std::vector<int> sameThing = ints | view::filter([](int i){return i > 7;});
For example
vector<int> ints {1, 9, 3, 27, 5, 19, 3, 8, 2, 12};
vector<int> evens;
std::copy_if( ints.begin(), ints.end(), std::back_inserter( evens ),
[]( int x ) { return x % 2 == 0; } );
Here is a demonstrative program
#include <iostream>
#include <algorithm>
#include <iterator>
#include <vector>
int main()
{
std::vector<int> ints { 1, 9, 3, 27, 5, 19, 3, 8, 2, 12 };
std::vector<int> evens;
std::copy_if( ints.begin(), ints.end(), std::back_inserter( evens ),
[]( int x ) { return x % 2 == 0; } );
for ( int x : evens ) std::cout << x << ' ';
std::cout << std::endl;
}
Its output is
8 2 12
Depending on what your exact requirements are, consider std::stable_partition (or std::partition). It reorders elements in the range such that all which satisfy a predicate come first. You can think of it as splitting the range into a "subset" and a "not subset" part. Here is an example:
#include <algorithm>
#include <vector>
#include <iostream>
int main()
{
using std::begin;
using std::end;
using std::cbegin;
using std::cend;
std::vector<int> ints { 1, 9, 3, 27, 5, 19, 3, 8, 2, 12 };
auto const greater_than_7 = [](int number) { return number > 7; };
auto const iter_first_not_greater_than_7 = std::stable_partition(begin(ints), end(ints), greater_than_7);
for (auto const_iter = cbegin(ints); const_iter != iter_first_not_greater_than_7; ++const_iter)
{
std::cout << *const_iter << "\n";
}
}
If, however, you are fine with copying each matching element to a new collection, for example because the source range must not be modified, then use std::copy_if.
Perhaps what you are really looking for is a view of an unmodifiable range. In this case, you are approaching the problem from the wrong direction. You don't need a particular algorithm; a more natural solution to the problem would be a filtering iterator, like for example Boost's Filter Iterator. You can use the one in Boost or study its implementation to learn how you could write filtering iterators yourself.

Efficiently sort subset of the vector that defines the order

I have the vector that defines the order of items (0..N-1), e.g.
{5, 0, 4, 3, 2, 1, 7, 6}.
I have to sort subsets of that vector. So, for {0, 1, 2, 5} I should get {5, 0, 2, 1}.
I tested the following solutions:
Create a set of items in a subset, then clear the subset, go through the ordering vector, adding only items in the set.
Create new sorted vector by going through the ordering vector, adding only items found by in the subset by std::lower_bound.
The second solution seems much faster, although it needs subset to be sorted. Are there any better solutions? I am using C++/STL/Qt, but the problem is probably not language-dependent.
Check this code :-
#include <iostream>
#include <algorithm>
#include <vector>
struct cmp_subset
{
std::vector<int> vorder;
cmp_subset(const std::vector<int>& order)
{
vorder.resize(order.size());
for (int i=0; i<order.size(); ++i)
vorder.at(order[i]) = i;
}
bool operator()(int lhs, int rhs) const
{
return vorder[lhs] < vorder[rhs];
}
};
int main()
{
std::vector<int> order = {5, 0, 4, 3, 2, 1, 7, 6};
std::vector<int> subset = {0, 1, 2, 5};
for (auto x : subset)
std::cout << x << ' ';
std::cout << '\n';
std::sort(subset.begin(), subset.end(), cmp_subset(order));
for (auto x : subset)
std::cout << x << ' ';
std::cout << '\n';
return 0;
}
The code is copied from here