I am working on a problem in C++ that involves calculations on arrays of objects sorted along different attributes. Suppose I have an array of 5 objects with 3 different attributes (A, B, and C), and I sort the array according to each attribute individually. This gives me three arrays that tell me for each object where its position would be if I were to sort them along the corresponding attribute.
std::vector<int> A_Order = { 1, 3, 0, 2, 4 };
std::vector<int> B_Order = { 2, 4, 3, 1, 0 };
std::vector<int> C_Order = { 0, 2, 4, 3, 1 };
Now I want to split the objects into 2 subsets, always picking a position N in one of the orderings, where all objects at position x <= N go into the first subset, and all x > N go into the next subset. If I do this on attribute A at N = 2, I get the following:
std::vector<int> A_OrderSub0 = { 1, 0, 2 };
std::vector<int> B_OrderSub0 = { 2, 3, 1 };
std::vector<int> C_OrderSub0 = { 0, 4, 3 };
std::vector<int> A_OrderSub1 = { 3, 4 };
std::vector<int> B_OrderSub1 = { 4, 0 };
std::vector<int> C_OrderSub1 = { 2, 1 };
To perform the next iteration of calculations efficiently, I once again need the subsets to be ordered along the attributes and get the resulting positions (in the example above, I need B_OrderSub1 = { 4, 0 } to become B_OrderSub1 = { 1, 0 }). What is the most efficient way for me to reuse the "global positions", which I only have to get once at the start, to then get the "locally ordered positions" for the objects each time I split them into subsets?
Related
I have my large sparse symmetric matrix stored as Compressed Sparse Row (CSR) using Intel MKL. For the sake of example, let's assume my symmetric sparse matrix is a 5x5:
A =
1 -1 0 -3 0
-1 5 0 0 0
0 0 4 6 4
-3 0 6 7 0
0 0 4 0 -5
values = {1, -1, -3, 5, 4, 6, 4, 7, -5}; // symmetric sparse matrix
columns = {0, 1, 3, 1, 2, 3, 4, 3, 4}; // zero-based
rowIndex = {0, 3, 4, 7, 8, 9}; // zero-based
I am trying to find a submatrix of A given the rows and columns, e.g., A(1:3, 2:4):
A(1:3,2:4) =
0 0 0
4 6 4
6 7 0
values = {4, 6, 4, 6, 7}; // General sparse matrix (sub-matrix is not necessarily symmetric)
columns = {0, 1, 2, 0, 1}; // zero-based
rowIndex = {0, 0, 3, 5}; // zero-based
I would be grateful to know how matrix-indexing can be done. One way I can think of is to convert CSR to coordinate format COO and apply matrix-indexing and then converting it back to CSR, which I don't think it is an efficient way.
Could someone let me know of an efficient or a common way of sparse matrix-indexing?
The trick is to look up values in the lower triangle by the output column (which is their row). You can keep an index into the data for each row, since you visit the entries in column order as you progress in row order for the output.
With the expositional type
struct CSR { // sometimes implicitly symmetric
std::vector<...> vals;
std::vector<int> cols,rowStart;
};
we have
// Return the [r0,r1) by [c0,c1) submatrix, never
// using any symmetry it might have.
CSR submatrix(const CSR &sym,int r0,int r1,int c0,int c1) {
const int m=r1-r0,n=c1-c0;
std::vector<int> finger(sym.rowStart.begin()+c0,sym.rowStart.begin()+c1);
CSR ret;
ret.rowStart.reserve(m+1);
ret.rowStart.push_back(0);
for(int r=0,rs=r0;r<m;++r,++rs) {
// (Strictly) lower triangle:
for(int cs=c0,c=0;cs<rs;++cs,++c)
for(int &f=finger[c],f1=sym.rowStart[cs+1];f<f1;++f) {
const int cf=sym.cols[f];
if(cf>rs) break;
if(cf==rs) {
ret.vals.push_back(sym.vals[f]);
ret.cols.push_back(c);
}
}
// Copy the relevant subsequence of the upper triangle:
for(int f=sym.rowStart[rs],f1=sym.rowStart[rs+1];f<f1;++f) {
const int c=sym.cols[f]-c0;
if(c<0) continue;
if(c>=n) break;
ret.vals.push_back(sym.vals[f]);
ret.cols.push_back(c);
}
ret.rowStart.push_back(ret.vals.size());
}
return ret;
}
For large matrices, the upper triangle loop could be optimized by using a binary search to find the relevant range of f.
I have a large size unsorted array, each element contains a unique integer number,
std::vector<size_t> Vec= {1, 5, 3, 7, 18...}
I need to shuffle the vector in such a way, given a specific number, look for it and then swap it with the number in a new desired position. This swapping needs to be done many times.
Currently I use anther vector PositionLookup to remember&update the positions after every swapping. And I'm wondering is there any more efficient way/data structure that can help do this?
Current solution,
//look for a specific number "key" and swap it with the number in desired position "position_new"
void shuffle(key, position_new)
{
size_t temp = Vec[position_new]; // main vector
size_t position_old = PositionLookup[key]; // auxiliary vector
Vec[position_old] = temp;
PositionLookup[temp] = position_old;
Vec[position_new] = key;
PositionLookup[key] = position_new;
}
A couple microoptimizations to start with: If the vector has a fixed size, you could use a std::array or a plain C array instead of a std::vector. You can also use the most compact integer type that can hold all the values in the vector (e.g. std::int8_t/signed char for values in the interval [-128,127], std::uint16_t/unsigned short for values in the interval [0,65535], etc.)
The bigger optimization opportunity: Since the values themselves never change, only their indexes, you only need to keep track of the indexes.
Suppose for simplicity's sake the values are 0 through 4. In that case we can have an array
std::array<std::int8_t, 5> indices{{2, 3, 1, 4, 0}};
Which represents the index of its indices in an imaginary array, here 4, 2, 0, 1, 3. Or in other words indices[0] is 2, which is the index of 0 in the imaginary array.
Then to swap the positions of 0 and 1 you only need to do
std::swap(indices[0], indices[1]);
Which makes the indices array 3, 2, 1, 4, 0 and the imaginary array 4, 2, 1, 0, 3.
Of course the imaginary array's values might not be the same as its indices.
If the (sorted) values are something like -2, -1, 0, 1, 2 you could obtain the value from the index by adding 2, or if they're 0, 3, 6, 9, 12 you could divide by 3, or if they're -5, -3, -1, 1, 3 you could add 5 then divide by 2, etc.
If the values don't follow a defined pattern, you can create a second array to look up the value that goes with an index.
std::array<std::int8_t, 5> indices{{2, 3, 1, 4, 0}};
constexpr std::array<std::int8_t, 5> Values{{1, 3, 5, 7, 18}};
// Imaginary array before: 18, 5, 1, 3, 7
std::swap(indices[0], indices[1]);
// Imaginary array after: 18, 5, 3, 1, 7
const auto index_to_value = [&](decltype(indices)::value_type idx) noexcept {
return Values[idx];
};
const auto value_to_index = [&](decltype(Values)::value_type val) noexcept {
return std::lower_bound(Values.begin(), Values.end(), val)
- Values.begin();
};
It's the same thing if the values aren't known until runtime, just obviously the values lookup table can't be const or constexpr.
std::array<std::int8_t, 5> indices{{2, 3, 1, 4, 0}};
std::array<std::int8_t, 5> values; // Not known yet at compile-time
// ... set `values` at runtime to e.g. -93, -77, -64, 8, 56
// Imaginary array before: 56, -64, -93, -77, 8
std::swap(indices[0], indices[1]);
// Imaginary array after: 56, -64, -77, -93, 8
const auto index_to_value = [&](decltype(indices)::value_type idx) noexcept {
return values[idx];
};
const auto value_to_index = [&](decltype(values)::value_type val) noexcept {
return std::lower_bound(values.cbegin(), values.cend(), val)
- values.cbegin();
};
I've got a C-style array called board that contains some char's. I'm trying to create a std::array or std::vector (either would be fine, although std::array would be preferable) to store all the indices of board that are a certain value (in my case, 0).
This code I wrote is functional and works well:
std::vector<int> zeroes;
zeroes.reserve(16);
//board has 16 elements, so zeroes.size() will never be larger than 16.
//I used this reserve for speedup - the compiler doesn't require it.
for (int i = 0; i < 16; ++i)
{
if (board[i] == 0)
{
zeroes.push_back(i);
}
}
However, from past experience, whenever a std function exists that could replace part of my code, it is terser and hence stylistically preferred and also faster. My function seems like a fairly basic operation - I know there is a standard function* to access the index of an array that contains a value when that value only occurs once** in the array. So, is there a standard function to create an array of the indices that contain a value, assuming that more than one such index exists?
* Technically, two nested function calls: int x = std::distance(board, std::find(board, board + 16, 0));. See the accepted answer here.
** Well, it still works if more than one index with the desired value is present, but it returns only the first such index, which isn't very useful in my context.
Edit:
As one of the answers misunderstood the question, I'll clarify what I'm seeking. Let's say we have:
char board[16] = {0, 2, 0, 4,
2, 4, 8, 2,
0, 0, 8, 4,
2, 0, 0, 2};
Now, the indices which I'm looking for are {0, 2, 8, 9, 13, 14} because board[0] = 0, board[2] = 0, board[8] = 0, etc. and these are the only numbers which satisfy that property.
Here's a solution using std::iota and std::remove_if:
#include <algorithm>
#include <iostream>
int main () {
const std::size_t board_size = 16;
char board [board_size] = {
0, 2, 0, 4,
2, 4, 8, 2,
0, 0, 8, 4,
2, 0, 0, 2
};
// Initialize a zero-filled vector of the appropriate size.
std::vector<int> zeroes(board_size);
// Fill the vector with index values (0 through board_size - 1).
std::iota(zeroes.begin(), zeroes.end(), 0);
// Remove the index values that do not correspond to zero elements in the board.
zeroes.erase(std::remove_if(zeroes.begin(), zeroes.end(), [&board] (int i) {
return board[i] != 0;
}), zeroes.end());
// Output the resulting contents of the vector.
for (int i : zeroes) {
std::cout << i << std::endl;
}
}
Output of the program (demo):
0
2
8
9
13
14
This question already has answers here:
Sorting zipped (locked) containers in C++ using boost or the STL
(5 answers)
Closed 8 years ago.
Consider the case where “rowPtr”, “colInd” and “values” in a struct are dynamically allocated with same number of elements. In this scenario, what is the fastest way (without copying if possible!!) to sort elements of colInd so that rowPtr and value elements are swapped or change positions based on how elements of colInd change positions.
struct csr
{
int rows;
int cols;
int nzmax;
int *rowPtr;
int *colInd;
double *values;
};
// A simple example without a struct. Just based on arrays
double values[10] = {0.2135, 0.8648, 7, 0.3446, 0.1429, 6, 0.02311, 0.3599, 0.0866, 8 };
int rowPtr[10] = { 0, 3, 6, 10, 2 -1, 24, -4, 1, 11 };
int colInd[10] = { 0, 2, 4, 1, 2, 3, 0, 1, 2, 4 };
// sort colInd and simultaneously change positions in rowPtr and values
//After sorting
Values = {0.214, 0.023, 0.345, 0.360, 0.865, 0.143, 0.087, 6.0};
rowPtr = {0, 24, 10, -4, 3, 2, 1, -1};
colInd = {0, 0, 1, 1, 2, 2, 2, 3};
I suggest putting the three arrays into an array of struct and sorting the array of struct.
struct csr_data
{
int rowPtr;
int colInd;
double value;
};
and
struct csr
{
int rows;
int cols;
int nzmax;
csr_data* data_array;
};
You can sort an array of csr_data using any of the three member variables. When they are sorted, all elements of csr_data will be rearranged regardless of which member you use to sort the data by.
I'm beginning programming, so sorry for my lack of knowledge.
How can I set elements in vector in a specific order? I would like to swap elements in the way that there won't be same elements next to each other.
For example vector contains:
{1, 2, 2, 2, 3, 3, 4, 4, 4}
and I'd like it to be like:
{1, 2, 4, 3, 4, 2, 3, 2, 4}
Thanks for help.
edit:
Hello again, I found not the best solution, maybe you can take a look and correct it?
map<unsigned,unsigned> Map;
for(vector<unsigned>::iterator i=V.begin();i!=V.end();++i)
{
map<unsigned,unsigned>::iterator f=Map.find(*i);
if(f==Map.end()) Map[*i]=1;
else ++f->second;
}
for(bool more=true;more;)
{
more=false;
for(map<unsigned,unsigned>::iterator i=Map.begin();i!=Map.end();++i)
{
if(i->second)
{
--i->second;
cout<<i->first<<", ";
more=true;
}
}
}
Now, for { 1, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4 } it gives me { 1, 2, 3, 4, 2, 3, 4, 2, 4, 4, 4, 4 } instead of e.g { 4, 1, 4, 2, 4, 3, 4, 2, 4, 3, 4, 2 }. How can it be done? Thanks
credits: _13th_Dragon
Count the occurrences of each value.
Starting with the most-frequent value, alternate it with less-frequent values.
In order to achieve (1), one can simply use std::map<V, unsigned>. However, for the second, one needs an ordered set of std::pair<V, unsigned int>, ordered by the second value. Since we want to keep track of how many times we need to use a given value, the second value cannot be constant. Also, we don't want to change the order if we happen to decrease the count of a given value much. All in all we get
#include <iostream>
#include <vector>
#include <algorithm>
#include <map>
// In the pair, first is the original value, while
// second is the number occurrences of that value.
typedef std::pair<int, unsigned> value_counter;
int main(){
std::vector<int> sequence = { 0, 1, 3, 3, 4, 1, 2, 2, 2, 2 , 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4 };
std::map<int, unsigned> count;
for( auto i : sequence){
count[i]++;
}
std::vector<value_counter> store( count.size() );
std::copy(count.begin(), count.end(), store.begin());
// Sort by the second value
std::sort(store.begin(), store.end(),
[](const value_counter& a, const value_counter& b){
return a.second > b.second;
});
std::vector<int> result;
// We need two indices, one for the current value
// and the other one for the alternative
for(unsigned i = 0, j = 1; i < store.size(); ++i){
while(store[i].second > 0){
result.push_back(store[i].first);
store[i].second--;
if(store[i].second == 0)
continue;
if( j <= i)
j = i + 1;
while(j < store.size() && store[j].second == 0)
++j;
if(j >= store.size()){
std::cerr << "Not enough elements for filling!" << std::endl;
return 1;
}
else {
result.push_back(store[j].first);
store[j].second--;
}
}
}
for( auto r : result){
std::cout << r << " ";
}
}
Instead of using a typedef you could create an alternative counter which has better names than first and second, but that makes copying from the map a little bit more verbose.