How to access elements in vectors in Lists more efficiently? - c++

I have a problem where I want to combine a list of vectors, all of the same type, in a particular fashion. I want the first element of my resultant vector to be the first element of the first vector in my list, the second element should be the first element of the second vector, the third, the first of the third and so on until n where n is length of my list and then element n+1 should be the second element of the first vector. This repeats until finished.
Currently, I am doing it like this:
CharacterVector measure(nrows * expansion);
CharacterVector temp(nrows);
for(int i=0; i < measure.size(); i++){
temp = values[i % expansion];
measure[i] = temp[i / expansion];
}
return(measure);
Where values is the List of CharacterVectors. This seems incredibly inefficient, overwriting temp every single time but I don't know of a better way to access the elements in values. I don't know a lot of C++ but I assume there must be a better way.
Any and all help is greatly appreciate!
EDIT:
All vectors in 'values are of the same length nrows and values has expansion elements in it.

What you need is the ListOf<CharacterVector> class. As the name implies, it represents an R list which only contains CharacterVector.
The code below uses it to extract the second element of each character vector from the list. Should not be hard to adapt it to your expansion algorithm, but your example was not reproducible without a bit more context.
#include <Rcpp.h>
using namespace Rcpp ;
// [[Rcpp::export]]
CharacterVector second( ListOf<CharacterVector> values ){
int n = values.size() ;
CharacterVector res(n);
for(int i=0; i<n; i++){
res[i] = values[i][1] ;
}
return res ;
}
Then, you sourceCpp this and try it on some sample data:
> data <- list(letters, letters, LETTERS)
> second(data)
[1] "b" "b" "B"
Now about your assumption:
This seems incredibly inefficient, overwriting temp every single time
Creating a CharacterVector is pretty fast, there is no deep copy of data, so this should not have been an issue in the first place.

You can preconstruct the vector and can easily know at which positions the elements of the first vector should go.. i.e. measure[0], measure[n], measure[n*2] etc.. where n = mylist.size(). Here i assume of course that each vector in the list has equal size. Untested code:
CharacterVector measure(nrows * expansion);
for(int i=0; i < values.size(); ++i)
{
CharacterVector& temp = values[i];
int newPosition = i;
for( int j=0; j < temp.size(); ++j)
{
measure[newPosition ] = temp[j];
newPosition += expansion;
}
}
return(measure);

Related

Sort array of n elements which has k sorted sections

What is the best way to sort an section-wise sorted array as depicted in the second image?
The problem is performing a quick-sort using Message Passing Interface. The solution is performing quick-sort on array sections obtained by using MPI_Scatter() then joining the sorted
pieces using MPI_Gather().
Problem is that the array as a whole is unsorted but sections of it are.
Merging the sub-sections similarly to this solution seems like the best way of sorting the array, but considering that the sub-arrays are already within a single array other sorting algorithms may prove better.
The inputs for a sort function would be the array, it's length and the number of equally sorted sub-sections.
A signature would look something like int* sort(int* array, int length, int sections);
The sections parameter can have any value between 1 and 25. The length parameter value is greater than 0, a multiple of sections and smaller than 2^32.
This is what I am currently using:
int* merge(int* input, int length, int sections)
{
int* sub_sections_indices = new int[sections];
int* result = new int[length];
int section_size = length / sections;
for (int i = 0; i < sections; i++) //initialisation
{
sub_sections_indices[i] = 0;
}
int min, min_index, current_index;
for (int i = 0; i < length; i++) //merging
{
min_index = 0;
min = INT_MAX;
for (int j = 0; j < sections; j++)
{
if (sub_sections_indices[j] < section_size)
{
current_index = j * section_size + sub_sections_indices[j];
if (input[current_index] < min)
{
min = input[current_index];
min_index = j;
}
}
}
sub_sections_indices[min_index]++;
result[i] = min;
}
return result;
}
Optimizing for performance
I think this answer that maintains a min-heap of the smallest item of each sub-array is the best way to handle arbitrary input. However, for small values of k, think somewhere between 10 and 100, it might be faster to implement the more naive solutions given in the question you linked to; while maintaining the min-heap is only O(log n) for each step, it might have a higher overhead for small values of n than the simple linear scan from the naive solutions.
All these solutions create a copy of the input, and they maintain O(k) state.
Optimizing for space
The only way to save space I see is to sort in-place. This will be a problem for the algorithms mentioned above. An in-place algorithm will have two swap elements, but any swaps will likely destroy the property that each sub-array is sorted, unless the larger of the swapped pair is re-sorted into the sub-array it is being swapped to, which will result in an O(n²) algorithm. So if you really do need to conserve memory, I think a regular in-place sorting algorithm would have to be used, which defeats your purpose.

Void value not ignored as it ought to be (Trying to assign void to non-void variable?)

vector<vector<int>> matrixReshape(vector<vector<int>>& nums, int r, int c) {
int row = nums.size();
int col = nums[0].size();
vector<vector<int>> newNums;
if((row*col) < (r*c)){
return nums;
}
else{
deque<int> storage;
for(int i = 0; i < row; i++){
for(int k = 0; k < col; k++){
storage.push_back(nums[i][k]);
}
}
for(int j = 0; j < r; j++){
for(int l = 0; l < c; l++){
newNums[j][l] = storage.pop_front();
}
}
}
return newNums;
}
Hey guys, I am having a problem where I am getting the said error of the title above 'Void value not ignored as it ought to be'. When I looked up the error message, the tips stated "This is a GCC error message that means the return-value of a function is 'void', but that you are trying to assign it to a non-void variable. You aren't allowed to assign void to integers, or any other type." After reading this, I assumed my deque was not being populated; however, I can not find out why my deque is not being populated. If you guys would like to know the problem I am trying to solve, I will be posting it below. Also, I cannot run this through a debugger since it will not compile :(. Thanks in advance.
In MATLAB, there is a very useful function called 'reshape', which can reshape a matrix into a new one with different size but keep its original data.
You're given a matrix represented by a two-dimensional array, and two positive integers r and c representing the row number and column number of the wanted reshaped matrix, respectively.
The reshaped matrix need to be filled with all the elements of the original matrix in the same row-traversing order as they were.
If the 'reshape' operation with given parameters is possible and legal, output the new reshaped matrix; Otherwise, output the original matrix.
Example 1:
Input:
nums =
[[1,2],
[3,4]]
r = 1, c = 4
Output:
[[1,2,3,4]]
Explanation:
The row-traversing of nums is [1,2,3,4]. The new reshaped matrix is a 1 * 4 matrix, fill it row by row by using the previous list.
This line has two problems:
newNums[j][l] = storage.pop_front();
First, pop_front() doesn't return the element that was popped. To get the first element of the deque, use storage[0]. Then call pop_front() to remove it.
You also can't assign to newNums[j][i], because you haven't allocated those elements of the vectors. You can pre-allocate all the memory by declaring it like this.
vector<vector<int>> newNums(r, vector<int>(c));
So the above line should be replaced with:
newNums[j][l] = storage[0];
storage.pop_front();

knapsack with weight only

if i had given the maximum weight say w=20 .and i had given a set on weights say m=[5,7,12,18] then how could i calculate the max possible weight that we can hold inside the maximum weight using the m. in this case the answer is 19.by adding 12+7=19. and my code is giving me 18.please help me in this.
int weight(int W, vector<int> &m) {
int current_weight = 0;
int temp;
for (int i = 0; i < w.size(); i++) {
for (int j = i + 1; j < m.size(); j++) {
if (m[i] < m[j]) {
temp = m[j];
m[j] = m[i];
m[i] = temp;
}
}
}
for (size_t i = 0; i < m.size(); ++i) {
if (current_weight + m[i] <= W) {
current_weight += m[i];
}
}
return current_weight;
}
The problem you describe looks more like a version of the maximum subset sum problem. Basically, there is nothing wrong with your implementaion in the first place; apparently you have correctly implemented a greedy algorithm for the problem. That being said, this algorithm fails to generate an optimal solution for every input. The instance you have found is such an example.
However, the problem can be solved using a different approach termed dynamic programming, which can be seen as form of organization of a recursive formulation of the solution.
Let m = { m_1, ... m_n } be the set of positive item sizes and W a capscity constraint where n is a positive integer. Organize an array A[n][W] as a state space where
A[i][j] = the maximum weight at most j attainable for the set of items
with indices from 0 to i if such a solution exists and
minus infinity otherwise
for each i in {1,...,n} and j in {1,...,W}; for ease of presentation, suppose that A has a value of minus infinity everywhere else. Note that for each such i and j the recurrence relation
A[i][j] = min { A[i-1][W-m_j] + m_j, A[i-1][W] }
holds, where the first case corresponds to selecting item i into the solution and the second case corresponds to not selecting item i into the solution.
Next, organize a loop which fills this table in an order of increasing values of i and j, where the initialization for i = 1 has to be done before. After filling the state space, the maximum feasible value in the last colum
max{ A[n][j] : j in {1,...,W}, A[n][j] is not minus infinity }
yields the optimal solution. If the associated set of items is also desired, either some backtracking or suitable auxiliary data structures have to be used.
So it feels like this solution can be a trivial change to the commonly existing 0-1 knapsack problem, by passing the copy of the weight array as the value array.

Comparisons of strings with c++

I used to have some code in C++ which stores strings as a series of characters in a character matrix (a string is a row). The classes Character matrix and LogicalVector are provided by Rcpp.h:
LogicalVector unq_mat( CharacterMatrix x ){
int nc = x.ncol() ; // Get the number of columns in the matrix.
LogicalVector out(nc); // Make a logical (bool) vector of the same length.
// For every col in the matrix, assess whether the column contains more than one unique character.
for( int i=0; i < nc; i++ ) {
out[i] = unique( x(_,i) ).size() != 1 ;
}
return out;
}
The logical vector identifies which columns contain more than one unique character. This is then passed back to the R language and used to manipulate a matrix. This is a very R way of thinking of doing this. However I'm interested in developing my thinking in C++, I'd like to write something that achieves the above: So finds out which characters in n strings are not all the same, but preferably using the stl classes like std::string. As a conceptual example given three strings:
A = "Hello", B = "Heleo", C = "Hidey". The code would point out that positions/characters 2,3,4,5 are not one value, but position/character 1 (the 'H') is the same in all strings (i.e. there is only one unique value). I have something below that I thought worked:
std::vector<int> StringsCompare(std::vector<string>& stringVector) {
std::vector<int> informative;
for (int i = 0; i < stringVector[0].size()-1; i++) {
for (int n = 1; n < stringVector.size()-1; n++) {
if (stringVector[n][i] != stringVector[n-1][i]) {
informative.push_back(i);
break;
}
}
}
return informative;
}
It's supposed to go through every character position (0 to size of string-1) with the outer loop, and with the inner loop, see if the character in string n is not the same as the character in string n-1. In cases where the character is all the same, for example the H in my hello example above, this will never be true. For cases where the characters in the strings are different the inter loops if statement will be satisfied, the character position recorded, and the inner loop broken out of. I then get a vector out containing the indicies of the characters in the n strings where the characters are not all identical. However these two functions give me different answers. How else can I go through n strings char by char and check they are not all identical?
Thanks,
Ben.
I expected #doctorlove to provide an answer. I'll enter one here in case he does not.
To iterate through all of the elements of a string or vector by index, you want i from 0 to size()-1. for (int i=0; i<str.size(); i++) stops just short of size, i.e., stops at size()-1. So remove the -1's.
Second, C++ arrays are 0-based, so you must adjust (by adding 1 to the value that is pushed into the vector).
std::vector<int> StringsCompare(std::vector<std::string>& stringVector) {
std::vector<int> informative;
for (int i = 0; i < stringVector[0].size(); i++) {
for (int n = 1; n < stringVector.size(); n++) {
if (stringVector[n][i] != stringVector[n-1][i]) {
informative.push_back(i+1);
break;
}
}
}
return informative;
}
A few things to note about this code:
The function should take a const reference to vector, as the input vector is not modified. Not really a problem here, but for various reasons, it's a good idea to declare unmodified input references as const.
This assumes that all the strings are at least as long as the first. If that doesn't hold, the behavior of the code is undefined. For "production" code, you should include a check for the length prior to extracting the ith element of each string.

Merge sorted arrays - Efficient solution

Goal here is to merge multiple arrays which are already sorted into a resultant array.
I've written the following solution and wondering if there is a way to improve the solution
/*
Goal is to merge all sorted arrays
*/
void mergeAll(const vector< vector<int> >& listOfIntegers, vector<int>& result)
{
int totalNumbers = listOfIntegers.size();
vector<int> curpos;
int currow = 0 , minElement , foundMinAt = 0;
curpos.reserve(totalNumbers);
// Set the current position that was travered to 0 in all the array elements
for ( int i = 0; i < totalNumbers; ++i)
{
curpos.push_back(0);
}
for ( ; ; )
{
/* Find the first minimum
Which is basically the first element in the array that hasn't been fully traversed
*/
for ( currow = 0 ; currow < totalNumbers ; ++currow)
{
if ( curpos[currow] < listOfIntegers[currow].size() )
{
minElement = listOfIntegers[currow][curpos[currow] ];
foundMinAt = currow;
break;
}
}
/* If all the elements were traversed in all the arrays, then no further work needs to be done */
if ( !(currow < totalNumbers ) )
break;
/*
Traverse each of the array and find out the first available minimum value
*/
for ( ;currow < totalNumbers; ++currow)
{
if ( listOfIntegers[currow][curpos[currow] ] < minElement )
{
minElement = listOfIntegers[currow][curpos[currow] ];
foundMinAt = currow;
}
}
/*
Store the minimum into the resultant array
and increment the element traversed
*/
result.push_back(minElement);
++curpos[foundMinAt];
}
}
The corresponding main goes like this.
int main()
{
vector< vector<int> > myInt;
vector<int> result;
myInt.push_back(vector<int>() );
myInt.push_back(vector<int>() );
myInt.push_back(vector<int>() );
myInt[0].push_back(10);
myInt[0].push_back(12);
myInt[0].push_back(15);
myInt[1].push_back(20);
myInt[1].push_back(21);
myInt[1].push_back(22);
myInt[2].push_back(14);
myInt[2].push_back(17);
myInt[2].push_back(30);
mergeAll(myInt,result);
for ( int i = 0; i < result.size() ; ++i)
{
cout << result[i] << endl;
}
}
You can generalize Merge Sort algorithm and work with multiple pointers. Initially, all of them are pointing to the beginning of each array. You maintain these pointers sorted (by the values they point to) in a priority queue. In each step, you remove the smallest element in the heap in O(log n) (n is the number of arrays). You then output the element pointed by the extracted pointer. Now you increment this pointer in one position and if you didn't reach the end of the array, reinsert in the priority queue in O(log n). Proceed this way until the heap is not empty. If there are a total of m elements, the complexity is O(m log n). The elements are output in sorted order this way.
Perhaps I'm misunderstanding the question...and I feel like I'm misunderstanding your solution.
That said, maybe this answer is totally off-base and not helpful.
But, especially with the number of vectors and push_back's you're already using, why do you not just use std::sort?
#include <algorithm>
void mergeAll(const vector<vector<int>> &origList, vector<int> &resultList)
{
for(int i = 0; i < origList.size(); ++i)
{
resultList.insert(resultList.end(), origList[i].begin(), origList[i].end());
}
std::sort(resultList.begin(), resultList.end());
}
I apologize if this is totally off from what you're looking for. But it's how I understood the problem and the solution.
std::sort runs in O(N log (N)) http://www.cppreference.com/wiki/stl/algorithm/sort
I've seen some solution on the internet to merge two sorted arrays, but most of them were quite cumbersome. I changed some of the logic to provide the shortest version I can come up with:
void merge(const int list1[], int size1, const int list2[], int size2, int list3[]) {
// Declaration & Initialization
int index1 = 0, index2 = 0, index3 = 0;
// Loop untill both arrays have reached their upper bound.
while (index1 < size1 || index2 < size2) {
// Make sure the first array hasn't reached
// its upper bound already and make sure we
// don't compare outside bounds of the second
// array.
if ((list1[index1] <= list2[index2] && index1 < size1) || index2 >= size2) {
list3[index3] = list1[index1];
index1++;
}
else {
list3[index3] = list2[index2];
index2++;
}
index3++;
}
}
If you want to take advantage of multi-threading then a fairly good solution would be to just merge 2 lists at a time.
ie suppose you have 9 lists.
merge list 0 with 1.
merge list 2 with 3.
merge list 4 with 5.
merge list 6 with 7.
These can be performed concurrently.
Then:
merge list 0&1 with 2&3
merge list 4&5 with 6&7
Again these can be performed concurrently.
then merge list 0,1,2&3 with list 4,5,6&7
finally merge list 0,1,2,3,4,5,6&7 with list 8.
Job done.
I'm not sure on the complexity of that but it seems the obvious solution and DOES have the bonus of being multi-threadable to some extent.
Consider the priority-queue implementation in this answer linked in a comment above: Merging 8 sorted lists in c++, which algorithm should I use
It's O(n lg m) time (where n = total number of items and m = number of lists).
All you need is two pointers (or just int index counters), checking for minimum between array A and B, copying the value over to the resultant list, and incrementing the pointer of the array the minimum came from. If you run out of elements on one source array, copy the remainder of the second to the resultant and you're done.
Edit:
You can trivially expand this to N arrays.
Edit:
Don't trivially expand this to N arrays :-). Do two at a time. Silly me.
If you are merging very many vector together, then you could speed up performance by using a sort of tree to determine which vector contains the smallest element. This is probably not necessary for your application, but comment if it is and I'll try to work it out.
You could just stick them all into a multiset. That will handle the sorting for you.