Merging 3 sets of arrays - c++

I am consider new to c++ and I have facing some doubt on merging three set of arrays.
For example:
x = 2,3,1,4,5
y = 1,3,5,7,9
z = 3,5,4,6,1
I would like to merge them into:
w = 2,1,3,3,3,5,1,5,4,4,7,6,5,9,1
I have been searching through Google. However what I get is how to merge the arrays and put them in an ascending orders.
What I actually needed:
1st from x, 1st from y, 1st from z, 2nd from x, 2nd from y, 2nd from z ............ 5th from z
Thank you very much!

It's just a matter of making a loop with i from 0 to 4 and mapping every i to the corresponding element of the array w.
Here's the skeleton of the algorithm:
std::array<int, 5> x, y, z;
std::array<int, 15> w;
for (int i = 0; i < 5; i++) {
w[i*3] = x[i];
w[i*3+1] = y[i];
w[i*3+2] = z[i];
}
And here's the working example.
If you are using std::vector, then the algorithm gets a little bit trickier. You'll need to find the maximum size, using std::max for example, and perform a loop based on that value. Then whenever a vector is empty, you'll need to skip it. Here's the skeleton again:
std::vector<int> x, y, z;
std::vector<int> w;
std::size_t max = std::max({x.size(), y.size(), z.size()});
for (std::size_t i = 0; i < max; i++) {
if (x.size() > i) w.push_back(x[i]);
if (y.size() > i) w.push_back(y[i]);
if (z.size() > i) w.push_back(z[i]);
}

As long as you know the size of x, y, z, and w, this is a fairly straightforward solution.
In C++, unlike other higher-level programming languages, most array-based operations are not handled by special functions. Instead, the user is required to write a loop to do this task themselves.
In your case, assuming that x, y, z, and w are all declared and defined properly, the most straightforward way is probably using a for loop, as follows:
int i;
for(i=0; i<(size_of_x); i++){
w[i*3] = x[i];
w[i*3+1] = y[i];
w[i*3+2] = z[i];
}
Notice that the variable size_of_x will need to be defined for this to work.

You might also want to consider the fact that the lengths of the arrays may differ.
int *resArray;
int totalLength = sizeof(x) + sizeof(y) + sizeof(z);
int maxLength = max(sizeof(x), max(sizeof(y), sizeof(x));
resArray = new int[totalLength];
int j = 0;
for (int i = 0; i < maxLength; i++)
{
if (i < sizeof(x))
{
resArray[j] = x[i];
j++
}
if (i < sizeof(y))
{
resArray[j] = y[i];
j++
}
if (i < sizeof(z))
{
resArray[j] = z[i];
j++
}
}
It'll not be the fastest solution, but it can handle arrays of different lengths.
Edit:
Do not forget to free the memory you've allocated using new.
And you can consider the use of std::vector

How about some C++11?
#include <vector>
int main()
{
std::vector<int> x {2,3,1,4,5}, y {1,3,5,7,9}, z {3,5,4,6,1};
std::vector<int> w;
for (int i {}; i < x.size(); ++i)
{
w.insert(v.end(),{x[i], y[i], z[i]});
}
}

Related

Looping Through Large, Multidimensional Array Using Rcpp

I am trying to create models that involve looping through large multidimensional arrays (ex: dimensions = 20 x 1000 x 60), which run very slow the way I code them in R. I downloaded Rcpp and have been trying to implement such a model, since C++ handles loops very well. Normally, I would write such a function in R as:
fun <- function(x,y,z){
f <- array(0, dim = c(18,50,10));
for (i in 1:18){
for (j in 1:50){
for (l in 1:10){
f[i,j,l] <- (i*j/10) + l;
}
}
}
return(f[x,y,z])
}
and as expected the function yields:
> fun(10,20,5)
[1] 25
This is what I thought the equivalent code in Rcpp should look like:
cppFunction('
double fun(int x, int y, int z){
int f[18][50][10] = {0};
for (int i = 1; i > 18; i++){
for (int j = 1; j > 50; j++){
for (int l = 1; l > 10; l++){
f[i][j][l] = (i * j/10) + l;
}
}
}
return f[x][y][z];
}
')
but I am getting 0's anytime I go to use the function.
> fun(10,20,5)
[1] 0
The actual models I'll be implementing use backward iteration, so I do need the arrays as part of the function. Alternatively, returning the array itself would also work for my purposes, but I haven't had luck with that either.
Any help would be sincerely appreciated.
Thanks
Remember that C++ is 0 indexed. You need to start your indexing at 0 rather than 1 as in R. You also need to make sure that your loops only continue while the value of i, j, and l are less than the dimensions of the array (so switch > for <. And your array needs to be an array of double, not int:
Rcpp::cppFunction('
double fun(int x, int y, int z){
double f[18][50][10] = {0};
for (int i = 0; i < 18; i++){
for (int j = 0; j < 50; j++){
for (int l = 0; l < 10; l++){
f[i][j][l] = (i * j/10) + l;
}
}
}
return f[x][y][z];
}
')
Testing gives:
fun(10, 20, 5)
#> [1] 25

Why am I getting "unknown signal 11" with this knapsack problem solver?

Task
Given n gold bars, find the maximum weight of gold that fits into bag of capacity W
Input
first line contains the capacity W of the knapsack and the number n of bars of gold. The next line contains n integers
Output
The max weight of gold that fits into a knapsack of capacity W.
Constraints
1 <= W <= 10000; 1<= n <= 300; 0 <= w0, w1, w2, ... , w(n-1) <= 100000
Code
#include <iostream>
#include <vector>
using std::vector;
int optimal_weight(int W, vector<int> w) {
int n = w.size() + 1;
int wt = W + 1;
int array [n][wt];
int val = 0;
for(int i = 0; i < wt; i++) array [0][i] = 0;
for(int i = 0; i < n; i++) array [i][0] = 0;
for(int i = 1; i< n; i++) {
for(int j = 1; j < wt; j++ ){
array[i][j] = array [i-1][j];
if (w[i-1] <= j) {
val = array[i-1][j - w[i-1]] + w[i-1];
if(array[i][j] < val) array[i][j] = val;
}
}
}
//printing the grid
// for(int i=0; i < n; i++) {
// for(int j=0; j < wt; j++) {
// cout<<array[i][j]<<" ";
// }
// cout<<endl;
// }
// cout<<endl;
return array [n-1][wt-1];
}
int main() {
int n, W;
std::cin >> W >> n;
vector<int> w(n);
for (int i = 0; i < n; i++) {
std::cin >> w[i];
}
std::cout << optimal_weight(W, w) << '\n';
}
The above code works fine for smaller inputs, but gives an unknown signal 11 error on the platform I wish to submit to. My best guess is of a possible segmentation fault, but I have been unable to debug it since quite some time now. Any help is much appreciated!
First note that your code doesn't work. That is, it doesn't compile when you adhere strictly to the C++ language standard, as C++ does not support variable-length arrays. (as noted by #Evg in a comment; some compilers offer this as an extension.)
The main reason for excluding those from C++ is probably why you're experiencing issues for larger problem sizes: the danger of stack overflows, the namesake of this website (as noted by #huseyinturgulbuyukisik in a comment). Variable-length arrays are allocated on the stack, whose size is limited. When you exceed it, you might attempt to write to a segment of memory that is not allocated to your process, triggering Linux signal 11, also known as SIGSEGV - the segmentation violation signal.
Instead of stack-based allocation, you should allocate your memory on the heap. A straightforward way to do so would be using the std::vector container (whose default allocator does indeed allocate on the heap). Thus, you would write:
std::vector<int> vec(n * wt);
and instead of array[i][j] you'd use vec[i * wt + j].
Now, this is not as convenient as using array[x][y]; for the extra convenience you can, for example, write a helper lambda, to access individual elements, e.g.
auto array_element = [&vec, wt](int x, int y) { return vec[x * wt + y]; }
with this lambda function available, you can now write statements such as array_element(i,j) = array_element(i-1,j);
or use a multi-dimensional container (std::vector<std::vector<int>> would work but it's ugly and wasteful IMHO; unfortunately, the standard library doesn't have a single-allocation multi-dimensional equivalent of that).
Other suggestions, not regarding a solution to your signal 11 issue:
Use more descriptive variable names, e.g. weight instead of wt and capacity instead of W. I'd also considersub_solutions_table or solutions_table instead of array, and might also rename i and j according to the semantics of the dynamic solution table.
You never actually need more than 2 rows of the solutions table; why not just allocate one row for the current iteration and one row for the previous iteration, and have appropriate pointers switch between them?
Replace
vector< vector< int> > k(n + 1,vector< int>(W + 1));
with
int array[n][w];

c++ Vector optimizations

i have vector, like this:
struct cords {
double x, y;
};
struct road {
cords start, end;
};
vector<road> roads;
And i found a function in my class, that works terribly slow. The main proper of the function is to get all the pairs from vector and to do some math with them. I'm not changing the values of vector's items inside, just reading them pretty often.
First problem i've noticed, that loop itself wasn't fast enough, that's why i'm using:
unsigned maxI = roads.size();
unsigned maxJ = roads.size();
for (unsigned i = 0; i < maxI; i++) {
for (unsigned j = i + 1; j < maxJ; j++) {
...
}
}
It gave a resonable time improvement to this function, but not enough.
As i told earlier, stuff inside is just math and few conditions, with calls to vector like that: roads[j].end.y.
Next step, i've notice, that if i'm doing
for (unsigned i = 0; i < maxI; i++) {
cords point1 = roads[i].start;
cords point2 = roads[i].end;
for (unsigned j = i + 1; j < maxJ; j++) {
and using point1, point2 instead of roads[j].end.y it works almost twice faster.
I just don't getting why is it happening and how can i improve it more.
UPD: Not sure, but it might be a compiler-depended question, so i'm using vs2015 with a built-in one.
If there is no need to temporary modify the point1and point2in the inner loop, then avoid to copy them just take const reference of them to improve speed.
const cords &point1 = roads[i].start;
const cords &point2 = roads[i].end;

Bucket sort and User input

Here's the problem I'm working on: a user gives me an unspecified number of points on a standard x,y coordinate plane, where 0 < x^2 + y^2 <= 1. (x squared plus y squared, just for clarity).
Here is an example of the input:
0.2 0.38
0.6516 -0.1
-0.3 0.41
-0.38 0.2
From there, I calculate the distance of those points from the origin, (0, 0). Here is the function I use to find the distance and push it into a vector of doubles, B.
void findDistance(double x = 0, double y = 0) {
double x2 = pow(x, 2);
double y2 = pow(y, 2);
double z = x2 + y2;
double final = sqrt(z);
B.push_back(final);
}
Then, I want to bucket sort vector B, where there are n buckets for n points. Here is my current build of the bucketSort:
void bucketSort(double arr[], int n)
{
vector<double> b[n];
for (int i=0; i<n; i++)
{
int bi = n*arr[i];
b[bi].push_back(arr[i]);
}
for (int i=0; i<n; i++)
sort(b[i].begin(), b[i].end());
int index = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < b[i].size(); j++)
arr[index++] = b[i][j];
}
My problem is I can't get bucketSort to work without crashing. I get a windows message saying the program has stopped working. Now, I know the function works, but only when I initialize the vector and fill it at the same time. This is an example of a call that works:
double arr[] = {0.707107, 0.565685, 0.989949, 0.848528 };
int n = sizeof(arr)/sizeof(arr[0]);
bucketSort(arr, n);
So far, I've yet to find any other format for calling and initializing the vector that the function will accept and run. I need to find a way to take the points, computer the distances, and sort the distances. Current main that I'm plugging in and getting as a backfire:
int main(){
int number;
while (cin >> number){
A.push_back(number); }
int q = 0; double r = 0; double d = 0;
while (q < (A.size() - 1)){
findDistance(A[q], A[q+1]);
q += 2;
}
double arr[B.size()]; copy(B.begin(), B.end(), arr);
int n = (sizeof(B) + sizeof(B[0])) / sizeof(B[0]);
bucketSort(arr, n);
int w = 0;
while (w < y){ cout << arr[w] << endl; w++; }
The arr copy was created in some strange debugging attempt: sorry if unclear. Results of distance function stored in B, copied into arr, and arr is what's attempted to be sorted. The user inputs are given through the command prompt, using the syntax listed in the beginning. Output should be something like:
0.42941
0.49241
0.50804
0.65923
If anyone can offer suggestions of edits to either of functions that would make it work, the assistance would be greatly appreciated.
Here are a few issues to work on:
Your input loop will stop when it reads a non-integer. Change number to double
Your size calculation
int n = (sizeof(B) + sizeof(B[0])) / sizeof(B[0]);
I am not sure what you are trying to do here, but sizeof on a vector is not what you want. I think replacing this with:
int n = B.size();
is what you want.
I am not sure why you needed to convert the vector to an array to do the bucket sort - much easier to just pass the vector through to the bucket sort, then the size comes with the vector.
Change the bucketSort function to take a reference to a vector:
void bucketSort(vector<double> &arr)
{
int n = B.size();
...
and just pass B into the function. The rest of the code should be the same.
Also a portability note: not every compiler supports variable sized arrays, you are better off sticking with vector wherever possible.

All possible combination. Faster way

I have a vector of numbers between 1 and 100(this is not important) which can take sizes between 3 and 1.000.000 values.
If anyone can help me getting 3 value unique* combinations from that vector.
*Unique
Example: I have in the array the following values: 1[0] 5[1] 7[2] 8[3] 7[4] (the [x] is the index)
In this case 1[0] 5[1] 7[2] and 1[3] 5[1] 7[4] are different, but 1[0] 5[1] 7[2] and 7[2] 1[0] 5[1] are the same(duplicate)
My algorithm is a little slow when i work with a lot of values(example 1.000.000). So what i want is a faster way to do it.
for(unsigned int x = 0;x<vect.size()-2;x++){
for(unsigned int y = x+1;y<vect.size()-1;y++){
for(unsigned int z = y+1;z<vect.size();z++)
{
// do thing with vect[x],vect[y],vect[z]
}
}
}
In fact it is very very important that your values are between 1 and 100! Because with a vector of size 1,000,000 you have a lot of numbers that are equal and you don't need to inspect all of them! What you can do is the following:
Note: the following code is just an outline! It may lack sufficient error checking and is just here to give you the idea, not for copy paste!
Note2: When I wrote the answer, I assumed the numbers to be in the range [0, 99]. Then I read that they are actually in [1, 100]. Obviously this is not a problem and you can either -1 all the numbers or even better, change all the 100s to 101s.
bool exists[100] = {0}; // exists[i] means whether i exists in your vector
for (unsigned int i = 0, size = vect.size(); i < size; ++i)
exists[vect[i]] = true;
Then, you do similar to what you did before:
for(unsigned int x = 0; x < 98; x++)
if (exists[x])
for(unsigned int y = x+1; y < 99; y++)
if (exists[y])
for(unsigned int z = y+1; z < 100; z++)
if (exists[z])
{
// {x, y, z} is an answer
}
Another thing you can do is spend more time in preparation to have less time generating the pairs. For example:
int nums[100]; // from 0 to count are the numbers you have
int count = 0;
for (unsigned int i = 0, size = vect.size(); i < size; ++i)
{
bool exists = false;
for (int j = 0; j < count; ++j)
if (vect[i] == nums[j])
{
exists = true;
break;
}
if (!exists)
nums[count++] = vect[i];
}
Then
for(unsigned int x = 0; x < count-2; x++)
for(unsigned int y = x+1; y < count-1; y++)
for(unsigned int z = y+1; z < count; z++)
{
// {nums[x], nums[y], nums[z]} is an answer
}
Let us consider 100 to be a variable, so let's call it k, and the actual numbers present in the array as m (which is smaller than or equal to k).
With the first method, you have O(n) preparation and O(m^2*k) operations to search for the value which is quite fast.
In the second method, you have O(nm) preparation and O(m^3) for generation of the values. Given your values for n and m, the preparation takes too long.
You could actually merge the two methods to get the best of both worlds, so something like this:
int nums[100]; // from 0 to count are the numbers you have
int count = 0;
bool exists[100] = {0}; // exists[i] means whether i exists in your vector
for (unsigned int i = 0, size = vect.size(); i < size; ++i)
{
if (!exists[vect[i]])
nums[count++] = vect[i];
exists[vect[i]] = true;
}
Then:
for(unsigned int x = 0; x < count-2; x++)
for(unsigned int y = x+1; y < count-1; y++)
for(unsigned int z = y+1; z < count; z++)
{
// {nums[x], nums[y], nums[z]} is an answer
}
This method has O(n) preparation and O(m^3) cost to find the unique triplets.
Edit: It turned out that for the OP, the same number in different locations are considered different values. If that is really the case, then I'm sorry, there is no faster solution. The reason is that all the possible combinations themselves are C(n, m) (That's a combination) that although you are generating each one of them in O(1), it is still too big for you.
There's really nothing that can be done to speed up the loop body you have there. Consider that with 1M vector size, you are making one trillion loop iterations.
Producing all combinations like that is an exponential problem, which means that you won't be able to practically solve it when the input size becomes large enough. Your only option would be to leverage specific knowledge of your application (what you need the results for, and how exactly they will be used) to "work around" the issue if possible.
Possibly you can sort your input, make it unique, and pick x[a], x[b] and x[c] when a < b < c. The sort will be O(n log n) and picking the combination will be O(n³). Still you will have less triplets to iterate over:
std::vector<int> x = original_vector;
std::sort(x.begin(), x.end());
std::erase(std::unique(x.begin(), x.end()), x.end());
for(a = 0; a < x.size() - 2; ++a)
for(b=a+1; b < x.size() - 1; ++b)
for(c=b+1; c< x.size(); ++c
issue triplet(x[a],x[b],x[c]);
Depending on your actual data, you may be able to speed it up significantly by first making a vector that has at most three entries with each value and iterate over that instead.
As r15habh pointed out, I think the fact that the values in the array are between 1-100 is in fact important.
Here's what you can do: make one pass through the array, reading values into a unique set. This by itself is O(n) time complexity. The set will have no more than 100 elements, which means O(1) space complexity.
Now since you need to generate all 3-item permutations, you'll still need 3 nested loops, but instead of operating on the potentially huge array, you'll be operating on a set that has at most 100 elements.
Overall time complexity depends on your original data set. For a small data set, time complexity will be O(n^3). For a large data set, it will approach O(n).
If understand your application correctly then you can use a tuple instead, and store in either a set or hash table depending on your requirements. If the normal of the tri matters, then make sure that you shift the tri so that lets say the largest element is first, if normal shouldn't matter, then just sort the tuple. A version using boost & integers:
#include <set>
#include <algorithm>
#include "boost/tuple/tuple.hpp"
#include "boost/tuple/tuple_comparison.hpp"
int main()
{
typedef boost::tuple< int, int, int > Tri;
typedef std::set< Tri > TriSet;
TriSet storage;
// 1 duplicate
int exampleData[4][3] = { { 1, 2, 3 }, { 2, 3, 6 }, { 5, 3, 2 }, { 2, 1, 3 } };
for( unsigned int i = 0; i < sizeof( exampleData ) / sizeof( exampleData[0] ); ++i )
{
std::sort( exampleData[i], exampleData[i] + ( sizeof( exampleData[i] ) / sizeof( exampleData[i][0] ) ) );
if( !storage.insert( boost::make_tuple( exampleData[i][0], exampleData[i][1], exampleData[i][2] ) ).second )
std::cout << "Duplicate!" << std::endl;
else
std::cout << "Not duplicate!" << std::endl;
}
}