How to extract columns of a 2D array in c++? - c++

My task is to count the no of elements greater than an element aij in the corresponding row i and column j for every element of a 2D array in C++. My way is to extract the ith row and jth column, sort them and traverse the sorted array with a counter variable until aij element is found.
But the problem is in extracting the entire row i and entire column j for every such element. I know that the row can easily be extracted with std::copy function in c++.
int **adj=new int *[n];
for(r=0;r<m;r++)
for(c=0;c<n;c++)
cin>>adj[r][c];
int buf[n];
std::copy(adj[i], adj[i] + n, buf);
But how to extract the corresponding jth column?
I can easily do it with a looping structure like:
int buf[m];
for(r=0;r<m;r++)
buf[r]=adj[r][j];
but this will increase time complexity keeping in mind that this operation is required for every element of the array. Any better way to do this?

If you decide to write program in C++, then
Stop using plain C-Style arrays. There is No reason whatsoever for C-Style arrays. Do never use them again. Simply stop this.
Stop using raw pointers. For now and ever. Do not use raw pointers
Do not use new. Never
The language C++, which you want to use, does not support VLA (avariable length arrays), Do not use C-Style arrays in the first place and not at all VLA (like int buf[m];
Especially, do not use such constructs, if you do not understand how thew work
In you first row, you are writing
int **adj=new int *[n];
With that you are allocating an array of pointer. Those pointers are not initialized. They point to somewhere random in the memory.
And with
for(r=0;r<m;r++)
for(c=0;c<n;c++)
cin>>adj[r][c];
You are getting user input and write them into random memory, somehwere, undifined, corrupting the heap and causing a crash.
With
int buf[n];
std::copy(adj[i], adj[i] + n, buf);
you copy some random values into buf. It will look like it works. But it is only by accident.
In the future please use std::vector or std array (if you know the dimension at compile time). For 2 dimensional arrays use a vector of vectors.
See the following example:
int main()
{
const size_t numberOfRows = 3;
const size_t numberOfColumns = 4;
std::vector<std::vector<int>> a2d(numberOfRows, std::vector<int>(numberOfColumns));
// Fill a2d with data
for (size_t row = 0; row < a2d.size(); ++row) {
for (size_t col = 0; col < a2d.front().size(); ++col) {
std::cin >> a2d[row][col];
}
}
// Get 2nd row
std::vector<int> row(numberOfColumns);
std::copy(a2d[1].begin(), a2d[1].end(), row.begin());
return 0;
}

But the problem is in extracting the entire row i and entire column j for every such element.
The algorithm you are trying to implement doesn't need to copy and sort the row and the column every time. You can copy and sort each row and each column once, then reuse those for every element. While time consuming, it should be asintotically faster than traversing the rows and columns multiple times to count the greater values.
See e.g. the following implementation (testable HERE).
#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>
int main()
{
std::vector<std::vector<int>> a {
{3, 5, 1, 2},
{8, 0, -2, 7},
{1, -5, 3, 6},
};
// Make a transposed copy. That's really cache unfriendly
auto a_t = std::vector<std::vector<int>>(a[0].size(), std::vector<int>(a.size()));
for (size_t r = 0; r < a.size(); ++r)
{
for (size_t c = 0; c < a[r].size(); ++c)
{
a_t[c][r] = a[r][c];
}
}
// Sort the rows of a_t (columns of a)
for (auto & row : a_t)
{
std::sort(row.begin(), row.end());
}
auto c = std::vector<std::vector<int>>(a.size(), std::vector<int>(a[0].size()));
for (size_t i = 0; i < c.size(); ++i)
{
// Sort a (copied) row at a time.
auto row_copy(a[i]);
std::sort(row_copy.begin(), row_copy.end());
// The columns have already been copied and sorted,
// now it just takes a couple of binary searches.
for (size_t j = 0; j < c[i].size(); ++j)
{
auto it_r = std::upper_bound(row_copy.cbegin(), row_copy.cend(), a[i][j]);
auto it_c = std::upper_bound(a_t[j].cbegin(), a_t[j].cend(), a[i][j]);
c[i][j] = std::distance(it_r, row_copy.cend())
+ std::distance(it_c, a_t[j].cend());
}
}
for (auto const & row : c)
{
for (auto i : row)
std::cout << std::setw(3) << i;
std::cout << '\n';
}
}

Related

How to insert many values in arrays efficiently (with C++17 or earlier)?

Given an array with a size with m entries having size of 3: n = m * 3. On every position after m values a new value should be inserted that the size is afterwards n = m * 4.
My naive approach would be to create a larger array with the new size and than iterate over the arrays copying the values into the new one and every m steps to add another step for the new-array-iteration and filling in the new value.
The example code below might help understanding what I try to achieve.
I guess this might not be efficient. Unfortunately this array interleaving is carried out often. Therefore, the code should be efficient to avoid long computing times.
What are some other stategies and are there tools in the current standard library to use? The example uses an array but I am open to other containers.
Edit: Example code
#include <omp.h>
#include <cstdlib>
int main(int argc, char* argv[])
{
const auto arrSize = 52428800; // entries for the initial array
auto arr = new double[arrSize * 3]; // 3 sequential elements are a group
#pragma omp parallel for
for( auto i = 0; i < arrSize*3; ++i)
{
arr[i] = 1.0*i/arrSize;
}
auto larr = new double[arrSize * 4]; // larger array with space for interleaving elements
#pragma omp parallel
#pragma omp for
for (auto i = 0; i < arrSize; ++i) // after each element group of three a new element should be inserted
{
auto k = 0;
auto j = i*3;
for(k=0; k < 3; ++k)
{
larr[i*4+k] = arr[j+k];
}
larr[i*4+k] = 0;
}
delete[] arr;
delete[] larr;
return EXIT_SUCCESS;
}
Do your elements need to be sequential in memory? If not you can have a vector of vectors and do the interleaving when accessing the vectors.
Personally, I would use a custom container class that wraps a vector< vector<foo>>:
template <class T>
class interleaved_vector
{
std::vector< std::vector <T> > data;
public:
void interleave(const std::vector <T> & v)
{
data.push_back(v);
}
/* I am a little bit rusty on move-semantics, so I don't guarantee this is
correct */
void interleave(std::vector &&v)
{
data.push_back(std::move(v));
}
// Access the data in an interleaved way
// This assumes all the interleaved vectors have the same size.
T operator[] (size_t i) const
{
size_t i1 = i / data.front().size();
size_t i2 = i % data.front().size();
return data[i1][i2];
}
T & operator[] (size_t i)
{
size_t i1 = i / data.front().size();
size_t i2 = i % data.front().size();
return data[i1][i2];
}
}
Now, you can declare a container of interleaved ints:
interleaved_vector<int> iv;
// populate the vector
std::vector<int> v(m)
iv.interleave(v);
iv.interleave(v);
iv.interleave(v);
// populate it using move-semantic
iv.interleave(std::move(v));
// read the n-th element after interleaving
int i = iv[n];
// set the n-th element after interleaving
iv[n] = 1234;
This is not a complete container class, just a generic idea. It lacks iterators etc.... How to complete it, and better fit it to your use-case, is left as an exercise to the reader

Dynamically allocating memory

I am new to C++ and programming in general so i apologize if this is a trivial question.I am trying to initialize 2 arrays of size [600][600] and type str but my program keeps crashing.I think this is because these 2 arrays exceed the memory limits of the stack.Also,N is given by user so i am not quite sure if i can use new here because it is not a constant expression.
My code:
#include<iostream>
using namespace std;
struct str {
int x;
int y;
int z;
};
int main(){
cin>>N;
str Array1[N][N]; //N can be up to 200
str Array2[N][N];
};
How could i initialize them in heap?I know that for a 1-D array i can use a vector but i don't know if this can somehow be applied to a 2-D array.
How 2-or-more-dimensional arrays work in C++
A 1D array is simple to implement and dereference. Assuming the array name is arr, it only requires one dereference to get access to an element.
Arrays with 2 or more dimensions, whether dynamic or stack-based, require more steps to create and access. To draw an analogy between a matrix and this, if arr is a 2D array and you want access to a specific element, let's say arr[row][col], there are actually 2 dereferences in this step. The first one, arr[row], gives you access to the row-th row of col elements. The second and final one, arr[row][col] reaches the exact element that you need.
Because arr[row][col] requires 2 dereferences for one to gain access, arr is no longer a pointer, but a pointer to pointer. With regards to the above, the first dereference gives you a pointer to a specific row (a 1D array), while the second dereference gives the actual element.
Thus, dynamic 2D arrays require you to have a pointer to pointer.
To allocate a dynamic 2D array with size given at runtime
First, you need to create an array of pointers to pointers to your data type of choice. Since yours is string, one way of doing it is:
std::cin >> N;
std::string **matrix = new string*[N];
You have allocated an array of row pointers. The final step is to loop through all the elements and allocate the columns themselves:
for (int index = 0; index < N; ++index) {
matrix[index] = new string[N];
}
Now you can dereference it just like you would a normal 2D grid:
// assuming you have stored data in the grid
for (int row = 0; row < N; ++row) {
for (int col = 0; col < N; ++col) {
std::cout << matrix[row][col] << std::endl;
}
}
One thing to note: dynamic arrays are more computationally-expensive than their regular, stack-based counterparts. If possible, opt to use STL containers instead, like std::vector.
Edit: To free the matrix, you go "backwards":
// free all the columns
for (int col = 0; col < N; ++col) {
delete [] matrix[col];
}
// free the list of rows
delete [] matrix;
When wanting to allocate a 2D array in C++ using the new operator, you must declare a (*pointer-to-array)[N] and then allocate with new type [N][N];
For example, you can declare and allocate for your Array1 as follows:
#define N 200
struct str {
int x, y, z;
};
int main (void) {
str (*Array1)[N] = new str[N][N]; /* allocate */
/* use Array1 as 2D array */
delete [] Array1; /* free memory */
}
However, ideally, you would want to let the C++ containers library type vector handle the memory management for your. For instance you can:
#include<vector>
..
std::vector <std::vector <str>> Array1;
Then to fill Array1, fill a temporary std::vector<str> tmp; for each row (1D array) of str and then Array1.push_back(tmp); to add the filled tmp vector to your Array1. Your access can still be 2D indexing (e.g. Array1[a][b].x, Array1[a][b].y, ..., but you benefit from auto-memory management provided by the container. Much more robust and less error prone than handling the memory yourself.
Normally, you can initialize memory in heap by using 'new' operator.
Hope this can help you:
// Example program
#include <iostream>
struct str {
int x;
int y;
int z;
};
int main()
{
int N;
std::cin>>N;
str **Array1 = new str*[N]; //N can be up to 200
for (int i = 0; i < N; ++i) {
Array1[i] = new str[N];
}
// set value
for (int row = 0; row < N; ++row) {
for (int col = 0; col < N; ++col) {
Array1[row][col].x=10;
Array1[row][col].y=10;
Array1[row][col].z=10;
}
}
// get value
for (int row = 0; row < N; ++row) {
for (int col = 0; col < N; ++col) {
std::cout << Array1[row][col].x << std::endl;
std::cout << Array1[row][col].y << std::endl;
std::cout << Array1[row][col].z << std::endl;
}
}
}

how to refer to a column of a 2D array (matrix) in C++

I have declared a 2D array in the following way (please note that I'm very beginner!)
double **A=new double*[10];
for(int i=0;i<10;i++)
A[i]=new double[5];
So I guess this already defines a matrix of size 10 by 5.
I know that to refer to its row I can use
A[i]
But the question is, how to refer to a column of A? something like A[][i]?
You cannot access a column directly.
You can either access a row by A[i] (which is an array itself) or an element A[i][j] (which is a single double in your case).
If you want to get a column you have to iterate throw the array
for(unsigned int i = 0; i < 10; i++)
{
A[i][2] // do something
}
Accesses the third column.
So it is useful to think about if you want to create a 10x5 or a 5x10 matrix. If you often need to work with just a row or an column, it may be a good idea to invert the array layout (here switch columns and rows)
EDIT:
Here is some simplified explanation:
Imagine the following code
int** A = new int*[2];
for(int i=0;i<2;i++)
A[i]=new int[3];
// more init code
Then the array in memory may look like this:
So it is simple to see that the "blue row" can be accessed directly as you have its startaddress in A[0]
But if you want every third element of the sub arrays you have to iterate through A and add 2 to every startadress. Especially as there is no guaranteed fixed distance in memory between the subarrays if you use heap memory via "new".
But you often can speedup computations by choosing the layout of your arrays in a good way. One could for example store the second matrix transposed when implementing matrix multiplication.
Enother way using pointers.
// Matrix dimentions
int n_rows = 10;
int n_cols = 5;
// create rows as pointers to columns
double **A = new double*[n_rows];
// create columns
for (int i = 0; i < n_rows; i++)
{
A[i] = new double[n_cols];
}
// Fill our matrix
int count=0;
for (int i = 0; i < n_rows; i++)
{
for (int j = 0; j < n_cols; j++)
{
A[i][j]=count;
++count;
}
}
// Create pointers columns
double ***A_t = new double**[n_cols];
// create matrix pointers rows
for (int i = 0; i < n_cols; i++)
{
A_t[i] = new double*[n_rows];
}
// And fill it with pointers to transposed main matrix elements
for (int i = 0; i < n_rows; i++)
{
for (int j = 0; j < n_cols; j++)
{
A_t[j][i]=&A[i][j];
}
}
// output first row/column for each case
for (int i = 0; i < n_cols; i++)
{
cout << *(A[0]+i) << endl;
}
cout << "-------------"<< endl;
for (int i = 0; i < n_rows; i++)
{
cout << **(A_t[0]+i) << endl;
}
// Work with matrices here
// Don't forget to clean everything.
A 2D array looks something like this
if ur 2d array is a[i][j]
then i will be ur rows and j will be ur columns.
if u want to access columns of first row you can do something like this
a[0][1] a[0][2] a[0][3]
see the link it will clear you more.
http://i.stack.imgur.com/21Bqr.png
Rows and columns are sort of abstracted away, you can consider the spatial orientation to go in either direction if you want (10 columns, or 5 columns if you see what I mean). But obviously you need to keep your use consistent.
It probably makes sense to keep the 'outer' array the column, so that A[x][y] makes sense in a cartesian coordinate type sense. From your example, you want to be indexing like A[i][] (i.e. your i index is the column, or X coordinate).
You cannot refer to a column in this way. This is because you dont really have a matrix, you specified an array of arrays. The arrays are your rows, but the columns are not directly stored. If you want to get a whole column, you have to run through all rows, receive the value stored in that column and store it in a different array.
auto column = new double[10];
for(int i = 0; i < 10; i++){
column[i] = A[i][2] //if you want to get the 2nd column
}
Elements of any row can be referred by arr[row_num][i].
Similarly elements of any column can be referred by arr[i][col_num].
Note that indexes are zero-based in C/C++. So if your column/row size is x, i can vary from 0 to x-1.
As you are a beginner, i would also like to tell you a bit more about arrays in C/C++. Firstly, i would suggest you to read about pointers if you are not familiar with them.
When you declare an array, say, int arr[10], arr[0] means the first element. Also, arr + 0 (or simply arr) means the address of the first element. Similarly, arr[i] means ith element, arr + i means address of the ith element. To print the value at an address, in c/c++, you can use value-at operator, represented by (*), e.g. *(arr + i) will be equivalent to arr[i], i.e. the ith element. Also, address of operator (&) gives to the address of an element, &arr[i] is equivalent to (arr + i).
If arr is a 2-d array, arr[i][j] means jth element of ith row. arr[i] means address of first element of ith row. c/c++ are row-major, which means first row is filled first and then second and so on. and we have to specify the row size always while declaring a 2-d array.
Note: In pointer-arithmetic, arr+i, and arr+i+1, etc. are not being incremented by 1, but by the size of the element it is pointing to.
So, to refer to a row, we can do the following:
//note that arr[row_num] is an address
int * new_1d_arr = arr[row_num];
for(int i = 0; i < row_size; i++)
cout << new_1d_arr[i] << endl;
Similarly, we can also refer to an column, but it would be a bit more complex, as we will have to increment i, not by 1, but by the row_size, due to the fact that arrays in c/c++ are row-major, and we would have to skip over number of elements (equal to row_size) to get to the next element in the same column.

How to create a two dimensional array of given size in C++

I need to create a square matrix of a given size. I know how to create a dynamic one-dimensional array of a given size. Doesn't the same work for two dimensinal arrays like the lines below?
cin>>size;
int* a[][]=new int[size][size]
int* a[][]=new int[size][size]
No, this doesn't work.
main.cpp:4: error: only the first dimension of an allocated array may have dynamic size
new int[size][size];
^~~~
If the size of the rows were fixed then you could do:
// allocate an array with `size` rows and 10 columns
int (*array)[10] = new int[size][10];
In C++ you can't have raw arrays with two dimensions where both dimensions are dynamic. This is because raw array indexing works in terms of pointers; for example, in order to access the second row a pointer to the first needs to be incremented by the size of the row. But when the size of a row is dynamic the array doesn't know that size and so C++ doesn't know how to figure out how to do the pointer increment.
If you want an array with multiple dynamic dimensions, then you need to either structure the array allocations such that C++'s default array indexing logic can handle it (such as the top answers to this duplicate question), or you need to implement the logic for figuring out the appropriate pointer increments yourself.
For an array where each row has the same size I would recommend against using multiple allocations such as those answers suggest, or using a vector of vectors. Using a vector of vectors addresses the difficulty and dangerousness of doing the allocations by hand, but it still uses more memory than necessary and doesn't allow faster memory access patterns.
A different approach, flattening the multi-dimensional array, can make for code as easy to read and write as any other approach, doesn't use extra memory, and can perform much, much better.
A flattened array means you use just a single dimentional array that has the same number of elements as your desired 2D array, and you perform arithmetic for converting between the multi-dimensional indices and the corresponding single dimensional index. With new it looks like:
int *arr = new int[row_count * column_count];
Row i, column j in the 2d array corresponds to arr[column_count*i + j]. arr[n] corresponds to the element at row n/column_count and column n% column_count. For example, in an array with 10 columns, row 0 column 0 corresponds to arr[0]; row 0, column 1 correponds to arr[1]; row 1 column 0 correponds to arr[10]; row 1, column 1 corresponds to arr[11].
You should avoid doing manual memory management using raw new and delete, such as in the case of int *arr = new int[size];. Instead resource management should be wrapped up inside a RAII class. One example of a RAII class for managing dynamically allocated memory is std::vector.
std::vector<int> arr(row_count * column_count);
arr[column_count*i + j]
You can further wrap the logic for computing indices up in another class:
#include <vector>
class Array2d {
std::vector<int> arr;
int columns;
public:
Array2d(int rows, int columns)
: arr(rows * columns)
, columns(columns)
{}
struct Array2dindex { int row; int column; };
int &operator[] (Array2dindex i) {
return arr[columns*i.row + i.column];
}
};
#include <iostream>
int main() {
int size;
std::cin >> size;
Array2d arr(size, size);
for (int i = 0; i < size; ++i) {
for (int j = 0; j < size; ++j) {
arr[{i, j}] = 100;
}
}
for (int i = 0; i < size; ++i) {
for (int j = 0; j < size; ++j) {
std::cout << arr[{i, j}] << ' ';
}
std::cout << '\n';
}
}
If you're using C++11 you can also use std::array.
const int iRows = 3, iCols = 3; // number of rows and columns
std::array<std::array<int, iCols>, iRows> matrix;
// fill with 1,2,3 4,5,6 7,8,9
for(int i=0;i<iRows;++i)
for(int j=0;j<iCols;++j)
matrix[i][j] = i * iCols + j + 1;
This class also allows for bounds checking by using the function
std::array::at
which (just like operator[]) returns a const reference if the array-object is const-qualified or a reference if it is not. Please note that
std::array
is not a variable-sized array-type, like
std::vector
You can use std::vector:
std::vector<std::vector<int*>> a(size, std::vector<int*>(size));
This will create a dynamically allocated 2D array of int* with width and height equal to size.
Or the same with new:
int*** a = new int**[size];
for (size_t i = 0; i < size; ++i)
a[i] = new int*[size];
...
for (size_t i = 0; i < size; ++i)
delete a[i];
delete a;
Note that there's no new[][] operator in C++, you just have to call new[] twice.
However, if you want to do it with new and delete instead of std::vector, you should use smart pointers instead of raw pointers, for example:
std::unique_ptr<std::unique_ptr<int*>[]> a(new std::unique_ptr<int*>[size]);
for (size_t i = 0; i < size; ++i)
a[i].reset(new int*[size]);
...
// No need to call `delete`, std::unique_ptr does it automatically.

C/C++ How a 3 dimensional array is stored in memory and what is the fastest way to traverse it

I am trying to understand how a 3 dimensional array is stored in memory and the difference between how std:vector is stored.
This is the way I understand that they are stored, and std::vectors, same way, with the difference that they make full use of memory blocks
a[0][0][0] a[0][0][1] a[0][0][2]... a[0][1][0] a[0][1][1] ... a[1][0][0] a[1][0][1]...
My goal is to find which is the most efficient way to traverse and array.
For example, I have array:
v[1000][500][3];
so how is more efficient to traverse it?
for(i = 0; i < 1000; i++)
{
for(j = 0; j < 500; j++)
{
for(k = 0; k < 3; ++k)
{
//some operation
}
}
}
or may be it would be more efficient to declare the array as;
v[3][500][1000]
and to traverse as
for(i = 0; i < 3; i++) {
for(j = 0; j < 500; j++)
{
for(k = 0; k < 1000; ++k)
{
//some operation
}
} }
Is there any CL tool to visualize how arrays are stored?
You're right in your representation of arrays in memory values are contiguous. So an int v[2][2][2] initialized to 0 would look like:
[[[0, 0], [0, 0]], [[0, 0], [0, 0]]]
As far as performance goes you want to access data as close to each other as possible to avoid data cache misses so iterating on the outer most dimension first is a good thing since they are located next to each other.
Something that might happen though with your first example is the compiler might optimize the inner loop(if right conditions are met) and unroll it so you would save some time there by skipping branching.
Since both your example are already iterating in the right way, I would say profile it and see which is faster.
std::vector also store its element contiguous in memory but since it is 1 dimension, locality apply by default(provided you aren't iterating randomly). The good side of vector is they can grow whereas an array can't(automatically anyway).
When the memory address is continuous (e.g., complied time array a[][][]), the most efficient way to traverse a multidimensional array is use a pointer. The a[i][j][k] actually is &a[0][0][0]+(i*j*k + j*k + k). Thus, initialize a pointer p to the beginning address, then calls *(p++)
int main() {
int a[2][3]={{1,2,3},{4,5,6}};
int *p = &a[0][0];
for( int i=0; i<6; ++i ){
cout<<*(p++)<<endl;
}
return 0;
}
To make it visible:
#include <iostream>
int main()
{
int a[][3] = { { 0, 1, 2 }, { 3, 4, 5 } };
int* p = reinterpret_cast<int*>(a);
for(unsigned i = 0; i < 6; ++i) {
std::cout << *(p + i);
}
std::cout << std::endl;
return 0;
}
Shows a row major order - see: http://en.wikipedia.org/wiki/Row-major_order
Having this, you should iterate per row to utilize the cache. In higher dimension N you will get similar, where each element represents a block of data with a dimension N-1