C++ Removing empty elements from array

C++ Removing empty elements from array - c++

I only want to add a[i] into the result array if the condition is met, but this method causes empty elements in the array as it adds to result[i]. Is there a better way to do this?
for(int i=0; i<N; i++)
{
if(a[i]>=lower && a[i]<=upper)
{
count++;
result[i]=a[i];
}
}

you can let result stay empty at first, and only push_back a[i] when the condition is met:
std::vector<...> result;
for (int i = 0; i < N; i++)
{
if (a[i] >= lower && a[i] <= upper)
{
result.push_back(a[i]);
}
}
and count you can leave out, as result.size() will tell you how many elements satisfied the condition.
to get a more modern solution, like how Some programmer dude suggested, you can use std::copy_if in combination with std::back_inserter to achieve the same thing:
std::vector<...> result;
std::copy_if(a.begin(), a.end(), std::back_inserter(result),
[&](auto n) {
return n >= lower && n <= upper;
});

Arrays in c++ are dumb.
They are just pointers to the beginning of the array and don't know their length.
If you just arr[i] you have to be sure that you aren't out of bounds. In that case it is undefined behavior as you dont know what part of meory have you written over. You could as well write over a different variable or beginning of another array.
So when you try to add results to an array you already have to have the array created with enough space.
This boilerplate of deleting and creating dumb arrays so that you can grow the array is very efficiently done in std::vector container which remembers number of elements stored, number of elements that could be stored and the array itself. Every time you try to add element when the reserved space is full it creates a new array two times the size of the original one and copy the data over. Which is O(n) in worst case but O(1) in avarege case (it may deviate when the n is under certain threshold)
Then the answer from Stack Danny applies.
Also use emplace_back instead of push_back if you can it is able to construct the data type in place based on the constructor parameters and in other cases it tries to act like push_back. It basically does what you want the fastest way possible so you avoid as much copies as possible.

count=0;
for(int i=0; i<N; i++)
{
if(a[i]>=lower && a[i]<=upper)
{
count++;
result[count] = a[i];
}
}
Try this.
Your code was copying elements from a[i] and pasting it in result[i] at random places.
For example, if a[0] and a[2] meet the required condition, but a[1] doesn't, then your code will do the following:
result[0] = a[0];
result[2] = a[2];
Notice how result[1] remains empty because a[1] didn't meet the required condition. To avoid empty positions in the result array, use another variable for copying instead of i.

Related

Function to check if an array is a permutation

I have to write a function which accepts an int array parameter and checks to see if it is a
permutation.
I tried this so far:
bool permutationChecker(int arr[], int n){
for (int i = 0; i < n; i++){
//Check if the array is the size of n
if (i == n){
return true;
}
if (i == arr[n]){
return true;
}
}
return false;
}
but the output says some arrays are permutations even though they are not.

When you write i == arr[n], that doesn't check whether i is in the array; that checks whether the element at position n is i. Now, that's even worse here, as the array size is n, so there's no valid element at position n: it's UB, array is overindexed.
If you'd like to check whether i is in the array, you need to scan each element of the array. You can do this using std::find(). Either that, or you might sort (a copy of) the array, then check if i is at position i:
bool isPermutation(int arr[], int n){
int* arr2 = new int[n]; // consider using std::array<> / std::vector<> if allowed
std::copy(arr, arr + n, arr2);
std::sort(arr2, arr2 + n);
for (int i = 0; i < n; i++){
if (i != arr2[i]){
delete[] arr2;
return false;
}
}
delete[] arr2;
return true;
}

One approach to checking that the input contains one of each value is to create an array of flags (acting like a set), and for each value in your input you set the flag to true for the corresponding index. If that flag is already set, then it's not unique. And if the value is out of range then you instantly know it's not a permutation.
Now, you would normally expect to allocate additional data for this temporary set. But, since your function accepts the input as non-constant data, we can use a trick where you use the same array but store extra information by making values negative.
It will even work for all positive int values, since as of C++20 the standard now guarantees 2's complement representation. That means for every positive integer, a negative integer exists (but not the other way around).
bool isPermutation(int arr[], int n)
{
// Ensure everything is within the valid range.
for (int i = 0; i < n; i++)
{
if (arr[i] < 1 || arr[i] > n) return false;
}
// Check for uniqueness. For each value, use it to index back into the array and then
// negate the value stored there. If already negative, the value is not unique.
int count = 0;
while (count < n)
{
int index = std::abs(arr[count]) - 1;
if (arr[index] < 0)
{
break;
}
arr[index] = -arr[index];
count++;
}
// Undo any negations done by the step above
for (int i = 0; i < count; i++)
{
int index = std::abs(arr[i]) - 1;
arr[index] = std::abs(arr[index]);
}
return count == n;
}
Let me be clear that using tricky magic is usually not the kind of solution you should go for because it's inevitably harder to understand and maintain code like this. That should be evident simply by looking at the code above. But let's say, hypothetically, you want to avoid any additional memory allocation, your data type is signed, and you want to do the operation in linear time... Well, then this might be useful.

A permutation p of id=[0,...,n-1] a bijection into id. Therefore, no value in p may repeat and no value may be >=n. To check for permutations you somehow have to verify these properties. One option is to sort p and compare it for equality to id. Another is to count the number of individual values set.
Your approach would almost work if you checked (i == arr[i]) instead of (i == arr[n]) but then you would need to sort the array beforehand, otherwise only id will pass your check. Furthermore the check (i == arr[n]) exhibits undefined behaviour because it accesses one element past the end of the array. Lastly the check (i == n) doesn't do anything because i goes from 0 to n-1 so it will never be == n.
With this information you can repair your code, but beware that this approach will destroy the original input.
If you are forced to play with arrays, perhaps your array has fixed size. For example: int arr[3] = {0,1,2};
If this were the case you could use the fact that the size is known at compile time and use an std::bitset. [If not use your approach or one of the others given here.]
template <std::size_t N>
bool isPermutation(int const (&arr)[N]) {
std::bitset<N> bits;
for (int a: arr) {
if (static_cast<std::size_t>(a) < N)
bits.set(a);
}
return bits.all();
}
(live demo)
You don't have to pass the size because C++ can infer it at compile time. This solution also does not allocate additional dynamic memory but it will get into problems for large arrays (sat > 1 million entries) because std::bitset lives on automatic memory and therefore on the stack.

Segmentation fault in online compilers

The code below works fine in gdb and VS code but other online compilers keep throwing "segmentation fault". Can anyone please help me with this? Every question I try to solve I keep getting this error.
For example:
Given an array of integers. Find the Inversion Count in the array.
Inversion Count: For an array, inversion count indicates how far (or close) the array is from being sorted. If the array is already sorted then the inversion count is 0. If an array is sorted in the reverse order then the inversion count is the maximum.
Formally, two elements a[i] and a[j] form an inversion if a[i] > a[j] and i < j.
Code:
long long int inversionCount(long long arr[], long long N) {
vector<long long> v;
long long int count = 0;
for (int i = 0; i < N; i++) v[i]= arr[i];
auto min = min_element(arr, arr+ N);
auto max = max_element(arr, arr+ N);
swap(v[0], *min);
v.erase(max);
v.push_back(*max);
for (int i = 0; i < N; i++) {
if(v[i] > v[i+1]) {
swap(v[i],v[i+1]);
count++;
}
return count;
}
}

You have a number of problems here. For one, you try to use elements in v without ever allocating space for them (i.e., you're using subscripts to refer to elements of v even though its size is still zero elements. I'd usually use the constructor that takes two iterators to copy an existing collection (and pointers can be iterators too).
std::vector<long long> v { arr, arr+N};
Assuming you fix that, this:
v.erase(max);
... invalidates max and every other iterator or reference to any point between the element max previously pointed to, and the end of the collection. Which means that this:
v.push_back(*max);
...is attempting to dereference an invalid iterator, which produces undefined behavior.

Sort Array By Parity the result is not robust

I am a new programmer and I am trying to sort a vector of integers by their parities - put even numbers in front of odds. The order inside of the odd or even numbers themselves doesn't matter. For example, given an input [3,1,2,4], the output can be [2,4,3,1] or [4,2,1,3], etc. Below is my c++ code, sometimes I got luck that the vector gets sorted properly, sometimes it doesn't. I exported the odd and even vectors and they look correct, but when I tried to combine them together it is just messed up. Can someone please help me debug?
class Solution {
public:
vector<int> sortArrayByParity(vector<int>& A) {
unordered_multiset<int> even;
unordered_multiset<int> odd;
vector<int> result(A.size());
for(int C:A)
{
if(C%2 == 0)
even.insert(C);
else
odd.insert(C);
}
merge(even.begin(),even.end(),odd.begin(),odd.end(),result.begin());
return result;
}
};

If you just need even values before odds and not a complete sort I suggest you use std::partition. You give it two iterators and a predicate. The elements where the predicate returns true will appear before the others. It works in-place and should be very fast.
Something like this:
std::vector<int> sortArrayByParity(std::vector<int>& A)
{
std::partition(A.begin(), A.end(), [](int value) { return value % 2 == 0; });
return A;
}

Because the merge function assumes that the two ranges are sorted, which is used as in merge sort. Instead, you should just use the insert function of vector:
result.insert(result.end(), even.begin(), even.end());
result.insert(result.end(), odd.begin(), odd.end());
return result;

There is no need to create three separate vectors. As you have allocated enough space in the result vector, that vector can be used as the final vector also to store your sub vectors, storing the separated odd and even numbers.
The value of using a vector, which under the covers is an array, is to avoid inserts and moves. Arrays/Vectors are fast because they allow immediate access to memory as an offset from the beginning. Take advantage of this!
The code simply keeps an index to the next odd and even indices and then assigns the correct cell accordingly.
class Solution {
public:
// As this function does not access any members, it can be made static
static std::vector<int> sortArrayByParity(std::vector<int>& A) {
std::vector<int> result(A.size());
uint even_index = 0;
uint odd_index = A.size()-1;
for(int element: A)
{
if(element%2 == 0)
result[even_index++] = element;
else
result[odd_index--] = element;
}
return result;
}
};

Taking advantage of the fact that you don't care about the order among the even or odd numbers themselves, you could use a very simple algorithm to sort the array in-place:
// Assume helper function is_even() and is_odd() are defined.
void sortArrayByParity(std::vector<int>& A)
{
int i = 0; // scanning from beginning
int j = A.size()-1; // scanning from end
do {
while (i < j && is_even(A[i])) ++i; // A[i] is an even at the front
while (i < j && is_odd(A[j])) --j; // A[j] is an odd at the back
if (i >= j) break;
// Now A[i] must be an odd number in front of an even number A[j]
std::swap(A[i], A[j]);
++i;
--j;
} while (true);
}
Note that the function above returns void, since the vector is sorted in-place. If you do want to return a sorted copy of input vector, you'd need to define a new vector inside the function, and copy the elements right before every ++i and --j above (and of course do not use std::swap but copy the elements cross-way instead; also, pass A as const std::vector<int>& A).
// Assume helper function is_even() and is_odd() are defined.
std::vector<int> sortArrayByParity(const std::vector<int>& A)
{
std::vector<int> B(A.size());
int i = 0; // scanning from beginning
int j = A.size()-1; // scanning from end
do {
while (i < j && is_even(A[i])) {
B[i] = A[i];
++i;
}
while (i < j && is_odd(A[j])) {
B[j] = A[j];
--j;
}
if (i >= j) break;
// Now A[i] must be an odd number in front of an even number A[j]
B[i] = A[j];
B[j] = A[i];
++i;
--j;
} while (true);
return B;
}
In both cases (in-place or out-of-place) above, the function has complexity O(N), N being number of elements in A, much better than the general O(N log N) for sorting N elements. This is because the problem doesn't actually sort much -- it only separates even from odd. There's therefore no need to invoke a full-fledged sorting algorithm.

call to condition on for loop (c++)

Here is a simple question I have been wondering about for a long time :
When I do a loop such as this one :
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
As the condition i < myVector.size() is checked each time, should I store the size of the array inside a variable before the loop to prevent the call to size() each iteration ? Or is the compiler smart enough to do it itself ?
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
And I would extend the question with a more complex condition such as i < myVector.front()/myVector.size()
Edit : I don't use myVector inside the loop, it is juste here to give the ending condition. And what about the more complex condition ?

The answer depends mainly on the contents of your loop–it may modify the vector during processing, thus modifying its size.
However if the vector is just scanned you can safely store its size in advance:
for (int i = 0, mySize = myVector.size(); i < mySize ; ++i) {
// my loop
}
although in most classes the functions like 'get current size' are just inline getters:
class XXX
{
public:
int size() const { return mSize; }
....
private:
int mSize;
....
};
so the compiler can easily reduce the call to just reading the int variable, consequently prefetching the length gives no gain.

If you are not changing anything in vector (adding/removing) during for-loop (which is normal case) I would use foreach loop
for (auto object : myVector)
{
//here some code
}
or if you cannot use c++11 I would use iterators
for (auto it = myVector.begin(); it != myVector.end(); ++it)
{
//here some code
}

I'd say that
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
is a bit safer than
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
because the value of myVector.size() may change (as result of , e.g. push_back(value) inside the loop) thus you might miss some of the elements.
If you are 100% sure that the value of myVector.size() is not going to change, then both are the same thing.
Yet, the first one is a bit more flexible than the second (other developer may be unaware that the loop iterates over fixed size and he might change the array size). Don't worry about the compiler, he's smarter than both of us combined.

The overhead is very small.
vector.size() does not recalculate anything, but simply returns the value of the private size variable..
it is safer than pre-buffering the value, as the vectors internal size variable is changed when an element is popped or pushed to/from the vector..
compilers can be written to optimize this out, if and only if, it can predict that the vector is not changed by ANYTHING while the for loop runs.
That is difficult to do if there are threads in there.
but if there isn't any threading going on, it's very easy to optimize it.

Any smart compiler will probably optimize this out. However just to be sure I usually lay out my for loops like this:
for (int i = myvector.size() -1; i >= 0; --i)
{
}
A couple of things are different:
The iteration is done the other way around. Although this shouldn't be a problem in most cases. If it is I prefer David Haim's method.
The --i is used rather than a i--. In theory the --i is faster, although on most compilers it won't make a difference.
If you don't care about the index this:
for (int i = myvector.size(); i > 0; --i)
{
}
Would also be an option. Altough in general I don't use it because it is a bit more confusing than the first. And will not gain you any performance.
For a type like a std::vector or std::list an iterator is the preffered method:
for (std::vector</*vectortype here*/>::iterator i = myVector.begin(); i != myVector.end(); ++i)
{
}

How to avoid reallocation using the STL (C++)

This question is derived from the topic:
vector reserve c++
I am using a datastructure of the type vector<vector<vector<double> > >. It is not possible to know the size of each of these vector (except the outer one) before items (doubles) are added. I can get an approximate size (upper bound) on the number of items in each "dimension".
A solution with the shared pointers might be the way to go, but I would like to try a solution where the vector<vector<vector<double> > > simply has .reserve()ed enough space (or in some other way has allocated enough memory).
Will A.reserve(500) (assumming 500 is the size or, alternatively an upper bound on the size) be enough to hold "2D" vectors of large size, say [1000][10000]?
The reason for my question is mainly because I cannot see any way of reasonably estimating the size of the interior of A at the time of .reserve(500).
An example of my question:
vector<vector<vector<int> > > A;
A.reserve(500+1);
vector<vector<int> > temp2;
vector<int> temp1 (666,666);
for(int i=0;i<500;i++)
{
A.push_back(temp2);
for(int j=0; j< 10000;j++)
{
A.back().push_back(temp1);
}
}
Will this ensure that no reallocation is done for A?
If temp2.reserve(100000) and temp1.reserve(1000) were added at creation will this ensure no reallocation at all will occur at all?
In the above please disregard the fact that memory could be wasted due to conservative .reserve() calls.
Thank you all in advance!

your example will cause a lot of copying and allocations.
vector<vector<vector<double>>> A;
A.reserve(500+1);
vector<vector<double>> temp2;
vector<double> temp1 (666,666);
for(int i=0;i<500;i++)
{
A.push_back(temp2);
for(int j=0; j< 10000;j++)
{
A.back().push_back(temp1);
}
}
Q: Will this ensure that no reallocation is done for A?
A: Yes.
Q: If temp2.reserve(100000) and temp1.reserve(1000) where added at creation will this ensure no reallocation at all will occur at all?
A: Here temp1 already knows its own length on creation time and will not be modified, so adding the temp1.reserve(1000) will only force an unneeded reallocation.
I don't know what the vector classes copy in their copy ctor, using A.back().reserve(10000) should work for this example.
Update: Just tested with g++, the capacity of temp2 will not be copied. So temp2.reserve(10000) will not work.
And please use the source formating when you post code, makes it more readable :-).

How can reserving 500 entries in A beforehand be enough for [1000][1000]?
You need to reserve > 1000 for A (which is your actual upperbound value), and then whenever you add an entry to A, reserve in it another 1000 or so (again, the upperbound but for the second value).
i.e.
A.reserve(UPPERBOUND);
for(int i = 0; i < 10000000; ++i)
A[i].reserve(UPPERBOUND);
BTW, reserve reserves the number of elements, not the number of bytes.

The reserve function will work properly for you vector A, but will not work as you are expecting for temp1 and temp2.
The temp1 vector is initialized with a given size, so it will be set with the proper capacity and you don't need to use reserve with this as long as you plan to not increase its size.
Regarding temp2, the capacity attribute is not carried over in a copy. Considering whenever you use push_back function you are adding a copy to your vector, code like this
vector<vector<double>> temp2;
temp2.reserve(1000);
A.push_back(temp2); //A.back().capacity() == 0
you are just increasing the allocated memory for temps that will be deallocated soon and not increasing the vector elements capacity as you expect. If you really want to use vector of vector as your solution, you will have to do something like this
vector<vector<double>> temp2;
A.push_back(temp2);
A.back().reserve(1000); //A.back().capacity() == 1000

I had the same issue one day. A clean way to do this (I think) is to write your own Allocator and use it for the inner vectors (last template parameter of std::vector<>). The idea is to write an allocator that don't actually allocate memory but simply return the right address inside the memory of your outter vector. You can easely know this address if you know the size of each previous vectors.

In order to avoid copy and reallocation for a datastructure such as vector<vector<vector<double> > >, i would suggest the following:
vector<vector<vector<double> > > myVector(FIXED_SIZE);
in order to 'assign' value to it, don't define your inner vectors until you actually know their rquired dimensions, then use swap() instead of assignment:
vector<vector<double> > innerVector( KNOWN_DIMENSION );
myVector[i].swap( innerVector );
Note that push_back() will do a copy operation and might cause reallocation, while swap() won't (assuming same allocator types are used for both vectors).

It seems to me that you need a real matrix class instead of nesting vectors. Have a look at boost, which has some strong sparse matrix classes.

Ok, now I have done some small scale testing on my own. I used a "2DArray" obtained from http://www.tek-tips.com/faqs.cfm?fid=5575 to represent a structure allocating memory static. For the dynamic allocation I used vectors almost as indicated in my original post.
I tested the following code (hr_time is a timing routine found on web which I due to anti spam unfortunately cannot post, but credits to David Bolton for providing it)
#include <vector>
#include "hr_time.h"
#include "2dArray.h"
#include <iostream>
using namespace std;
int main()
{
vector<int> temp;
vector<vector<int> > temp2;
CStopWatch mytimer;
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
temp2.push_back(temp);
for(int j=0; j< 2000; j++)
{
temp2.back().push_back(j);
}
}
mytimer.stopTimer();
cout << "With vectors without reserved: " << mytimer.getElapsedTime() << endl;
vector<int> temp3;
vector<vector<int> > temp4;
temp3.reserve(1001);
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
temp4.push_back(temp3);
for(int j=0; j< 2000; j++)
{
temp4.back().push_back(j);
}
}
mytimer.stopTimer();
cout << "With vectors with reserved: " << mytimer.getElapsedTime() << endl;
int** MyArray = Allocate2DArray<int>(1000,2000);
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
for(int j=0; j< 2000; j++)
{
MyArray[i][j]=j;
}
}
mytimer.stopTimer();
cout << "With 2DArray: " << mytimer.getElapsedTime() << endl;
//Test
for(int i=0; i<1000; i++)
{
for(int j=0; j< 200; j++)
{
//cout << "My Array stores :" << MyArray[i][j] << endl;
}
}
return 0;
}
It turns out that there is approx a factor 10 for these sizes. I should thus reconsider if dynamic allocation is appropriate for my application since speed is of utmost importance!

Why not subclass the inner containers and reserve() in constructors ?

If the Matrix does get really large and spare I'd try a sparse matrix lib too. Otherwise, before messing with allocaters, I'd try replacing vector with deque. A deque won't reallocate on growing and offers almost as fast random access as a vector.

This was more or less answered here. So your code would look something like this:
vector<vector<vector<double> > > foo(maxdim1,
vector<vector<double> >(maxdim2,
vector<double>(maxdim3)));

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ Removing empty elements from array - c++

I only want to add a[i] into the result array if the condition is met, but this method causes empty elements in the array as it adds to result[i]. Is there a better way to do this? for(int i=0; i<N; i++) { if(a[i]>=lower && a[i]<=upper) { count++; result[i]=a[i]; } }

Related

Function to check if an array is a permutation

Segmentation fault in online compilers

Sort Array By Parity the result is not robust

call to condition on for loop (c++)

How to avoid reallocation using the STL (C++)

Categories

Resources