Modifying set while iterating gives me segfault - c++

I am currently working on a function working with a vector of sets of int.
I want my function merge( ) to merge all sets that share an int in common, so for example I want this to happen :
[0] - 0, 1, 2
[1] - 1, 3 Then it will [0] - 0, 1, 2, 3
[2] - 0, 3 output this vector -> [1] - 4, 5
[3] - 4, 5 [2] - 6, 7, 8, 9
[4] - 6, 7, 8
[5] - 8, 9
I have already written this function, of which code is presented down here.
I have commented almost every line so that it is not too difficult to understand my code !
// Merges all sets that shares at least one int
//
// PARAMETERS...
// vectorE : vector of sets of int
void mergeStates( std::vector< std::set< int > >& vectorE )
{
// For every set of ints
for( auto &currentSet : vectorE )
{
// For every ints of the set
for( auto currentInt : currentSet )
{
// The two for( ) loops down there allow me to iterate over
// every int of every set of the vectorE
for( auto setToCheck : vectorE )
{
// If the set is different from the one we're already targeting
if( currentSet != setToCheck )
{
for( auto intToCheck : setToCheck )
{
// if we have found an int that is the same as the one we're targeting
if( intToCheck == currentInt )
{
// Merge
etatsetEtudie.insert( setToCheck.begin(), setToCheck.end() );
// Deleting the set we copied from, because we won't need it anymore
for(auto setToErase = vectorE.begin() ; setToErase != vectorE.end() ; ){
if( *setToErase == setToCheck )
setToErase = vectorE.erase( setToErase );
else
++setToErase;
}
}
}
}
}
}
}
}
Every time I run my program, I get a segfault when it comes to deleting the set we copied from : where is my error?
Edit : I got it to work !
Alright, thanks guys I simply made my parameter const and added a return value so that I can add dynamically every constructed set I need to a new vector, and return this vector :-)

The problem isn't modifying any set, it's modifying the vector.
Erasing something from a vector shifts the elements after it. First, this means iterators into the vector after the erased position (the for-range loop uses iterators internally) are no longer valid. Second, if the shifting copied and overwrote sets (instead of moving them), all your iterators into the sets will no longer be valid.
The result is lots of undefined behavior in your code.
Also, your innermost loop is not a good way to go about erasing the set, even if the method was valid. It's very, very inefficient.
You need to rethink, at the very least, the way you're erasing elements. But I think that coming up with a generally better algorithm would be the better approach.

Try to make a new vector instead of modifying the original one:
std::vector<std::set<int>> mergeStates(const std::vector<std::set<int>> & vectorE ) {
std::vector<std::set<int>> new_vector;
...
return new_vector;
}

You are using the std::vector::erase function which invalidates iterators. Consequently your code inside the range based for loop tries to access the iterator past the container end.

The end iterator used by the range-based for is determined prior to the loop. Since you erase() during the iteration, the end actually changes. Obtaining the iterator from the result of the erase() is insufficient as the end also changed. I think you can get away not using range-based for loops for the ranges you erase from.

Related

Removing first three elements of 2d array C++

So here's my problem.. I have a 2d array of 2 char strings.
9D 5C 6S 9D KS 4S 9D
9S
If 3 found I need to delete the first 3 based on the first char.
card
My problem is I segfault almost anything i do...
pool is the 2d vector
selection = "9S";
while(col != GameBoard::pool.size() ){
while(GameBoard::pool[col][0].at(0) == selection.at(0) || cardsRem!=0){
if(GameBoard::pool[col].size() == 1){
GameBoard::pool.erase(GameBoard::pool.begin() + col);
cardsRem--;
}
else{
GameBoard::pool[col].pop_back();
cardsRem--;
}
}
if(GameBoard::pool[col][0].at(0) != selection.at(0)){
col++;
}
}
I've tried a series of for loops etc, and no luck! Any thoughts would save my sanity!
So I've tried to pull out a code segment to replicate it. But I can't...
If I run my whole program in a loop it will eventually throw a segfault. If I run that exact code in the same circumstance it doesn't... I'm trying to figure out what I'm missing. I'll get back in if I figure out exactly where my issue is..
So in the end the issue is not my code itself, i've got memory leaks or something somewhere that are adding up to eventually crash my program... That tends to be in the same method each time I guess.
The safer and most efficient way to erase some elements from a container is to apply the erase-remove idiom.
For instance, your snippet can be rewritten as the following (which is testable here):
using card_t = std::string;
std::vector<std::vector<card_t>> decks = {
{"9D", "5C", "6S", "9D", "KS", "4S", "9D"},
{"9S"}
};
card_t selection{"9S"};
// Predicate specifing which cards should be removed
auto has_same_rank = [rank = selection.at(0)] (card_t const& card) {
return card.at(0) == rank;
};
auto & deck = decks.at(0);
// 'std::remove_if' removes all the elements satisfying the predicate from the range
// by moving the elements that are not to be removed at the beginning of the range
// and returns a past-the-end iterator for the new end of the range.
// 'std::vector::erase' removes from the vector the elements from the iterator
// returned by 'std::remove_if' up to the end iterator. Note that it invalidates
// iterators and references at or after the point of the erase, including the
// end() iterator (it's the most common cause of errors in code like OP's).
deck.erase(std::remove_if(deck.begin(), deck.end(), has_same_rank),
deck.end());
So for anyone else in the future who comes across this...
The problem is I was deleting an element in the array in a loop, with the conditional stop was it's size. The size is set before hand, and while it was accounted for in the code it still left open the possibility for while(array.size() ) which would be locked in at 8 in the loop be treated as 6 in the code.
The solution was to save the location in the vector to delete and then delete them outside of the loop. I imagine there is a better, more technical answer to this, but it works as intended now!
for (double col = 0; col < size; ++col)
{
if(GameBoard::pool[col][0].at(0) == selection.at(0)){
while(GameBoard::pool[col][0].at(0) == selection.at(0) && cardsRem !=0){
if( GameBoard::pool[col].size() > 1 ){
GameBoard::pool[col].pop_back();
cardsRem--;
}
if(GameBoard::pool[col].size() <2){
toDel.insert ( toDel.begin() , col );
//GameBoard::pool.erase(GameBoard::pool.begin() + col);
cardsRem--;
size--;
}
}
}
}
for(int i = 0; i< toDel.size(); i++){
GameBoard::pool.erase(GameBoard::pool.begin() + toDel[i]);
}

Remove duplicates without using any STL containers

I was asked the following question in a 30-minute interview:
Given an array of integers, remove the duplicates without using any STL containers. For e.g.:
For the input array [1,2,3,4,5,3,3,5,4] the output should be:
[1,2,3,4,5];
Note that the first 3, 4 and 5 have been included, but the subsequent ones have been removed since we have already included them once in the output array. How do we do without using an extra STL container?
In the interview, I assumed that we only have positive integers and suggested using a bit array to mark off every element present in the input (assume every element in the input array as an index of the bit array and update it to 1). Finally, we could iterate over this bit vector, populating (or displaying) the unique elements. However, he was not satisfied with this approach. Any other methods that I could have used?
Thanks.
Just use std::sort() and std::unique():
int arr[] = { 1,2,3,4,5,3,3,5,4 };
std::sort( std::begin(arr), std::end(arr) );
auto end = std::unique( std::begin(arr), std::end(arr) );
Live example
We can first sort the array then check if the next element is equal to the previous one and finally give the answer with the help of another array of size 2 larger than the previous one like this.
Initialize the second array with a value that first array will not take (any number larger/smaller than the limit given) ,suppose 0 for simplicity then
int arr1[] = { 1,2,3,4,5,3,3,5,4 };
int arr2[] = { 0,0,0,0,0,0,0,0,0,0,0 };
std::sort( std::begin(arr1), std::end(arr1) );
int position=1;
arr2[0] = arr1[0];
for(int* i=begin(arr1)+1;i!=end(arr1);i++){
if((*i)!=(*(i-1))){
arr2[position] = (*i);
position++;
}
}
int size = 0;
for(int* i=begin(arr2);i!=end(arr2);i++){
if((*i)!=(*(i+1))){
size++;
}
else{
break;
}
}
int ans[size];
for(int i=0;i<size;i++){
ans[i]=arr2[i];
}
Easy algorithm in O(n^2):
void remove_duplicates(Vec& v) {
// range end
auto it_end = end(v);
for (auto it = begin(v); it != it_end; ++it) {
// remove elements matching *it
it_end = remove(it+1, it_end, *it);
}
// erase now-unused elements
v.erase(it_end, end(v));
}
See also erase-remove idiom
Edit: This is assuming you get a std::vector in, but it would work with C-style arrays too, you would just have to implement the erasure yourself.

Fast algorithm to remove odd elements from vector

Given a vector of integers, I want to wrote a fast (not obvious O(n^2)) algorithm to remove all odd elements from it.
My idea is: iterate through vector till first odd element, then copy everything before it to the end of vector (call push_back method) and so on until we have looked through all original elements (except copied ones), then remove all of them, so that only the vector's tail survive.
I wrote the following code to implement it:
void RemoveOdd(std::vector<int> *data) {
size_t i = 0, j, start, end;
uint l = (*data).size();
start = 0;
for (i = 0; i < l; ++i)
{
if ((*data)[i] % 2 != 0)
{
end = i;
for (j = start, j < end, ++j)
{
(*data).push_back((*data)[j]);
}
start = i + 1;
}
}
(*data).erase((*data).begin(), i);
}
but it gives me lots of errors, which I can't fix. I'm very new to the programming, so expect that all of them are elementary and stupid.
Please help me with error corrections or another algorithm implementation. Any suggestions and explanations will be very appreciative. It is also better not to use algorithm library.
You can use the remove-erase idiom.
data.erase(std::remove_if(data.begin(), data.end(),
[](int item) { return item % 2 != 0; }), data.end());
You don't really need to push_back anything (or erase elements at the front, which requires repositioning all that follows) to remove elements according to a predicate... Try to understand the "classic" inplace removal algorithm (which ultimately is how std::remove_if is generally implemented):
void RemoveOdd(std::vector<int> & data) {
int rp = 0, wp = 0, sz = data.size();
for(; rp<sz; ++rp) {
if(data[rp] % 2 == 0) {
// if the element is a keeper, write it in the "write pointer" position
data[wp] = data[rp];
// increment so that next good element won't overwrite this
wp++;
}
}
// shrink to include only the good elements
data.resize(wp);
}
rp is the "read" pointer - it's the index to the current element; wp is the "write" pointer - it always points to the location where we'll write the next "good" element, which is also the "current length" of the "new" vector. Every time we have a good element we copy it in the write position and increment the write pointer. Given that wp <= rp always (as rp is incremented once at each iteration, and wp at most once per iteration), you are always overwriting either an element with itself (so no harm is done), or an element that has already been examined and either has been moved to its correct final position, or had to be discarded anyway.
This version is done with specific types (vector<int>), a specific predicate, with indexes and with "regular" (non-move) assignment, but can be easily generalized to any container with forward iterators (as its done in std::remove_if) and erase.
Even if the generic standard library algorithm works well in most cases, this is still an important algorithm to keep in mind, there are often cases where the generic library version isn't sufficient and knowing the underlying idea is useful to implement your own version.
Given pure algorithm implementation, you don't need to push back elements. In worst case scenario, you will do more than n^2 copy. (All odd data)
Keep two pointers: one for iterating (i), and one for placing. Iterate on all vector (i++), and if *data[I] is even, write it to *data[placed] and increment placed. At the end, reduce length to placed, all elements after are unecessary
remove_if does this for you ;)
void DeleteOdd(std::vector<int> & m_vec) {
int i= 0;
for(i= 0; i< m_vec.size(); ++i) {
if(m_vec[i] & 0x01)
{
m_vec.erase(m_vec.begin()+i);
i--;
}
}
m_vec.resize(i);
}

Modifying a data structure while iterating over it

What happens when you add elements to a data structure such as a vector while
iterating over it. Can I not do this?
I tried this and it breaks:
int main() {
vector<int> x = { 1, 2, 3 };
int j = 0;
for (auto it = x.begin(); it != x.end(); ++it) {
x.push_back(j);
j++;
cout << j << " .. ";
}
}
Iterators are invalidated by some operations that modify a std::vector.
Other containers have various rules about when iterators are and are not invalidated. This is a post (by yours truly) with details.
By the way, the entrypoint function main() MUST return int:
int main() { ... }
What happens when you add elements to a data structure such as a vector while iterating over it. Can I not to this?
The iterator would become invalid IF the vector resizes itself. So you're safe as long as the vector doesn't resize itself.
I would suggest you to avoid this.
The short explanation why resizing invalidates iterator:
Initially the vector has some capacity (which you can know by calling vector::capacity().), and you add elements to it, and when it becomes full, it allocates larger size of memory, copying the elements from the old memory to the newly allocated memory, and then deletes the old memory, and the problem is that iterator still points to the old memory, which has been deallocated. That is how resizing invalidates iterator.
Here is simple demonstration. Just see when the capacity changes:
std::vector<int> v;
for(int i = 0 ; i < 100 ; i++ )
{
std::cout <<"size = "<<v.size()<<", capacity = "<<v.capacity()<<std::endl;
v.push_back(i);
}
Partial Output:
size = 0, capacity = 0
size = 1, capacity = 1
size = 2, capacity = 2
size = 3, capacity = 4
size = 4, capacity = 4
size = 5, capacity = 8
size = 6, capacity = 8
size = 7, capacity = 8
size = 8, capacity = 8
size = 9, capacity = 16
size = 10, capacity = 16
See the complete output here : http://ideone.com/rQfWe
Note: capacity() tells the maximum number of elements the vector can contain without allocating new memory, and size() tells the number of elements the vector currently containing.
It's not a good idea to do it.
You could think about the case where your vector would need to be resized after a push_back. It would then need to be moved to a bigger memory spot and your iterators would now be invalid.
It's a bad idea in general, because if the vector is resized, the iterator will become invalid (it's wrapping a pointer into the vector's memory).
It's also not clear what your code is really trying to do. If the iterator somehow didn't become invalid (suppose it was implemented as an index), I'd expect you to have an infinite loop there - the end would never be reached because you're always adding elements.
Assuming you want to loop over the original elements, and add one for each, one solution would be to add the new elements to a second vector, and then concatenate that at the end:
vector<int> temp;
// ...
// Inside loop, do this:
temp.push_back(j);
// ...
// After loop, do this to insert all new elements onto end of x
x.insert(x.end(), temp.begin(), temp.end());
While you used vector as an example, there are other stl containers which are able to have elements pushed-back without invalidating iterators. Pushing back an element into a std::list doesn't require any re-allocation of existing elements as they aren't stored contiguously (lists instead comprise of nodes linked together by pointers to the next node), therefore iterators remain valid as the node they internally point to still resides at the same address.
if you need to do it this way, you can reserve the maximum number of records you could add. this will stop the vector from needing to resize, and this should prevent crashes

Erasing multiple objects from a std::vector?

Here is my issue, lets say I have a std::vector with ints in it.
let's say it has 50,90,40,90,80,60,80.
I know I need to remove the second, fifth and third elements. I don't necessarily always know the order of elements to remove, nor how many. The issue is by erasing an element, this changes the index of the other elements. Therefore, how could I erase these and compensate for the index change. (sorting then linearly erasing with an offset is not an option)
Thanks
I am offering several methods:
1. A fast method that does not retain the original order of the elements:
Assign the current last element of the vector to the element to erase, then erase the last element. This will avoid big moves and all indexes except the last will remain constant. If you start erasing from the back, all precomputed indexes will be correct.
void quickDelete( int idx )
{
vec[idx] = vec.back();
vec.pop_back();
}
I see this essentially is a hand-coded version of the erase-remove idiom pointed out by Klaim ...
2. A slower method that retains the original order of the elements:
Step 1: Mark all vector elements to be deleted, i.e. with a special value. This has O(|indexes to delete|).
Step 2: Erase all marked elements using v.erase( remove (v.begin(), v.end(), special_value), v.end() );. This has O(|vector v|).
The total run time is thus O(|vector v|), assuming the index list is shorter than the vector.
3. Another slower method that retains the original order of the elements:
Use a predicate and remove if as described in https://stackoverflow.com/a/3487742/280314 . To make this efficient and respecting the requirement of
not "sorting then linearly erasing with an offset", my idea is to implement the predicate using a hash table and adjust the indexes stored in the hash table as the deletion proceeds on returning true, as Klaim suggested.
Using a predicate and the algorithm remove_if you can achieve what you want : see http://www.cplusplus.com/reference/algorithm/remove_if/
Don't forget to erase the item (see remove-erase idiom).
Your predicate will simply hold the idx of each value to remove and decrease all indexes it keeps each time it returns true.
That said if you can afford just removing each object using the remove-erase idiom, just make your life simple by doing it.
Erase the items backwards. In other words erase the highest index first, then next highest etc. You won't invalidate any previous iterators or indexes so you can just use the obvious approach of multiple erase calls.
I would move the elements which you don't want to erase to a temporary vector and then replace the original vector with this.
While this answer by Peter G. in variant one (the swap-and-pop technique) is the fastest when you do not need to preserve the order, here is the unmentioned alternative which maintains the order.
With C++17 and C++20 the removal of multiple elements from a vector is possible with standard algorithms. The run time is O(N * Log(N)) due to std::stable_partition. There are no external helper arrays, no excessive copying, everything is done inplace. Code is a "one-liner":
template <class T>
inline void erase_selected(std::vector<T>& v, const std::vector<int>& selection)
{
v.resize(std::distance(
v.begin(),
std::stable_partition(v.begin(), v.end(),
[&selection, &v](const T& item) {
return !std::binary_search(
selection.begin(),
selection.end(),
static_cast<int>(static_cast<const T*>(&item) - &v[0]));
})));
}
The code above assumes that selection vector is sorted (if it is not the case, std::sort over it does the job, obviously).
To break this down, let us declare a number of temporaries:
// We need an explicit item index of an element
// to see if it should be in the output or not
int itemIndex = 0;
// The checker lambda returns `true` if the element is in `selection`
auto filter = [&itemIndex, &sorted_sel](const T& item) {
return !std::binary_search(
selection.begin(),
selection.end(),
itemIndex++);
};
This checker lambda is then fed to std::stable_partition algorithm which is guaranteed to call this lambda only once for each element in the original (unpermuted !) array v.
auto end_of_selected = std::stable_partition(
v.begin(),
v.end(),
filter);
The end_of_selected iterator points right after the last element which should remain in the output array, so we now can resize v down. To calculate the number of elements we use the std::distance to get size_t from two iterators.
v.resize(std::distance(v.begin(), end_of_selected));
This is different from the code at the top (it uses itemIndex to keep track of the array element). To get rid of the itemIndex, we capture the reference to source array v and use pointer arithmetic to calculate itemIndex internally.
Over the years (on this and other similar sites) multiple solutions have been proposed, but usually they employ multiple "raw loops" with conditions and some erase/insert/push_back calls. The idea behind stable_partition is explained beautifully in this talk by Sean Parent.
This link provides a similar solution (and it does not assume that selection is sorted - std::find_if instead of std::binary_search is used), but it also employs a helper (incremented) variable which disables the possibility to parallelize processing on larger arrays.
Starting from C++17, there is a new first argument to std::stable_partition (the ExecutionPolicy) which allows auto-parallelization of the algorithm, further reducing the run-time for big arrays. To make yourself believe this parallelization actually works, there is another talk by Hartmut Kaiser explaining the internals.
Would this work:
void DeleteAll(vector<int>& data, const vector<int>& deleteIndices)
{
vector<bool> markedElements(data.size(), false);
vector<int> tempBuffer;
tempBuffer.reserve(data.size()-deleteIndices.size());
for (vector<int>::const_iterator itDel = deleteIndices.begin(); itDel != deleteIndices.end(); itDel++)
markedElements[*itDel] = true;
for (size_t i=0; i<data.size(); i++)
{
if (!markedElements[i])
tempBuffer.push_back(data[i]);
}
data = tempBuffer;
}
It's an O(n) operation, no matter how many elements you delete. You could gain some efficiency by reordering the vector inline (but I think this way it's more readable).
This is non-trival because as you delete elements from the vector, the indexes change.
[0] hi
[1] you
[2] foo
>> delete [1]
[0] hi
[1] foo
If you keep a counter of times you delete an element and if you have a list of indexes you want to delete in sorted order then:
int counter = 0;
for (int k : IndexesToDelete) {
events.erase(events.begin()+ k + counter);
counter -= 1;
}
You can use this method, if the order of the remaining elements doesn't matter
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector< int> vec;
vec.push_back(1);
vec.push_back(-6);
vec.push_back(3);
vec.push_back(4);
vec.push_back(7);
vec.push_back(9);
vec.push_back(14);
vec.push_back(25);
cout << "The elements befor " << endl;
for(int i = 0; i < vec.size(); i++) cout << vec[i] <<endl;
vector< bool> toDeleted;
int YesOrNo = 0;
for(int i = 0; i<vec.size(); i++)
{
cout<<"You need to delete this element? "<<vec[i]<<", if yes enter 1 else enter 0"<<endl;
cin>>YesOrNo;
if(YesOrNo)
toDeleted.push_back(true);
else
toDeleted.push_back(false);
}
//Deleting, beginning from the last element to the first one
for(int i = toDeleted.size()-1; i>=0; i--)
{
if(toDeleted[i])
{
vec[i] = vec.back();
vec.pop_back();
}
}
cout << "The elements after" << endl;
for(int i = 0; i < vec.size(); i++) cout << vec[i] <<endl;
return 0;
}
Here's an elegant solution in case you want to preserve the indices, the idea is to replace the values you want to delete with a special value that is guaranteed not be used anywhere, and then at the very end, you perform the erase itself:
std::vector<int> vec = {1, 2, 3, 4, 5, 6, 7, 8, 9};
// marking 3 elements to be deleted
vec[2] = std::numeric_limits<int>::lowest();
vec[5] = std::numeric_limits<int>::lowest();
vec[3] = std::numeric_limits<int>::lowest();
// erase
vec.erase(std::remove(vec.begin(), vec.end(), std::numeric_limits<int>::lowest()), vec.end());
// print values => 1 2 5 7 8 9
for (const auto& value : vec) std::cout << ' ' << value;
std::cout << std::endl;
It's very quick if you delete a lot of elements because the deletion itself is happening only once. Items can also be deleted in any order that way.
If you use a a struct instead of an int, then you can still mark an element of that struct, for ex dead=true and then use remove_if instead of remove =>
struct MyObj
{
int x;
bool dead = false;
};
std::vector<MyObj> objs = {{1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}};
objs[2].dead = true;
objs[5].dead = true;
objs[3].dead = true;
objs.erase(std::remove_if(objs.begin(), objs.end(), [](const MyObj& obj) { return obj.dead; }), objs.end());
// print values => 1 2 5 7 8 9
for (const auto& obj : objs) std::cout << ' ' << obj.x;
std::cout << std::endl;
This one is a bit slower, around 80% the speed of the remove.