How to move all the pointers from one vector to another? - c++

Basically what I want to do is remove some of the pointers inside my vector, but I found out that it can be quite slow to do that in the middle of the vector.
So I have a vector that already has data inside:
std::vector<Class*> vec1; // This already contains pointers
I'll iterate through vec1 and will add some of the pointers to another vector (vec2): vec2.push_back(vec1.at(index))
Now I would like to do is something like vec1 = vec2 but I don't know if this is the better (effecient) way to do that.
What would be the best way to do that?
I tried:
While looping through vec1 simply erasing what I need to remove from it:
it = vec1.erase(it)
While looping through vec1 moving the last item to the actual index and poping_back
vec1.at(index) = vec1.back();
vec1.pop_back();
Setting some attribute on the object the pointer is pointing while looping through vec1 and than using std::remove_if
vec1.erase(std::remove_if(vec1.begin(), vec1.end(), shouldBeRemoved), vec1.end());
Now I'm trying to generate a new vector while looping through vec1 and adding the pointers I want to keep, then "swapping" or "moving" the contents of this new vector to vec1.
Apparently when doing it the 4th way, the pointers get invalidated :(
I would love to see what you guys suggest me. A big thank you to everyone that is willing to help!

You can just use std::remove_if to conditionally remove items from a vector. This algorithm will shift items that need to be kept over to the front. Follow it up with a std::vector::erase call to actually remove the items not shifted to the front.
This is similar to your option 3, but you don't need to set an attribute first - just use a predicate that determines if the item should be kept or not, and avoid having to pass over the vector twice.
If you don't want to do it in-place, but want to fill a new vector, then std::copy_if does that.

Removing things from a vector should be done with the erase remove idiom
It is well covered here: https://en.wikipedia.org/wiki/Erase%E2%80%93remove_idiom
The basic idea is to shift the elements first and then erase the unneeded items which is faster than erasing and shifting each individual element which is done as from the example with:
v.erase( std::remove( v.begin(), v.end(), 5 ), v.end() );
But in general: If you have a lot of add/erase steps in your algorithm, you should use a std::list where removing elements in the middle is much cheaper at all.

Your attempt #2 suggests that you're not interested in the order of the elements. remove_if will suffer performance problems as it will maintain the order of the items that you don't delete; meaning you could do a substantial number of shifts to maintain this order.
The swapping and popping will suffer the problem that repeatedly popping the back isn't required - it could resize the vector or do other things.
As such, by combining the ideas - of swapping the "last not swapped out" (ie last the first time, 2nd last the 2nd etc) and then at the end erasing the end items once complete; you should have the fastest algorithm.
Some of the comments suggest that a copy is faster than a swap; and while true when doing a single copy; for a vector when you're copying multiple elements multiple times; the swap will be significantly faster.

Related

Does inserting an element in vector by re-sizing the vector every time takes more time?

I got a decision making problem here. In my application, I need to merge two vectors. I can't use stl algorithms since data order is important (It should not be sorted.).
Both the vectors contains the data which can be same sometimes or 75% different in the worst case.
Currently I am confused b/w two approaches,
Approach 1:
a. take an element in the smaller vector.
b. compare it with the elements in bigger one.
c. If element matches then skip it (I don't want duplicates).
d. If element is not found in bigger one, calculate proper position to insert.
e. re-size the bigger one to insert the element (multiple time re-size may happen).
Approach 2:
a. Iterate through vectors to find matched element positions.
b. Resize the bigger one at a go by calculating total size required.
c. Take smaller vector and go to elements which are not-matched.
d. Insert the element in appropriate position.
Kindly help me to choose the proper one. And if there is any better approach or simpler techniques (like stl algorithms), or easier container than vector, please post here. Thank you.
You shouldn't be focusing on the resizes. In approach 1, you should use use vector.insert() so you don't actually need to resize the vector yourself. This may cause reallocations of the underlying buffer to happen automatically, but std::vector is carefully implemented so that the total cost of these operations will be small.
The real problem with your algorithm is the insert, and maybe the search (which you didn't detail). When you into a vector anywhere except at the end, all the elements after the insertion point must be moved up in memory, and this can be quite expensive.
If you want this to be fast, you should build a new vector from your two input vectors, by appending one element at a time, with no inserting in the middle.
Doesn't look like you can do this in better time complexity than O(n.log(n)) because removing duplicates from a normal vector takes n.log(n) time. So using set to remove duplicates might be the best thing you can do.
n here is number of elements in both vectors.
Depending on your actual setup (like if you're adding object pointers to a vector instead of copying values into one), you might get significantly faster results using a std::list. std::list allows for constant time insertion which is going to be a huge performance overhead.
Doing insertion might be a little awkward but is completely do-able by only changing a few pointers (inexpensive) vs insertion via a vector which moves every element out of the way to put the new one down.
If they need to end up as vectors, you can then convert the list to a vector with something like (untested)
std::list<thing> things;
//efficiently combine the vectors into a list
//since list is MUCH better for inserts
//but we still need it as a vector anyway
std::vector<thing> things_vec;
things_vec.reserve(things.size()); //allocate memory
//now move them into the vector
things_vec.insert(
things_vec.begin(),
std::make_move_iterator(things.begin()),
std::make_move_iterator(things.end())
);
//things_vec now has the same content and order as the list with very little overhead

Keeping vector of iterators of the data

I have a function :
void get_good_items(const std::vector<T>& data,std::vector<XXX>& good_items);
This function should check all data and find items that satisfies a condition and return where they are in good_items.
what is best instead of std::vector<XXX>?
std::vector<size_t> that contains all good indices.
std::vector<T*> that contain a pointers to the items.
std::vector<std::vector<T>::iterator> that contains iterators to the items.
other ??
EDIT:
What will I do with the good_items?
Many things... one of them is to delete them from the vector and save them in other place. maybe something else later
EDIT 2:
One of the most important for me is how will accessing the items in data will be fast depending on the struct of good_items?
EDIT 3:
I have just relized that my thought was wrong. Is not better to keep raw pointers(or smart) as items of the vector so I can keep the real values of the vector (which are pointers) and I do not afraid of heavy copy because they are just pointers?
If you remove items from the original vector, every one of the methods you listed will be a problem.
If you add items to the original vector, the second and third will be problematic. The first one won't be a problem if you use push_back to add items.
All of them will be fine if you don't modify the original vector.
Given that, I would recommend using std::vector<size_t>.
I would go with std::vector<size_t> or std::vector<T*> because they are easier to type. Otherwise, those three vectors are pretty much equivalent, they all identify positions of elements.
std::vector<size_t> can be made to use a smaller type for indexes if you know the limits.
If you expect that there are going to be many elements in this vector, you may like to consider using boost::dynamic_bitset instead to save memory and increase CPU cache utilization. A bit per element, bit position being the index into the original vector.
If you intend to remove the elements that statisfy the predicate, then erase-remove idiom is the simplest solution.
If you intend to copy such elements, then std::copy_if is the simplest solution.
If you intend to end up with two partitions of the container i.e. one container has the good ones and another the bad ones, then std::partition_copy is a good choice.
For generally allowing the iteration of such elements, an efficient solution is returning a range of such iterators that will check the predicate while iterating. I don't think there are such iterators in the standard library, so you'll need to implement them yourself. Luckily boost already has done that for you: http://www.boost.org/doc/libs/release/libs/iterator/doc/filter_iterator.html
The problem you are solving, from my understanding, is the intersection of two sets, and I would go for the solution from standard library: std::set_intersection

Inserting an element into vector in the middle

Is there an way of inserting/deleting an element from the vectors other than the following..
The formal method of using 'push_back'
Using 'find()' in this way... find(v.begin(), v.end(), int)
I have read some where that inserting in the middle can be achieved by inclusive insertion/deletion.
So, is it really possible?
You can use std::vector::insert; however, note that this operation is O(.size()). If your code needs to perform insertions in the middle frequently, you may want to switch to a linked-list structure.
Is there an way of inserting/deleting an element from the vectors other than the following
Yes, you can use std::vector::insert() to insert element at a specified position.
Because vectors use an array as their underlying storage, inserting elements in positions other than the vector end causes the container to move all the elements that were after position to their new positions. This is generally an inefficient operation compared to the one performed for the same operation by other kinds of sequence containers (such as std::list).
std::vector is standard container, you could apply standard STL algorithms on it.
vector::insert seems to be what you want.

Removing Vector from 2D vector

I have a 2D vector containing 96 blocks of 600 values, which is what I want.
I need to remove (blocks) that do not contain sufficient energy. I have managed to calculate the energy but do not know which way would be better in removing the (blocks) that do not contain enough energy.
In your opinions would it be better to create a temporary 2D vector, that pushed back the blocks that do contain enough energy and then delete the original vector from memory or...
Should I remove the blocks from the vector at that particular position?
I'm assuming you have this:
typedef std::vector<value> Block;
typedef std::vector< Block > my2dVector;
and you have a function like this:
bool BlockHasInsufficientEnergy( Block const& vec );
and you want to remove the Blocks that do not have sufficient energy.
By remove, do you mean you want there to be fewer than 96 Blocks afterwards? I will assume so.
Then the right way to do this is:
void RemoveLowEnergyBlocks( my2dVector& vec )
{
my2dVector::iterator erase_after = std::remove_if( vec.begin(), vec.end(), BlockHasInsufficientEnergy );
vec.erase( erase_after, vec.end() );
}
the above can be done in one line, but by doing it in two what is going on should be more clear.
remove_if finds everything that passes the 3rd argument condition, and filters it out of the range. It returns the point where the "trash" at the end of the vector lives. We then erase the trash. This is called the remove-erase idiom.
Maybe you'd want to use linked list, or just set filtered-out items as NULL's, or mark them with bool member flag, or keep a separate vector of indexes of filtered items (if you have several filters at once this saves memory).
The solution vary on what are the constraints. Do you need random access? How much object copy takes? Etc.
Also you can take a look at the STL code (this is STL's vector, right?) and check if it does what you ask for - i.e. copying a vector data.
It depends, in part, on how you define better in this case. There may be advantages to either method, but it would be hard to know exactly what they are. Most likely, it is probably somewhat "better", in terms of memory and processing performance, to erase the exact positions you don't want from the vector instead of allocating an entirely new one.
It may be better still to consider using a deque or list for that purpose, since they may avoid large reallocations that the vector is likely to make as it tries to keep a contiguous segment of memory.

How to erase duplicate element in vector, efficiently

I have
vector<string> data ; // I hold some usernames in it
In that vector, I have duplicate element(s), so I want erase this/these element(s).Are there any algorithm or library function to erase duplicate element(s)?
ex :
In data;
abba, abraham, edie, Abba, edie
After operation;
abba, abraham, edie, Abba
If you can sort the elements in the container, the straightforward and relatively efficient solution would be:
std::sort(data.begin(), data.end());
data.erase(std::unique(data.begin(), data.end()), data.end());
I'm not sure there is a really good way to do it.
What I would do is sort (in a different array, if you need the original in tact) and then run through it.
"set" does not allow duplicates. You can use that to filter out duplicates.
Create a set
Add all usernames to set
Create a new vector
Add all elements from set to vector
If you really need to do it efficiently, then should do an in place sort first, and then go through the container by yourself instead of using std::unique, fetch unique items into a new vector, at then end do a swap.
I just checked the source code of std::unique, it will do a lot move when finding one duplicate, move hurts vector's performance.