Can I insert values into vector of pointers using insert function? - c++

Suppose Foo is any class.
Foo f[5];
std::vector<Foo*> v;
I can insert the elements into vector of pointers using a for loop statement:
for (size_t i = 0; i < 5; i++)
v.push_back(&f[i]);
Is it possible to insert them using std::vector::insert() function and why not? I have tried several times it failed something like this:
v.insert(v.end(), &f[0], &f[5]); // error

If you mean, with a single call to insert, then no - that can copy a range, performing type conversions if needed, but can't apply arbitrary transformations like taking the address of each element.
You could use std::transform:
std::transform(std::begin(f), std::end(f),
std::back_inserter(v),
[](Foo & f) {return &f;});
although that's probably less clear than a simple loop, especially if you use new-style syntax
for (Foo & foo : f) {
v.push_back(&foo);
}

Yes you can use insert also. But there are few differences between these two operations:-
push_back puts a new element at the end of the vector and insert allows you to select position. This impacts the performance. insert forces to move all elements after the selected position of a new element. You simply have to make a place for it. This is why insert might often be less efficient than push_back.

Related

Insert to beginning of copied vector

I have a std:;vector<double> that's the output from a simulation code. The size can be anywhere from O(10^1) to O(10^4). I need to create a new vector that's a copy of this vector with an additional element at the beginning, so I can either write:
// old_vec is some std::vector<double> from a simulation code
auto new_vec = old_vec;
double val = 1.0;
new_vec.insert(new_vec.begin(), val);
or
std::vector<double> new_vec{val};
new_vec.insert(new_vec.end(), old_vec.begin(), old_vec.end());
I believe the first approach will cause a reallocation due to the insertion at the beginning of a vector, whereas the second one will just append everything to the end, so the latter seems better? Is there any guarantee that the compiler may optimize the first code into the second code?
I wouldn't trust directly using the "=" operator to copy the vector, but more of a combination between your two methods. List-initialization may be safer first, then use insert() to add the first element:
vector <double> new_vec = {old_vec.begin(), old_vec.end()};
new_vec.insert(new_vec.begin(), val);
Your suspicions of problems may vary across different compilers, so you may or may not get an error. However, if you would like a foolproof way, that would be outright inserting and copying:
vector <double> new_vec; new_vec.push_back(val);
for (double i : old_vec) { new_vec.push_back(i); }

Copying vector elements to a vector pair

In my C++ code,
vector <string> strVector = GetStringVector();
vector <int> intVector = GetIntVector();
So I combined these two vectors into a single one,
void combineVectors(vector<string>& strVector, vector <int>& intVector, vector < pair <string, int>>& pairVector)
{
for (int i = 0; i < strVector.size() || i < intVector.size(); ++i )
{
pairVector.push_back(pair<string, int> (strVector.at(i), intVector.at(i)));
}
}
Now this function is called like this,
vector <string> strVector = GetStringVector();
vector <int> intVector = GetIntVector();
vector < pair <string, int>> pairVector
combineVectors(strVector, intVector, pairVector);
//rest of the implementation
The combineVectors function uses a loop to add the elements of other 2 vectors to the vector pair. I doubt this is a efficient way as this function gets called hundrands of times passing different data. This might cause a performance issue because everytime it goes through the loop.
My goal is to copy both the vectors in "one go" to the vector pair. i.e., without using a loop. Am not sure whether that's even possible.
Is there a better way of achieving this without compromising the performance?
You have clarified that the arrays will always be of equal size. That's a prerequisite condition.
So, your situation is as follows. You have vector A over here, and vector B over there. You have no guarantees whether the actual memory that vector A uses and the actual memory that vector B uses are next to each other. They could be anywhere.
Now you're combining the two vectors into a third vector, C. Again, no guarantees where vector C's memory is.
So, you have really very little to work with, in terms of optimizations. You have no additional guarantees whatsoever. This is pretty much fundamental: you have two chunks of bytes, and those two chunks need to be copied somewhere else. That's it. That's what has to be done, that's what it all comes down to, and there is no other way to get it done, other than doing exactly that.
But there is one thing that can be done to make things a little bit faster. A vector will typically allocate memory for its values in incremental steps, reserving some extra space, initially, and as values get added to the vector, one by one, and eventually reach the vector's reserved size, the vector has to now grab a new larger block of memory, copy everything in the vector to the larger memory block, then delete the older block, and only then add the next value to the vector. Then the cycle begins again.
But you know, in advance, how many values you are about to add to the vector, so you simply instruct the vector to reserve() enough size in advance, so it doesn't have to repeatedly grow itself, as you add values to it. Before your existing for loop, simply:
pairVector.reserve(pairVector.size()+strVector.size());
Now, the for loop will proceed and insert new values into pairVector which is guaranteed to have enough space.
A couple of other things are possible. Since you have stated that both vectors will always have the same size, you only need to check the size of one of them:
for (int i = 0; i < strVector.size(); ++i )
Next step: at() performs bounds checking. This loop ensures that i will never be out of bounds, so at()'s bound checking is also some overhead you can get rid of safely:
pairVector.push_back(pair<string, int> (strVector[i], intVector[i]));
Next: with a modern C++ compiler, the compiler should be able to optimize away, automatically, several redundant temporaries, and temporary copies here. It's possible you may need to help the compiler, a little bit, and use emplace_back() instead of push_back() (assuming C++11, or later):
pairVector.emplace_back(strVector[i], intVector[i]);
Going back to the loop condition, strVector.size() gets evaluated on each iteration of the loop. It's very likely that a modern C++ compiler will optimize it away, but just in case you can also help your compiler check the vector's size() only once:
int i=strVector.size();
for (int i = 0; i < n; ++i )
This is really a stretch, but it might eke out a few extra quantums of execution time. And that pretty much all obvious optimizations here. Realistically, the most to be gained here is by using reserve(). The other optimizations might help things a little bit more, but it all boils down to moving a certain number of bytes from one area in memory to another area. There aren't really special ways of doing that, that's faster than other ways.
We can use std:generate() to achieve this:
#include <bits/stdc++.h>
using namespace std;
vector <string> strVector{ "hello", "world" };
vector <int> intVector{ 2, 3 };
pair<string, int> f()
{
static int i = -1;
++i;
return make_pair(strVector[i], intVector[i]);
}
int main() {
int min_Size = min(strVector.size(), intVector.size());
vector< pair<string,int> > pairVector(min_Size);
generate(pairVector.begin(), pairVector.end(), f);
for( int i = 0 ; i < 2 ; i++ )
cout << pairVector[i].first <<" " << pairVector[i].second << endl;
}
I'll try and summarize what you want with some possible answers depending on your situation. You say you want a new vector that is essentially a zipped version of two other vectors which contain two heterogeneous types. Where you can access the two types as some sort of pair?
If you want to make this more efficient, you need to think about what you are using the new vector for? I can see three scenarios with what you are doing.
The new vector is a copy of your data so you can do stuff with it without affecting the original vectors. (ei you still need the original two vectors)
The new vector is now the storage mechanism for your data. (ei you
no longer need the original two vectors)
You are simply coupling the vectors together to make use and representation easier. (ei where they are stored doesn't actually matter)
1) Not much you can do aside from copying the data into your new vector. Explained more in Sam Varshavchik's answer.
3) You do something like Shakil's answer or here or some type of customized iterator.
2) Here you make some optimisations here where you do zero coping of the data with the use of a wrapper class. Note: A wrapper class works if you don't need to use the actual std::vector < std::pair > class. You can make a class where you move the data into it and create access operators for it. If you can do this, it also allows you to decompose the wrapper back into the original two vectors without copying. Something like this might suffice.
class StringIntContainer {
public:
StringIntContaint(std::vector<std::string>& _string_vec, std::vector<int>& _int_vec)
: string_vec_(std::move(_string_vec)), int_vec_(std::move(_int_vec))
{
assert(string_vec_.size() == int_vec_.size());
}
std::pair<std::string, int> operator[] (std::size_t _i) const
{
return std::make_pair(string_vec_[_i], int_vec_[_i]);
}
/* You may want methods that return reference to data so you can edit it*/
std::pair<std::vector<std::string>, std::vector<int>> Decompose()
{
return std::make_pair(std::move(string_vec_), std::move(int_vec_[_i])));
}
private:
std::vector<std::string> _string_vec_;
std::vector<int> int_vec_;
};

How to avoid out of range exception when erasing vector in a loop?

My apologies for the lengthy explanation.
I am working on a C++ application that loads two files into two 2D string vectors, rearranges those vectors, builds another 2D string vector, and outputs it all in a report. The first element of the two vectors is a code that identifies the owner of the item and the item in the vector. I pass the owner's identification to the program on start and loop through the two vectors in a nested while loop to find those that have matching first elements. When I do, I build a third vector with components of the first two, and I then need to capture any that don't match.
I was using the syntax "vector.erase(vector.begin() + i)" to remove elements from the two original arrays when they matched. When the loop completed, I had my new third vector, and I was left with two vectors that only had elements, which didn't match and that is what I needed. This was working fine as I tried the various owners in the files (the program accepts one owner at a time). Then I tried one that generated an out of range error.
I could not figure out how to do the erase inside of the loop without throwing the error (it didn't seem that swap and pop or erase-remove were feasible solutions). I solved my problem for the program with two extra nested while loops after building my third vector in this one.
I'd like to know how to make the erase method work here (as it seems a simpler solution) or at least how to check for my out of range error (and avoid it). There were a lot of "rows" for this particular owner; so debugging was tedious. Before giving up and going on to the nested while solution, I determined that the second erase was throwing the error. How can I make this work, or are my nested whiles after the fact, the best I can do? Here is the code:
i = 0;
while (i < AIvector.size())
{
CHECK:
j = 0;
while (j < TRvector.size())
{
if (AIvector[i][0] == TRvector[j][0])
{
linevector.clear();
// Add the necessary data from both vectors to Combo_outputvector
for (x = 0; x < AIvector[i].size(); x++)
{
linevector.push_back(AIvector[i][x]); // add AI info
}
for (x = 3; x < TRvector[j].size(); x++) // Don't need the the first three elements; so start with x=3.
{
linevector.push_back(TRvector[j][x]); // add TR info
}
Combo_outputvector.push_back(linevector); // build the combo vector
// then erase these two current rows/elements from their respective vectors, this revises the AI and TR vectors
AIvector.erase(AIvector.begin() + i);
TRvector.erase(TRvector.begin() + j);
goto CHECK; // jump from here because the erase will have changed the two increments
}
j++;
}
i++;
}
As already discussed, your goto jumps to the wrong position. Simply moving it out of the first while loop should solve your problems. But can we do better?
Erasing from a vector can be done cleanly with std::remove and std::erase for cheap-to-move objects, which vector and string both are. After some thought, however, I believe this isn't the best solution for you because you need a function that does more than just check if a certain row exists in both containers and that is not easily expressed with the erase-remove idiom.
Retaining the current structure, then, we can use iterators for the loop condition. We have a lot to gain from this, because std::vector::erase returns an iterator to the next valid element after the erased one. Not to mention that it takes an iterator anyway. Conditionally erasing elements in a vector becomes as simple as
auto it = vec.begin()
while (it != vec.end()) {
if (...)
it = vec.erase(it);
else
++it;
}
Because we assign erase's return value to it we don't have to worry about iterator invalidation. If we erase the last element, it returns vec.end() so that doesn't need special handling.
Your second loop can be removed altogether. The C++ standard defines functions for searching inside STL containers. std::find_if searches for a value in a container that satisfies a condition and returns an iterator to it, or end() if it doesn't exist. You haven't declared your types anywhere so I'm just going to assume the rows are std::vector<std::string>>.
using row_t = std::vector<std::string>;
auto AI_it = AIVector.begin();
while (AI_it != AIVector.end()) {
// Find a row in TRVector with the same first element as *AI_it
auto TR_it = std::find_if (TRVector.begin(), TRVector.end(), [&AI_it](const row_t& row) {
return row[0] == (*AI_it)[0];
});
// If a matching row was found
if (TR_it != TRVector.end()) {
// Copy the line from AIVector
auto linevector = *AI_it;
// Do NOT do this if you don't guarantee size > 3
assert(TR_it->size() >= 3);
std::copy(TR_it->begin() + 3, TR_it->end(),
std::back_inserter(linevector));
Combo_outputvector.emplace_back(std::move(linevector));
AI_it = AIVector.erase(AI_it);
TRVector.erase(TR_it);
}
else
++AI_it;
}
As you can see, switching to iterators completely sidesteps your initial problem of figuring out how not to access invalid indices. If you don't understand the syntax of the arguments for find_if search for the term lambda. It is beyond the scope if this answer to explain what they are.
A few notable changes:
linevector is now encapsulated properly. There is no reason for it to be declared outside this scope and reused.
linevector simply copies the desired row from AIVector rather than push_back every element in it, as long as Combo_outputvector (and therefore linevector) contains the same type than AIVector and TRVector.
std::copy is used instead of a for loop. Apart from being slightly shorter, it is also more generic, meaning you could change your container type to anything that supports random access iterators and inserting at the back, and the copy would still work.
linevector is moved into Combo_outputvector. This can be a huge performance optimization if your vectors are large!
It is possible that you used an non-encapsulated linevector because you wanted to keep a copy of the last inserted row outside of the loop. That would prohibit moving it, however. For this reason it is faster and more descriptive to do it as I showed above and then simply do the following after the loop.
auto linevector = Combo_outputvector.back();

Is it at all possible to erase from a vector with C++11's for loops?

Alright. For the sake of other (more simple but not explanatory enough) questions that this might look like, I am not asking if this is possible or impossible (because I found that out already), I am asking if there is a lighter alternative to my question.
What I have is what would be considered a main class, and in that main class, there is a variable that references to a 'World Map' class. In essence, this 'WorldMap' class is a container of other class variables. The main class does all of the looping and updates all of the respective objects that are active. There are times in this loop that I need to delete an object of a vector that is deep inside a recursive set of containers (As shown in the code provided). It would be extremely tedious to repeatedly have to reference the necessary variable as a pointer to another pointer (and so on) to point to the specific object I need, and later erase it (this was the concept I used before switching to C++11) so instead I have a range for loop (also shown in the code). My example code shows the idea that I have in place, where I want to cut down on the tedium as well as make the code a lot more readable.
This is the example code:
struct item{
int stat;
};
struct character{
int otherStat;
std::vector<item> myItems;
};
struct charContainer{
std::map<int, character> myChars;
};
int main(){
//...
charContainer box;
//I want to do something closer to this
for(item targItem: box.myChars[iter].myItems){
//Then I only have to use targItem as the reference
if(targItem.isFinished)
box.myChars[iter].myItems.erase(targItem);
}
//Instead of doing this
for(int a=0;a<box.myChars[iter].myItems.size();a++){
//Then I have to repeatedly use box.myChars[iter].myItems[a]
if(box.myChars[iter].myItems[a].isFinished)
box.myChars[iter].myItems.erase(box.myChars[iter].myItems[a]);
}
}
TLDR: I want to remove the tedium of repeatedly calling the full reference by using the new range for loops shown in C++11.
EDIT: I am not trying to delete the elements all at once. I am asking how I would delete them in the matter of the first loop. I am deleting them when I am done with them externally (via an if statement). How would I delete specific elements, NOT all of them?
If you simply want to clear an std::vector, there is a very simple method you can use:
std::vector<item> v;
// Fill v with elements...
v.clear(); // Removes all elements from v.
In addition to this, I'd like to point out that [1] to erase an element in a vector requires the usage of iterators, and [2] even if your approach was allowed, erasing elements from a vector inside a for loop is a bad idea if you are not careful. Suppose your vector has 5 elements:
std::vector<int> v = { 1, 2, 3, 4, 5 };
Then your loop would have the following effect:
First iteration: a == 0, size() == 5. We remove the first element, then the vector will contain {2, 3, 4, 5}
Second iteration: a == 1, size() == 4. We then remove the second element, then the vector will contain {2,4,5}
Third iteration: a == 2, size() == 3. We remove the third element, and we are left with the final result {2,4}.
Since this does not actually empty the vector, I suppose it is not what you were looking for.
If instead you have some particular condition that you want to apply to remove the elements, it is very easily applied in C++11 in the following way:
std::vector<MyType> v = { /* initialize vector */ };
// The following is a lambda, which is a function you can store in a variable.
// Here we use it to represent the condition that should be used to remove
// elements from the vector v.
auto isToRemove = [](const MyType & value){
return /* true if to remove, false if not */
};
// A vector can remove multiple elements at the same time using its method erase().
// Erase will remove all elements within a specified range. We use this method
// together with another method provided by the standard library: remove_if.
// What it does is it deletes all elements for which a particular predicate
// returns true within a range, and leaves the empty spaces at the end.
v.erase( std::remove_if( std::begin(v), std::end(v), isToRemove ), std::end(v) );
// Done!
I am deleting them when I am done with them externally (via an if statement). How would I delete specific elements, NOT all of them?
In my opinion, you're looking at this the wrong way. Writing loops to delete items from a sequence container is always problematic and not recommended. Strive to stay away from removing items in this fashion.
When you work with containers, you should strategically set up your code so that you place the deleted or "about to be deleted" items in a part of the container that is easily accessed, away from the items in the container that you do not want to delete. At the time you actually do want to remove them, you know where they are and thus can call some function to expel them from the container.
One answer was already given, and that is to use the erase-remove(if) idiom. When you call remove or remove_if, the items that are "bad" are moved to the end of the container. The return value for remove(_if) is the iterator to the start of the items that will be removed. Then you feed this iterator to the vector::erase method to delete these items permanently from the container.
The other solution (but probably less used) is the std::partition algorithm. The std::partition also can move the "bad" items to the end of the container, but unlike remove(_if), the items are still valid (i.e. you can leave them at the end of the container and still use them safely). Then later on, you can remove them as you wish in a separate step since std::partition also returns an iterator.
Why not have a standard iterator iterating over a vector. That way you can delete the element by passing an iterator. Then .erase() will return the next available iterator. And if your next iterator is iterator::end() then your loop will exit.

How to remove almost duplicates from a vector in C++

I have an std::vector of floats that I want to not contain duplicates but the math that populates the vector isn't 100% precise. The vector has values that differ by a few hundredths but should be treated as the same point. For example here's some values in one of them:
...
X: -43.094505
X: -43.094501
X: -43.094498
...
What would be the best/most efficient way to remove duplicates from a vector like this.
First sort your vector using std::sort. Then use std::unique with a custom predicate to remove the duplicates.
std::unique(v.begin(), v.end(),
[](double l, double r) { return std::abs(l - r) < 0.01; });
// treats any numbers that differ by less than 0.01 as equal
Live demo
Sorting is always a good first step. Use std::sort().
Remove not sufficiently unique elements: std::unique().
Last step, call resize() and maybe also shrink_to_fit().
If you want to preserve the order, do the previous 3 steps on a copy (omit shrinking though).
Then use std::remove_if with a lambda, checking for existence of the element in the copy (binary search) (don't forget to remove it if found), and only retain elements if found in the copy.
I say std::sort() it, then go through it one by one and remove the values within certain margin.
You can have a separate write iterator to the same vector and one resize operation at the end - instead of calling erase() for each removed element or having another destination copy for increased performance and smaller memory usage.
If your vector cannot contain duplicates, it may be more appropriate to use an std::set. You can then use a custom comparison object to consider small changes as being inconsequential.
Hi you could comprare like this
bool isAlmostEquals(const double &f1, const double &f2)
{
double allowedDif = xxxx;
return (abs(f1 - f2) <= allowedDif);
}
but it depends of your compare range and the double precision is not on your side
if your vector is sorted you could use std::unique with the function as predicate
I would do the following:
Create a set<double>
go through your vector in a loop or using a functor
Round each element and insert into the set
Then you can swap your vector with an empty vector
Copy all elements from the set to the empty vector
The complexity of this approach will be n * log(n) but it's simpler and can be done in a few lines of code. The memory consumption will double from just storing the vector. In addition set consumes slightly more memory per each element than vector. However, you will destroy it after using.
std::vector<double> v;
v.push_back(-43.094505);
v.push_back(-43.094501);
v.push_back(-43.094498);
v.push_back(-45.093435);
std::set<double> s;
std::vector<double>::const_iterator it = v.begin();
for(;it != v.end(); ++it)
s.insert(floor(*it));
v.swap(std::vector<double>());
v.resize(s.size());
std::copy(s.begin(), s.end(), v.begin());
The problem with most answers so far is that you have an unusual "equality". If A and B are similar but not identical, you want to treat them as equal. Basically, A and A+epsilon still compare as equal, but A+2*epsilon does not (for some unspecified epsilon). Or, depending on your algorithm, A*(1+epsilon) does and A*(1+2*epsilon) does not.
That does mean that A+epsilon compares equal to A+2*epsilon. Thus A = B and B = C does not imply A = C. This breaks common assumptions in <algorithm>.
You can still sort the values, that is a sane thing to do. But you have to consider what to do with a long range of similar values in the result. If the range is long enough, the difference between the first and last can still be large. There's no simple answer.