I am working on a merge sort function. I got the sort down - I am trying to get my merge part finished. Assume that I am learning C++, have cursory knowledge of pointers, and don't understand all the rules of std::vector::iterator's (or std::vector's, for that matter).
Assume that num is the size of the original std::vector that have copied (std::copy) values from an array of size "int ar[num]." Assume that farray has the values of (0 to (num / 2)) and sarray has the values of ((num / 2) to num).
int num = original.size();
std::vector<int> final(num);
for (std::vector<int>::iterator it = farray.begin(); it != farray.end(); ++it) {
for (std::vector<int>::iterator iter = sarray.begin(); iter != sarray.end(); ++iter) {
if (*it > *iter) final.push_back(*it);
else
final.push_back(*iter);
}
}
This code compiles and my latest stable build of Bloodshed Dev-C++ does not throw any warnings or errors. I don't know if this is valid, I still need to try and cout all the values of final. I just want to know if this is common, prone to errors, or just bad style. And, if so, how you would
It's valid... but a for loop probably isn't what you want. When you use two for loops, your inner loop keeps going back to the start every time the outer loop loops. So if your vectors contain:
farray: 10 9 8 4 3
sarray: 7 6 4 3 1
Then your final array will contain something like:
10 10 10 10 10 9 9 9 9 9 8 8 8 8 8 7 6 4 4 4 7 6 4 3 3
because you are testing every single combination, and adding the larger one to the final list. A better solution might be to remember an iterator for each list, and just use one loop. Rather than looping over a list, just go through both of them together - if sarray has the larger number, then increment your sarray iterator, and compare that with the old farray iterator. Stop your loop when both sarray and farray are empty.
vector<int> fiter = farray.begin();
vector<int> siter = sarray.begin();
vector<int> final;
// Let's traverse both farray and sarray.
// We'll want to stop this loop once we've traversed both lists.
while (fiter != farray.end() && siter != sarray.end())
{
if (fiter == farray.end())
{
// we must have gone right through farray -
// so use the value from sarray, and go to the next one
final.push_back(*siter);
siter++;
}
else if (siter == sarray.end())
{
// we must have gone right through sarray -
// so use the value from farray, and go to the next one
final.push_back(*fiter);
fiter++;
}
else if (*siter > *fiter)
{
// siter is the bigger of the two - add it to the final list, and
// go to the next sarray entry
final.push_back(*siter);
siter++;
}
else // *fiter >= *siter
{
// fiter is the bigger of the two - add it to the final list, and
// go to the next farray entry
final.push_back(*fiter);
fiter++;
}
}
I haven't tested it - and if this is for homework, then please try to understand what I've done, go away and write it yourself, rather than copy+paste.
Merge sort algorithm aside, nested for loop with iterator's is just as valid as nested for loops with two variables i and j.
You can nest loops of any kind (for, while, do while) as long as you don't reuse the loop variables. If you would try that it would compile but may fail miserably during runtime. Although technically allowed to use the same name for nested loop variables in modern C and C++ it is confusing and should be avoided.
It's no more or less prone to errors than a single loop except for the already mentioned problem with the reuse of loop variables.
Read more about the limits of nested loops.
Nesting for loops is a totally legit way to do things. For example, it's the classic "old school" way to traverse a 2D array - one loop goes down the y axis, and the other loop goes down the x axis.
Nowadays, with those kids and their for each loops and iterators and mapping functions there are arguably "better" ways to do it, (for some definition of better) but nesting loops works just fine. Using C++ or pointers doesn't change that.
Yes, you can do this. And yes, it is often prone to errors. In fact, writing loops is itself prone to errors, which is one argument to use the algorithms in the STL like for_each, copy and transform.
Yes, you can nest loops, or other statements, to pretty much whatever depth you want (within reason; there are limits, as mentioned in another answer, but they're way above what you should ever need).
Related
My apologies for the lengthy explanation.
I am working on a C++ application that loads two files into two 2D string vectors, rearranges those vectors, builds another 2D string vector, and outputs it all in a report. The first element of the two vectors is a code that identifies the owner of the item and the item in the vector. I pass the owner's identification to the program on start and loop through the two vectors in a nested while loop to find those that have matching first elements. When I do, I build a third vector with components of the first two, and I then need to capture any that don't match.
I was using the syntax "vector.erase(vector.begin() + i)" to remove elements from the two original arrays when they matched. When the loop completed, I had my new third vector, and I was left with two vectors that only had elements, which didn't match and that is what I needed. This was working fine as I tried the various owners in the files (the program accepts one owner at a time). Then I tried one that generated an out of range error.
I could not figure out how to do the erase inside of the loop without throwing the error (it didn't seem that swap and pop or erase-remove were feasible solutions). I solved my problem for the program with two extra nested while loops after building my third vector in this one.
I'd like to know how to make the erase method work here (as it seems a simpler solution) or at least how to check for my out of range error (and avoid it). There were a lot of "rows" for this particular owner; so debugging was tedious. Before giving up and going on to the nested while solution, I determined that the second erase was throwing the error. How can I make this work, or are my nested whiles after the fact, the best I can do? Here is the code:
i = 0;
while (i < AIvector.size())
{
CHECK:
j = 0;
while (j < TRvector.size())
{
if (AIvector[i][0] == TRvector[j][0])
{
linevector.clear();
// Add the necessary data from both vectors to Combo_outputvector
for (x = 0; x < AIvector[i].size(); x++)
{
linevector.push_back(AIvector[i][x]); // add AI info
}
for (x = 3; x < TRvector[j].size(); x++) // Don't need the the first three elements; so start with x=3.
{
linevector.push_back(TRvector[j][x]); // add TR info
}
Combo_outputvector.push_back(linevector); // build the combo vector
// then erase these two current rows/elements from their respective vectors, this revises the AI and TR vectors
AIvector.erase(AIvector.begin() + i);
TRvector.erase(TRvector.begin() + j);
goto CHECK; // jump from here because the erase will have changed the two increments
}
j++;
}
i++;
}
As already discussed, your goto jumps to the wrong position. Simply moving it out of the first while loop should solve your problems. But can we do better?
Erasing from a vector can be done cleanly with std::remove and std::erase for cheap-to-move objects, which vector and string both are. After some thought, however, I believe this isn't the best solution for you because you need a function that does more than just check if a certain row exists in both containers and that is not easily expressed with the erase-remove idiom.
Retaining the current structure, then, we can use iterators for the loop condition. We have a lot to gain from this, because std::vector::erase returns an iterator to the next valid element after the erased one. Not to mention that it takes an iterator anyway. Conditionally erasing elements in a vector becomes as simple as
auto it = vec.begin()
while (it != vec.end()) {
if (...)
it = vec.erase(it);
else
++it;
}
Because we assign erase's return value to it we don't have to worry about iterator invalidation. If we erase the last element, it returns vec.end() so that doesn't need special handling.
Your second loop can be removed altogether. The C++ standard defines functions for searching inside STL containers. std::find_if searches for a value in a container that satisfies a condition and returns an iterator to it, or end() if it doesn't exist. You haven't declared your types anywhere so I'm just going to assume the rows are std::vector<std::string>>.
using row_t = std::vector<std::string>;
auto AI_it = AIVector.begin();
while (AI_it != AIVector.end()) {
// Find a row in TRVector with the same first element as *AI_it
auto TR_it = std::find_if (TRVector.begin(), TRVector.end(), [&AI_it](const row_t& row) {
return row[0] == (*AI_it)[0];
});
// If a matching row was found
if (TR_it != TRVector.end()) {
// Copy the line from AIVector
auto linevector = *AI_it;
// Do NOT do this if you don't guarantee size > 3
assert(TR_it->size() >= 3);
std::copy(TR_it->begin() + 3, TR_it->end(),
std::back_inserter(linevector));
Combo_outputvector.emplace_back(std::move(linevector));
AI_it = AIVector.erase(AI_it);
TRVector.erase(TR_it);
}
else
++AI_it;
}
As you can see, switching to iterators completely sidesteps your initial problem of figuring out how not to access invalid indices. If you don't understand the syntax of the arguments for find_if search for the term lambda. It is beyond the scope if this answer to explain what they are.
A few notable changes:
linevector is now encapsulated properly. There is no reason for it to be declared outside this scope and reused.
linevector simply copies the desired row from AIVector rather than push_back every element in it, as long as Combo_outputvector (and therefore linevector) contains the same type than AIVector and TRVector.
std::copy is used instead of a for loop. Apart from being slightly shorter, it is also more generic, meaning you could change your container type to anything that supports random access iterators and inserting at the back, and the copy would still work.
linevector is moved into Combo_outputvector. This can be a huge performance optimization if your vectors are large!
It is possible that you used an non-encapsulated linevector because you wanted to keep a copy of the last inserted row outside of the loop. That would prohibit moving it, however. For this reason it is faster and more descriptive to do it as I showed above and then simply do the following after the loop.
auto linevector = Combo_outputvector.back();
I am a bit curiuous about vector optimization and have couple questions about it. (I am still a beginner in programing)
example:
struct GameInfo{
EnumType InfoType;
// Other info...
};
int _lastPosition;
// _gameInfoV is sorted beforehand
std::vector<GameInfo> _gameInfoV;
// The tick function is called every game frame (in "perfect" condition it's every 1.0/60 second)
void BaseClass::tick()
{
for (unsigned int i = _lastPosition; i < _gameInfoV.size(); i++{
auto & info = _gameInfoV[i];
if( !info.bhasbeenAdded ){
if( DoWeNeedNow() ){
_lastPosition++;
info.bhasbeenAdded = true;
_otherPointer->DoSomething(info.InfoType);
// Do something more with "info"....
}
else return; //Break the cycle since we don't need now other "info"
}
}
}
The _gameInfoV vector size can be between 2000 and 5000.
My main 2 questions are:
Is it better to leave the way how it is or it's better to make smaller chunks of it, which is checked for every different GameInfo.InfoType
Is it worth the hassle of storing the last start position index of the vector instead of iterating from the beginning.
Note that if using smaller vectors there will be like 3 to 6 of them
The third thing is probably that I am not using vector iterators, but is it safe to use then like this?
std::vector<GameInfo>::iterator it = _gameInfoV.begin() + _lastPosition;
for (it = _gameInfoV.begin(); it != _gameInfoV.end(); ++it){
//Do something
}
Note: It will be used in smartphones, so every optimization will be appreciated, when targeting weaker phones.
-Thank you
Don't; except if you frequently move memory around
It is no hassle if you do it correctly:
std::vector<GameInfo>::const_iterator _lastPosition(gameInfoV.begin());
// ...
for (std::vector<GameInfo>::iterator info=_lastPosition; it!=_gameInfoV.end(); ++info)
{
if (!info->bhasbeenAdded)
{
if (DoWeNeedNow())
{
++_lastPosition;
_otherPointer->DoSomething(info->InfoType);
// Do something more with "info"....
}
else return; //Break the cycle since we don't need now other "i
}
}
Breaking one vector up into several smaller vectors in general doesn't improve performance. It could even slightly degrade performance because the compiler has to manage more variables, which take up more CPU registers etc.
I don't know about gaming so I don't understand the implication of GameInfo.InfoType. Your processing time and CPU resource requirements are going to increase if you do more total iterations through loops (where each loop iteration performs the same type of operation). So if separating the vectors causes you to avoid some loop iterations because you can skip entire vectors, that's going to increase performance of your app.
iterators are the most secure way to iterate through containers. But for a vector I often just use the index operator [] and my own indexer (a plain old unsigned integer).
I have 2 structs, one simply has 2 values:
struct combo {
int output;
int input;
};
And another that sorts the input element based on the index of the output element:
struct organize {
bool operator()(combo const &a, combo const &b)
{
return a.input < b.input;
}
};
Using this:
sort(myVector.begin(), myVector.end(), organize());
What I'm trying to do with this, is iterate through the input varlable, and check if each element is equal to another input 'in'.
If it is equal, I want to insert the value at the same index it was found to be equal at for input, but from output into another temp vector.
I originally went with a more simple solution (when I wasn't using a structs and simply had 2 vectors, one input and one output) and had this in a function called copy:
for(int i = 0; i < input.size(); ++i){
if(input == in){
temp.push_back(output[i]);
}
}
Now this code did work exactly how I needed it, the only issue is it is simply too slow. It can handle 10 integer inputs, or 100 inputs but around 1000 it begins to slow down taking an extra 5 seconds or so, then at 10,000 it takes minutes, and you can forget about 100,000 or 1,000,000+ inputs.
So, I asked how to speed it up on here (just the function iterator) and somebody suggested sorting the input vector which I did, implemented their suggestion of using upper/lower bound, changing my iterator to this:
std::vector<int>::iterator it = input.begin();
auto lowerIt = std::lower_bound(input.begin(), input.end(), in);
auto upperIt = std::upper_bound(input.begin(), input.end(), in);
for (auto it = lowerIt; it != upperIt; ++it)
{
temp.push_back(output[it - input.begin()]);
}
And it worked, it made it much faster, I still would like it to be able to handle 1,000,000+ inputs in seconds but I'm not sure how to do that yet.
I then realized that I can't have the input vector sorted, what if the inputs are something like:
input.push_back(10);
input.push_back(-1);
output.push_back(1);
output.push_back(2);
Well then we have 10 in input corresponding to 1 in output, and -1 corresponding to 2. Obviously 10 doesn't come before -1 so sorting it smallest to largest doesn't really work here.
So I found a way to sort the input based on the output. So no matter how you organize input, the indexes match each other based on what order they were added.
My issue is, I have no clue how to iterate through just input with the same upper/lower bound iterator above. I can't seem to call upon just the input variable of myVector, I've tried something like:
std::vector<combo>::iterator it = myVector.input.begin();
But I get an error saying there is no member 'input'.
How can I iterate through just input so I can apply the upper/lower bound iterator to this new way with the structs?
Also I explained everything so everyone could get the best idea of what I have and what I'm trying to do, also maybe somebody could point me in a completely different direction that is fast enough to handle those millions of inputs. Keep in mind I'd prefer to stick with vectors because not doing so would involve me changing 2 other files to work with things that aren't vectors or lists.
Thank you!
I think that if you sort it in smallest to largest (x is an integer after all) that you should be able to use std::adjacent_find to find duplicates in the array, and process them properly. For the performance issues, you might consider using reserve to preallocate space for your large vector, so that your push back operations don't have to reallocate memory as often.
As the title says, I have in my mind some methods to do it but I don't know which is fastest.
So let's say that we have a: vector<int> vals with some values
1
After my vals are added
sort(vals.begin(), vals.end());
auto last = unique(vals.begin(), vals.end());
vals.erase(last, vals.end());
2
Convert to set after my vals are added:
set<int> s( vals.begin(), vals.end() );
vals.assign( s.begin(), s.end() );
3
When i add my vals, i check if it's already in my vector:
if( find(vals.begin(), vals.end(), myVal)!=vals.end() )
// add my val
4
Use a set from start
Ok, I've got these 4 methods, my questions are:
1 From 1, 2 and 3 which is the fastest?
2 Is 4 faster than the first 3?
3 At 2 after converting the vector to set, it's more convenabile to use the set to do what I need to do or should I do the vals.assign( .. ) and continue with my vector?
Question 1: Both 1 and 2 are O(n log n), 3 is O(n^2). Between 1 and 2, it depends on the data.
Question 2: 4 is also O(n log n) and can be better than 1 and 2 if you have lots of duplicates, because it only stores one copy of each. Imagine a million values that are all equal.
Question 3: Well, that really depends on what you need to do.
The only thing that can be said without knowing more is that your alternative number 3 is asymptotically worse than the others.
If you're using C++11 and don't need ordering, you can use std::unordered_set, which is a hash table and can be significantly faster than std::set.
Option 1 is going to beat all the others. The complexity is just O(N log N) and the contiguous memory of vector keeps the constant factors low.
std::set typically suffers a lot from non-contiguous allocations. It's not just slow to access those, just creating them takes significant time as well.
These methods all have their shortcomings although (1) is worth looking at.
But, take a look at this 5th option: Bear in mind that you can access the vector's data buffer using the data() function. Then, bearing in mind that no reallocation will take place since the vector will only ever get smaller, apply the algorithm that you learn at school:
unduplicate(vals.data(), vals.size());
void unduplicate(int* arr, std::size_t length) /*Reference: Gang of Four, I think*/
{
int *it, *end = arr + length - 1;
for (it = arr + 1; arr < end; arr++, it = arr + 1){
while (it <= end){
if (*it == *arr){
*it = *end--;
} else {
++it;
}
}
}
}
And resize the vector at the end if that is what's required. This is never worse than O(N^2), so is superior to insertion-sort or sort then remove approaches.
Your 4th option might be an idea if you can adopt it. Profile the performance. Otherwise use my algorithm from the 1960s.
I've got a similar problem recently, and experimented with 1, 2, and 4, as well as with unordered_set version of 4. In turned out that the best performance was the latter one, 4 with unordered_set in place of set.
BTW, that empirical finding is not too surprising if one considers that both set and sort were a bit of overkill: they guaranteed relative order of unequal elements. For example inputs 4,3,5,2,4,3 would lead to sorted output of unique values 2,3,4,5. This is unnecessary if you can live with unique values in arbitrary order, i.e. 3,4,2,5. When you use unordered_set it doesn't guarantee the order, only uniqueness, and therefore it doesn't have to perform the additional work of ensuring the order of different elements.
I'm working right now with iterators arithmetic operations and stack on small problem .
I need to make a Sum of first and last element of vector<int> followed by second and last element of vector<int> , third and last element of vector<int>
Example:
Input numbers by user
1 2 3 4 5 6 7 8 9
Output should be
10 11 12 13 14 15 16 17
In general the code should do addition like that
1+9 2+9 3+9 4+9 5+9 6+9 7+9 ......
So basically i need the actual code for this arithmetic operation using iterator with member functions *.begin() , *.end() only ! I've try many ways but nothing coming in my head how to do this operation only with .begin() and .end() . I found other member functions but this functions is explained in STD library, not in basic knowledge level. So i need help to make code with only begin() and end() member functions if possible.
Code i got so far
int main()
{
vector<int> numset;
int num_input;
auto beg=numset.begin(), end=numset.end();
while (cin>>num_input)
{
numset.push_back(num_input);
}
for (auto it = numset.begin()+1; it !=numset.end(); ++it)
{
// *it=*it+1+nuset.end(); -- Wrong X
// *it+=(end-beg)/2; -- Totally wrong(and totally stupid) X
// *it + numset.back() -- can't use other member functions X
//////// I've stack here dont know what code need //////
cout<<*it<<endl;
}
Thank you for your time.
The operation you perform is *it+*(it-1). (It might help to add more parentheses and spaces in your code.) That adds two adjacent elements from the sequence.
The last element in the sequence is numset.back(). So try *it + numset.back() instead. And there's no need to start with the second element, since you do want to print the sum of the first and last elements. If you don't want to print the sum of the last element with itself, you should stop at end() - 1, though.