Here is a simple question I have been wondering about for a long time :
When I do a loop such as this one :
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
As the condition i < myVector.size() is checked each time, should I store the size of the array inside a variable before the loop to prevent the call to size() each iteration ? Or is the compiler smart enough to do it itself ?
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
And I would extend the question with a more complex condition such as i < myVector.front()/myVector.size()
Edit : I don't use myVector inside the loop, it is juste here to give the ending condition. And what about the more complex condition ?
The answer depends mainly on the contents of your loop–it may modify the vector during processing, thus modifying its size.
However if the vector is just scanned you can safely store its size in advance:
for (int i = 0, mySize = myVector.size(); i < mySize ; ++i) {
// my loop
}
although in most classes the functions like 'get current size' are just inline getters:
class XXX
{
public:
int size() const { return mSize; }
....
private:
int mSize;
....
};
so the compiler can easily reduce the call to just reading the int variable, consequently prefetching the length gives no gain.
If you are not changing anything in vector (adding/removing) during for-loop (which is normal case) I would use foreach loop
for (auto object : myVector)
{
//here some code
}
or if you cannot use c++11 I would use iterators
for (auto it = myVector.begin(); it != myVector.end(); ++it)
{
//here some code
}
I'd say that
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
is a bit safer than
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
because the value of myVector.size() may change (as result of , e.g. push_back(value) inside the loop) thus you might miss some of the elements.
If you are 100% sure that the value of myVector.size() is not going to change, then both are the same thing.
Yet, the first one is a bit more flexible than the second (other developer may be unaware that the loop iterates over fixed size and he might change the array size). Don't worry about the compiler, he's smarter than both of us combined.
The overhead is very small.
vector.size() does not recalculate anything, but simply returns the value of the private size variable..
it is safer than pre-buffering the value, as the vectors internal size variable is changed when an element is popped or pushed to/from the vector..
compilers can be written to optimize this out, if and only if, it can predict that the vector is not changed by ANYTHING while the for loop runs.
That is difficult to do if there are threads in there.
but if there isn't any threading going on, it's very easy to optimize it.
Any smart compiler will probably optimize this out. However just to be sure I usually lay out my for loops like this:
for (int i = myvector.size() -1; i >= 0; --i)
{
}
A couple of things are different:
The iteration is done the other way around. Although this shouldn't be a problem in most cases. If it is I prefer David Haim's method.
The --i is used rather than a i--. In theory the --i is faster, although on most compilers it won't make a difference.
If you don't care about the index this:
for (int i = myvector.size(); i > 0; --i)
{
}
Would also be an option. Altough in general I don't use it because it is a bit more confusing than the first. And will not gain you any performance.
For a type like a std::vector or std::list an iterator is the preffered method:
for (std::vector</*vectortype here*/>::iterator i = myVector.begin(); i != myVector.end(); ++i)
{
}
Related
I'm looking for the simplest way(algorithm?) to push an entire vector onto a queue and then delete the vector. I think there are a few ways to do this but I'm not sure which is best, or if all of them are correct. Option 1 is to use vector.pop_back(), but I'd have to go backwards through the for loop in this case, which isn't a problem since the order the objects go into the queue from the vector do not matter
for(unsigned i = vector.size() - 1; i >= 0; i--){
queue.push(vector[i]);
vector.pop_back();
}
Option 2 is to use vector.erase(). Also is it okay to do i < vector.size()? Because when I looked online for iterating through vectors I found a lot of i != vector.size() instead
for(unsigned i = 0; i < vector.size(); i++){
queue.push(vector[i]);
vector.erase[i];
}
My issue here is that if I erase vector[i], does vector [i+1] now become vector[i]? Or does vector[i] become a Null value?
My 3rd option would be to just erase it all at the end
for(unsigned i = 0; i < vector.size(); i++){
queue.push(vector[i]);
}
vector.erase(vector.begin(), vector.end());
Just for clarity, I don't want to get rid of the vector variable itself, just empty it after putting it into the queue, because it will eventually store a bunch of new things to dump into a queue again and again.
If you don't mind the objects being present in both the queue and the vector for a while, just do the simplest thing: your 3rd option, just with a clear() instead to be explicit what you're doing:
for(size_t i = 0; i < vector.size(); i++){
queue.push(vector[i]);
}
vector.clear();
Of course, in C++11, you could use a range-based for loop, and even move the items out of the vector to avoid needless copies:
for (auto &elem : vector) {
queue.push(std::move(elem));
}
vector.clear();
A)
for(unsigned i = vector.size() - 1; i >= 0; i--){
queue.push(vector[i]);
vector.pop_back();
}
This should be a bit less efficient then C)
B)
for(unsigned i = 0; i < vector.size(); i++){
queue.push(vector[i]);
vector.erase[i];
}
"My issue here is that if I erase vector[i], does vector [i+1] now become vector[i]? Or does vector[i] become a Null value?"
Erase[i] doesn't work at all. You could ruse erase(vector.begin()) pulling elements from the vectors head one by one, but its not very efficient and the whole loop should end up with O(N^2/2) since your deleting from the vectors head.
C)
for(unsigned i = 0; i < vector.size(); i++){
queue.push(vector[i]);
}
vector.clear();
Should be the most efficient way to go.
Note
The result of A and B since in A you're pulling elements from the tail while taking them from the head in C
Unless the element-type is large, you can't do much (except move which #Angew suggested). If element size is small, there is no benefit of move either - the memory layout of both vector and queue are different. If element-type is large, you may consider using pointers (in a list or vector) of elements.
If you were to use a deque instead of queue then you could do an insert and insert all the elements at once.
Something like
template <class T, template <typename, typename> class Container>
class BlockQueue : public Container<T, std::allocator<T>>
{
public:
BlockQueue() : Container<T, std::allocator<T>>()
{
}
void push( T val )
{
this->push_back( val );
}
void push( const std::vector<T>& newData )
{
this->insert( this->end(), newData.begin(), newData.end() );
}
};
I've written some perhaps naive code that is meant to remove elements from a vector that are too similar. The functionality is fine, but I think I may get unexpected results now and then because of the dynamic resizing of the vector.
for (size_t i = 0 ; i < vec.size(); i++) {
for(size_t j = i+1; j < vec.size(); j++) {
if(norm(vec[i]-vec[j]) <= 20 ) {
vec.erase(vec.begin()+j);
}
}
}
Is this safe to do? I'm concerned about i and j correctly adapting as I erase elements.
You need to pay better attention to where your elements are. It might be easier to express this directly in terms of iterators rather than compute iterators via indexes, like this:
for (auto it = vec.begin(); it != vec.end(); ++it)
{
for (auto jt = std::next(it); jt !=; vec.end(); )
{
if (/* condition */)
{
jt = vec.erase(jt);
}
else
{
++jt;
}
}
}
Yes, you are safe here. Since you are using indexes, not iterators, there is nothing to invalidate by erasing an item in the container except the size, and the size would be updated automatically, so we are good here.
One more thing to consider is what effect does erasing an element inside the inner loop has on the stopping condition of the outer loop. There is no problem there either, because j is guaranteed to be strictly greater than i, so j < vec.size() condition of the inner loop will be hit before the i < vec.size() condition of the outer loop, meaning that there would be no unsafe vec[i] access with an invalid index i.
Of course you should increment j after erasing an element to avoid the classic error. An even better approach would be to start walking the vector from the back, but you would need to do so in both loops to make sure that i a valid element is never erased from underneath the outer index of i.
I have a nested for-loop structure and right now I am re-declaring the vector at the start of each iteration:
void function (n1,n2,bound,etc){
for (int i=0; i<bound; i++){
vector< vector<long long> > vec(n1, vector<long long>(n2));
//about three more for-loops here
}
}
This allows me to "start fresh" each iteration, which works great because my internal operations are largely in the form of vec[a][b] += some value. But I worry that it's slow for large n1 or large n2. I don't know the underlying architecture of vectors/arrays/etc so I am not sure what the fastest way is to handle this situation. Should I use an array instead? Should I clear it differently? Should I handle the logic differently altogether?
EDIT: The vector's size technically does not change each iteration (but it may change based on function parameters). I'm simply trying to clear it/etc so the program is as fast as humanly possible given all other circumstances.
EDIT:
My results of different methods:
Timings (for a sample set of data):
reclaring vector method: 111623 ms
clearing/resizing method: 126451 ms
looping/setting to 0 method: 88686 ms
I have a clear preference for small scopes (i.e. declaring the variable in the innermost loop if it’s only used there) but for large sizes this could cause a lot of allocations.
So if this loop is a performance problem, try declaring the variable outside the loop and merely clearing it inside the loop – however, this is only advantageous if the (reserved) size of the vector stays identical. If you are resizing the vector, then you get reallocations anyway.
Don’t use a raw array – it doesn’t give you any advantage, and only trouble.
Here is some code that tests a few different methods.
#include <chrono>
#include <iostream>
#include <vector>
int main()
{
typedef std::chrono::high_resolution_clock clock;
unsigned n1 = 1000;
unsigned n2 = 1000;
// Original method
{
auto start = clock::now();
for (unsigned i = 0; i < 10000; ++i)
{
std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
// vec is initialized to zero already
// do stuff
}
auto elapsed_time = clock::now() - start;
std::cout << elapsed_time.count() << std::endl;
}
// reinitialize values to zero at every pass in the loop
{
auto start = clock::now();
std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
for (unsigned i = 0; i < 10000; ++i)
{
// initialize vec to zero at the start of every loop
for (unsigned j = 0; j < n1; ++j)
for (unsigned k = 0; k < n2; ++k)
vec[j][k] = 0;
// do stuff
}
auto elapsed_time = clock::now() - start;
std::cout << elapsed_time.count() << std::endl;
}
// clearing the vector this way is not optimal since it will destruct the
// inner vectors
{
auto start = clock::now();
std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
for (unsigned i = 0; i < 10000; ++i)
{
vec.clear();
vec.resize(n1, std::vector<long long>(n2));
// do stuff
}
auto elapsed_time = clock::now() - start;
std::cout << elapsed_time.count() << std::endl;
}
// equivalent to the second method from above
// no performace penalty
{
auto start = clock::now();
std::vector<std::vector<long long>> vec(n1, std::vector<long long>(n2));
for (unsigned i = 0; i < 10000; ++i)
{
for (unsigned j = 0; j < n1; ++j)
{
vec[j].clear();
vec[j].resize(n2);
}
// do stuff
}
auto elapsed_time = clock::now() - start;
std::cout << elapsed_time.count() << std::endl;
}
}
Edit: I've updated the code to make a fairer comparison between the methods.
Edit 2: Cleaned up the code a bit, methods 2 or 4 are the way to go.
Here are the timings of the above four methods on my computer:
16327389
15216024
16371469
15279471
The point is that you should try out different methods and profile your code.
When choosing a container i usually use this diagram to help me:
source
Other than that,
Like previously posted if this is causing performance problems declare the container outside of the for loop and just clear it at the start of each iteration
In addition to the previous comments :
if you use Robinson's swap method, you could go ever faster by handling that swap asynchronously.
Why not something like that :
{
vector< vector<long long> > vec(n1, vector<long long>(n2));
for (int i=0; i<bound; i++){
//about three more for-loops here
vec.clear();
}
}
Edit: added scope braces ;-)
Well if you are really concerned about performance (and you know the size of n1 and n2 beforehand) but don't want to use a C-style array, std::array may be your friend.
EDIT: Given your edit, it seems an std::array isn't an appropriate substitute since while the vector size does not change each iteration, it still isn't known before compilation.
Since you have to reset the vector values to 0 each iteration, in practical terms, this question boils down to "is the cost of allocating and deallocating the memory for the vector cheap or expensive compared to the computations inside the loops".
Assuming the computations are the expensive part of the algorithm, the way you've coded it is both clear, concise, shows the intended scope, and is probably just as fast as alternate approaches.
If however your computations and updates are extremely fast and the allocation/deallocation of the vector is relatively expensive, you could use std::fill to fill zeroes back into the array at the end/beginning of each iteration through the loop.
Of course the only way to know for sure is to measure with a profiler. I suspect you'll find that the approach you took won't show up as a hotspot of any sort and you should leave the obvious code in place.
The overhead of using a vector vs an array is minor, especially when you are getting a lot of useful functionality from the vector. Internally a vector allocates an array. So vector is the way to go.
I need to iterate over a vector from the end to the beginning. The "correct" way is
for(std::vector<SomeT>::reverse_iterator rit = v.rbegin(); rit != v.rend(); ++rit)
{
//do Something
}
When //do Something involves knowing the actual index, then some calculations need to be done with rit to obtain it, like index = v.size() - 1 - (rit - v.rbegin)
If the index is needed anyway, then I strongly believe it is better to iterate using that index
for(int i = v.size() - 1; i >= 0; --i)
{
//do something with v[i] and i;
}
This gives a warning that i is signed and v.size() is unsigned.
Changing to
for(unsigned i = v.size() - 1; i >= 0; --i) is just functionally wrong, because this is essentially an endless loop :)
What is an aesthetically good way to do what I want to do which
is warning-free
doesn't involve casts
is not overly verbose
As you've noted, the problem with a condition of i >= 0 when it's unsigned is that the condition is always true. Instead of subtracting 1 when you initialize i and then again after each iteration, subtract 1 after checking the loop condition:
for (unsigned i = v.size(); i-- > 0; )
I like this style for several reasons:
Although i will wrap around to UINT_MAX at the end of the loop, it doesn't rely on that behavior — it would work the same if the types were signed. Relying on unsigned wraparound feels like a bit of a hack to me.
It calls size() exactly once.
It doesn't use >=. Whenever I see that operator in a for loop, I have to re-read it to make sure there isn't an off-by-one error.
If you change the spacing in the conditional, you can make it use the "goes to" operator.
There's nothing to stop your reverse_iterator loop also using the index as described in multiple other answers. That way you can use the iterator or index as needed in the // do the work part, for minimal extra cost.
size_t index = v.size() - 1;
for(std::vector<SomeT>::reverse_iterator rit = v.rbegin();
rit != v.rend(); ++rit, --index)
{
// do the work
}
Though I'm curious to know what you need the index for. Accessing v[index] is the same as accessing *rit.
to be aesthetically pleasing! ;)
for(unsigned i = v.size() - 1; v.size() > i; --i)
I would prefer the reverse iterator variant, because it's still easy to interpret and allows to avoid index-related errors.
Sometimes you can simply use the BOOST_REVERSE_FOREACH, which would make your code look the following way:
reverse_foreach (int value, vector) {
do_something_with_the_value;
}
Actually speaking, you can always use foreach statements for these kinds of loops, but then they become a bit unobvious:
size_t i = 0;
foreach (int value, vector) {
do_something;
++i;
}
In C++20 one can use ranges (#include <ranges>)
//DATA
std::vector<int> vecOfInts = { 2,4,6,8 };
//REVERSE VECTOR (
for (int i : vecOfInts | std::views::reverse)
{
std::cout << i << " ";
}
or if it is required to save in a different variable.
//SAVE IN ANOTHER VARIABLE
auto reverseVecOfInts = std::views::reverse(vecOfInts);
//ITERATION
for (int i : reverseVecOfInts)
{
std::cout << i << " ";
}
Try out a do while :
std::vector<Type> v;
// Some code
if(v.size() > 0)
{
unsigned int i = v.size() - 1;
do
{
// Your stuff
}
while(i-- > 0);
}
Hi i think better way use iterator as you use in first sample and if you need get iterator index you can use
std::distance to calculate it, if i understand your question
loop condition i != std::numeric_limits<unsigned>::max() ... or use UINT_MAX if you think its to verbose.
or another way:
for(unsigned j=0, end=v.size(), i=end-1; j<end; --i, ++j)
or
for(unsigned end=v.size(), i=end-1; (end-i)<end; --i)
I think that:
for(unsigned i = v.size() - 1; i >= 0; --i)
is fine if you check
!v.empty()
earlier.
for (it = v.end()-1; it != v.begin()-1; --it)
{
}
The "goes to" operator definitely messes with my head.
I'm trying to optimize my C++ code. I've searched the internet on using dynamically allocated C++ arrays vs using std::vector and have generally seen a recommendation in favor of std::vector and that the difference in performance between the two is negligible. For instance here - Using arrays or std::vectors in C++, what's the performance gap?.
However, I wrote some code to test the performance of iterating through an array/vector and assigning values to the elements and I generally found that using dynamically allocated arrays was nearly 3 times faster than using vectors (I did specify a size for the vectors beforehand). I used g++-4.3.2.
However I feel that my test may have ignored issues I don't know about so I would appreciate any advice on this issue.
Thanks
Code used -
#include <time.h>
#include <iostream>
#include <vector>
using namespace std;
int main() {
clock_t start,end;
std::vector<int> vec(9999999);
std::vector<int>::iterator vecIt = vec.begin();
std::vector<int>::iterator vecEnd = vec.end();
start = clock();
for (int i = 0; vecIt != vecEnd; i++) {
*(vecIt++) = i;
}
end = clock();
cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
int* arr = new int[9999999];
start = clock();
for (int i = 0; i < 9999999; i++) {
arr[i] = i;
}
end = clock();
cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}
When benchmarking C++ comtainers, it's important to enable most compiler optimisations. Several of my own answers on SO have fallen foul of this - for example, the function call overhead when something like operator[] is not inlined can be very significant.
Just for fun, try iterating over the plain array using a pointer instead of an integer index (the code should look just like the vector iteration, since the point of STL iterators is to appear like pointer arithmetic for most operations). I bet the speed will be exactly equal in that case. Which of course means you should pick the vector, since it will save you a world of headaches from managing arrays by hand.
The thing about the standard library classes such as std::vector is that yes, naively, it is a lot more code than a raw array. But all of it can be trivially inlined by the compiler, which means that if optimizations are enabled, it becomes essentially the same code as if you'd used a raw array. The speed difference then is not negligible but non-existent. All the overhead is removed at compile-time.
But that requires compiler optimizations to be enabled.
I imagine the reason why you found iterating and adding to std::vector 3 times slower than a plain array is a combination of the cost of iterating the vector and doing the assigment.
Edit:
That was my initial assumption before the testcase; however running the testcase (compiled with -O3) shows the converse - std::vector is actually 3 times faster, which surprised me.
I can't see how std::vector could be faster (certainly not 3 times faster) than a vanilla array copy - I think there's some optimisation being applied to the std::vector compiled code which isn't happening for the array version.
Original benchmark results:
$ ./array
array: 0.059375
vector: 0.021209
std::vector is 3x faster. Same benchmark again, except add an additional outer loop to run the test iterater loop 1000 times:
$ ./array
array: 21.7129
vector: 21.6413
std::vector is now ~ the same speed as array.
Edit 2
Found it! So the problem with your test case is that in the vector case the memory holding the data appears to be already in the CPU cache - either by the way it is initialised, or due to the call to vec.end(). If I 'warm' up the CPU cache before each timing test, I get the same numbers for array and vector:
#include <time.h>
#include <iostream>
#include <vector>
int main() {
clock_t start,end;
std::vector<int> vec(9999999);
std::vector<int>::iterator vecIt = vec.begin();
std::vector<int>::iterator vecEnd = vec.end();
// get vec into CPU cache.
for (int i = 0; vecIt != vecEnd; i++) { *(vecIt++) = i; }
vecIt = vec.begin();
start = clock();
for (int i = 0; vecIt != vecEnd; i++) {
*(vecIt++) = i;
}
end = clock();
std::cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;
int* arr = new int[9999999];
// get arr into CPU cache.
for (int i = 0; i < 9999999; i++) { arr[i] = i; }
start = clock();
for (int i = 0; i < 9999999; i++) {
arr[i] = i;
}
end = clock();
std::cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;
}
This gives me the following result:
$ ./array
vector: 0.020875
array: 0.020695
I agree with rmeador,
for (int i = 0; vecIt != vecEnd; i++) {
*(vecIt++) = i; // <-- quick offset calculation
}
end = clock();
cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
int* arr = new int[9999999];
start = clock();
for (int i = 0; i < 9999999; i++) {
arr[i] = i; // <-- not fair play :) - offset = arr + i*size(int)
}
I think the answer here is obvious: it doesn't matter. Like jalf said the code will end up being about the same, but even if it wasn't, look at the numbers. The code you posted creates a huge array of 10 MILLION items, yet iterating over the entire array takes only a few hundredths of a second.
Even if your application really is working with that much data, whatever it is you're actually doing with that data is likely to take much more time than iterating over your array. Just use whichever data structure you prefer, and focus your time on the rest of your code.
To prove my point, here's the code with one change: the assignment of i to the array item is replaced with an assignment of sqrt(i). On my machine using -O2, the execution time triples from .02 to .06 seconds.
#include <time.h>
#include <iostream>
#include <vector>
#include <math.h>
using namespace std;
int main() {
clock_t start,end;
std::vector<int> vec(9999999);
std::vector<int>::iterator vecIt = vec.begin();
std::vector<int>::iterator vecEnd = vec.end();
start = clock();
for (int i = 0; vecIt != vecEnd; i++) {
*(vecIt++) = sqrt(i);
}
end = clock();
cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
int* arr = new int[9999999];
start = clock();
for (int i = 0; i < 9999999; i++) {
arr[i] = i;
}
end = clock();
cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}
The issue seems to be that you compiled your code with optimizations turned off. On my machine, OS X 10.5.7 with g++ 4.0.1 I actually see that the vector is faster than primitive arrays by a factor of 2.5.
With gcc try to pass -O2 to the compiler and see if there's any improvement.
The reason that your array iterating is faster is that the the number of iteration is constant, and compiler is able to unroll the loop. Try to use rand to generate a number, and multiple it to be a big number you wanted so that compiler wont be able to figure it out at compile time. Then try it again, you will see similar runtime results.
One reason you're code might not be performing quite the same is because on your std::vector version, you are incrimenting two values, the integer i and the std::vector::iterator vecIt. To really be equivalent, you could refactor to
start = clock();
for (int i = 0; i < vec.size(); i++) {
vec[i] = i;
}
end = clock();
cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
Your code provides an unfair comparison between the two cases since you're doing far more work in the vector test than the array test.
With the vector, you're incrementing both the iterator (vecIT) and a separate variable (i) for generating the assignment values.
With the array, you're only incrementing the variable i and using it for dual purpose.