Related
So I am trying to code for this question:
Yes, I have to use arrays since it is a requirement.
Consider the problem of adding two n-bit binary integers, stored in two n-element arrays A and B. The sum of the two integers should be stored in binary form in an (n+1) element array C . State the problem formally and write pseudocode for adding the two integers.
I know that the ans array contains the correct output at the end of the addd function. However, I am not able to output that answer.
Below is my code. Please help me figure where in the code I'm going wrong, and what I can do to change it so it works. I will be very grateful.
#include <iostream>
using namespace std;
int * addd(int a[], int n1, int b[], int n2)
{
int s;
if(n1<n2) {s=n2+1;}
else {s=n1+1;}
int ans[s];
int i=n1-1, j=n2-1, k=s-1;
int carry=0;
while(i>=0 && j>=0 && k>0)
{
ans[k]=(a[i]+b[j]+carry)%2;
//cout<<k<<" "<<ans[k]<<endl;
carry=(a[i]+b[j]+carry)/2;
i--; j--; k--;
}
//cout<<"Carry "<<carry<<endl;
ans[0]=carry;
return ans;
}
int main(int argc, const char * argv[]) {
// insert code here...
int a[]={0,0,0,1,1,1};
int n1=sizeof(a)/sizeof(a[0]);
int b[]={1,0,1,1,0,1};
int n2=sizeof(b)/sizeof(b[0]);
int *p=addd(a,6,b,6);
// cout<<p[1]<<endl;
// cout<<p[0]<<" "<<p[1]<<" "<<p[2]<<" "<<p[3]<<" "<<p[4]<<" "<<p[5]<<" "<<p[6]<<endl;
return 0;
}
using namespace std;
Don't write using namespace std;. I have a summary I paste in from a file of common issues when I'm active in the Code Review Stack Exchange, but I don't have that here. Instead, you should just declare the symbols you need, like using std::cout;
int * addd(int a[], int n1, int b[], int n2)
The parameters of the form int a[] are very odd. This comes from C and is actually transformed into int* a and is not passing the array per-se.
The inputs should be const.
The names are not clear, but I'm guessing that n1 is the size of the array? In the Standard Guidelines, you'll see that passing a pointer plus length is strongly discouraged. The Standard Guidelines Library supplies a simple span type to use for this instead.
And the length should be size_t not int.
Based on the description, I think each element is only one bit, right? So why are the arrays of type int? I'd use bool or perhaps int8_t as being easier to work with.
What are you returning? If a and b and their lengths are the input, where is the output that you are returning a pointer to the beginning of? This is not giving value semantics, as you are returning a pointer to something that must exist elsewhere so what is its lifetime?
int s;
int ans[s];
return ans;
Well, there's your problem. First of all, declaring an array of a size that's not a constant is not even legal. (This is a gnu extension that implements C's VLA feature but not without issues as it breaks the C++ type system)
Regardless of that, you are returning a pointer to the first element of the local array, so what happens to the memory when the function returns? Boom.
int s;
No. Initialize values when they are created.
if(n1<n2) {s=n2+1;}
else {s=n1+1;}
Learn the library.
How about:
const size_t s = 1+std::max(n1,n2);
and then the portable way to get your memory is:
std::vector<int> ans(s);
Your main logic will not work if one array is shorter than the other. The shorter input should behave as if it had leading zeros to match. Consider abstracting the problem of "getting the next bit" so you don't duplicate the code for handling each input and make an unreadable mess. You really should have learned to use collections and iterators first.
now:
return ans;
would work as intended since it is a value. You just need to declare the function to be the right type. So just use auto for the return type and it knows.
int n1=sizeof(a)/sizeof(a[0]);
Noooooooo.
There is a standard function to give the size of a built-in primitive array. But really, this should be done automatically as part of the passing, not as a separate thing, as noted earlier.
int *p=addd(a,6,b,6);
You wrote 6 instead of n1 etc.
Anyway, with the previous edits, it becomes:
using std::size;
const auto p = addd (a, size(a), b, size(b));
Finally, concerning:
cout<<p[0]<<" "<<p[1]<<" "<<p[2]<<" "<<p[3]<<" "<<p[4]<<" "<<p[5]<<" "<<p[6]<<endl;
How about using loops?
for (auto val : p) cout << val;
cout << '\n';
oh, don't use endl. It's not needed for cout which auto-flushes anyway, and it's slow. Modern best practice is to use '\n' and then flush explicitly if/when needed (like, never).
Let's look at:
int ans[s];
Apart that this is not even part of the standard and probably the compiler is giving you some warnings (see link), that command allocate temporary memory in the stack which gets deallocated on function exit: that's why you are getting every time different results, you are reading garbage, i.e. memory that in the meantime might have been overwritten.
You can replace it for example with
int* ans = new int[s];
Don't forget though to deallocate the memory when you have finished using the buffer (outside the function), to avoid memory leakage.
Some other notes:
int s;
if(n1<n2) {s=n2+1;}
else {s=n1+1;}
This can be more elegantly written as:
const int s = (n1 < n2) ? n2 + 1 : n1 + 1;
Also, the actual computation code is imprecise as it leads to wrong results if n1 is not equal to n2: You need further code to finish processing the remaining bits of the longest array. By the way you don't need to check on k > 0 because of the way you have defined s.
The following should work:
int i=n1-1, j=n2-1, k=s-1;
int carry=0;
while(i>=0 && j>=0)
{
ans[k]=(a[i]+b[j]+carry)%2;
carry=(a[i]+b[j]+carry)/2;
i--; j--; k--;
}
while(i>=0) {
ans[k]=(a[i]+carry)%2;
carry=(a[i]+carry)/2;
i--; k--;
}
while(j>=0) {
ans[k]=(b[j]+carry)%2;
carry=(b[j]+carry)/2;
j--; k--;
}
ans[0]=carry;
return ans;
}
If You Must Only Use C Arrays
Returning ans is returning the pointer to a local variable. The object the pointer refers to is no longer valid after then function has returned, so trying to read it would lead to undefined behavior.
One way to fix this is to pass in the address to an array to hold your answer, and populate that, instead of using a VLA (which is a non-standard C++ extension).
A VLA (variable length array) is an array which takes its size from a run-time computed value. In your case:
int s;
//... code that initializes s
int ans[s];
ans is a VLA because you are not using a constant to determine the array size. However, that is not a standard feature of the C++ language (it is an optional one in the C language).
You can modify your function so that ans is actually provided by the caller.
int * addd(int a[], int n1, int b[], int n2, int ans[])
{
//...
And then the caller would be responsible for passing in a large enough array to hold the answer.
Your function also appears to be incomplete.
while(i>=0 && j>=0 && k>0)
{
ans[k]=(a[i]+b[j]+carry)%2;
//cout<<k<<" "<<ans[k]<<endl;
carry=(a[i]+b[j]+carry)/2;
i--; j--; k--;
}
If one array is shorter than the other, then the index for the shorter array will reach 0 first. Then, when that corresponding index goes negative, the loop will stop, without handling the remaining terms in the longer array. This essentially makes the corresponding entries in ans be uninitialized. Reading those values results in undefined behavior.
To address this, you should populate the remaining entries in ans with the correct calculation based on carry and the remaining entries in the longer array.
A More C++ Approach
The original answer above was provided assuming you were constrained to only using C style arrays for both input and output, and that you wanted an answer that would allow you to stay close to your original implementation.
Below is a more C++ oriented solution, assuming you still need to provide C arrays as input, but otherwise no other constraint.
C Array Wrapper
A C array does not provide the amenities that you may be accustomed to have when using C++ containers. To gain some of these nice to have features, you can write an adapter that allows a C array to behave like a C++ container.
template <typename T, std::size_t N>
struct c_array_ref {
typedef T ARR_TYPE[N];
ARR_TYPE &arr_;
typedef T * iterator;
typedef std::reverse_iterator<T *> reverse_iterator;
c_array_ref (T (&arr)[N]) : arr_(arr) {}
std::size_t size () { return N; }
T & operator [] (int i) { return arr_[i]; }
operator ARR_TYPE & () { return arr_; }
iterator begin () { return &arr_[0]; }
iterator end () { return begin() + N; }
reverse_iterator rbegin () { return reverse_iterator(end()); }
reverse_iterator rend () { return reverse_iterator(begin()); }
};
Use C Array References
Instead of passing in two arguments as information about the array, you can pass in the array by reference, and use template argument deduction to deduce the array size.
Return a std::array
Although you cannot return a local C array like you attempted in your question, you can return an array that is wrapped inside a struct or class. That is precisely what the convenience container std::array provides. When you use C array references and template argument deduction to obtain the array size, you can now compute at compile time the proper array size that std::array should have for the return value.
template <std::size_t N1, std::size_t N2>
std::array<int, ((N1 < N2) ? N2 : N1) + 1>
addd(int (&a)[N1], int (&b)[N2])
{
Normalize the Input
It is much easier to solve the problem if you assume the arguments have been arranged in a particular order. If you always want the second argument to be the larger array, you can do that with a simple recursive call. This is perfectly safe, since we know the recursion will happen at most once.
if (N2 < N1) return addd(b, a);
Use C++ Containers (or Look-Alike Adapters)
We can now convert our arguments to the adapter shown earlier, and also create a std::array to hold the output.
c_array_ref<int, N1> aa(a);
c_array_ref<int, N2> bb(b);
std::array<int, std::max(N1, N2)+1> ans;
Leverage Existing Algorithms if Possible
In order to deal with the short comings of your original program, you can adjust your implementation a bit in an attempt to remove special cases. One way to do that is to store the result of adding the longer array to 0 and storing it into the output. However, this can mostly be accomplished with a simple call to std::copy.
ans[0] = 0;
std::copy(bb.begin(), bb.end(), ans.begin() + 1);
Since we know the input consists of only 1s and 0s, we can compute straight addition from the shorter array into the longer array, without concern for carry (that will be addressed in the next step). To compute this addition, we apply std::transform with a lambda expression.
std::transform(aa.rbegin(), aa.rend(), ans.rbegin(),
ans.rbegin(),
[](int a, int b) -> int { return a + b; });
Lastly, we can make a pass over the output array to fix up the carry computation. After doing so, we are ready to return the result. The return is possible because we are using std::array to represent the answer.
for (auto i = ans.rbegin(); i != ans.rend()-1; ++i) {
*(i+1) += *i / 2;
*i %= 2;
}
return ans;
}
A Simpler main Function
We now only need to pass in the two arrays to the addd function, since template type deduction will discover the sizes of the arrays. In addition, the output generator can be handled more easily with an ostream_iterator.
int main(int, const char * []) {
int a[]={1,0,0,0,1,1,1};
int b[]={1,0,1,1,0,1};
auto p=addd(a,b);
std::copy(p.begin(), p.end(),
std::ostream_iterator<int>(std::cout, " "));
return 0;
}
Try it online!
If I may editorialize a bit... I think this is a deceptively difficult question for beginners, and as-stated should flag problems in the design review long before any attempt at coding. It's telling you to do things that are not good/typical/idiomatic/proper in C++, and distracting you with issues that get in the way of the actual logic to be developed.
Consider the core algorithm you wrote (and Antonio corrected): that can be understood and discussed without worrying about just how A and B are actually passed in for this code to use, or exactly what kind of collection it is. If they were std::vector, std::array, or primitive C array, the usage would be identical. Likewise, how does one return the result out of the code? You populate ans here, and how it is gotten into and/or out of the code and back to main is not relevant.
Primitive C arrays are not first-class objects in C++ and there are special rules (inherited from C) on how they are passed as arguments.
Returning is even worse, and returning dynamic-sized things was a major headache in C and memory management like this is a major source of bugs and security flaws. What we want is value semantics.
Second, using arrays and subscripts is not idiomatic in C++. You use iterators and abstract over the exact nature of the collection. If you were interested in writing super-efficent back-end code that doesn't itself deal with memory management (it's called by other code that deals with the actual collections involved) it would look like std::merge which is a venerable function that dates back to the early 90's.
template< class InputIt1, class InputIt2, class OutputIt >
OutputIt merge( InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2,
OutputIt d_first );
You can find others with similar signatures, that take two different ranges for input and outputs to a third area. If you write addp exactly like this, you could call it with primitive C arrays of hardcoded size:
int8_t A[] {0,0,0,1,1,1};
int8_t B[] {1,0,1,1,0,1};
int8_t C[ ??? ];
using std::begin; std::end;
addp (begin(A),end(A), begin(B), end(B), begin(C));
Note that it's up to the caller to have prepared an output area large enough, and there's no error checking.
However, the same code can be used with vectors, or even any combination of different container types. This could populate a std::vector as the result by passing an insertion iterator. But in this particular algorithm that's difficult since you're computing it in reverse order.
std::array
Improving upon the situation with primitive C arrays, you could use the std::array class which is exactly the same array but without the strange passing/returning rules. It's actually just a primitive C array inside a wrapping struct. See this documentation: https://en.cppreference.com/w/cpp/container/array
So you could write it as:
using BBBNum1 = std::array<int8_t, 6>
BBBNum1 addp (const BBBNum1& A, const BBBNum1& B) { ... }
The code inside can use A[i] etc. in the same way you are, but it also can get the size via A.size(). The issue here is that the inputs are the same length, and the output is the same as well (not 1 larger). Using templates, it could be written to make the lengths flexible but still only specified at compile time.
std::vector
The vector is like an array but with a run-time length. It's dynamic, and the go-to collection you should reach for in C++.
using BBBNum2 = std::vector<int8_t>
BBBNum2 addp (const BBBNum2& A, const BBBNum2& B) { ... }
Again, the code inside this function can refer to B[j] etc. and use B.size() exactly the same as with the array collection. But now, the size is a run-time property, and can be different for each one.
You would create your result, as in my first post, by giving the size as a constructor argument, and then you can return the vector by-value. Note that the compiler will do this efficiently and not actually have to copy anything if you write:
auto C = addp (A, B);
now for the real work
OK, now that this distraction is at least out of the way, you can worry about actually writing the implementation. I hope you are convinced that using vector instead of a C primitive array does not affect your problem logic or even the (available) syntax of using subscripts. Especially since the problem referred to psudocode, I interpret its use of "array" as "suitable indexable collection" and not specifically the primitive C array type.
The issue of going through 2 sequences together and dealing with differing lengths is actually a general purpose idea. In C++20, the Range library has things that make quick work of this. Older 3rd party libraries exist as well, and you might find it called zip or something like that.
But, let's look at writing it from scratch.
You want to read an item at a time from two inputs, but neatly make it look like they're the same length. You don't want to write the same code three times, or elaborate on the cases where A is shorter or where B may be shorter... just abstract out the idea that they are read together, and if one runs out it provides zeros.
This is its own piece of code that can be applied twice, to A and to B.
class backwards_bit_reader {
const BBBnum2& x;
size_t index;
public:
backwards_bit_reader(const BBBnum2& x) : x{x}, index{x.size()} {}
bool done() const { return index == 0; }
int8_t next()
{
if (done()) return 0; // keep reading infinite leading zeros
--index;
return x[index];
}
};
Now you can write something like:
backwards_bit_reader A_in { A };
backwards_bit_reader B_in { B };
while (!A_in.done() && !B_in.done()) {
const a = A_in.next();
const b = B_in.next();
const c = a+b+carry;
carry = c/2; // update
C[--k]= c%2;
}
C[0]= carry; // the final bit, one longer than the input
It can be written far more compactly, but this is clear.
another approach
The problem is, is writing backwards_bit_reader beyond what you've learned thus far? How else might you apply the same logic to both A and B without duplicating the statements?
You should be learning to recognize what's sometimes called "code smell". Repeating the same block of code multiple times, and repeating the same steps with nothing changed but which variable it's applying to, should be seen as ugly and unacceptable.
You can at least cut back the cases by ensuring that B is always the longer one, if they are of different length. Do this by swapping A and B if that's not the case, as a preliminary step. (Actually implementing that well is another digression)
But the logic is still nearly duplicated, since you have to deal with the possibility of the carry propagating all the way to the end. Just now you have 2 copies instead of 3.
Extending the shorter one, at least in façade, is the only way to write one loop.
how realistic is this problem?
It's simplified to the point of being silly, but if it's not done in base 2 but with larger values, this is actually implementing multi-precision arithmetic, which is a real thing people want to do. That's why I named the type above BBBNum for "Bad Binary Bignum".
Getting down to an actual range of memory and wanting the code to be fast and optimized is also something you want to do sometimes. The BigNum is one example; you often see this with string processing. But we'll want to make an efficient back-end that operates on memory without knowing how it was allocated, and higher-level wrappers that call it.
For example:
void addp (const int8_t* a_begin, const int8_t* a_end,
const int8_t* b_begin, const int8_t* b_end,
int8_t* result_begin, int8_t* result_end);
will use the provided range for output, not knowing or caring how it was allocated, and taking input that's any contiguous range without caring what type of container is used to manage it as long as it's contiguous. Note that as you saw with the std::merge example, it's more idiomatic to pass begin and end rather than begin and size.
But then you have helper functions like:
BBBNum2 addp (const BBBNum2& A, const BBBNum2& B)
{
BBBNum result (1+std::max(A.size(),B.size());
addp (A.data(), A.data()+A.size(), B.data(), B.data()+B.size(), C.data(), C.data()+C.size());
}
Now the casual user can call it using vectors and a dynamically-created result, but it's still available to call for arrays, pre-allocated result buffers, etc.
Take the following two lines of code:
for (int i = 0; i < some_vector.size(); i++)
{
//do stuff
}
And this:
for (some_iterator = some_vector.begin(); some_iterator != some_vector.end();
some_iterator++)
{
//do stuff
}
I'm told that the second way is preferred. Why exactly is this?
The first form is efficient only if vector.size() is a fast operation. This is true for vectors, but not for lists, for example. Also, what are you planning to do within the body of the loop? If you plan on accessing the elements as in
T elem = some_vector[i];
then you're making the assumption that the container has operator[](std::size_t) defined. Again, this is true for vector but not for other containers.
The use of iterators bring you closer to container independence. You're not making assumptions about random-access ability or fast size() operation, only that the container has iterator capabilities.
You could enhance your code further by using standard algorithms. Depending on what it is you're trying to achieve, you may elect to use std::for_each(), std::transform() and so on. By using a standard algorithm rather than an explicit loop you're avoiding re-inventing the wheel. Your code is likely to be more efficient (given the right algorithm is chosen), correct and reusable.
It's part of the modern C++ indoctrination process. Iterators are the only way to iterate most containers, so you use it even with vectors just to get yourself into the proper mindset. Seriously, that's the only reason I do it - I don't think I've ever replaced a vector with a different kind of container.
Wow, this is still getting downvoted after three weeks. I guess it doesn't pay to be a little tongue-in-cheek.
I think the array index is more readable. It matches the syntax used in other languages, and the syntax used for old-fashioned C arrays. It's also less verbose. Efficiency should be a wash if your compiler is any good, and there are hardly any cases where it matters anyway.
Even so, I still find myself using iterators frequently with vectors. I believe the iterator is an important concept, so I promote it whenever I can.
because you are not tying your code to the particular implementation of the some_vector list. if you use array indices, it has to be some form of array; if you use iterators you can use that code on any list implementation.
Imagine some_vector is implemented with a linked-list. Then requesting an item in the i-th place requires i operations to be done to traverse the list of nodes. Now, if you use iterator, generally speaking, it will make its best effort to be as efficient as possible (in the case of a linked list, it will maintain a pointer to the current node and advance it in each iteration, requiring just a single operation).
So it provides two things:
Abstraction of use: you just want to iterate some elements, you don't care about how to do it
Performance
I'm going to be the devils advocate here, and not recommend iterators. The main reason why, is all the source code I've worked on from Desktop application development to game development have i nor have i needed to use iterators. All the time they have not been required and secondly the hidden assumptions and code mess and debugging nightmares you get with iterators make them a prime example not to use it in any applications that require speed.
Even from a maintence stand point they're a mess. Its not because of them but because of all the aliasing that happen behind the scene. How do i know that you haven't implemented your own virtual vector or array list that does something completely different to the standards. Do i know what type is currently now during runtime? Did you overload a operator I didn't have time to check all your source code. Hell do i even know what version of the STL your using?
The next problem you got with iterators is leaky abstraction, though there are numerous web sites that discuss this in detail with them.
Sorry, I have not and still have not seen any point in iterators. If they abstract the list or vector away from you, when in fact you should know already what vector or list your dealing with if you don't then your just going to be setting yourself up for some great debugging sessions in the future.
You might want to use an iterator if you are going to add/remove items to the vector while you are iterating over it.
some_iterator = some_vector.begin();
while (some_iterator != some_vector.end())
{
if (/* some condition */)
{
some_iterator = some_vector.erase(some_iterator);
// some_iterator now positioned at the element after the deleted element
}
else
{
if (/* some other condition */)
{
some_iterator = some_vector.insert(some_iterator, some_new_value);
// some_iterator now positioned at new element
}
++some_iterator;
}
}
If you were using indices you would have to shuffle items up/down in the array to handle the insertions and deletions.
Separation of Concerns
It's very nice to separate the iteration code from the 'core' concern of the loop. It's almost a design decision.
Indeed, iterating by index ties you to the implementation of the container. Asking the container for a begin and end iterator, enables the loop code for use with other container types.
Also, in the std::for_each way, you TELL the collection what to do, instead of ASKing it something about its internals
The 0x standard is going to introduce closures, which will make this approach much more easy to use - have a look at the expressive power of e.g. Ruby's [1..6].each { |i| print i; }...
Performance
But maybe a much overseen issue is that, using the for_each approach yields an opportunity to have the iteration parallelized - the intel threading blocks can distribute the code block over the number of processors in the system!
Note: after discovering the algorithms library, and especially foreach, I went through two or three months of writing ridiculously small 'helper' operator structs which will drive your fellow developers crazy. After this time, I went back to a pragmatic approach - small loop bodies deserve no foreach no more :)
A must read reference on iterators is the book "Extended STL".
The GoF have a tiny little paragraph in the end of the Iterator pattern, which talks about this brand of iteration; it's called an 'internal iterator'. Have a look here, too.
Because it is more object-oriented. if you are iterating with an index you are assuming:
a) that those objects are ordered
b) that those objects can be obtained by an index
c) that the index increment will hit every item
d) that that index starts at zero
With an iterator, you are saying "give me everything so I can work with it" without knowing what the underlying implementation is. (In Java, there are collections that cannot be accessed through an index)
Also, with an iterator, no need to worry about going out of bounds of the array.
Another nice thing about iterators is that they better allow you to express (and enforce) your const-preference. This example ensures that you will not be altering the vector in the midst of your loop:
for(std::vector<Foo>::const_iterator pos=foos.begin(); pos != foos.end(); ++pos)
{
// Foo & foo = *pos; // this won't compile
const Foo & foo = *pos; // this will compile
}
Aside from all of the other excellent answers... int may not be large enough for your vector. Instead, if you want to use indexing, use the size_type for your container:
for (std::vector<Foo>::size_type i = 0; i < myvector.size(); ++i)
{
Foo& this_foo = myvector[i];
// Do stuff with this_foo
}
I probably should point out you can also call
std::for_each(some_vector.begin(), some_vector.end(), &do_stuff);
STL iterators are mostly there so that the STL algorithms like sort can be container independent.
If you just want to loop over all the entries in a vector just use the index loop style.
It is less typing and easier to parse for most humans. It would be nice if C++ had a simple foreach loop without going overboard with template magic.
for( size_t i = 0; i < some_vector.size(); ++i )
{
T& rT = some_vector[i];
// now do something with rT
}
'
I don't think it makes much difference for a vector. I prefer to use an index myself as I consider it to be more readable and you can do random access like jumping forward 6 items or jumping backwards if needs be.
I also like to make a reference to the item inside the loop like this so there are not a lot of square brackets around the place:
for(size_t i = 0; i < myvector.size(); i++)
{
MyClass &item = myvector[i];
// Do stuff to "item".
}
Using an iterator can be good if you think you might need to replace the vector with a list at some point in the future and it also looks more stylish to the STL freaks but I can't think of any other reason.
The second form represents what you're doing more accurately. In your example, you don't care about the value of i, really - all you want is the next element in the iterator.
After having learned a little more on the subject of this answer, I realize it was a bit of an oversimplification. The difference between this loop:
for (some_iterator = some_vector.begin(); some_iterator != some_vector.end();
some_iterator++)
{
//do stuff
}
And this loop:
for (int i = 0; i < some_vector.size(); i++)
{
//do stuff
}
Is fairly minimal. In fact, the syntax of doing loops this way seems to be growing on me:
while (it != end){
//do stuff
++it;
}
Iterators do unlock some fairly powerful declarative features, and when combined with the STL algorithms library you can do some pretty cool things that are outside the scope of array index administrivia.
Indexing requires an extra mul operation. For example, for vector<int> v, the compiler converts v[i] into &v + sizeof(int) * i.
During iteration you don't need to know number of item to be processed. You just need the item and iterators do such things very good.
No one mentioned yet that one advantage of indices is that they are not become invalid when you append to a contiguous container like std::vector, so you can add items to the container during iteration.
This is also possible with iterators, but you must call reserve(), and therefore need to know how many items you'll append.
If you have access to C++11 features, then you can also use a range-based for loop for iterating over your vector (or any other container) as follows:
for (auto &item : some_vector)
{
//do stuff
}
The benefit of this loop is that you can access elements of the vector directly via the item variable, without running the risk of messing up an index or making a making a mistake when dereferencing an iterator. In addition, the placeholder auto prevents you from having to repeat the type of the container elements,
which brings you even closer to a container-independent solution.
Notes:
If you need the the element index in your loop and the operator[] exists for your container (and is fast enough for you), then better go for your first way.
A range-based for loop cannot be used to add/delete elements into/from a container. If you want to do that, then better stick to the solution given by Brian Matthews.
If you don't want to change the elements in your container, then you should use the keyword const as follows: for (auto const &item : some_vector) { ... }.
Several good points already. I have a few additional comments:
Assuming we are talking about the C++ standard library, "vector" implies a random access container that has the guarantees of C-array (random access, contiguos memory layout etc). If you had said 'some_container', many of the above answers would have been more accurate (container independence etc).
To eliminate any dependencies on compiler optimization, you could move some_vector.size() out of the loop in the indexed code, like so:
const size_t numElems = some_vector.size();
for (size_t i = 0; i
Always pre-increment iterators and treat post-increments as exceptional cases.
for (some_iterator = some_vector.begin(); some_iterator != some_vector.end(); ++some_iterator){ //do stuff }
So assuming and indexable std::vector<> like container, there is no good reason to prefer one over other, sequentially going through the container. If you have to refer to older or newer elemnent indexes frequently, then the indexed version is more appropropriate.
In general, using the iterators is preferred because algorithms make use of them and behavior can be controlled (and implicitly documented) by changing the type of the iterator. Array locations can be used in place of iterators, but the syntactical difference will stick out.
I don't use iterators for the same reason I dislike foreach-statements. When having multiple inner-loops it's hard enough to keep track of global/member variables without having to remember all the local values and iterator-names as well. What I find useful is to use two sets of indices for different occasions:
for(int i=0;i<anims.size();i++)
for(int j=0;j<bones.size();j++)
{
int animIndex = i;
int boneIndex = j;
// in relatively short code I use indices i and j
... animation_matrices[i][j] ...
// in long and complicated code I use indices animIndex and boneIndex
... animation_matrices[animIndex][boneIndex] ...
}
I don't even want to abbreviate things like "animation_matrices[i]" to some random "anim_matrix"-named-iterator for example, because then you can't see clearly from which array this value is originated.
If you like being close to the metal / don't trust their implementation details, don't use iterators.
If you regularly switch out one collection type for another during development, use iterators.
If you find it difficult to remember how to iterate different sorts of collections (maybe you have several types from several different external sources in use), use iterators to unify the means by which you walk over elements. This applies to say switching a linked list with an array list.
Really, that's all there is to it. It's not as if you're going to gain more brevity either way on average, and if brevity really is your goal, you can always fall back on macros.
Even better than "telling the CPU what to do" (imperative) is "telling the libraries what you want" (functional).
So instead of using loops you should learn the algorithms present in stl.
For container independence
I always use array index because many application of mine require something like "display thumbnail image". So I wrote something like this:
some_vector[0].left=0;
some_vector[0].top =0;<br>
for (int i = 1; i < some_vector.size(); i++)
{
some_vector[i].left = some_vector[i-1].width + some_vector[i-1].left;
if(i % 6 ==0)
{
some_vector[i].top = some_vector[i].top.height + some_vector[i].top;
some_vector[i].left = 0;
}
}
Both the implementations are correct, but I would prefer the 'for' loop. As we have decided to use a Vector and not any other container, using indexes would be the best option. Using iterators with Vectors would lose the very benefit of having the objects in continuous memory blocks which help ease in their access.
I felt that none of the answers here explain why I like iterators as a general concept over indexing into containers. Note that most of my experience using iterators doesn't actually come from C++ but from higher-level programming languages like Python.
The iterator interface imposes fewer requirements on consumers of your function, which allows consumers to do more with it.
If all you need is to be able to forward-iterate, the developer isn't limited to using indexable containers - they can use any class implementing operator++(T&), operator*(T) and operator!=(const &T, const &T).
#include <iostream>
template <class InputIterator>
void printAll(InputIterator& begin, InputIterator& end)
{
for (auto current = begin; current != end; ++current) {
std::cout << *current << "\n";
}
}
// elsewhere...
printAll(myVector.begin(), myVector.end());
Your algorithm works for the case you need it - iterating over a vector - but it can also be useful for applications you don't necessarily anticipate:
#include <random>
class RandomIterator
{
private:
std::mt19937 random;
std::uint_fast32_t current;
std::uint_fast32_t floor;
std::uint_fast32_t ceil;
public:
RandomIterator(
std::uint_fast32_t floor = 0,
std::uint_fast32_t ceil = UINT_FAST32_MAX,
std::uint_fast32_t seed = std::mt19937::default_seed
) :
floor(floor),
ceil(ceil)
{
random.seed(seed);
++(*this);
}
RandomIterator& operator++()
{
current = floor + (random() % (ceil - floor));
}
std::uint_fast32_t operator*() const
{
return current;
}
bool operator!=(const RandomIterator &that) const
{
return current != that.current;
}
};
int main()
{
// roll a 1d6 until we get a 6 and print the results
RandomIterator firstRandom(1, 7, std::random_device()());
RandomIterator secondRandom(6, 7);
printAll(firstRandom, secondRandom);
return 0;
}
Attempting to implement a square-brackets operator which does something similar to this iterator would be contrived, while the iterator implementation is relatively simple. The square-brackets operator also makes implications about the capabilities of your class - that you can index to any arbitrary point - which may be difficult or inefficient to implement.
Iterators also lend themselves to decoration. People can write iterators which take an iterator in their constructor and extend its functionality:
template<class InputIterator, typename T>
class FilterIterator
{
private:
InputIterator internalIterator;
public:
FilterIterator(const InputIterator &iterator):
internalIterator(iterator)
{
}
virtual bool condition(T) = 0;
FilterIterator<InputIterator, T>& operator++()
{
do {
++(internalIterator);
} while (!condition(*internalIterator));
return *this;
}
T operator*()
{
// Needed for the first result
if (!condition(*internalIterator))
++(*this);
return *internalIterator;
}
virtual bool operator!=(const FilterIterator& that) const
{
return internalIterator != that.internalIterator;
}
};
template <class InputIterator>
class EvenIterator : public FilterIterator<InputIterator, std::uint_fast32_t>
{
public:
EvenIterator(const InputIterator &internalIterator) :
FilterIterator<InputIterator, std::uint_fast32_t>(internalIterator)
{
}
bool condition(std::uint_fast32_t n)
{
return !(n % 2);
}
};
int main()
{
// Rolls a d20 until a 20 is rolled and discards odd rolls
EvenIterator<RandomIterator> firstRandom(RandomIterator(1, 21, std::random_device()()));
EvenIterator<RandomIterator> secondRandom(RandomIterator(20, 21));
printAll(firstRandom, secondRandom);
return 0;
}
While these toys might seem mundane, it's not difficult to imagine using iterators and iterator decorators to do powerful things with a simple interface - decorating a forward-only iterator of database results with an iterator which constructs a model object from a single result, for example. These patterns enable memory-efficient iteration of infinite sets and, with a filter like the one I wrote above, potentially lazy evaluation of results.
Part of the power of C++ templates is your iterator interface, when applied to the likes of fixed-length C arrays, decays to simple and efficient pointer arithmetic, making it a truly zero-cost abstraction.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why use iterators instead of array indices?
I'm reviewing my knowledge on C++ and I've stumbled upon iterators. One thing I want to know is what makes them so special and I want to know why this:
using namespace std;
vector<int> myIntVector;
vector<int>::iterator myIntVectorIterator;
// Add some elements to myIntVector
myIntVector.push_back(1);
myIntVector.push_back(4);
myIntVector.push_back(8);
for(myIntVectorIterator = myIntVector.begin();
myIntVectorIterator != myIntVector.end();
myIntVectorIterator++)
{
cout<<*myIntVectorIterator<<" ";
//Should output 1 4 8
}
is better than this:
using namespace std;
vector<int> myIntVector;
// Add some elements to myIntVector
myIntVector.push_back(1);
myIntVector.push_back(4);
myIntVector.push_back(8);
for(int y=0; y<myIntVector.size(); y++)
{
cout<<myIntVector[y]<<" ";
//Should output 1 4 8
}
And yes I know that I shouldn't be using the std namespace. I just took this example off of the cprogramming website. So can you please tell me why the latter is worse? What's the big difference?
The special thing about iterators is that they provide the glue between algorithms and containers. For generic code, the recommendation would be to use a combination of STL algorithms (e.g. find, sort, remove, copy) etc. that carries out the computation that you have in mind on your data structure (vector, list, map etc.), and to supply that algorithm with iterators into your container.
Your particular example could be written as a combination of the for_each algorithm and the vector container (see option 3) below), but it's only one out of four distinct ways to iterate over a std::vector:
1) index-based iteration
for (std::size_t i = 0; i != v.size(); ++i) {
// access element as v[i]
// any code including continue, break, return
}
Advantages: familiar to anyone familiar with C-style code, can loop using different strides (e.g. i += 2).
Disadvantages: only for sequential random access containers (vector, array, deque), doesn't work for list, forward_list or the associative containers. Also the loop control is a little verbose (init, check, increment). People need to be aware of the 0-based indexing in C++.
2) iterator-based iteration
for (auto it = v.begin(); it != v.end(); ++it) {
// if the current index is needed:
auto i = std::distance(v.begin(), it);
// access element as *it
// any code including continue, break, return
}
Advantages: more generic, works for all containers (even the new unordered associative containers, can also use different strides (e.g. std::advance(it, 2));
Disadvantages: need extra work to get the index of the current element (could be O(N) for list or forward_list). Again, the loop control is a little verbose (init, check, increment).
3) STL for_each algorithm + lambda
std::for_each(v.begin(), v.end(), [](T const& elem) {
// if the current index is needed:
auto i = &elem - &v[0];
// cannot continue, break or return out of the loop
});
Advantages: same as 2) plus small reduction in loop control (no check and increment), this can greatly reduce your bug rate (wrong init, check or increment, off-by-one errors).
Disadvantages: same as explicit iterator-loop plus restricted possibilities for flow control in the loop (cannot use continue, break or return) and no option for different strides (unless you use an iterator adapter that overloads operator++).
4) range-for loop
for (auto& elem: v) {
// if the current index is needed:
auto i = &elem - &v[0];
// any code including continue, break, return
}
Advantages: very compact loop control, direct access to the current element.
Disadvantages: extra statement to get the index. Cannot use different strides.
What to use?
For your particular example of iterating over std::vector: if you really need the index (e.g. access the previous or next element, printing/logging the index inside the loop etc.) or you need a stride different than 1, then I would go for the explicitly indexed-loop, otherwise I'd go for the range-for loop.
For generic algorithms on generic containers I'd go for the explicit iterator loop unless the code contained no flow control inside the loop and needed stride 1, in which case I'd go for the STL for_each + a lambda.
With a vector iterators do no offer any real advantage. The syntax is uglier, longer to type and harder to read.
Iterating over a vector using iterators is not faster and is not safer (actually if the vector is possibly resized during the iteration using iterators will put you in big troubles).
The idea of having a generic loop that works when you will change later the container type is also mostly nonsense in real cases. Unfortunately the dark side of a strictly typed language without serious typing inference (a bit better now with C++11, however) is that you need to say what is the type of everything at each step. If you change your mind later you will still need to go around and change everything. Moreover different containers have very different trade-offs and changing container type is not something that happens that often.
The only case in which iteration should be kept if possible generic is when writing template code, but that (I hope for you) is not the most frequent case.
The only problem present in your explicit index loop is that size returns an unsigned value (a design bug of C++) and comparison between signed and unsigned is dangerous and surprising, so better avoided. If you use a decent compiler with warnings enabled there should be a diagnostic on that.
Note that the solution is not to use an unsiged as the index, because arithmetic between unsigned values is also apparently illogical (it's modulo arithmetic, and x-1 may be bigger than x). You instead should cast the size to an integer before using it.
It may make some sense to use unsigned sizes and indexes (paying a LOT of attention to every expression you write) only if you're working on a 16 bit C++ implementation (16 bit was the reason for having unsigned values in sizes).
As a typical mistake that unsigned size may introduce consider:
void drawPolyline(const std::vector<P2d>& points)
{
for (int i=0; i<points.size()-1; i++)
drawLine(points[i], points[i+1]);
}
Here the bug is present because if you pass an empty points vector the value points.size()-1 will be a huge positive number, making you looping into a segfault.
A working solution could be
for (int i=1; i<points.size(); i++)
drawLine(points[i - 1], points[i]);
but I personally prefer to always remove unsinged-ness with int(v.size()).
PS: If you really don't want to think by to yourself to the implications and simply want an expert to tell you then consider that a quite a few world recognized C++ experts agree and expressed opinions on that unsigned values are a bad idea except for bit manipulations.
Discovering the ugliness of using iterators in the case of iterating up to second-last is left as an exercise for the reader.
Iterators make your code more generic.
Every standard library container provides an iterator hence if you change your container class in future the loop wont be affected.
Iterators are first choice over operator[]. C++11 provides std::begin(), std::end() functions.
As your code uses just std::vector, I can't say there is much difference in both codes, however, operator [] may not operate as you intend to. For example if you use map, operator[] will insert an element if not found.
Also, by using iterator your code becomes more portable between containers. You can switch containers from std::vector to std::list or other container freely without changing much if you use iterator such rule doesn't apply to operator[].
It always depends on what you need.
You should use operator[] when you need direct access to elements in the vector (when you need to index a specific element in the vector). There is nothing wrong in using it over iterators. However, you must decide for yourself which (operator[] or iterators) suits best your needs.
Using iterators would enable you to switch to other container types without much change in your code. In other words, using iterators would make your code more generic, and does not depend on a particular type of container.
By writing your client code in terms of iterators you abstract away the container completely.
Consider this code:
class ExpressionParser // some generic arbitrary expression parser
{
public:
template<typename It>
void parse(It begin, const It end)
{
using namespace std;
using namespace std::placeholders;
for_each(begin, end,
bind(&ExpressionParser::process_next, this, _1);
}
// process next char in a stream (defined elsewhere)
void process_next(char c);
};
client code:
ExpressionParser p;
std::string expression("SUM(A) FOR A in [1, 2, 3, 4]");
p.parse(expression.begin(), expression.end());
std::istringstream file("expression.txt");
p.parse(std::istringstream<char>(file), std::istringstream<char>());
char expr[] = "[12a^2 + 13a - 5] with a=108";
p.parse(std::begin(expr), std::end(expr));
Edit: Consider your original code example, implemented with :
using namespace std;
vector<int> myIntVector;
// Add some elements to myIntVector
myIntVector.push_back(1);
myIntVector.push_back(4);
myIntVector.push_back(8);
copy(myIntVector.begin(), myIntVector.end(),
std::ostream_iterator<int>(cout, " "));
The nice thing about iterator is that later on if you wanted to switch your vector to a another STD container. Then the forloop will still work.
its a matter of speed. using the iterator accesses the elements faster. a similar question was answered here:
What's faster, iterating an STL vector with vector::iterator or with at()?
Edit:
speed of access varies with each cpu and compiler
What is the fastest way (if there is any other) to convert a std::vector from one datatype to another (with the idea to save space)? For example:
std::vector<unsigned short> ----> std::vector<bool>
we obviously assume that the first vector only contains 0s and 1s. Copying element by element is highly inefficient in case of a really large vector.
Conditional question:
If you think there is no way to do it faster, is there a complex datatype which actually allows fast conversion from one datatype to another?
std::vector<bool>
Stop.
A std::vector<bool> is... not. std::vector has a specialization for the use of the type bool, which causes certain changes in the vector. Namely, it stops acting like a std::vector.
There are certain things that the standard guarantees you can do with a std::vector. And vector<bool> violates those guarantees. So you should be very careful about using them.
Anyway, I'm going to pretend you said vector<int> instead of vector<bool>, as the latter really complicates things.
Copying element by element is highly inefficient in case of a really large vector.
Only if you do it wrong.
Vector casting of the type you want needs to be done carefully to be efficient.
If the the source T type is convertible to the destination T, then this is works just fine:
vector<Tnew> vec_new(vec_old.begin(), vec_old.end());
Decent implementations should recognize when they've been given random-access iterators and optimize the memory allocation and loop appropriately.
The biggest problem for non-convertible types you'll have for simple types is not doing this:
std::vector<int> newVec(oldVec.size());
That's bad. That will allocate a buffer of the proper size, but it will also fill it with data. Namely, default-constructed ints (int()).
Instead, you should do this:
std::vector<int> newVec;
newVec.reserve(oldVec.size());
This reserves capacity equal to the original vector, but it also ensures that no default construction takes place. You can now push_back to your hearts content, knowing that you will never cause reallocation in your new vector.
From there, you can just loop over each entry in the old vector, doing the conversion as needed.
There's no way to avoid the copy, since a std::vector<T> is a distinct
type from std::vector<U>, and there's no way for them to share the
memory. Other than that, it depends on how the data is mapped. If the
mapping corresponds to an implicit conversion (e.g. unsigned short to
bool), then simply creating a new vector using the begin and end
iterators from the old will do the trick:
std::vector<bool> newV( oldV.begin(), oldV.end() );
If the mapping isn't just an implicit conversion (and this includes
cases where you want to verify things; e.g. that the unsigned short
does contain only 0 or 1), then it gets more complicated. The
obvious solution would be to use std::transform:
std::vector<TargetType> newV;
newV.reserve( oldV.size() ); // avoids unnecessary reallocations
std::transform( oldV.begin(), oldV.end(),
std::back_inserter( newV ),
TranformationObject() );
, where TranformationObject is a functional object which does the
transformation, e.g.:
struct ToBool : public std::unary_function<unsigned short, bool>
{
bool operator()( unsigned short original ) const
{
if ( original != 0 && original != 1 )
throw Something();
return original != 0;
}
};
(Note that I'm just using this transformation function as an example.
If the only thing which distinguishes the transformation function from
an implicit conversion is the verification, it might be faster to verify
all of the values in oldV first, using std::for_each, and then use
the two iterator constructor above.)
Depending on the cost of default constructing the target type, it may be
faster to create the new vector with the correct size, then overwrite
it:
std::vector<TargetType> newV( oldV.size() );
std::transform( oldV.begin(), oldV.end(),
newV.begin(),
TranformationObject() );
Finally, another possibility would be to use a
boost::transform_iterator. Something like:
std::vector<TargetType> newV(
boost::make_transform_iterator( oldV.begin(), TranformationObject() ),
boost::make_transform_iterator( oldV.end(), TranformationObject() ) );
In many ways, this is the solution I prefer; depending on how
boost::transform_iterator has been implemented, it could also be the
fastest.
You should be able to use assign like this:
vector<unsigned short> v;
//...
vector<bool> u;
//...
u.assign(v.begin(), v.end());
class A{... }
class B{....}
B convert_A_to_B(const A& a){.......}
void convertVector_A_to_B(const vector<A>& va, vector<B>& vb)
{
vb.clear();
vb.reserve(va.size());
std::transform(va.begin(), va.end(), std::back_inserter(vb), convert_A_to_B);
}
The fastest way to do it is to not do it. For example, if you know in advance that your items only need a byte for storage, use a byte-size vector to begin with. You'll find it difficult to find a faster way than that :-)
If that's not possible, then just absorb the cost of the conversion. Even if it's a little slow (and that's by no means certain, see Nicol's excellent answer for details), it's still necessary. If it wasn't, you would just leave it in the larger-type vector.
First, a warning: Don't do what I'm about to suggest. It's dangerous and must never be done. That said, if you just have to squeeze out a tiny bit more performance No Matter What...
First, there are some caveats. If you don't meet these, you can't do this:
The vector must contain plain-old-data. If your type has pointers, or uses a destructor, or needs an operator = to copy correctly ... do not do this.
The sizeof() both vector's contained types must be the same. That is, vector< A > can copy from vector< B > only if sizeof(A) == sizeof(B).
Here is a fairly stable method:
vector< A > a;
vector< B > b;
a.resize( b.size() );
assert( sizeof(vector< A >::value_type) == sizeof(vector< B >::value_type) );
if( b.size() == 0 )
a.clear();
else
memcpy( &(*a.begin()), &(*b.begin()), b.size() * sizeof(B) );
This does a very fast, block copy of the memory contained in vector b, directly smashing whatever data you have in vector a. It doesn't call constructors, it doesn't do any safety checking, and it's much faster than any of the other methods given here. An optimizing compiler should be able to match the speed of this in theory, but unless you're using an unusually good one, it won't (I checked with Visual C++ a few years ago, and it wasn't even close).
Also, given these constraints, you could forcibly (via void *) cast one vector type to the other and swap them -- I had a code sample for that, but it started oozing ectoplasm on my screen, so I deleted it.
Copying element by element is not highly inefficient. std::vector provides constant access time to any of its elements, hence the operation will be O(n) overall. You will not notice it.
#ifdef VECTOR_H_TYPE1
#ifdef VECTOR_H_TYPE2
#ifdef VECTOR_H_CLASS
/* Other methods can be added as needed, provided they likewise carry out the same operations on both */
#include <vector>
using namespace std;
class VECTOR_H_CLASS {
public:
vector<VECTOR_H_TYPE1> *firstVec;
vector<VECTOR_H_TYPE2> *secondVec;
VECTOR_H_CLASS(vector<VECTOR_H_TYPE1> &v1, vector<VECTOR_H_TYPE2> &v2) { firstVec = &v1; secondVec = &v2; }
~VECTOR_H_CLASS() {}
void init() { // Use this to copy a full vector into an empty (or garbage) vector to equalize them
secondVec->clear();
for(vector<VECTOR_H_TYPE1>::iterator it = firstVec->begin(); it != firstVec->end(); it++) secondVec->push_back((VECTOR_H_TYPE2)*it);
}
void push_back(void *value) {
firstVec->push_back((VECTOR_H_TYPE1)value);
secondVec->push_back((VECTOR_H_TYPE2)value);
}
void pop_back() {
firstVec->pop_back();
secondVec->pop_back();
}
void clear() {
firstVec->clear();
secondVec->clear();
}
};
#undef VECTOR_H_CLASS
#endif
#undef VECTOR_H_TYPE2
#endif
#undef VECTOR_H_TYPE1
#endif
Take the following two lines of code:
for (int i = 0; i < some_vector.size(); i++)
{
//do stuff
}
And this:
for (some_iterator = some_vector.begin(); some_iterator != some_vector.end();
some_iterator++)
{
//do stuff
}
I'm told that the second way is preferred. Why exactly is this?
The first form is efficient only if vector.size() is a fast operation. This is true for vectors, but not for lists, for example. Also, what are you planning to do within the body of the loop? If you plan on accessing the elements as in
T elem = some_vector[i];
then you're making the assumption that the container has operator[](std::size_t) defined. Again, this is true for vector but not for other containers.
The use of iterators bring you closer to container independence. You're not making assumptions about random-access ability or fast size() operation, only that the container has iterator capabilities.
You could enhance your code further by using standard algorithms. Depending on what it is you're trying to achieve, you may elect to use std::for_each(), std::transform() and so on. By using a standard algorithm rather than an explicit loop you're avoiding re-inventing the wheel. Your code is likely to be more efficient (given the right algorithm is chosen), correct and reusable.
It's part of the modern C++ indoctrination process. Iterators are the only way to iterate most containers, so you use it even with vectors just to get yourself into the proper mindset. Seriously, that's the only reason I do it - I don't think I've ever replaced a vector with a different kind of container.
Wow, this is still getting downvoted after three weeks. I guess it doesn't pay to be a little tongue-in-cheek.
I think the array index is more readable. It matches the syntax used in other languages, and the syntax used for old-fashioned C arrays. It's also less verbose. Efficiency should be a wash if your compiler is any good, and there are hardly any cases where it matters anyway.
Even so, I still find myself using iterators frequently with vectors. I believe the iterator is an important concept, so I promote it whenever I can.
because you are not tying your code to the particular implementation of the some_vector list. if you use array indices, it has to be some form of array; if you use iterators you can use that code on any list implementation.
Imagine some_vector is implemented with a linked-list. Then requesting an item in the i-th place requires i operations to be done to traverse the list of nodes. Now, if you use iterator, generally speaking, it will make its best effort to be as efficient as possible (in the case of a linked list, it will maintain a pointer to the current node and advance it in each iteration, requiring just a single operation).
So it provides two things:
Abstraction of use: you just want to iterate some elements, you don't care about how to do it
Performance
I'm going to be the devils advocate here, and not recommend iterators. The main reason why, is all the source code I've worked on from Desktop application development to game development have i nor have i needed to use iterators. All the time they have not been required and secondly the hidden assumptions and code mess and debugging nightmares you get with iterators make them a prime example not to use it in any applications that require speed.
Even from a maintence stand point they're a mess. Its not because of them but because of all the aliasing that happen behind the scene. How do i know that you haven't implemented your own virtual vector or array list that does something completely different to the standards. Do i know what type is currently now during runtime? Did you overload a operator I didn't have time to check all your source code. Hell do i even know what version of the STL your using?
The next problem you got with iterators is leaky abstraction, though there are numerous web sites that discuss this in detail with them.
Sorry, I have not and still have not seen any point in iterators. If they abstract the list or vector away from you, when in fact you should know already what vector or list your dealing with if you don't then your just going to be setting yourself up for some great debugging sessions in the future.
You might want to use an iterator if you are going to add/remove items to the vector while you are iterating over it.
some_iterator = some_vector.begin();
while (some_iterator != some_vector.end())
{
if (/* some condition */)
{
some_iterator = some_vector.erase(some_iterator);
// some_iterator now positioned at the element after the deleted element
}
else
{
if (/* some other condition */)
{
some_iterator = some_vector.insert(some_iterator, some_new_value);
// some_iterator now positioned at new element
}
++some_iterator;
}
}
If you were using indices you would have to shuffle items up/down in the array to handle the insertions and deletions.
Separation of Concerns
It's very nice to separate the iteration code from the 'core' concern of the loop. It's almost a design decision.
Indeed, iterating by index ties you to the implementation of the container. Asking the container for a begin and end iterator, enables the loop code for use with other container types.
Also, in the std::for_each way, you TELL the collection what to do, instead of ASKing it something about its internals
The 0x standard is going to introduce closures, which will make this approach much more easy to use - have a look at the expressive power of e.g. Ruby's [1..6].each { |i| print i; }...
Performance
But maybe a much overseen issue is that, using the for_each approach yields an opportunity to have the iteration parallelized - the intel threading blocks can distribute the code block over the number of processors in the system!
Note: after discovering the algorithms library, and especially foreach, I went through two or three months of writing ridiculously small 'helper' operator structs which will drive your fellow developers crazy. After this time, I went back to a pragmatic approach - small loop bodies deserve no foreach no more :)
A must read reference on iterators is the book "Extended STL".
The GoF have a tiny little paragraph in the end of the Iterator pattern, which talks about this brand of iteration; it's called an 'internal iterator'. Have a look here, too.
Because it is more object-oriented. if you are iterating with an index you are assuming:
a) that those objects are ordered
b) that those objects can be obtained by an index
c) that the index increment will hit every item
d) that that index starts at zero
With an iterator, you are saying "give me everything so I can work with it" without knowing what the underlying implementation is. (In Java, there are collections that cannot be accessed through an index)
Also, with an iterator, no need to worry about going out of bounds of the array.
Another nice thing about iterators is that they better allow you to express (and enforce) your const-preference. This example ensures that you will not be altering the vector in the midst of your loop:
for(std::vector<Foo>::const_iterator pos=foos.begin(); pos != foos.end(); ++pos)
{
// Foo & foo = *pos; // this won't compile
const Foo & foo = *pos; // this will compile
}
Aside from all of the other excellent answers... int may not be large enough for your vector. Instead, if you want to use indexing, use the size_type for your container:
for (std::vector<Foo>::size_type i = 0; i < myvector.size(); ++i)
{
Foo& this_foo = myvector[i];
// Do stuff with this_foo
}
I probably should point out you can also call
std::for_each(some_vector.begin(), some_vector.end(), &do_stuff);
STL iterators are mostly there so that the STL algorithms like sort can be container independent.
If you just want to loop over all the entries in a vector just use the index loop style.
It is less typing and easier to parse for most humans. It would be nice if C++ had a simple foreach loop without going overboard with template magic.
for( size_t i = 0; i < some_vector.size(); ++i )
{
T& rT = some_vector[i];
// now do something with rT
}
'
I don't think it makes much difference for a vector. I prefer to use an index myself as I consider it to be more readable and you can do random access like jumping forward 6 items or jumping backwards if needs be.
I also like to make a reference to the item inside the loop like this so there are not a lot of square brackets around the place:
for(size_t i = 0; i < myvector.size(); i++)
{
MyClass &item = myvector[i];
// Do stuff to "item".
}
Using an iterator can be good if you think you might need to replace the vector with a list at some point in the future and it also looks more stylish to the STL freaks but I can't think of any other reason.
The second form represents what you're doing more accurately. In your example, you don't care about the value of i, really - all you want is the next element in the iterator.
After having learned a little more on the subject of this answer, I realize it was a bit of an oversimplification. The difference between this loop:
for (some_iterator = some_vector.begin(); some_iterator != some_vector.end();
some_iterator++)
{
//do stuff
}
And this loop:
for (int i = 0; i < some_vector.size(); i++)
{
//do stuff
}
Is fairly minimal. In fact, the syntax of doing loops this way seems to be growing on me:
while (it != end){
//do stuff
++it;
}
Iterators do unlock some fairly powerful declarative features, and when combined with the STL algorithms library you can do some pretty cool things that are outside the scope of array index administrivia.
Indexing requires an extra mul operation. For example, for vector<int> v, the compiler converts v[i] into &v + sizeof(int) * i.
During iteration you don't need to know number of item to be processed. You just need the item and iterators do such things very good.
No one mentioned yet that one advantage of indices is that they are not become invalid when you append to a contiguous container like std::vector, so you can add items to the container during iteration.
This is also possible with iterators, but you must call reserve(), and therefore need to know how many items you'll append.
If you have access to C++11 features, then you can also use a range-based for loop for iterating over your vector (or any other container) as follows:
for (auto &item : some_vector)
{
//do stuff
}
The benefit of this loop is that you can access elements of the vector directly via the item variable, without running the risk of messing up an index or making a making a mistake when dereferencing an iterator. In addition, the placeholder auto prevents you from having to repeat the type of the container elements,
which brings you even closer to a container-independent solution.
Notes:
If you need the the element index in your loop and the operator[] exists for your container (and is fast enough for you), then better go for your first way.
A range-based for loop cannot be used to add/delete elements into/from a container. If you want to do that, then better stick to the solution given by Brian Matthews.
If you don't want to change the elements in your container, then you should use the keyword const as follows: for (auto const &item : some_vector) { ... }.
Several good points already. I have a few additional comments:
Assuming we are talking about the C++ standard library, "vector" implies a random access container that has the guarantees of C-array (random access, contiguos memory layout etc). If you had said 'some_container', many of the above answers would have been more accurate (container independence etc).
To eliminate any dependencies on compiler optimization, you could move some_vector.size() out of the loop in the indexed code, like so:
const size_t numElems = some_vector.size();
for (size_t i = 0; i
Always pre-increment iterators and treat post-increments as exceptional cases.
for (some_iterator = some_vector.begin(); some_iterator != some_vector.end(); ++some_iterator){ //do stuff }
So assuming and indexable std::vector<> like container, there is no good reason to prefer one over other, sequentially going through the container. If you have to refer to older or newer elemnent indexes frequently, then the indexed version is more appropropriate.
In general, using the iterators is preferred because algorithms make use of them and behavior can be controlled (and implicitly documented) by changing the type of the iterator. Array locations can be used in place of iterators, but the syntactical difference will stick out.
I don't use iterators for the same reason I dislike foreach-statements. When having multiple inner-loops it's hard enough to keep track of global/member variables without having to remember all the local values and iterator-names as well. What I find useful is to use two sets of indices for different occasions:
for(int i=0;i<anims.size();i++)
for(int j=0;j<bones.size();j++)
{
int animIndex = i;
int boneIndex = j;
// in relatively short code I use indices i and j
... animation_matrices[i][j] ...
// in long and complicated code I use indices animIndex and boneIndex
... animation_matrices[animIndex][boneIndex] ...
}
I don't even want to abbreviate things like "animation_matrices[i]" to some random "anim_matrix"-named-iterator for example, because then you can't see clearly from which array this value is originated.
If you like being close to the metal / don't trust their implementation details, don't use iterators.
If you regularly switch out one collection type for another during development, use iterators.
If you find it difficult to remember how to iterate different sorts of collections (maybe you have several types from several different external sources in use), use iterators to unify the means by which you walk over elements. This applies to say switching a linked list with an array list.
Really, that's all there is to it. It's not as if you're going to gain more brevity either way on average, and if brevity really is your goal, you can always fall back on macros.
Even better than "telling the CPU what to do" (imperative) is "telling the libraries what you want" (functional).
So instead of using loops you should learn the algorithms present in stl.
For container independence
I always use array index because many application of mine require something like "display thumbnail image". So I wrote something like this:
some_vector[0].left=0;
some_vector[0].top =0;<br>
for (int i = 1; i < some_vector.size(); i++)
{
some_vector[i].left = some_vector[i-1].width + some_vector[i-1].left;
if(i % 6 ==0)
{
some_vector[i].top = some_vector[i].top.height + some_vector[i].top;
some_vector[i].left = 0;
}
}
Both the implementations are correct, but I would prefer the 'for' loop. As we have decided to use a Vector and not any other container, using indexes would be the best option. Using iterators with Vectors would lose the very benefit of having the objects in continuous memory blocks which help ease in their access.
I felt that none of the answers here explain why I like iterators as a general concept over indexing into containers. Note that most of my experience using iterators doesn't actually come from C++ but from higher-level programming languages like Python.
The iterator interface imposes fewer requirements on consumers of your function, which allows consumers to do more with it.
If all you need is to be able to forward-iterate, the developer isn't limited to using indexable containers - they can use any class implementing operator++(T&), operator*(T) and operator!=(const &T, const &T).
#include <iostream>
template <class InputIterator>
void printAll(InputIterator& begin, InputIterator& end)
{
for (auto current = begin; current != end; ++current) {
std::cout << *current << "\n";
}
}
// elsewhere...
printAll(myVector.begin(), myVector.end());
Your algorithm works for the case you need it - iterating over a vector - but it can also be useful for applications you don't necessarily anticipate:
#include <random>
class RandomIterator
{
private:
std::mt19937 random;
std::uint_fast32_t current;
std::uint_fast32_t floor;
std::uint_fast32_t ceil;
public:
RandomIterator(
std::uint_fast32_t floor = 0,
std::uint_fast32_t ceil = UINT_FAST32_MAX,
std::uint_fast32_t seed = std::mt19937::default_seed
) :
floor(floor),
ceil(ceil)
{
random.seed(seed);
++(*this);
}
RandomIterator& operator++()
{
current = floor + (random() % (ceil - floor));
}
std::uint_fast32_t operator*() const
{
return current;
}
bool operator!=(const RandomIterator &that) const
{
return current != that.current;
}
};
int main()
{
// roll a 1d6 until we get a 6 and print the results
RandomIterator firstRandom(1, 7, std::random_device()());
RandomIterator secondRandom(6, 7);
printAll(firstRandom, secondRandom);
return 0;
}
Attempting to implement a square-brackets operator which does something similar to this iterator would be contrived, while the iterator implementation is relatively simple. The square-brackets operator also makes implications about the capabilities of your class - that you can index to any arbitrary point - which may be difficult or inefficient to implement.
Iterators also lend themselves to decoration. People can write iterators which take an iterator in their constructor and extend its functionality:
template<class InputIterator, typename T>
class FilterIterator
{
private:
InputIterator internalIterator;
public:
FilterIterator(const InputIterator &iterator):
internalIterator(iterator)
{
}
virtual bool condition(T) = 0;
FilterIterator<InputIterator, T>& operator++()
{
do {
++(internalIterator);
} while (!condition(*internalIterator));
return *this;
}
T operator*()
{
// Needed for the first result
if (!condition(*internalIterator))
++(*this);
return *internalIterator;
}
virtual bool operator!=(const FilterIterator& that) const
{
return internalIterator != that.internalIterator;
}
};
template <class InputIterator>
class EvenIterator : public FilterIterator<InputIterator, std::uint_fast32_t>
{
public:
EvenIterator(const InputIterator &internalIterator) :
FilterIterator<InputIterator, std::uint_fast32_t>(internalIterator)
{
}
bool condition(std::uint_fast32_t n)
{
return !(n % 2);
}
};
int main()
{
// Rolls a d20 until a 20 is rolled and discards odd rolls
EvenIterator<RandomIterator> firstRandom(RandomIterator(1, 21, std::random_device()()));
EvenIterator<RandomIterator> secondRandom(RandomIterator(20, 21));
printAll(firstRandom, secondRandom);
return 0;
}
While these toys might seem mundane, it's not difficult to imagine using iterators and iterator decorators to do powerful things with a simple interface - decorating a forward-only iterator of database results with an iterator which constructs a model object from a single result, for example. These patterns enable memory-efficient iteration of infinite sets and, with a filter like the one I wrote above, potentially lazy evaluation of results.
Part of the power of C++ templates is your iterator interface, when applied to the likes of fixed-length C arrays, decays to simple and efficient pointer arithmetic, making it a truly zero-cost abstraction.