C++: Vector bounds

C++: Vector bounds - c++

I am coming from Java and learning C++ in the moment. I am using Stroustrup's Progamming Principles and Practice of Using C++. I am working with vectors now. On page 117 he says that accessing a non-existant element of a vector will cause a runtime error (same in Java, index out of bounds). I am using the MinGW compiler and when I compile and run this code:
#include <iostream>
#include <cstdio>
#include <vector>
int main()
{
std::vector<int> v(6);
v[8] = 10;
std::cout << v[8];
return 0;
}
It gives me as output 10. Even more interesting is that if I do not modify the non-existent vector element (I just print it expecting a runtime error or at least a default value) it prints some large integers. So... is Stroustrup wrong, or does GCC have some strange ways of compiling C++?

The book is a bit vague. It's not as much a "runtime error" as it is undefined behaviour which manifests at runtime. This means that anything could happen. But the error is strictly with you, not with the program execution, and it is in fact impossible and non sensible to even talk about the execution of a program with undefined behaviour.
There is nothing in C++ that protects you against programming errors, quite unlike in Java.
As #sftrabbit says, std::vector has an alternative interface, .at(), which always gives a correct program (though it may throw exceptions), and consequently one which one can reason about.
Let me repeat the point with an example, because I believe this is an important fundamental aspect of C++. Suppose we're reading an integer from the user:
int read_int()
{
std::cout << "Please enter a number: ";
int n;
return (std::cin >> n) ? n : 18;
}
Now consider the following three programs:
The dangerous one: The correctness of this program depends on the user input! It is not necessarily incorrect, but it is unsafe (to the point where I would call it broken).
int main()
{
int n = read_int();
int k = read_int();
std::vector<int> v(n);
return v[k];
}
Unconditionally correct: No matter what the user enters, we know how this program behaves.
int main() try
{
int n = read_int();
int k = read_int();
std::vector<int> v(n);
return v.at(k);
}
catch (...)
{
return 0;
}
The sane one: The above version with .at() is awkward. Better to check and provide feedback. Because we perform dynamic checking, the unchecked vector access is actually guaranteed to be fine.
int main()
{
int n = read_int();
if (n <= 0) { std::cout << "Bad container size!\n"; return 0; }
int k = read_int();
if (k < 0 || k >= n) { std::cout << "Bad index!\n"; return 0; }
std::vector<int> v(n);
return v[k];
}
(We're ignoring the possibility that the vector construction might throw an exception of its own.)
The moral is that many operations in C++ are unsafe and only conditionally correct, but it is expected of the programmer that you make the necessary checks ahead of time. The language doesn't do it for you, and so you don't pay for it, but you have to remember to do it. The idea is that you need to handle the error conditions anyway, and so rather than enforcing an expensive, non-specific operation at the library or language level, the responsibility is left to the programmer, who is in a better position to integrate the checking into the code that needs to be written anyway.
If I wanted to be facetious, I would contrast this approach to Python, which allows you to write incredibly short and correct programs, without any user-written error handling at all. The flip side is that any attempt to use such a program that deviates only slightly from what the programmer intended leaves you with a non-specific, hard-to-read exception and stack trace and little guidance on what you should have done better. You're not forced to write any error handling, and often no error handling ends up being written. (I can't quite contrast C++ with Java, because while Java is generally safe, I have yet to see a short Java program.)</rantmode>

This is a valuable comment by #Evgeny Sergeev that I promote to the answer:
For GCC, you can -D_GLIBCXX_DEBUG to replace standard containers with safe implementations. More recently, this now also seems to work with std::array. More info here: gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode.html
I would add, it is also possible to bundle individual "safe" versions of vector and other utility classes by using gnu_debug:: namespace prefix rather than std::.
In other words, do not re-invent the wheel, array checks are available at least with GCC.

C and C++ does not always do bounds checks. It MAY cause a runtime error. And if you were to overdo your number by enough, say 10000 or so, it's almost certain to cause a problem.
You can also use vector.at(10), which definitely should give you an exception.
see:
http://www.cplusplus.com/reference/vector/vector/at/
compared with:
http://www.cplusplus.com/reference/vector/vector/operator%5B%5D/

I hoped that vector's "operator[]" would check boundary as "at()" does, because I'm not so careful. :-)
One way would inherit vector class and override operator[] to call at() so that one can use more readable "[]" and no need to replace all "[]" to "at()". You can also define the inherited vector (ex:safer_vector) as normal vector.
The code will be like this(in C++11, llvm3.5 of Xcode 5).
#include <vector>
using namespace std;
template <class _Tp, class _Allocator = allocator<_Tp> >
class safer_vector:public vector<_Tp, _Allocator>{
private:
typedef __vector_base<_Tp, _Allocator> __base;
public:
typedef _Tp value_type;
typedef _Allocator allocator_type;
typedef typename __base::reference reference;
typedef typename __base::const_reference const_reference;
typedef typename __base::size_type size_type;
public:
reference operator[](size_type __n){
return this->at(__n);
};
safer_vector(_Tp val):vector<_Tp, _Allocator>(val){;};
safer_vector(_Tp val, const_reference __x):vector<_Tp, _Allocator>(val,__x){;};
safer_vector(initializer_list<value_type> __il):vector<_Tp, _Allocator>(__il){;}
template <class _Iterator>
safer_vector(_Iterator __first, _Iterator __last):vector<_Tp,_Allocator>(__first, __last){;};
// If C++11 Constructor inheritence is supported
// using vector<_Tp, _Allocator>::vector;
};
#define safer_vector vector

Related

Extend std::vector with range checking and signed size_type

I would like to use a class with the same functionality as std::vector, but
Replace std::vector<T>::size_type by some signed integer (like int64_t or simply int), instead of usual size_t. It is very annoying to see warnings produced by a compiler in comparisons between signed and unsigned numbers when I use standard vector interface. I can't just disable such warnings, because they really help to catch programming errors.
put assert(0 <= i && i < size()); inside operator[](int i) to check out of range errors. As I understand it will be a better option over the call to .at() because I can disable assertions in release builds, so performance will be the same as in the standard implementation of the vector. It is almost impossible for me to use std::vector without manual checking of range before each operation because operator[] is the source of almost all weird errors related to memory access.
The possible options that come to my mind are to
Inherit from std::vector. It is not a good idea, as said in the following question: Extending the C++ Standard Library by inheritance?.
Use composition (put std::vector inside my class) and repeat all the interface of std::vector. This option forces me to think about the current C++ standard, because the interface of some methods, iterators is slightly different in C++ 98,11,14,17. I would like to be sure, that when c++ 20 became available, I can simply use it without reimplementation of all the interface of my vector.

An answer more to the underlying problem read from the comment:
For example, I don't know how to write in a ranged-based for way:
for (int i = a.size() - 2; i >= 0; i--) { a[i] = 2 * a[i+1]; }
You may change it to a generic one like this:
std::vector<int> vec1{ 1,2,3,4,5,6};
std::vector<int> vec2 = vec1;
int main()
{
// generic code
for ( auto it = vec1.rbegin()+1; it != vec1.rend(); it++ )
{
*it= 2* *(it-1);
}
// your code
for (int i = vec2.size() - 2; i >= 0; i--)
{
vec2[i] = 2 * vec2[i+1];
}
for ( auto& el: vec1) { std::cout << el << std::endl; }
for ( auto& el: vec2) { std::cout << el << std::endl; }
}
Not using range based for as it is not able to access relative to the position.

Regarding point 1: we hardly ever get those warnings here, because we use vectors' size_type where appropriate and/or cast to it if needed (with a 'checked' cast like boost::numeric_cast for safety). Is that not an option for you? Otherwise, write a function to do it for you, i.e. the non-const version would be something like
template<class T>
T& ati(std::vector<T>& v, std::int64_t i)
{
return v.at(checked_cast<decltype(v)::size_type>(i));
}
And yes, inheriting is still a problem. And even if it weren't you'd break the definition of vector (and the Liskov substitution principle I guess), because the size_type is defined as
an unsigned integral type that can represent any non-negative value of difference_type
So it's down to composition, or a bunch of free functions for accessing with a signed size_type and a range check. Personally I'd go for the latter: less work, as easy to use, and you can still pass your vector to functions teaking vectors without problems.

(This is more a comment than a real answer, but has some code, so...)
For the second part (range checking at runtime), a third option would be to use some macro trick:
#ifdef DEBUG
#define VECTOR_AT(v,i) v.at(i)
#else
#define VECTOR_AT(v,i) v[i]
#endif
This can be used this way:
std::vector<sometype> vect(somesize);
VECTOR_AT(vect,i) = somevalue;
Of course, this requires editing your code in a quite non-standard way, which may not be an option. But it does the job.

Compile time validation for iterator usage?

I have the following piece of careless C++ code, which compiles without a hitch under VC10, but fails miserably during runtime. I am wondering if there is a way to validate this kind of error at compile time?
#include "stdafx.h"
#include <set>
void minus(std::set<int>& lhs, const std::set<int>& rhs)
{
for ( auto i = rhs.cbegin(); i != rhs.cend(); ++i )
{
lhs.erase(i); // !!! while I meant "*i" !!!
}
}
int _tmain(int argc, _TCHAR* argv[])
{
int v_lhs[] = {0,1,2,3,4,5};
std::set<int> s_lhs(&v_lhs[0], &v_lhs[sizeof(v_lhs) / sizeof(int)]);
int v_rhs[] = {1,3,5};
std::set<int> s_rhs(&v_rhs[0], &v_rhs[sizeof(v_rhs) / sizeof(int)]);
minus(s_lhs, s_rhs);
return 0;
}
Note that I am fully aware that C++11 (as partially adopted early by VC10) has corrected the behavior that 'erase" actually takes "const_iterator".
Thanks in advance for any valuable inputs.

C++ is not a mind-reading language. All it knows are types. It knows that erase takes an iterator. And it knows that i is an iterator of the same type. Therefore, as far as the compiler's rules of C++ are concerned, it is legal to call erase(i).
There's no way for the compiler to know what you meant to do. Nor is there a way for the compiler to know that the contents of i are not appropriate for this particular use of erase. Your best bet is to just try to avoid mistakes. Range-based for (or the use of std::for_each) would help you here, as both of these hide the iterator.

Why use std::for_each over a for loop? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Advantages of std::for_each over for loop
So I was playing around with some C++11 features and I'm curious as to why std::for_each is beneficial. Wouldn't it be easier and look cleaner to do a for loop or is it because I'm so used to doing it this way?
#include <iostream>
#include <tuple>
#include <vector>
#include <algorithm>
typedef std::tuple<int, int> pow_tuple;
pow_tuple pow(int x)
{
return std::make_tuple(x, x*x);
}
void print_values(pow_tuple values)
{
std::cout << std::get<0>(values) << "\t" << std::get<1>(values) << std::endl;
}
int main(int argc, char** argv)
{
std::vector<int> numbers;
for (int i=1; i < 10; i++)
numbers.push_back(i);
std::for_each(numbers.begin(), numbers.end(),
[](int x) { print_values(pow(x)); }
);
std::cout << "Using auto keyword:" << std::endl;
auto values = pow(20);
print_values(values);
return 0;
}

the standard algorithms handle all of the looping issues correctly, reducing the chance of making one-off errors and such, allowing you to focus on the calculation and not the looping.

It depends somewhat on the local coding conventions, but there are two
potential advantages. The first is that it states clearly that the code
iterates over all of the elements in the sequence; unless the local
coding conventions say otherwise (and they are enforced), you have to
consider that some cowboy programmer might have inserted a break. The
second is that it names the operation you are performing on each
element; this once can easily be handled by calling a function in the
loop, and of course, really trivial operations may not need a name.
There's also the advantage, at least if you aren't yet using C++11, that
you don't have to spell out the iterator types; the spelled out iterator
types create a lot of verbiage, in which the important logic can get
lost or overlooked.

one could say that this form allows you write this piece of code without the unnecessary index to manipulate and make mistakes with.
It is also an idiom which exists in other languages, and since you are getting anonymous functions, this feature can be a good example of higher level functions (educational purpose?).
I agree that it does not feel like c++ ...

A proper way to create a matrix in c++

I want to create an adjacency matrix for a graph. Since I read it is not safe to use arrays of the form matrix[x][y] because they don't check for range, I decided to use the vector template class of the stl. All I need to store in the matrix are boolean values. So my question is, if using std::vector<std::vector<bool>* >* produces too much overhead or if there is a more simple way for a matrix and how I can properly initialize it.
EDIT: Thanks a lot for the quick answers. I just realized, that of course I don't need any pointers. The size of the matrix will be initialized right in the beginning and won't change until the end of the program. It is for a school project, so it would be good if I write "nice" code, although technically performance isn't too important. Using the STL is fine. Using something like boost, is probably not appreciated.

Note that also you can use boost.ublas for matrix creation and manipulation and also boost.graph to represent and manipulate graphs in a number of ways, as well as using algorithms on them, etc.
Edit: Anyway, doing a range-check version of a vector for your purposes is not a hard thing:
template <typename T>
class BoundsMatrix
{
std::vector<T> inner_;
unsigned int dimx_, dimy_;
public:
BoundsMatrix (unsigned int dimx, unsigned int dimy)
: dimx_ (dimx), dimy_ (dimy)
{
inner_.resize (dimx_*dimy_);
}
T& operator()(unsigned int x, unsigned int y)
{
if (x >= dimx_ || y>= dimy_)
throw std::out_of_range("matrix indices out of range"); // ouch
return inner_[dimx_*y + x];
}
};
Note that you would also need to add the const version of the operators, and/or iterators, and the strange use of exceptions, but you get the idea.

Best way:
Make your own matrix class, that way you control every last aspect of it, including range checking.
eg. If you like the "[x][y]" notation, do this:
class my_matrix {
std::vector<std::vector<bool> >m;
public:
my_matrix(unsigned int x, unsigned int y) {
m.resize(x, std::vector<bool>(y,false));
}
class matrix_row {
std::vector<bool>& row;
public:
matrix_row(std::vector<bool>& r) : row(r) {
}
bool& operator[](unsigned int y) {
return row.at(y);
}
};
matrix_row& operator[](unsigned int x) {
return matrix_row(m.at(x));
}
};
// Example usage
my_matrix mm(100,100);
mm[10][10] = true;
nb. If you program like this then C++ is just as safe as all those other "safe" languages.

The standard vector does NOT do range checking by default.
i.e. The operator[] does not do a range check.
The method at() is similar to [] but does do a range check.
It will throw an exception on out of range.
std::vector::at()
std::vector::operator[]()
Other notes:
Why a vector<Pointers> ?
You can quite easily have a vector<Object>. Now there is no need to worry about memory management (i.e. leaks).
std::vector<std::vector<bool> > m;
Note: vector<bool> is overloaded and not very efficient (i.e. this structure was optimized for size not speed) (It is something that is now recognized as probably a mistake by the standards committee).
If you know the size of the matrix at compile time you could use std::bitset?
std::vector<std::bitset<5> > m;
or if it is runtime defined use boost::dynamic_bitset
std::vector<boost::dynamic_bitset> m;
All of the above will allow you to do:
m[6][3] = true;

If you want 'C' array performance, but with added safety and STL-like semantics (iterators, begin() & end() etc), use boost::array.
Basically it's a templated wrapper for 'C'-arrays with some NDEBUG-disable-able range checking asserts (and also some std::range_error exception-throwing accessors).
I use stuff like
boost::array<boost::array<float,4>,4> m;
instead of
float m[4][4];
all the time and it works great (with appropriate typedefs to keep the verbosity down, anyway).
UPDATE: Following some discussion in the comments here of the relative performance of boost::array vs boost::multi_array, I'd point out that this code, compiled with g++ -O3 -DNDEBUG on Debian/Lenny amd64 on a Q9450 with 1333MHz DDR3 RAM takes 3.3s for boost::multi_array vs 0.6s for boost::array.
#include <iostream>
#include <time.h>
#include "boost/array.hpp"
#include "boost/multi_array.hpp"
using namespace boost;
enum {N=1024};
typedef multi_array<char,3> M;
typedef array<array<array<char,N>,N>,N> C;
// Forward declare to avoid being optimised away
static void clear(M& m);
static void clear(C& c);
int main(int,char**)
{
const clock_t t0=clock();
{
M m(extents[N][N][N]);
clear(m);
}
const clock_t t1=clock();
{
std::auto_ptr<C> c(new C);
clear(*c);
}
const clock_t t2=clock();
std::cout
<< "multi_array: " << (t1-t0)/static_cast<float>(CLOCKS_PER_SEC) << "s\n"
<< "array : " << (t2-t1)/static_cast<float>(CLOCKS_PER_SEC) << "s\n";
return 0;
}
void clear(M& m)
{
for (M::index i=0;i<N;i++)
for (M::index j=0;j<N;j++)
for (M::index k=0;k<N;k++)
m[i][j][k]=1;
}
void clear(C& c)
{
for (int i=0;i<N;i++)
for (int j=0;j<N;j++)
for (int k=0;k<N;k++)
c[i][j][k]=1;
}

What I would do is create my own class for dealing with matrices (probably as an array[x*y] because I'm more used to C (and I'd have my own bounds checking), but you could use vectors or any other sub-structure in that class).
Get your stuff functional first then worry about how fast it runs. If you design the class properly, you can pull out your array[x*y] implementation and replace it with vectors or bitmasks or whatever you want without changing the rest of the code.
I'm not totally sure, but I thing that's what classes were meant for, the ability to abstract the implementation well out of sight and provide only the interface :-)

In addition to all the answers that have been posted so far, you might do well to check out the C++ FAQ Lite. Questions 13.10 - 13.12 and 16.16 - 16.19 cover several topics related to rolling your own matrix class. You'll see a couple of different ways to store the data and suggestions on how to best write the subscript operators.
Also, if your graph is sufficiently sparse, you may not need a matrix at all. You could use std::multimap to map each vertex to those it connects.

my favourite way to store a graph is vector<set<int>>; n elements in vector (nodes 0..n-1), >=0 elements in each set (edges). Just do not forget adding a reverse copy of every bi-directional edge.

Consider also how big is your graph/matrix, does performance matter a lot? Is the graph static, or can it grow over time, e.g. by adding new edges?

Probably, not relevant as this is an old question, but you can use the Armadillo library, which provides many linear algebra oriented data types and functions.
Below is an example for your specific problem:
// In C++11
Mat<bool> matrix = {
{ true, true},
{ false, false},
};
// In C++98
Mat<bool> matrix;
matrix << true << true << endr
<< false << false << endr;

Mind you std::vector doesn't do range checking either.

acceptable fix for majority of signed/unsigned warnings?

I myself am convinced that in a project I'm working on signed integers are the best choice in the majority of cases, even though the value contained within can never be negative. (Simpler reverse for loops, less chance for bugs, etc., in particular for integers which can only hold values between 0 and, say, 20, anyway.)
The majority of the places where this goes wrong is a simple iteration of a std::vector, often this used to be an array in the past and has been changed to a std::vector later. So these loops generally look like this:
for (int i = 0; i < someVector.size(); ++i) { /* do stuff */ }
Because this pattern is used so often, the amount of compiler warning spam about this comparison between signed and unsigned type tends to hide more useful warnings. Note that we definitely do not have vectors with more then INT_MAX elements, and note that until now we used two ways to fix compiler warning:
for (unsigned i = 0; i < someVector.size(); ++i) { /*do stuff*/ }
This usually works but might silently break if the loop contains any code like 'if (i-1 >= 0) ...', etc.
for (int i = 0; i < static_cast<int>(someVector.size()); ++i) { /*do stuff*/ }
This change does not have any side effects, but it does make the loop a lot less readable. (And it's more typing.)
So I came up with the following idea:
template <typename T> struct vector : public std::vector<T>
{
typedef std::vector<T> base;
int size() const { return base::size(); }
int max_size() const { return base::max_size(); }
int capacity() const { return base::capacity(); }
vector() : base() {}
vector(int n) : base(n) {}
vector(int n, const T& t) : base(n, t) {}
vector(const base& other) : base(other) {}
};
template <typename Key, typename Data> struct map : public std::map<Key, Data>
{
typedef std::map<Key, Data> base;
typedef typename base::key_compare key_compare;
int size() const { return base::size(); }
int max_size() const { return base::max_size(); }
int erase(const Key& k) { return base::erase(k); }
int count(const Key& k) { return base::count(k); }
map() : base() {}
map(const key_compare& comp) : base(comp) {}
template <class InputIterator> map(InputIterator f, InputIterator l) : base(f, l) {}
template <class InputIterator> map(InputIterator f, InputIterator l, const key_compare& comp) : base(f, l, comp) {}
map(const base& other) : base(other) {}
};
// TODO: similar code for other container types
What you see is basically the STL classes with the methods which return size_type overridden to return just 'int'. The constructors are needed because these aren't inherited.
What would you think of this as a developer, if you'd see a solution like this in an existing codebase?
Would you think 'whaa, they're redefining the STL, what a huge WTF!', or would you think this is a nice simple solution to prevent bugs and increase readability. Or maybe you'd rather see we had spent (half) a day or so on changing all these loops to use std::vector<>::iterator?
(In particular if this solution was combined with banning the use of unsigned types for anything but raw data (e.g. unsigned char) and bit masks.)

Don't derive publicly from STL containers. They have nonvirtual destructors which invokes undefined behaviour if anyone deletes one of your objects through a pointer-to base. If you must derive e.g. from a vector, do it privately and expose the parts you need to expose with using declarations.
Here, I'd just use a size_t as the loop variable. It's simple and readable. The poster who commented that using an int index exposes you as a n00b is correct. However, using an iterator to loop over a vector exposes you as a slightly more experienced n00b - one who doesn't realize that the subscript operator for vector is constant time. (vector<T>::size_type is accurate, but needlessly verbose IMO).

While I don't think "use iterators, otherwise you look n00b" is a good solution to the problem, deriving from std::vector appears much worse than that.
First, developers do expect vector to be std:.vector, and map to be std::map. Second, your solution does not scale for other containers, or for other classes/libraries that interact with containers.
Yes, iterators are ugly, iterator loops are not very well readable, and typedefs only cover up the mess. But at least, they do scale, and they are the canonical solution.
My solution? an stl-for-each macro. That is not without problems (mainly, it is a macro, yuck), but it gets across the meaning. It is not as advanced as e.g. this one, but does the job.

I made this community wiki... Please edit it. I don't agree with the advice against "int" anymore. I now see it as not bad.
Yes, i agree with Richard. You should never use 'int' as the counting variable in a loop like those. The following is how you might want to do various loops using indices (althought there is little reason to, occasionally this can be useful).
Forward
for(std::vector<int>::size_type i = 0; i < someVector.size(); i++) {
/* ... */
}
Backward
You can do this, which is perfectly defined behaivor:
for(std::vector<int>::size_type i = someVector.size() - 1;
i != (std::vector<int>::size_type) -1; i--) {
/* ... */
}
Soon, with c++1x (next C++ version) coming along nicely, you can do it like this:
for(auto i = someVector.size() - 1; i != (decltype(i)) -1; i--) {
/* ... */
}
Decrementing below 0 will cause i to wrap around, because it is unsigned.
But unsigned will make bugs slurp in
That should never be an argument to make it the wrong way (using 'int').
Why not use std::size_t above?
The C++ Standard defines in 23.1 p5 Container Requirements, that T::size_type , for T being some Container, that this type is some implementation defined unsigned integral type. Now, using std::size_t for i above will let bugs slurp in silently. If T::size_type is less or greater than std::size_t, then it will overflow i, or not even get up to (std::size_t)-1 if someVector.size() == 0. Likewise, the condition of the loop would have been broken completely.

Definitely use an iterator. Soon you will be able to use the 'auto' type, for better readability (one of your concerns) like this:
for (auto i = someVector.begin();
i != someVector.end();
++i)

Skip the index
The easiest approach is to sidestep the problem by using iterators, range-based for loops, or algorithms:
for (auto it = begin(v); it != end(v); ++it) { ... }
for (const auto &x : v) { ... }
std::for_each(v.begin(), v.end(), ...);
This is a nice solution if you don't actually need the index value. It also handles reverse loops easily.
Use an appropriate unsigned type
Another approach is to use the container's size type.
for (std::vector<T>::size_type i = 0; i < v.size(); ++i) { ... }
You can also use std::size_t (from <cstddef>). There are those who (correctly) point out that std::size_t may not be the same type as std::vector<T>::size_type (though it usually is). You can, however, be assured that the container's size_type will fit in a std::size_t. So everything is fine, unless you use certain styles for reverse loops. My preferred style for a reverse loop is this:
for (std::size_t i = v.size(); i-- > 0; ) { ... }
With this style, you can safely use std::size_t, even if it's a larger type than std::vector<T>::size_type. The style of reverse loops shown in some of the other answers require casting a -1 to exactly the right type and thus cannot use the easier-to-type std::size_t.
Use a signed type (carefully!)
If you really want to use a signed type (or if your style guide practically demands one), like int, then you can use this tiny function template that checks the underlying assumption in debug builds and makes the conversion explicit so that you don't get the compiler warning message:
#include <cassert>
#include <cstddef>
#include <limits>
template <typename ContainerType>
constexpr int size_as_int(const ContainerType &c) {
const auto size = c.size(); // if no auto, use `typename ContainerType::size_type`
assert(size <= static_cast<std::size_t>(std::numeric_limits<int>::max()));
return static_cast<int>(size);
}
Now you can write:
for (int i = 0; i < size_as_int(v); ++i) { ... }
Or reverse loops in the traditional manner:
for (int i = size_as_int(v) - 1; i >= 0; --i) { ... }
The size_as_int trick is only slightly more typing than the loops with the implicit conversions, you get the underlying assumption checked at runtime, you silence the compiler warning with the explicit cast, you get the same speed as non-debug builds because it will almost certainly be inlined, and the optimized object code shouldn't be any larger because the template doesn't do anything the compiler wasn't already doing implicitly.

You're overthinking the problem.
Using a size_t variable is preferable, but if you don't trust your programmers to use unsigned correctly, go with the cast and just deal with the ugliness. Get an intern to change them all and don't worry about it after that. Turn on warnings as errors and no new ones will creep in. Your loops may be "ugly" now, but you can understand that as the consequences of your religious stance on signed versus unsigned.

vector.size() returns a size_t var, so just change int to size_t and it should be fine.
Richard's answer is more correct, except that it's a lot of work for a simple loop.

I notice that people have very different opinions about this subject. I have also an opinion which does not convince others, so it makes sense to search for support by some guru’s, and I found the CPP core guidelines:
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines
maintained by Bjarne Stroustrup and Herb Sutter, and their last update, upon which I base the information below, is of April 10, 2022.
Please take a look at the following code rules:
ES.100: Don’t mix signed and unsigned arithmetic
ES.101: Use unsigned types for bit manipulation
ES.102: Use signed types for arithmetic
ES.107: Don’t use unsigned for subscripts, prefer gsl::index
So, supposing that we want to index in a for loop and for some reason the range based for loop is not the appropriate solution, then using an unsigned type is also not the preferred solution. The suggested solution is using gsl::index.
But in case you don’t have gsl around and you don’t want to introduce it, what then?
In that case I would suggest to have a utility template function as suggested by Adrian McCarthy: size_as_int

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++: Vector bounds - c++

Related

Extend std::vector with range checking and signed size_type

Compile time validation for iterator usage?

Why use std::for_each over a for loop? [duplicate]

A proper way to create a matrix in c++

acceptable fix for majority of signed/unsigned warnings?

Categories

Resources