As most programmers I admire and try to follow the principles of Literate programming, but in C++ I routinely find myself using std::pair, for a gazillion common tasks. But std::pair is, IMHO, a vile enemy of literate programming...
My point is when I come back to code I've written a day or two ago, and I see manipulations of a std::pair (typically as an iterator) I wonder to myself "what did iter->first and iter->second mean???".
I'm guessing others have the same doubts when looking at their std::pair code, so I was wondering, has anyone come up with some good solutions to recover literacy when using std::pair?
std::pair is a good way to make a "local" and essentially anonymous type with essentially anonymous columns; if you're using a certain pair over so large a lexical space that you need to name the type and columns, I'd use a plain struct instead.
How about this:
struct MyPair : public std::pair < int, std::string >
{
const int& keyInt() { return first; }
void keyInt( const int& keyInt ) { first = keyInt; }
const std::string& valueString() { return second; }
void valueString( const std::string& valueString ) { second = valueString; }
};
It's a bit verbose, however using this in your code might make things a little easier to read, eg:
std::vector < MyPair > listPairs;
std::vector < MyPair >::iterator iterPair( listPairs.begin() );
if ( iterPair->keyInt() == 123 )
iterPair->valueString( "hello" );
Other than this, I can't see any silver bullet that's going to make things much clearer.
typedef std::pair<bool, int> IsPresent_Value;
typedef std::pair<double, int> Price_Quantity;
...you get the point.
You can create two pairs of getters (const and non) that will merely return a reference to first and second, but will be much more readable. For instance:
string& GetField(pair& p) { return p.first; }
int& GetValue(pair& p) { return p.second; }
Will let you get the field and value members from a given pair without having to remember which member holds what.
If you expect to use this a lot, you could also create a macro that will generate those getters for you, given the names and types: MAKE_PAIR_GETTERS(Field, string, Value, int) or so. Making the getters straightforward will probably allow the compiler to optimize them away, so they'll add no overhead at runtime; and using the macro will make it a snap to create those getters for whatever use you make of pairs.
You could use boost tuples, but they don't really alter the underlying issue: Do your really want to access each part of the pair/tuple with a small integral type, or do you want more 'literate' code. See this question I posted a while back.
However, boost::optional is a useful tool which I've found replaces quite a few of the cases where pairs/tuples are touted as ther answer.
Recently I've found myself using boost::tuple as a replacement for std::pair. You can define enumerators for each member and so it's obvious what each member is:
typedef boost::tuple<int, int> KeyValueTuple;
enum {
KEY
, VALUE
};
void foo (KeyValueTuple & p) {
p.get<KEY> () = 0;
p.get<VALUE> () = 0;
}
void bar (int key, int value)
{
foo (boost:tie (key, value));
}
BTW, comments welcome on if there is a hidden cost to using this approach.
EDIT: Remove names from global scope.
Just a quick comment regarding global namespace. In general I would use:
struct KeyValueTraits
{
typedef boost::tuple<int, int> Type;
enum {
KEY
, VALUE
};
};
void foo (KeyValueTuple::Type & p) {
p.get<KeyValueTuple::KEY> () = 0;
p.get<KeyValueTuple::VALUE> () = 0;
}
It does look to be the case that boost::fusion does tie the identity and value closer together.
As Alex mentioned, std::pair is very convenient but when it gets confusing create a structure and use it in the same way, have a look at std::pair code, it's not that complex.
I don't like std::pair as used in std::map either, map entries should have had members key and value.
I even used boost::MIC to avoid this. However, boost::MIC also comes with a cost.
Also, returning a std::pair results in less than readable code:
if (cntnr.insert(newEntry).second) { ... }
???
I also found that std::pair is commonly used by the lazy programmers who needed 2 values but didn't think why these values where needed together.
Related
I'm using the ranges library to help filer data in my classes, like this:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(v) {}
std::vector<int> getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
return std::vector<int>(evens.begin(), evens.end());
}
private:
std::vector<int> vec;
};
In this case, a new vector is constructed in the getEvents() function. To save on this overhead, I'm wondering if it is possible / advisable to return the range directly from the function?
class MyClass
{
public:
using RangeReturnType = ???;
MyClass(std::vector<int> v) : vec(v) {}
RangeReturnType getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
// ...
return evens;
}
private:
std::vector<int> vec;
};
If it is possible, are there any lifetime considerations that I need to take into account?
I am also interested to know if it is possible / advisable to pass a range in as an argument, or to store it as a member variable. Or is the ranges library more intended for use within the scope of a single function?
This was asked in op's comment section, but I think I will respond it in the answer section:
The Ranges library seems promising, but I'm a little apprehensive about this returning auto.
Remember that even with the addition of auto, C++ is a strongly typed language. In your case, since you are returning evens, then the return type will be the same type of evens. (technically it will be the value type of evens, but evens was a value type anyways)
In fact, you probably really don't want to type out the return type manually: std::ranges::filter_view<std::ranges::ref_view<const std::vector<int>>, MyClass::getEvens() const::<decltype([](int i) {return ! (i % 2);})>> (141 characters)
As mentioned by #Caleth in the comment, in fact, this wouldn't work either as evens was a lambda defined inside the function, and the type of two different lambdas will be different even if they were basically the same, so there's literally no way of getting the full return type here.
While there might be debates on whether to use auto or not in different cases, but I believe most people would just use auto here. Plus your evens was declared with auto too, typing the type out would just make it less readable here.
So what are my options if I want to access a subset (for instance even numbers)? Are there any other approaches I should be considering, with or without the Ranges library?
Depends on how you would access the returned data and the type of the data, you might consider returning std::vector<T*>.
views are really supposed to be viewed from start to end. While you could use views::drop and views::take to limit to a single element, it doesn't provide a subscript operator (yet).
There will also be computational differences. vector need to be computed beforehand, where views are computed while iterating. So when you do:
for(auto i : myObject.getEven())
{
std::cout << i;
}
Under the hood, it is basically doing:
for(auto i : myObject.vec)
{
if(!(i % 2)) std::cout << i;
}
Depends on the amount of data, and the complexity of computations, views might be a lot faster, or about the same as the vector method. Plus you can easily apply multiple filters on the same range without iterating through the data multiple times.
In the end, you can always store the view in a vector:
std::vector<int> vec2(evens.begin(), evens.end());
So my suggestions is, if you have the ranges library, then you should use it.
If not, then vector<T>, vector<T*>, vector<index> depending on the size and copiability of T.
There's no restrictions on the usage of components of the STL in the standard. Of course, there are best practices (eg, string_view instead of string const &).
In this case, I can foresee no problems with handling the view return type directly. That said, the best practices are yet to be decided on since the standard is so new and no compiler has a complete implementation yet.
You're fine to go with the following, in my opinion:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(std::move(v)) {}
auto getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
private:
std::vector<int> vec;
};
As you can see here, a range is just something on which you can call begin and end. Nothing more than that.
For instance, you can use the result of begin(range), which is an iterator, to traverse the range, using the ++ operator to advance it.
In general, looking back at the concept I linked above, you can use a range whenever the conext code only requires to be able to call begin and end on it.
Whether this is advisable or enough depends on what you need to do with it. Clearly, if your intention is to pass evens to a function which expects a std::vector (for instance it's a function you cannot change, and it calls .push_back on the entity we are talking about), you clearly have to make a std::vector out of filter's output, which I'd do via
auto evens = vec | ranges::views::filter(whatever) | ranges::to_vector;
but if all the function which you pass evens to does is to loop on it, then
return vec | ranges::views::filter(whatever);
is just fine.
As regards life time considerations, a view is to a range of values what a pointer is to the pointed-to entity: if the latter is destroied, the former will be dangling, and making improper use of it will be undefined behavior. This is an erroneous program:
#include <iostream>
#include <range/v3/view/filter.hpp>
#include <string>
using namespace ranges;
using namespace ranges::views;
auto f() {
// a local vector here
std::vector<std::string> vec{"zero","one","two","three","four","five"};
// return a view on the local vecotor
return vec | filter([](auto){ return true; });
} // vec is gone ---> the view returned is dangling
int main()
{
// the following throws std::bad_alloc for me
for (auto i : f()) {
std::cout << i << std::endl;
}
}
You can use ranges::any_view as a type erasure mechanism for any range or combination of ranges.
ranges::any_view<int> getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
I cannot see any equivalent of this in the STL ranges library; please edit the answer if you can.
EDIT: The problem with ranges::any_view is that it is very slow and inefficient. See https://github.com/ericniebler/range-v3/issues/714.
It is desirable to declare a function returning a range in a header and define it in a cpp file
for compilation firewalls (compilation speed)
stop the language server from going crazy
for better factoring of the code
However, there are complications that make it not advisable:
How to get type of a view?
If defining it in a header is fine, use auto
If performance is not a issue, I would recommend ranges::any_view
Otherwise I'd say it is not advisable.
I need to parse and store a somewhat (but not too) complex stream and need to store the parsed result somehow. The stream essentially contains name-value pairs with values possibly being of different type for different names. Basically, I end up with a map of key (always string) to a pair <type, value>.
I started with something like this:
typedef enum ValidType {STRING, INT, FLOAT, BINARY} ValidType;
map<string, pair<ValidType, void*>> Data;
However I really dislike void* and storing pointers. Of course, I can always store the value as binary data (vector<char> for example), in which case the map would end up being
map<string, pair<ValidType, vector<char>>> Data;
Yet, in this case I would have to parse the binary data every time I need the actual value, which would be quite expensive in terms of performance.
Considering that I am not too worried about memory footprint (the amount of data is not large), but I am concerned about performance, what would be the right way to store such data?
Ideally, I'd like to avoid using boost, as that would increase the size of the final app by a factor of 3 if not more and I need to minimise that.
You're looking for a discriminated (or tagged) union.
Boost.Variant is one example, and Boost.Any is another. Are you so sure Boost will increase your final app size by a factor of 3? I would have thought variant was header-only, in which case you don't need to link any libraries.
If you really can't use Boost, implementing a simple discriminated union isn't so hard (a general and fully-correct one is another matter), and at least you know what to search for now.
For completeness, a naive discriminated union might look like:
class DU
{
public:
enum TypeTag { None, Int, Double };
class DUTypeError {};
private:
TypeTag type_;
union {
int i;
double d;
} data_;
void typecheck(TypeTag tt) const { if(type_ != tt) throw DUTypeError(); }
public:
DU() : type_(None) {}
DU(DU const &other) : type_(other.type_), data_(other.data_) {}
DU& operator= (DU const &other) {
type_=other.type_; data_=other.data_; return *this;
}
TypeTag type() const { return type_; }
bool istype(TypeTag tt) const { return type_ == tt; }
#define CONVERSIONS(TYPE, ENUM, MEMBER) \
explicit DU(TYPE val) : type_(ENUM) { data_.MEMBER = val; } \
operator TYPE & () { typecheck(ENUM); return data_.MEMBER; } \
operator TYPE const & () const { typecheck(ENUM); return data_.MEMBER; } \
DU& operator=(TYPE val) { type_ = ENUM; data_.MEMBER = val; return *this; }
CONVERSIONS(int, Int, i)
CONVERSIONS(double, Double, d)
};
Now, there are several drawbacks:
you can't store non-POD types in the union
adding a type means modifying the enum, and the union, and remembering to add a new CONVERSIONS line (it would be even worse without the macro)
you can't use the visitor pattern with this (or, you'd have to write your own dispatcher for it), which means lots of switch statements in the client code
every one of these switches may also need updating if you add a type
if you did write a visitor dispatch, that needs updating if you add a type, and so may every visitor
you need to manually reproduce something like the built-in C++ type-conversion rules if you want to do anything like arithmetic with these (ie, operator double could promote an Int instead of only handling Double ... but only if you hand-roll every operator)
I haven't implemented operator== precisely because it needs a switch. You can't just memcmp the two unions if the types match, because identical 32-bit integers could still compare different if the extra space required for the double holds a different bit pattern
Some of these issues can be addressed if you care about them, but it's all more work. Hence my preference for not re-inventing this particular wheel if it can be avoided.
Since your data types are fixed what about something like this...
Have something like a std::vector for each type of value.
And your map would have as the second value of the pair the index to the data.
std::vector<int> vInt;
std::vector<float> vFloat;
.
.
.
map<std::string, std::pair<ValidType, int>> Data;
You can implement a multi-type map by leveraging the nifty features of std::tuple in C++11, which allows access by a type key. You can wrap this to create access by arbitrary keys. An in-depth explanation of this (and quite an interesting read) is available here:
https://jguegant.github.io/blogs/tech/thread-safe-multi-type-map.html
The modern C++ features provide create ways to solve old problems.
I need to parse and store a somewhat (but not too) complex stream and need to store the parsed result somehow. The stream essentially contains name-value pairs with values possibly being of different type for different names. Basically, I end up with a map of key (always string) to a pair <type, value>.
I started with something like this:
typedef enum ValidType {STRING, INT, FLOAT, BINARY} ValidType;
map<string, pair<ValidType, void*>> Data;
However I really dislike void* and storing pointers. Of course, I can always store the value as binary data (vector<char> for example), in which case the map would end up being
map<string, pair<ValidType, vector<char>>> Data;
Yet, in this case I would have to parse the binary data every time I need the actual value, which would be quite expensive in terms of performance.
Considering that I am not too worried about memory footprint (the amount of data is not large), but I am concerned about performance, what would be the right way to store such data?
Ideally, I'd like to avoid using boost, as that would increase the size of the final app by a factor of 3 if not more and I need to minimise that.
You're looking for a discriminated (or tagged) union.
Boost.Variant is one example, and Boost.Any is another. Are you so sure Boost will increase your final app size by a factor of 3? I would have thought variant was header-only, in which case you don't need to link any libraries.
If you really can't use Boost, implementing a simple discriminated union isn't so hard (a general and fully-correct one is another matter), and at least you know what to search for now.
For completeness, a naive discriminated union might look like:
class DU
{
public:
enum TypeTag { None, Int, Double };
class DUTypeError {};
private:
TypeTag type_;
union {
int i;
double d;
} data_;
void typecheck(TypeTag tt) const { if(type_ != tt) throw DUTypeError(); }
public:
DU() : type_(None) {}
DU(DU const &other) : type_(other.type_), data_(other.data_) {}
DU& operator= (DU const &other) {
type_=other.type_; data_=other.data_; return *this;
}
TypeTag type() const { return type_; }
bool istype(TypeTag tt) const { return type_ == tt; }
#define CONVERSIONS(TYPE, ENUM, MEMBER) \
explicit DU(TYPE val) : type_(ENUM) { data_.MEMBER = val; } \
operator TYPE & () { typecheck(ENUM); return data_.MEMBER; } \
operator TYPE const & () const { typecheck(ENUM); return data_.MEMBER; } \
DU& operator=(TYPE val) { type_ = ENUM; data_.MEMBER = val; return *this; }
CONVERSIONS(int, Int, i)
CONVERSIONS(double, Double, d)
};
Now, there are several drawbacks:
you can't store non-POD types in the union
adding a type means modifying the enum, and the union, and remembering to add a new CONVERSIONS line (it would be even worse without the macro)
you can't use the visitor pattern with this (or, you'd have to write your own dispatcher for it), which means lots of switch statements in the client code
every one of these switches may also need updating if you add a type
if you did write a visitor dispatch, that needs updating if you add a type, and so may every visitor
you need to manually reproduce something like the built-in C++ type-conversion rules if you want to do anything like arithmetic with these (ie, operator double could promote an Int instead of only handling Double ... but only if you hand-roll every operator)
I haven't implemented operator== precisely because it needs a switch. You can't just memcmp the two unions if the types match, because identical 32-bit integers could still compare different if the extra space required for the double holds a different bit pattern
Some of these issues can be addressed if you care about them, but it's all more work. Hence my preference for not re-inventing this particular wheel if it can be avoided.
Since your data types are fixed what about something like this...
Have something like a std::vector for each type of value.
And your map would have as the second value of the pair the index to the data.
std::vector<int> vInt;
std::vector<float> vFloat;
.
.
.
map<std::string, std::pair<ValidType, int>> Data;
You can implement a multi-type map by leveraging the nifty features of std::tuple in C++11, which allows access by a type key. You can wrap this to create access by arbitrary keys. An in-depth explanation of this (and quite an interesting read) is available here:
https://jguegant.github.io/blogs/tech/thread-safe-multi-type-map.html
The modern C++ features provide create ways to solve old problems.
I recently hit a problem and the only way I can see to avoid it is to use const_cast - but I'm guessing there is a way I'm not thinking of to avoid this without otherwise changing the function of the code. The code snippet below distills my problem into a very simple example.
struct Nu
{
Nu() {v = rand();}
int v;
};
struct G
{
~G()
{
for(auto it = _m.begin(); it != _m.end(); it++) delete it->first;
}
void AddNewNu()
{
_m[new Nu] = 0.5f;
}
void ModifyAllNu()
{
for(auto it = _m.begin(); it != _m.end(); it++) it->first->v++;
}
float F(const Nu *n) const
{
auto it = _m.find(n);
// maybe do other stuff with it
return it->second;
}
map<Nu*, float> _m;
};
Here, suppose Nu is actually a very large struct whose layout is already fixed by the need to match an external library (and thus the "float" can't simply be folded into Nu, and for various other reasons it can't be map<Nu, float>). The G struct has a map that it uses to hold all the Nu's it creates (and ultimately to delete them all on destruction). As written, the function F will not compile - it cannot cast (const Nu *n) to (Nu n) as expected by std::map. However, the map can't be switched to map<const Nu*, float> because some non-const functions still need to modify the Nu's inside _m. Of course, I could alternatively store all these Nu's in an additional std::vector and then switch the map type to be const - but this introduces a vector that should be entirely unnecessary. So the only alternative I've thought of at the moment is to use const_cast inside the F function (which should be a safe const_cast) and I'm wondering if this is avoidable.
After a bit more hunting this exact same problem has already been addressed here: Calling map::find with a const argument
This is because the map expects Nu* const, but you have given it a const Nu*. I also find it highly illogical and don't understand why, but this is how it is.
"find" in your case will return a const_iterator. putting:
map<Nu*,float>::const_iterator it = _m.find(n);
...
return it->second;
should work I think.
Since you are in a const method you can only read your map of course, not write/modify it
I myself am convinced that in a project I'm working on signed integers are the best choice in the majority of cases, even though the value contained within can never be negative. (Simpler reverse for loops, less chance for bugs, etc., in particular for integers which can only hold values between 0 and, say, 20, anyway.)
The majority of the places where this goes wrong is a simple iteration of a std::vector, often this used to be an array in the past and has been changed to a std::vector later. So these loops generally look like this:
for (int i = 0; i < someVector.size(); ++i) { /* do stuff */ }
Because this pattern is used so often, the amount of compiler warning spam about this comparison between signed and unsigned type tends to hide more useful warnings. Note that we definitely do not have vectors with more then INT_MAX elements, and note that until now we used two ways to fix compiler warning:
for (unsigned i = 0; i < someVector.size(); ++i) { /*do stuff*/ }
This usually works but might silently break if the loop contains any code like 'if (i-1 >= 0) ...', etc.
for (int i = 0; i < static_cast<int>(someVector.size()); ++i) { /*do stuff*/ }
This change does not have any side effects, but it does make the loop a lot less readable. (And it's more typing.)
So I came up with the following idea:
template <typename T> struct vector : public std::vector<T>
{
typedef std::vector<T> base;
int size() const { return base::size(); }
int max_size() const { return base::max_size(); }
int capacity() const { return base::capacity(); }
vector() : base() {}
vector(int n) : base(n) {}
vector(int n, const T& t) : base(n, t) {}
vector(const base& other) : base(other) {}
};
template <typename Key, typename Data> struct map : public std::map<Key, Data>
{
typedef std::map<Key, Data> base;
typedef typename base::key_compare key_compare;
int size() const { return base::size(); }
int max_size() const { return base::max_size(); }
int erase(const Key& k) { return base::erase(k); }
int count(const Key& k) { return base::count(k); }
map() : base() {}
map(const key_compare& comp) : base(comp) {}
template <class InputIterator> map(InputIterator f, InputIterator l) : base(f, l) {}
template <class InputIterator> map(InputIterator f, InputIterator l, const key_compare& comp) : base(f, l, comp) {}
map(const base& other) : base(other) {}
};
// TODO: similar code for other container types
What you see is basically the STL classes with the methods which return size_type overridden to return just 'int'. The constructors are needed because these aren't inherited.
What would you think of this as a developer, if you'd see a solution like this in an existing codebase?
Would you think 'whaa, they're redefining the STL, what a huge WTF!', or would you think this is a nice simple solution to prevent bugs and increase readability. Or maybe you'd rather see we had spent (half) a day or so on changing all these loops to use std::vector<>::iterator?
(In particular if this solution was combined with banning the use of unsigned types for anything but raw data (e.g. unsigned char) and bit masks.)
Don't derive publicly from STL containers. They have nonvirtual destructors which invokes undefined behaviour if anyone deletes one of your objects through a pointer-to base. If you must derive e.g. from a vector, do it privately and expose the parts you need to expose with using declarations.
Here, I'd just use a size_t as the loop variable. It's simple and readable. The poster who commented that using an int index exposes you as a n00b is correct. However, using an iterator to loop over a vector exposes you as a slightly more experienced n00b - one who doesn't realize that the subscript operator for vector is constant time. (vector<T>::size_type is accurate, but needlessly verbose IMO).
While I don't think "use iterators, otherwise you look n00b" is a good solution to the problem, deriving from std::vector appears much worse than that.
First, developers do expect vector to be std:.vector, and map to be std::map. Second, your solution does not scale for other containers, or for other classes/libraries that interact with containers.
Yes, iterators are ugly, iterator loops are not very well readable, and typedefs only cover up the mess. But at least, they do scale, and they are the canonical solution.
My solution? an stl-for-each macro. That is not without problems (mainly, it is a macro, yuck), but it gets across the meaning. It is not as advanced as e.g. this one, but does the job.
I made this community wiki... Please edit it. I don't agree with the advice against "int" anymore. I now see it as not bad.
Yes, i agree with Richard. You should never use 'int' as the counting variable in a loop like those. The following is how you might want to do various loops using indices (althought there is little reason to, occasionally this can be useful).
Forward
for(std::vector<int>::size_type i = 0; i < someVector.size(); i++) {
/* ... */
}
Backward
You can do this, which is perfectly defined behaivor:
for(std::vector<int>::size_type i = someVector.size() - 1;
i != (std::vector<int>::size_type) -1; i--) {
/* ... */
}
Soon, with c++1x (next C++ version) coming along nicely, you can do it like this:
for(auto i = someVector.size() - 1; i != (decltype(i)) -1; i--) {
/* ... */
}
Decrementing below 0 will cause i to wrap around, because it is unsigned.
But unsigned will make bugs slurp in
That should never be an argument to make it the wrong way (using 'int').
Why not use std::size_t above?
The C++ Standard defines in 23.1 p5 Container Requirements, that T::size_type , for T being some Container, that this type is some implementation defined unsigned integral type. Now, using std::size_t for i above will let bugs slurp in silently. If T::size_type is less or greater than std::size_t, then it will overflow i, or not even get up to (std::size_t)-1 if someVector.size() == 0. Likewise, the condition of the loop would have been broken completely.
Definitely use an iterator. Soon you will be able to use the 'auto' type, for better readability (one of your concerns) like this:
for (auto i = someVector.begin();
i != someVector.end();
++i)
Skip the index
The easiest approach is to sidestep the problem by using iterators, range-based for loops, or algorithms:
for (auto it = begin(v); it != end(v); ++it) { ... }
for (const auto &x : v) { ... }
std::for_each(v.begin(), v.end(), ...);
This is a nice solution if you don't actually need the index value. It also handles reverse loops easily.
Use an appropriate unsigned type
Another approach is to use the container's size type.
for (std::vector<T>::size_type i = 0; i < v.size(); ++i) { ... }
You can also use std::size_t (from <cstddef>). There are those who (correctly) point out that std::size_t may not be the same type as std::vector<T>::size_type (though it usually is). You can, however, be assured that the container's size_type will fit in a std::size_t. So everything is fine, unless you use certain styles for reverse loops. My preferred style for a reverse loop is this:
for (std::size_t i = v.size(); i-- > 0; ) { ... }
With this style, you can safely use std::size_t, even if it's a larger type than std::vector<T>::size_type. The style of reverse loops shown in some of the other answers require casting a -1 to exactly the right type and thus cannot use the easier-to-type std::size_t.
Use a signed type (carefully!)
If you really want to use a signed type (or if your style guide practically demands one), like int, then you can use this tiny function template that checks the underlying assumption in debug builds and makes the conversion explicit so that you don't get the compiler warning message:
#include <cassert>
#include <cstddef>
#include <limits>
template <typename ContainerType>
constexpr int size_as_int(const ContainerType &c) {
const auto size = c.size(); // if no auto, use `typename ContainerType::size_type`
assert(size <= static_cast<std::size_t>(std::numeric_limits<int>::max()));
return static_cast<int>(size);
}
Now you can write:
for (int i = 0; i < size_as_int(v); ++i) { ... }
Or reverse loops in the traditional manner:
for (int i = size_as_int(v) - 1; i >= 0; --i) { ... }
The size_as_int trick is only slightly more typing than the loops with the implicit conversions, you get the underlying assumption checked at runtime, you silence the compiler warning with the explicit cast, you get the same speed as non-debug builds because it will almost certainly be inlined, and the optimized object code shouldn't be any larger because the template doesn't do anything the compiler wasn't already doing implicitly.
You're overthinking the problem.
Using a size_t variable is preferable, but if you don't trust your programmers to use unsigned correctly, go with the cast and just deal with the ugliness. Get an intern to change them all and don't worry about it after that. Turn on warnings as errors and no new ones will creep in. Your loops may be "ugly" now, but you can understand that as the consequences of your religious stance on signed versus unsigned.
vector.size() returns a size_t var, so just change int to size_t and it should be fine.
Richard's answer is more correct, except that it's a lot of work for a simple loop.
I notice that people have very different opinions about this subject. I have also an opinion which does not convince others, so it makes sense to search for support by some guru’s, and I found the CPP core guidelines:
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines
maintained by Bjarne Stroustrup and Herb Sutter, and their last update, upon which I base the information below, is of April 10, 2022.
Please take a look at the following code rules:
ES.100: Don’t mix signed and unsigned arithmetic
ES.101: Use unsigned types for bit manipulation
ES.102: Use signed types for arithmetic
ES.107: Don’t use unsigned for subscripts, prefer gsl::index
So, supposing that we want to index in a for loop and for some reason the range based for loop is not the appropriate solution, then using an unsigned type is also not the preferred solution. The suggested solution is using gsl::index.
But in case you don’t have gsl around and you don’t want to introduce it, what then?
In that case I would suggest to have a utility template function as suggested by Adrian McCarthy: size_as_int