I am getting following warning always for following type of code.
std::vector v;
for ( int i = 0; i < v.size(); i++) {
}
warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
I understand that size() returns size_t, just wanted to know is this safe to ignore this warning or should I make all my loop variable of type size_t
If you might need to hold more than INT_MAX items in your vector, use size_t. In most cases, it doesn't really matter, but I use size_t just to make the warning go away.
Better yet, use iterators:
for( auto it = v.begin(); it != v.end(); ++it )
(If your compiler doesn't support C++11, use std::vector<whatever>::iterator in place of auto)
C++11 also makes choosing the best index type easier (in case you use the index in some computation, not just for subscripting v):
for( decltype(v.size()) i = 0; i < v.size(); ++i )
What is size_t?
size_t corresponds to the integral data type returned by the language operator sizeof and is defined in the header file (among others) as an unsigned integral type.
Is it okay to cast size_t to int?
You could use a cast if you are sure that size is never going to be > than INT_MAX.
If you are trying to write a portable code, it is not safe because,
size_t in 64 bit Unix is 64 bits
size_tin 64 bit Windows is 32 bits
So if you port your code from Unix to WIndows and if above are the enviornments you will lose data.
Suggested Answer
Given the caveat, the suggestion is to make i of unsigned integral type or even better use it as type size_t.
is this safe to ignore this warning or should I make all my loop variable of type size_t
No. You are opening yourself up to a class of integer overflow attacks. If the vector size is greater than MAX_INT (and an attacker has a way of making that happen), your loop will run forever, causing a denial of service possibility.
Technically, std::vector::size returns std::vector::size_type, though.
You should use the right signedness for your loop counter variables. (Really, for most uses, you want unsigned integers rather than signed integers for loops anyway)
The problem is that you're mixing two different data types. On some architectures, size_t is a 32-bit integer, on others it's 64-bit. Your code should properly handle both.
since size() returns a size_t (not int), then that should be the datatype you compare it against.
std::vector v;
for ( size_t i = 0; i < v.size(); i++) {
}
Here's an alternate view from Bjarne Stroustrup:
http://www.stroustrup.com/bs_faq2.html#simple-program
for (int i = 0; i<v.size(); ++i) cout << v[i] << '\n';
Yes, I know that I could declare i to be a vector::size_type
rather than plain int to quiet warnings from some hyper-suspicious
compilers, but in this case,I consider that too pedantic and
distracting.
It's a trade-off. If you're worried that v.size() could go above 2,147,483,647, use size_t. If you're using i inside your loop for more than just looking inside the vector, and you're concerned about subtle signed/unsigned related bugs, use int. In my experience, the latter issue is more prevalent than the former. Your experience may differ.
Also see Why is size_t unsigned?.
Related
I'm just wondering should I use std::size_t for loops and stuff instead of int?
For instance:
#include <cstdint>
int main()
{
for (std::size_t i = 0; i < 10; ++i) {
// std::size_t OK here? Or should I use, say, unsigned int instead?
}
}
In general, what is the best practice regarding when to use std::size_t?
A good rule of thumb is for anything that you need to compare in the loop condition against something that is naturally a std::size_t itself.
std::size_t is the type of any sizeof expression and as is guaranteed to be able to express the maximum size of any object (including any array) in C++. By extension it is also guaranteed to be big enough for any array index so it is a natural type for a loop by index over an array.
If you are just counting up to a number then it may be more natural to use either the type of the variable that holds that number or an int or unsigned int (if large enough) as these should be a natural size for the machine.
size_t is the result type of the sizeof operator.
Use size_t for variables that model size or index in an array. size_t conveys semantics: you immediately know it represents a size in bytes or an index, rather than just another integer.
Also, using size_t to represent a size in bytes helps making the code portable.
The size_t type is meant to specify the size of something so it's natural to use it, for example, getting the length of a string and then processing each character:
for (size_t i = 0, max = strlen (str); i < max; i++)
doSomethingWith (str[i]);
You do have to watch out for boundary conditions of course, since it's an unsigned type. The boundary at the top end is not usually that important since the maximum is usually large (though it is possible to get there). Most people just use an int for that sort of thing because they rarely have structures or arrays that get big enough to exceed the capacity of that int.
But watch out for things like:
for (size_t i = strlen (str) - 1; i >= 0; i--)
which will cause an infinite loop due to the wrapping behaviour of unsigned values (although I've seen compilers warn against this). This can also be alleviated by the (slightly harder to understand but at least immune to wrapping problems):
for (size_t i = strlen (str); i-- > 0; )
By shifting the decrement into a post-check side-effect of the continuation condition, this does the check for continuation on the value before decrement, but still uses the decremented value inside the loop (which is why the loop runs from len .. 1 rather than len-1 .. 0).
By definition, size_t is the result of the sizeof operator. size_t was created to refer to sizes.
The number of times you do something (10, in your example) is not about sizes, so why use size_t? int, or unsigned int, should be ok.
Of course it is also relevant what you do with i inside the loop. If you pass it to a function which takes an unsigned int, for example, pick unsigned int.
In any case, I recommend to avoid implicit type conversions. Make all type conversions explicit.
short answer:
Almost never. Use signed version ptrdiff_t or non-standard ssize_t. Use function std::ssize instead of std::size.
long answer:
Whenever you need to have a vector of char bigger that 2gb on a 32 bit system. In every other use case, using a signed type is much safer than using an unsigned type.
example:
std::vector<A> data;
[...]
// calculate the index that should be used;
size_t i = calc_index(param1, param2);
// doing calculations close to the underflow of an integer is already dangerous
// do some bounds checking
if( i - 1 < 0 ) {
// always false, because 0-1 on unsigned creates an underflow
return LEFT_BORDER;
} else if( i >= data.size() - 1 ) {
// if i already had an underflow, this becomes true
return RIGHT_BORDER;
}
// now you have a bug that is very hard to track, because you never
// get an exception or anything anymore, to detect that you actually
// return the false border case.
return calc_something(data[i-1], data[i], data[i+1]);
The signed equivalent of size_t is ptrdiff_t, not int. But using int is still much better in most cases than size_t. ptrdiff_t is long on 32 and 64 bit systems.
This means that you always have to convert to and from size_t whenever you interact with a std::containers, which not very beautiful. But on a going native conference the authors of c++ mentioned that designing std::vector with an unsigned size_t was a mistake.
If your compiler gives you warnings on implicit conversions from ptrdiff_t to size_t, you can make it explicit with constructor syntax:
calc_something(data[size_t(i-1)], data[size_t(i)], data[size_t(i+1)]);
if just want to iterate a collection, without bounds cheking, use range based for:
for(const auto& d : data) {
[...]
}
here some words from Bjarne Stroustrup (C++ author) at going native
For some people this signed/unsigned design error in the STL is reason enough, to not use the std::vector, but instead an own implementation.
size_t is a very readable way to specify the size dimension of an item - length of a string, amount of bytes a pointer takes, etc.
It's also portable across platforms - you'll find that 64bit and 32bit both behave nicely with system functions and size_t - something that unsigned int might not do (e.g. when should you use unsigned long
Use std::size_t for indexing/counting C-style arrays.
For STL containers, you'll have (for example) vector<int>::size_type, which should be used for indexing and counting vector elements.
In practice, they are usually both unsigned ints, but it isn't guaranteed, especially when using custom allocators.
Soon most computers will be 64-bit architectures with 64-bit OS:es running programs operating on containers of billions of elements. Then you must use size_t instead of int as loop index, otherwise your index will wrap around at the 2^32:th element, on both 32- and 64-bit systems.
Prepare for the future!
size_t is returned by various libraries to indicate that the size of that container is non-zero. You use it when you get once back :0
However, in the your example above looping on a size_t is a potential bug. Consider the following:
for (size_t i = thing.size(); i >= 0; --i) {
// this will never terminate because size_t is a typedef for
// unsigned int which can not be negative by definition
// therefore i will always be >= 0
printf("the never ending story. la la la la");
}
the use of unsigned integers has the potential to create these types of subtle issues. Therefore imho I prefer to use size_t only when I interact with containers/types that require it.
When using size_t be careful with the following expression
size_t i = containner.find("mytoken");
size_t x = 99;
if (i-x>-1 && i+x < containner.size()) {
cout << containner[i-x] << " " << containner[i+x] << endl;
}
You will get false in the if expression regardless of what value you have for x.
It took me several days to realize this (the code is so simple that I did not do unit test), although it only take a few minutes to figure the source of the problem. Not sure it is better to do a cast or use zero.
if ((int)(i-x) > -1 or (i-x) >= 0)
Both ways should work. Here is my test run
size_t i = 5;
cerr << "i-7=" << i-7 << " (int)(i-7)=" << (int)(i-7) << endl;
The output: i-7=18446744073709551614 (int)(i-7)=-2
I would like other's comments.
It is often better not to use size_t in a loop. For example,
vector<int> a = {1,2,3,4};
for (size_t i=0; i<a.size(); i++) {
std::cout << a[i] << std::endl;
}
size_t n = a.size();
for (size_t i=n-1; i>=0; i--) {
std::cout << a[i] << std::endl;
}
The first loop is ok. But for the second loop:
When i=0, the result of i-- will be ULLONG_MAX (assuming size_t = unsigned long long), which is not what you want in a loop.
Moreover, if a is empty then n=0 and n-1=ULLONG_MAX which is not good either.
size_t is an unsigned type that can hold maximum integer value for your architecture, so it is protected from integer overflows due to sign (signed int 0x7FFFFFFF incremented by 1 will give you -1) or short size (unsigned short int 0xFFFF incremented by 1 will give you 0).
It is mainly used in array indexing/loops/address arithmetic and so on. Functions like memset() and alike accept size_t only, because theoretically you may have a block of memory of size 2^32-1 (on 32bit platform).
For such simple loops don't bother and use just int.
I have been struggling myself with understanding what and when to use it. But size_t is just an unsigned integral data type which is defined in various header files such as <stddef.h>, <stdio.h>, <stdlib.h>, <string.h>, <time.h>, <wchar.h> etc.
It is used to represent the size of objects in bytes hence it's used as the return type by the sizeof operator. The maximum permissible size is dependent on the compiler; if the compiler is 32 bit then it is simply a typedef (alias) for unsigned int but if the compiler is 64 bit then it would be a typedef for unsigned long long. The size_t data type is never negative(excluding ssize_t)
Therefore many C library functions like malloc, memcpy and strlen declare their arguments and return type as size_t.
/ Declaration of various standard library functions.
// Here argument of 'n' refers to maximum blocks that can be
// allocated which is guaranteed to be non-negative.
void *malloc(size_t n);
// While copying 'n' bytes from 's2' to 's1'
// n must be non-negative integer.
void *memcpy(void *s1, void const *s2, size_t n);
// the size of any string or `std::vector<char> st;` will always be at least 0.
size_t strlen(char const *s);
size_t or any unsigned type might be seen used as loop variable as loop variables are typically greater than or equal to 0.
size_t is an unsigned integral type, that can represent the largest integer on you system.
Only use it if you need very large arrays,matrices etc.
Some functions return an size_t and your compiler will warn you if you try to do comparisons.
Avoid that by using a the appropriate signed/unsigned datatype or simply typecast for a fast hack.
size_t is unsigned int. so whenever you want unsigned int you can use it.
I use it when i want to specify size of the array , counter ect...
void * operator new (size_t size); is a good use of it.
The code below generates a compiler warning:
private void test()
{
byte buffer[100];
for (int i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
This is because sizeof() returns a size_t, which is unsigned.
I have seen a number of suggestions for how to deal with this, but none with a preponderance of support and none with any convincing logic nor any references to support one approach as clearly "better." The most common suggestions seem to be:
ignore the warnings
turn off the warnings
use a loop variable of type size_t
use a loop variable of type size_t with tricks to avoid decrementing past zero
cast size_of(buffer) to an int
some extremely convoluted suggestions that I did not have the patience to follow because they involved unreadable code, generally involving vectors and/or iterators
libraries that I cannot load in the AVR / ARM embedded environments I often use.
free functions returning a valid int or long representing the byte count of T
Don't use loops (gotta love that advice)
Is there a "correct" way to approach this?
-- Begin Edit --
The example I gave is, of course, trivial, and meant only to demonstrate the type mismatch warning that can occur in an indexing situation.
#3 is not necessarily the obviously correct answer because size_t carries special risks in a decrementing loop such as
for (size_t i = myArray.size; i > 0; --i)
(the array may someday have a size of zero).
#4 is a suggestion to deal with decrementing size_t indexes by including appropriate and necessary checks to avoid ever decrementing past zero. Since that makes the code harder to read, there are some cute shortcuts that are not particularly readable, hence my referring to them as "tricks."
#7 is a suggestion to use libraries that are not generalizable in the sense that they may not be available or appropriate in every setting.
#8 is a suggestion to keep the checks readable, but to hide them in a non-member method, sometimes referred to as a "free function."
#9 is a suggestion to use algorithms rather than loops. This was offered many times as a solution to the size_t indexing problem, and there were a lot of upvotes. I include it even though I can't use the stl library in most of my environments and would have to write the code myself.
-- End Edit--
I am hoping for evidence-based guidance or references as to best practices for handling something like this. Is there a "standard text" or a style guide somewhere that addresses the question? A defined approach that has been adopted/endorsed internally by a major tech company? An emulatable solution forthcoming in a new language release? If necessary, I would be satisfied with an unsupported public recommendation from a single widely recognized expert.
None of the options on offer seem very appealing. The warnings drown out other things I want to see. I don't want to miss signed/unsigned comparisons in places where it might matter. Decrementing a loop variable of type size_t with comparison >=0 results in an infinite loop from unsigned integer wraparound, and even if we protect against that with something like for (size_t i = sizeof(buffer); i-->0 ;), there are other issues with incrementing/decrementing/comparing to size_t variables. Testing against size_t - 1 will yield a large positive 'oops' number when size_t is unexpectedly zero (e.g. strlen(myEmptyString)). Casting an unsigned size_t to an integer is a container size problem (not guaranteed a value) and of course size_t could potentially be bigger than an int.
Given that my arrays are of known sizes well below Int_Max, it seems to me that casting size_t to a signed integer is the best of the bunch, but it makes me cringe a little bit. Especially if it has to be static_cast<int>. Easier to take if it's hidden in a function call with some size testing, but still...
Or perhaps there's a way to turn off the warnings, but just for loop comparisons?
I find any of the three following approaches equally good.
Use a variable of type int to store the size and compare the loop variable to it.
byte buffer[100];
int size = sizeof(buffer);
for (int i = 0; i < size; ++i)
{
buffer[i] = 0;
}
Use size_t as the type of the loop variable.
byte buffer[100];
for (size_t i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
Use a pointer.
byte buffer[100];
byte* end = buffer + sizeof(buffer)
for (byte* p = buffer; p < end; ++p)
{
*p = 0;
}
If you are able to use a C++11 compiler, you can also use a range for loop.
byte buffer[100];
for (byte& b : buffer)
{
b = 0;
}
The most appropriate solution will depend entirely on context. In the context of the code fragment in your question the most appropriate action is perhaps to have type-agreement - the third option in your bullet list. This is appropriate in this case because the usage of i throughout the code is only to index the array - in this case the use of int is inappropriate - or at least unnecessary.
On the other hand if i were an arithmetic object involved in some arithmetic expression that was itself signed, the int might be appropriate and a cast would be in order.
I would suggest that as a guideline, a solution that involves the fewest number of necessary type casts (explicit of implicit) is appropriate, or to look at it another way, the maximum possible type agreement. There is not one "authoritative" rule because the purpose and usage of the variables involved is semantically rather then syntactically dependent. In this case also as has been pointed out in other answers, newer language features supporting iteration may avoid this specific issue altogether.
To discuss the advice you say you have been given specifically:
ignore the warnings
Never a good idea - some will be genuine semantic errors or maintenance issues, and by teh time you have several hundred warnings you are ignoring, how will you spot the one warning that is and issue?
turn off the warnings
An even worse idea; the compiler is helping you to improve your code quality and reliability. Why would you disable that?
use a loop variable of type size_t
In this precise example, that is exactly why you should do; exact type agreement should always be the aim.
use a loop variable of type size_t with tricks to avoid decrementing past zero
This advice is irrelevant for the trivial example given. Moreover I presume that by "tricks" the adviser in fact means checks or just correct code. There is no need for "tricks" and the term is entirely ambiguous - who knows what the adviser means? It suggests something unconventional and a bit "dirty", when there is not need for any solution with such attributes.
cast size_of(buffer) to an int
This may be necessary if the usage of i warrants the use of int for correct semantics elsewhere in the code. The example in the question does not, so this would not be an appropriate solution in this case. Essentially if making i a size_t here causes type agreement warnings elsewhere that cannot themselves be resolved by universal type agreement for all operands in an expression, then a cast may be appropriate. The aim should be to achieve zero warnings an minimum type casts.
some extremely convoluted suggestions that I did not have the patience to follow, generally involving vectors and/or iterators
If you are not prepared to elaborate or even consider such advice, you'd have better omitted the "advice" from your question. The use of STL containers in any case is not always appropriate to a large segment of embedded targets in any case, excessive code size increase and non-deterministic heap management are reasons to avoid on many platforms and applications.
libraries that I cannot load in an embedded environment.
Not all embedded environments have equal constraints. The restriction is on your embedded environment, not by any means all embedded environments. However the "loading of libraries" to resolve or avoid type agreement issues seems like a sledgehammer to crack a nut.
free functions returning a valid int or long representing the byte count of T
It is not clear what that means. What id a "free function"? Is that just a non-member function? Such a function would internally necessarily have a type case, so what have you achieved other than hiding a type cast?
Don't use loops (gotta love that advice).
I doubt you needed to include that advice in your list. The problem is not in any case limited to loops; it is not because you are using a loop that you have the warning, it is because you have used < with mismatched types.
My favorite solution is to use C++11 or newer and skip the whole manual size bounding entirely like so:
// assuming byte is defined by something like using byte = std::uint8_t;
void test()
{
byte buffer[100];
for (auto&& b: buffer)
{
b = 0;
}
}
Alternatively, if I can't use the ranged-based for loop (but still can use C++11 or newer), my favorite syntax becomes:
void test()
{
byte buffer[100];
for (auto i = decltype(sizeof(buffer)){0}; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}
Or for iterating backwards:
void test()
{
byte buffer[100];
// relies on the defined modwrap semantics behavior for unsigned integers
for (auto i = sizeof(buffer) - 1; i < sizeof(buffer); --i)
{
buffer[i] = 0;
}
}
The correct generic way is to use a loop iterator of type size_t. Simply because the is the most correct type to use for describing an array size.
There is not much need for "tricks to avoid decrementing past zero", because the size of an object can never be negative.
If you find yourself needing negative numbers to describe a variable size, it is probably because you have some special case where you are iterating across an array backwards. If so, the "trick" to deal with it is this:
for(size_t i=0; i<sizeof(array); i++)
{
size_t index = sizeof(array)-1 - i;
array[index] = something;
}
However, size_t is often an inconvenient type to use in embedded systems, because it may end up as a larger type than what your MCU can handle with one instruction, resulting in needlessly inefficient code. It may then be better to use a fixed width integer such as uint16_t, if you know the maximum size of the array in advance.
Using plain int in an embedded system is almost certainly incorrect practice. Your variables must be of deterministic size and signedness - most variables in an embedded system are unsigned. Signed variables also lead to major problems whenever you need to use bitwise operators.
If you are able to use C++ 11, you could use decltype to obtain the actual type of what sizeof returns, for instance:
void test()
{
byte buffer[100];
// On macOS decltype(sizeof(buffer)) returns unsigned long, this passes
// the compiler without warnings.
for (decltype(sizeof(buffer)) i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}
I'm using a C++ app on my s6 named CppDroid to make a quick program.
How do you use a size_t as limiter for a "for loop" on it's counter?
int c;
//... more codes here...
for (c=0; c < a.used; ++c)
//... more codes here...
The a.used is a number of used array that came from a solution to make a dynamic sized array
The error is: comparison of integer of different signs: 'int' and 'size_t' (aka unsigned int)
The for loop is one of an internal nested loops of the program so I want to maintain variable c as an "int" as much as possible.
I've seen examples about Comparing int with size_t but I'm not sure how it can help since it is for an "if" condition.
Just use std::size_t c instead of int c.
As long as a.used doesn't change during the iteration, a common idiom is:
for(int c=0, n=a.used; c<n; ++c) {
...
}
In this way, the cast happens implicitly, and you also have the "total number of elements" variable n handy in the loop body. Also, when n comes from a methods call (say, vec.size()) you evaluate it just once, which is slightly more efficient.1
1. In theory the compiler may do this optimization by itself, but with "complicated" stuff like std::vector and a nontrivial loop body it's surprisingly difficult to prove that it's a loop invariant, so often it's just recalculated at each iteration.
Regarding
” comparison of integer of different signs: 'int' and 'size_t' (aka unsigned int)
… this is a warning. It's not an error that prevents creation of an executable, unless you've asked the compiler to treat warnings as errors.
A very direct way to address it is to use a cast, int(a.size).
More generally I recommend defining a common function to do that, e.g. named n_items (C++17 will have a size function, unfortunately with unsigned result and conflating two or more logical functions, so that name's taken):
using My_array = ...; // Whatever
using Size = ptrdiff_t;
auto n_items( My_array const& a )
-> Size
{ return a.used; }
then for your loop:
for( int c = 0; c < n_items( a ); ++c )
By the way it's generally Not A Good Idea™ to reuse a variable, like c here. I'm assuming that that reuse was unintentional. The example above shows how to declare the loop variable in the for loop head.
Also, as Matteo Italia notes in his answer, it can sometimes be a good idea to manually optimize a loop like this, if measuring shows it to be a bottleneck. That's because the compiler can't easily prove that the result of the n_items call, or any other dynamic array size expression, is the same (is “invariant”) in all executions of the loop body.
Thus, if measuring tells you that the possibly repeated size expression evaluations are a bottleneck, you can do e.g.
for( int c = 0, n = n_items( a ); c < n; ++c )
It's worth noting that any manual optimization carries costs, which are not easy to measure, but which are severe enough that the usual advice to is to defer optimization until measurements tell you that it's really needed.
Is it better to cast the iterator condition right operand from size_t to int, or iterate potentially past the maximum value of int? Is the answer implementation specific?
int a;
for (size_t i = 0; i < vect.size(); i++)
{
if (some_func((int)i))
{
a = (int)i;
}
}
int a;
for (int i = 0; i < (int)vect.size(); i++)
{
if (some_func(i))
{
a = i;
}
}
I almost always use the first variation, because I find that about 80% of the time, I discover that some_func should probably also take a size_t.
If in fact some_func takes a signed int, you need to be aware of what happens when vect gets bigger than INT_MAX. If the solution isn't obvious in your situation (it usually isn't), you can at least replace some_func((int)i) with some_func(numeric_cast<int>(i)) (see Boost.org for one implementation of numeric_cast). This has the virtue of throwing an exception when vect grows bigger than you've planned on, rather than silently wrapping around to negative values.
I'd just leave it as a size_t, since there's not a good reason not to do so. What do you mean by "or iterate potentially up to the maximum value of type_t"? You're only iterating up to the value of vect.size().
For most compilers, it won't make any difference. On 32 bit systems, it's obvious, but even on 64 bit systems, both variables will probably be stored in a 64-bit register and pushed on the stack as a 64-bit value.
If the compiler stores int values as 32 bit values on the stack, the first function should be more efficient in terms of CPU-cycles.
But the difference is negligible (although the second function "looks" cleaner)
It seems safe to cast the result of my vector's size() function to an unsigned int. How can I tell for sure, though? My documentation isn't clear about how size_type is defined.
Do not assume the type of the container size (or anything else typed inside).
Today?
The best solution for now is to use:
std::vector<T>::size_type
Where T is your type. For example:
std::vector<std::string>::size_type i ;
std::vector<int>::size_type j ;
std::vector<std::vector<double> >::size_type k ;
(Using a typedef could help make this better to read)
The same goes for iterators, and all other types "inside" STL containers.
After C++0x?
When the compiler will be able to find the type of the variable, you'll be able to use the auto keyword. For example:
void doSomething(const std::vector<double> & p_aData)
{
std::vector<double>::size_type i = p_aData.size() ; // Old/Current way
auto j = p_aData.size() ; // New C++0x way, definition
decltype(p_aData.size()) k; // New C++0x way, declaration
}
Edit: Question from JF
What if he needs to pass the size of the container to some existing code that uses, say, an unsigned int? – JF
This is a problem common to the use of the STL: You cannot do it without some work.
The first solution is to design the code to always use the STL type. For example:
typedef std::vector<int>::size_type VIntSize ;
VIntSize getIndexOfSomeItem(const std::vector<int> p_aInt)
{
return /* the found value, or some kind of std::npos */
}
The second is to make the conversion yourself, using either a static_cast, using a function that will assert if the value goes out of bounds of the destination type (sometimes, I see code using "char" because, "you know, the index will never go beyond 256" [I quote from memory]).
I believe this could be a full question in itself.
According to the standard, you cannot be sure. The exact type depends on your machine. You can look at the definition in your compiler's header implementations, though.
I can't imagine that it wouldn't be safe on a 32-bit system, but 64-bit could be a problem (since ints remain 32 bit). To be safe, why not just declare your variable to be vector<MyType>::size_type instead of unsigned int?
It should always be safe to cast it to size_t. unsigned int isn't enough on most 64-bit systems, and even unsigned long isn't enough on Windows (which uses the LLP64 model instead of the LP64 model most Unix-like systems use).
The C++ standard only states that size_t is found in <cstddef>, which puts the identifiers in <stddef.h>. My copy of Harbison & Steele places the minimum and maximum values for size_t in <stdint.h>. That should give you a notion of how big your recipient variable needs to be for your platform.
Your best bet is to stick with integer types that are large enough to hold a pointer on your platform. In C99, that'd be intptr_t and uintptr_t, also officially located in <stdint.h>.
As long as you're sure that an unsigned int on your system will be large enough to hold the number of items you'll have in the vector you should be safe ;-)
I'm not sure how well this will work because I'm just thinking off the top of my head, but a compile-time assertion (such as BOOST_STATIC_ASSERT() or see Ways to ASSERT expressions at build time in C) might help. Something like:
BOOST_STATIC_ASSERT( sizeof( unsigned int) >= sizeof( size_type));