Sometimes I've got warnings with conversion from a longer type to a smaller type e.g.:
void f( unsigned short i ) // f - accept any numeric type
// smaller than std::vector<>::size_type
{}
std::vector < some_type > v;
..
f ( v.size() );
Usually I was using one of next solutions:
assert( v.size() <= std::numeric_limits< unsigned short >::max() );
f( static_cast< unsigned short >( v.size() ) );
or
f( boost::numeric_cast< unsigned short >( v.size() ) );
But on my present work boost not used and from last month asserts are disallowed.
What other safe ways you know for suppress this warning?
Any pitfalls in discribed ways?
PS:
It is not always possible to change the signature of f, also sometimes really should accept small numeric type.
EDITED:
I want to make conversion as safe as possible.
Why cast in the first place? The vector's size is typically an unsigned integer. If possible, I'd say update the function signature. Warnings are not meant to be suppressed, rather addressed.
The only safe way to deal with this is to ensure that you do not have a loss of conversion at runtime. The assert code will only work during debug builds and will allow for a conversion loss in retail builds. The conversion loss is bad because it will pass around a completely incorrect size for the vector.
What you really need is a mechanism to prevent you from creating data loss. I reccomend using a class like SafeInt. This will prevent a conversion which overflows or underflows by means of throwing an exception.
SafeInt<size_t> size = v.size();
f((unsigned short)size); // Throws if size can't fit in an unsigned short
SafeInt: http://www.codeplex.com/SafeInt
I will now repeat my mantra again: If your code contains casts, there is probably something wrong with the code or the design and you should examine both with a view to removing the cast.
BTW, you upvoted this the last time I posted it!
As size() usually returns an unsigned integer, it should be quite safe to typecast it to a signed one.
f(static_cast<expected-type>(v.size()));
Otherwise change the function signature, if it is possible.
Related
How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Different compilers will warn for something like:
int i = get_int();
size_t s = i;
for loss of signedness or
size_t s = get_size();
int i = s;
for narrowing.
casting can remove the warnings but don't solve the safety issue.
Is there a proper way of doing this?
You can try boost::numeric_cast<>.
boost numeric_cast returns the result of converting a value of type Source to a value of type Target. If out-of-range is detected, an exception is thrown (see bad_numeric_cast, negative_overflow and positive_overflow ).
How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Control when conversion is needed. As able, only convert when there is no value change. Sometimes, then one must step back and code at a higher level. IOWs, was a lossy conversion needed or can code be re-worked to avoid conversion loss?
It is not hard to add an if(). The test just needs to be carefully formed.
Example where size_t n and int len need a compare. Note that positive values of int may exceed that of size_t - or visa-versa or the same. Note in this case, the conversion of int to unsigned only happens with non-negative values - thus no value change.
int len = snprintf(buf, n, ...);
if (len < 0 || (unsigned)len >= n) {
// Handle_error();
}
unsigned to int example when it is known that the unsigned value at this point of code is less than or equal to INT_MAX.
unsigned n = ...
int i = n & INT_MAX;
Good analysis tools see that n & INT_MAX always converts into int without loss.
There is no built-in safe narrowing conversion between int types in c++ and STL. You could implement it yourself using as an example Microsoft GSL.
Theoretically, if you want perfect safety, you shouldn't be mixing types like this at all. (And you definitely shouldn't be using explicit casts to silence warnings, as you know.) If you've got values of type size_t, it's best to always carry them around in variables of type size_t.
There is one case where I do sometimes decide I can accept less than 100.000% perfect type safety, and that is when I assign sizeof's return value, which is a size_t, to an int. For any machine I am ever going to use, the only time this conversion might lose information is when sizeof returns a value greater than 2147483647. But I am content to assume that no single object in any of my programs will ever be that big. (In particular, I will unhesitatingly write things like printf("sizeof(int) = %d\n", (int)sizeof(int)), explicit cast and all. There is no possible way that the size of a type like int will not fit in an int!)
[Footnote: Yes, it's true, on a 16-bit machine the assumption is the rather less satisfying threshold that sizeof won't return a value greater than 32767. It's more likely that a single object might have a size like that, but probably not in a program that's running on a 16-bitter.]
I don't know any other way to return the size of a vector other than the .size() command, and it works very well, but, it return a variable of type long long unsigned int, and this in very cases are very good, but I'm sure my program will never have a vector so big that it need all that size of return, short int is more than enough.
I know, for today's computers those few enused bytes are irrelevant, but I don't like to leave these "loose ends" even if they are small, and whem I was programming, I came across some details that bothered me.
Look at these examples:
for(short int X = 0 ; X < Vector.size() ; X++){
}
compiling this, I receive this warning:
warning: comparison of integer expressions of different signedness: 'short int' and 'std::vector<unsigned char>::size_type' {aka 'long long unsigned int'} [-Wsign-compare]|
this is because the .size() return value type is different from the short int I'm compiling, "X" is a short int, and Vector.size() return a long long unsigned int, was expected, so if I do this:
for(size_t X = 0 ; X < Vector.size() ; X++){
}
the problem is gone, but by doing this, I'm creating a long long unsigned int in variable size_t and I'm returning another variable long long unsigned int, so, my computer allocale two variables long long unsigned int, so, what I do for returning a simple short int? I don't need anything more than this, long long unsigned int is overkill, so I did this:
for(short int X = 0 ; X < short(Vector.size()) ; X++){
}
but... how is this working? short int X = 0 is allocating a short int, nothing new, but what about short (Vector.size()), is the computer allocating a long unsigned int and converting it to a short int? or is the compiler "changing" the return of the .size() function by making it naturally return a short int and, in this case, not allocating a long long unsined int? because I know the compilers are responsible for optimizing the code too, is there any "problem" or "detail" when using this method? since I rarely see anyone using this, what exactly is this short() doing in memory allocation? where can i read more about it?
(thanks to everyone who responded)
Forget for a moment that this involves a for loop; that's important for the underlying code, but it's a distraction from what's going on with the conversion.
short X = Vector.size();
That line calls Vector.size(), which returns a value of type std::size_t. std::size_t is an unsigned type, large enough to hold the size of any object. So it could be unsigned long, or it could be unsigned long long. In any event, it's definitely not short. So the compiler has to convert that value to short, and that's what it does.
Most compilers these days don't trust you to understand what this actually does, so they warn you. (Yes, I'm rather opinionated about compilers that nag; that doesn't change the analysis here). So if you want to see that warning (i.e., you don't turn it off), you'll see it. If you want to write code that doesn't generate that warning, then you have to change the code to say "yes, I know, and I really mean it". You do that with a cast:
short X = short(Vector.size());
The cast tells the compiler to call Vector.size() and convert the resulting value to short. The code then assigns the result of that conversion to X. So, more briefly, in this case it tells the compiler that you want it to do exactly what it would have done without the cast. The difference is that because you wrote a cast, the compiler won't warn you that you might not know what you're doing.
Some folks prefer to write that cast is with a static_cast:
short X = static_cast<short>(Vector.size());
That does the same thing: it tells the compiler to do the conversion to short and, again, the compiler won't warn you that you did it.
In the original for loop, a different conversion occurs:
X < Vector.size()
That bit of code calls Vector.size(), which still returns an unsigned type. In order to compare that value with X, the two sides of the < have to have the same type, and the rules for this kind of expression require that X gets promoted to std::size_t, i.e., that the value of X gets treated as an unsigned type. That's okay as long as the value isn't negative. If it's negative, the conversion to the unsigned type is okay, but it will produce results that probably aren't what was intended. Since we know that X is not negative here, the code works perfectly well.
But we're still in the territory of compiler nags: since X is signed, the compiler warns you that promoting it to an unsigned type might do something that you don't expect. Again, you know that that won't happen, but the compiler doesn't trust you. So you have to insist that you know what you're doing, and again, you do that with a cast:
X < short(Vector.size())
Just like before, that cast converts the result of calling Vector.size() to short. Now both sides of the < are the same type, so the < operation doesn't require a conversion from a signed to an unsigned type, so the compiler has nothing to complain about. There is still a conversion, because the rules say that values of type short get promoted to int in this expression, but don't worry about that for now.
Another possibility is to use an unsigned type for that loop index:
for (unsigned short X = 0; X < Vector.size(); ++X)
But the compiler might still insist on warning you that not all values of type std::size_t can fit in an unsigned short. So, again, you might need a cast. Or change the type of the index to match what the compiler think you need:
for (std::size_t X = 0; X < Vector.size(); ++X_
If I were to go this route, I would use unsigned int and if the compiler insisted on telling me that I don't know what I'm doing I'd yell at the compiler (which usually isn't helpful) and then I'd turn off that warning. There's really no point in using short here, because the loop index will always be converted to int (or unsigned int) wherever it's used. It will probably be in a register, so there is no space actually saved by storing it as a short.
Even better, as recommended in other answers, is to use a range-base for loop, which avoids managing that index:
for (auto& value: Vector) ...
In all cases, X has a storage duration of automatic, and the result of Vector.size() does not outlive the full expression where it is created.
I don't need anything more than this, long long unsigned int is overkill
Typically, automatic duration variables are "allocated" either on the stack, or as registers. In either case, there is no performance benefit to decreasing the allocation size, and there can be a performance penalty in narrowing and then widening values.
In the very common case where you are using X solely to index into Vector, you should strongly consider using a different kind of for:
for (auto & value : Vector) {
// replace Vector[X] with value in your loop body
}
My program does the common task of writing binary data to a file, conforming to a certain non-text file format. Since the data I'm writing is not already in existing chunks but instead is put together byte by byte at runtime, I use std::ostream::put() instead of write(). I assume this is normal procedure.
The program works just fine. It uses both std::stringstream::put() and std::ofstream::put() with two-digit hex integers as the arguments. But I get compiler warning C4309: "truncation of constant value" (in VC++ 2010) whenever the argument to put() is greater than 0x7f. Obviously the compiler is expecting a signed char, and the constant is out of range. But I don't think any truncation is actually happening; the byte gets written just like it's supposed to.
Compiler warnings make me think I'm not doing things in the normal, accepted way. The situation I described has to be a common one. Is there are common way to avoid such a compiler warning? Or is this an example of a pointless compiler warning that should just be ignored?
I thought of two inelegant ways to avoid it. I could use syntax like mystream.put( char(0xa4) ) on every call. Or instead of using std::stringstream I could use std::basic_stringstream< unsigned char >, but I don't think that trick would work with std::ofstream, which is not a templated type. I feel like there should be a better solution here, especially since ofstream is meant for writing binary files.
Your thoughts?
--EDIT--
Ah, I was mistaken about std::ofstream not being a templated type. It is actually std::basic_ofstream<char>, but I tried that method that and realized it won't work anyway for lack of defined methods and polymorphic incompatibility with std::ostream.
Here's a code sample:
stringstream ss;
int a, b;
/* Do stuff */
ss.put( 0 );
ss.put( 0x90 | a ); // oddly, no warning here...
ss.put( b ); // ...or here
ss.put( 0xa4 ); // C4309
I found solution that I'm happy with. It's more elegant than explicitly casting every constant to unsigned char. This is what I had:
ss.put( 0xa4 ); // C4309
I thought that the "truncation" was happening in implicitly casting unsigned char to char, but Cong Xu pointed out that integer constants are assumed to be signed, and any one greater than 0x7f gets promoted from char to int. Then it has to actually be truncated (cut down to one byte) if passed to put(). By using the suffix "u", I can specify an unsigned integer constant, and if it's no greater than 0xff, it will be an unsigned char. This is what I have now, without compiler warnings:
ss.put( 0xa4u );
std::stringstream ss;
ss.put(0x7f);
ss.put(0x80); //C4309
As you've guessed, the problem is that ostream.put() expects a char, but 0x7F is the maximum value for char, and anything greater gets promoted to int. You should cast to unsigned char, which is as wide as char so it'll store anything char does and safely, but also make truncation warnings legitimate:
ss.put(static_cast<unsigned char>(0x80)); // OK
ss.put(static_cast<unsigned char>(0xFFFF)); //C4309
Often an object I use will have (signed) int parameters (e.g. int iSize) which eventually store how large something should be. At the same time, I will often initialize them to -1 to signify that the object (etc) hasn't been setup / hasn't been filled / isn't ready for use.
I often end up with the warning comparison between signed and unsigned integer, when I do something like if( iSize >= someVector.size() ) { ... }.
Thus, I nominally don't want to be using an unsigned int. Are there any situations where this will lead to an error or unexpected behavior?
If not: what is the best way to handle this? If I use the compiler flag -Wno-sign-compare I could (hypothetically) miss a situation in which I should be using an unsigned int (or something like that). So should I just use a cast when comparing with an unsigned int--e.g. if( iSize >= (int)someVector.size() ) { ... } ?
Yes, there are, and very subtle ones. If you are curious, you can check this interesting presentation by Stephan T. Lavavej about arithmetic conversion and a bug in Microsoft's implementation of STL which was caused just by signed vs unsigned comparison.
In general, the problem is due to the fact that because of complement 2 arithmetic, a very small negative integral value has the same bit representation as a very big unsigned integral value (e.g. -1 = 0xFFFF = 65535).
In the specific case of checking size(), why not using type size_t for iSize in the first place? Unsigned values just give you greater expressivity, use it.
And if you do not want to declare iSize as size_t, just make it clear by using an explicit cast that you are aware of the nature of this comparison. The compiler is trying to do you a favor with those warnings and, as you correctly wrote, there might be situations where ignoring them would cause you a very bad headache.
Thus, if iSize is sometimes negative (and should be evaluated as less than all unsigned int values of size()), use the idiom: if ((iSize < 0) || ((unsigned)iSize < somevector.size())) ...
It seems safe to cast the result of my vector's size() function to an unsigned int. How can I tell for sure, though? My documentation isn't clear about how size_type is defined.
Do not assume the type of the container size (or anything else typed inside).
Today?
The best solution for now is to use:
std::vector<T>::size_type
Where T is your type. For example:
std::vector<std::string>::size_type i ;
std::vector<int>::size_type j ;
std::vector<std::vector<double> >::size_type k ;
(Using a typedef could help make this better to read)
The same goes for iterators, and all other types "inside" STL containers.
After C++0x?
When the compiler will be able to find the type of the variable, you'll be able to use the auto keyword. For example:
void doSomething(const std::vector<double> & p_aData)
{
std::vector<double>::size_type i = p_aData.size() ; // Old/Current way
auto j = p_aData.size() ; // New C++0x way, definition
decltype(p_aData.size()) k; // New C++0x way, declaration
}
Edit: Question from JF
What if he needs to pass the size of the container to some existing code that uses, say, an unsigned int? – JF
This is a problem common to the use of the STL: You cannot do it without some work.
The first solution is to design the code to always use the STL type. For example:
typedef std::vector<int>::size_type VIntSize ;
VIntSize getIndexOfSomeItem(const std::vector<int> p_aInt)
{
return /* the found value, or some kind of std::npos */
}
The second is to make the conversion yourself, using either a static_cast, using a function that will assert if the value goes out of bounds of the destination type (sometimes, I see code using "char" because, "you know, the index will never go beyond 256" [I quote from memory]).
I believe this could be a full question in itself.
According to the standard, you cannot be sure. The exact type depends on your machine. You can look at the definition in your compiler's header implementations, though.
I can't imagine that it wouldn't be safe on a 32-bit system, but 64-bit could be a problem (since ints remain 32 bit). To be safe, why not just declare your variable to be vector<MyType>::size_type instead of unsigned int?
It should always be safe to cast it to size_t. unsigned int isn't enough on most 64-bit systems, and even unsigned long isn't enough on Windows (which uses the LLP64 model instead of the LP64 model most Unix-like systems use).
The C++ standard only states that size_t is found in <cstddef>, which puts the identifiers in <stddef.h>. My copy of Harbison & Steele places the minimum and maximum values for size_t in <stdint.h>. That should give you a notion of how big your recipient variable needs to be for your platform.
Your best bet is to stick with integer types that are large enough to hold a pointer on your platform. In C99, that'd be intptr_t and uintptr_t, also officially located in <stdint.h>.
As long as you're sure that an unsigned int on your system will be large enough to hold the number of items you'll have in the vector you should be safe ;-)
I'm not sure how well this will work because I'm just thinking off the top of my head, but a compile-time assertion (such as BOOST_STATIC_ASSERT() or see Ways to ASSERT expressions at build time in C) might help. Something like:
BOOST_STATIC_ASSERT( sizeof( unsigned int) >= sizeof( size_type));