Have to cast value in assignment - c++

I've got some c++ code that's requiring me to cast an immediate in an assignment statement. The casting makes the code more difficult to read and I was hoping there was a way around this.
uint64_t shifted_val = (uint64_t)1 << 50;
If I write this code without the cast, shifted_val gets set to 0, I assume because it's treating the 1 immediate as a 32-bit value. Is there something I'm missing so that I can write this without casting?

You can do:
uint64_t shifted_val = 1ull << 50;
If you think the syntax is also close to casting then you can do:
uint64_t a = 1;
uint64_t shifted_val = a << 50;

One way to do it is to adopt a habit of performing the calculations within the recipient variable itself
uint64_t shifted_val = 1;
shifted_val <<= 50;
This will solve the issue naturally, without requiring you to hardcode additional type references into the expression (like type casts or type-specific suffixes).

The constant needs to be treated as a 64 bit value, so there needs to be some way of specifying that.
You could specify it as 1ULL, which tells the compiler the constant is a unsigned long long, however that may not necessarily be 64 bit, so you're better off with the cast to be more portable.

Related

Converting Integer Types

How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Different compilers will warn for something like:
int i = get_int();
size_t s = i;
for loss of signedness or
size_t s = get_size();
int i = s;
for narrowing.
casting can remove the warnings but don't solve the safety issue.
Is there a proper way of doing this?
You can try boost::numeric_cast<>.
boost numeric_cast returns the result of converting a value of type Source to a value of type Target. If out-of-range is detected, an exception is thrown (see bad_numeric_cast, negative_overflow and positive_overflow ).
How does one convert from one integer type to another safely and with setting off alarm bells in compilers and static analysis tools?
Control when conversion is needed. As able, only convert when there is no value change. Sometimes, then one must step back and code at a higher level. IOWs, was a lossy conversion needed or can code be re-worked to avoid conversion loss?
It is not hard to add an if(). The test just needs to be carefully formed.
Example where size_t n and int len need a compare. Note that positive values of int may exceed that of size_t - or visa-versa or the same. Note in this case, the conversion of int to unsigned only happens with non-negative values - thus no value change.
int len = snprintf(buf, n, ...);
if (len < 0 || (unsigned)len >= n) {
// Handle_error();
}
unsigned to int example when it is known that the unsigned value at this point of code is less than or equal to INT_MAX.
unsigned n = ...
int i = n & INT_MAX;
Good analysis tools see that n & INT_MAX always converts into int without loss.
There is no built-in safe narrowing conversion between int types in c++ and STL. You could implement it yourself using as an example Microsoft GSL.
Theoretically, if you want perfect safety, you shouldn't be mixing types like this at all. (And you definitely shouldn't be using explicit casts to silence warnings, as you know.) If you've got values of type size_t, it's best to always carry them around in variables of type size_t.
There is one case where I do sometimes decide I can accept less than 100.000% perfect type safety, and that is when I assign sizeof's return value, which is a size_t, to an int. For any machine I am ever going to use, the only time this conversion might lose information is when sizeof returns a value greater than 2147483647. But I am content to assume that no single object in any of my programs will ever be that big. (In particular, I will unhesitatingly write things like printf("sizeof(int) = %d\n", (int)sizeof(int)), explicit cast and all. There is no possible way that the size of a type like int will not fit in an int!)
[Footnote: Yes, it's true, on a 16-bit machine the assumption is the rather less satisfying threshold that sizeof won't return a value greater than 32767. It's more likely that a single object might have a size like that, but probably not in a program that's running on a 16-bitter.]

how does the short(vector.size()) command conversion work in C++?

I don't know any other way to return the size of a vector other than the .size() command, and it works very well, but, it return a variable of type long long unsigned int, and this in very cases are very good, but I'm sure my program will never have a vector so big that it need all that size of return, short int is more than enough.
I know, for today's computers those few enused bytes are irrelevant, but I don't like to leave these "loose ends" even if they are small, and whem I was programming, I came across some details that bothered me.
Look at these examples:
for(short int X = 0 ; X < Vector.size() ; X++){
}
compiling this, I receive this warning:
warning: comparison of integer expressions of different signedness: 'short int' and 'std::vector<unsigned char>::size_type' {aka 'long long unsigned int'} [-Wsign-compare]|
this is because the .size() return value type is different from the short int I'm compiling, "X" is a short int, and Vector.size() return a long long unsigned int, was expected, so if I do this:
for(size_t X = 0 ; X < Vector.size() ; X++){
}
the problem is gone, but by doing this, I'm creating a long long unsigned int in variable size_t and I'm returning another variable long long unsigned int, so, my computer allocale two variables long long unsigned int, so, what I do for returning a simple short int? I don't need anything more than this, long long unsigned int is overkill, so I did this:
for(short int X = 0 ; X < short(Vector.size()) ; X++){
}
but... how is this working? short int X = 0 is allocating a short int, nothing new, but what about short (Vector.size()), is the computer allocating a long unsigned int and converting it to a short int? or is the compiler "changing" the return of the .size() function by making it naturally return a short int and, in this case, not allocating a long long unsined int? because I know the compilers are responsible for optimizing the code too, is there any "problem" or "detail" when using this method? since I rarely see anyone using this, what exactly is this short() doing in memory allocation? where can i read more about it?
(thanks to everyone who responded)
Forget for a moment that this involves a for loop; that's important for the underlying code, but it's a distraction from what's going on with the conversion.
short X = Vector.size();
That line calls Vector.size(), which returns a value of type std::size_t. std::size_t is an unsigned type, large enough to hold the size of any object. So it could be unsigned long, or it could be unsigned long long. In any event, it's definitely not short. So the compiler has to convert that value to short, and that's what it does.
Most compilers these days don't trust you to understand what this actually does, so they warn you. (Yes, I'm rather opinionated about compilers that nag; that doesn't change the analysis here). So if you want to see that warning (i.e., you don't turn it off), you'll see it. If you want to write code that doesn't generate that warning, then you have to change the code to say "yes, I know, and I really mean it". You do that with a cast:
short X = short(Vector.size());
The cast tells the compiler to call Vector.size() and convert the resulting value to short. The code then assigns the result of that conversion to X. So, more briefly, in this case it tells the compiler that you want it to do exactly what it would have done without the cast. The difference is that because you wrote a cast, the compiler won't warn you that you might not know what you're doing.
Some folks prefer to write that cast is with a static_cast:
short X = static_cast<short>(Vector.size());
That does the same thing: it tells the compiler to do the conversion to short and, again, the compiler won't warn you that you did it.
In the original for loop, a different conversion occurs:
X < Vector.size()
That bit of code calls Vector.size(), which still returns an unsigned type. In order to compare that value with X, the two sides of the < have to have the same type, and the rules for this kind of expression require that X gets promoted to std::size_t, i.e., that the value of X gets treated as an unsigned type. That's okay as long as the value isn't negative. If it's negative, the conversion to the unsigned type is okay, but it will produce results that probably aren't what was intended. Since we know that X is not negative here, the code works perfectly well.
But we're still in the territory of compiler nags: since X is signed, the compiler warns you that promoting it to an unsigned type might do something that you don't expect. Again, you know that that won't happen, but the compiler doesn't trust you. So you have to insist that you know what you're doing, and again, you do that with a cast:
X < short(Vector.size())
Just like before, that cast converts the result of calling Vector.size() to short. Now both sides of the < are the same type, so the < operation doesn't require a conversion from a signed to an unsigned type, so the compiler has nothing to complain about. There is still a conversion, because the rules say that values of type short get promoted to int in this expression, but don't worry about that for now.
Another possibility is to use an unsigned type for that loop index:
for (unsigned short X = 0; X < Vector.size(); ++X)
But the compiler might still insist on warning you that not all values of type std::size_t can fit in an unsigned short. So, again, you might need a cast. Or change the type of the index to match what the compiler think you need:
for (std::size_t X = 0; X < Vector.size(); ++X_
If I were to go this route, I would use unsigned int and if the compiler insisted on telling me that I don't know what I'm doing I'd yell at the compiler (which usually isn't helpful) and then I'd turn off that warning. There's really no point in using short here, because the loop index will always be converted to int (or unsigned int) wherever it's used. It will probably be in a register, so there is no space actually saved by storing it as a short.
Even better, as recommended in other answers, is to use a range-base for loop, which avoids managing that index:
for (auto& value: Vector) ...
In all cases, X has a storage duration of automatic, and the result of Vector.size() does not outlive the full expression where it is created.
I don't need anything more than this, long long unsigned int is overkill
Typically, automatic duration variables are "allocated" either on the stack, or as registers. In either case, there is no performance benefit to decreasing the allocation size, and there can be a performance penalty in narrowing and then widening values.
In the very common case where you are using X solely to index into Vector, you should strongly consider using a different kind of for:
for (auto & value : Vector) {
// replace Vector[X] with value in your loop body
}

Should I use a bit mask when truncating uint64_t to uint8_t[i]?

If I have a large int, say a uint64_t, and an array of uint8_t, e.g.:
uint64_t large = 12345678901234567890;
uint8_t small[5];
and I want to copy the 8 least significant bits of the uint64_t into an element of the array of uint8_t, is it safe to just use:
small[3] = large;
or should I use a bit-mask:
small[3] = large & 255;
i.e. Is there any situation where the rest of the large int may somehow overflow into the other elements of the array?
It will most certainly not cause data to be processed incorrectly. However, some compilers may generate a warning message.
There are two options to avoid these.
You can cast your variable:
(uint8_t)large
Or you can disable the warning:
#pragma warning(disable:4503)
I would suggest casting the variable, because hiding compiler warnings will potentially keep you from spotting actual problems and is therefore not best practice.
This is perfectly safe:
small[3] = large;
and such a conversion is explicitly described in [conv.integral]:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source
integer (modulo 2n where n is the number of bits used to represent the unsigned type).
That is, these four statements all are guaranteed to end up with the same value in small[3]:
small[3] = large;
small[3] = large % 256;
small[3] = large & 255;
small[3] = static_cast<uint8_t>(large);
there's no functional reason to do the % or & or cast yourself, though if you want to anyway I would be surprised if the compiler didn't generate the same code for all four (gcc and clang do).
The one difference would be if you compile with something like -Wconversion, which would cause this to issue a warning (which can sometimes be beneficial). In that case, you'll want to do the cast.

What is the number 0ui64?

I see the
#define NUM_MAX_VOLUME 0ui64
in other people's code
What is the number 0ui64? It seems it is not a hex number though.
I am surpsised that there are many answers, but none has pointed out the official and authoritative documentation that should be noted in my opinion, so here goes the MSDN documentation:
unsigned-suffix: one of
u U
and
64-bit integer-suffix:
i64 LL ll
So, it is indeed not a hexa number, but basically a macro define to zero that represents an unsiged 64 bit integer number. Please note that 0Ui64, 0ULL, 0ull, etc, would be all the same, too.
This is necessary when you want to make sure that the sign and size are fixed so that it cannot go unexpected or undefined behavior.
This is neither standard C++, nor C, but a Microsoft compiler feature. Try to avoid it.
Since your question is tagged as Qt, the recommendation is to use quint64 and Q_UINT64_C instead which will work cross-platform. So, you would write something like this:
#define NUM_MAX_VOLUME Q_UINT64_C(0)
"ui64" means unsigned 64-bit integer. It is a non-standard suffix in some cases.
"0ui64" is just 0, and i guess the reason to write like this is for compatibility.
It's basically used in a expression where the size of the operand (the constant here) matters. Take shifting for example:
auto i = 1 << 36;
On a machine where int is 32-bits long this will lead to undefined behaviour. Since the 1 here is taken as an int, and you're trying to shift it beyond the size of the resulting type: int. What you want is a 64-bit integral type, say unsigned long long then you'd do
auto i = 1ULL << 36;
This isn't UB since the resulting type would also be an unsigned long long due to the operand (which is now an unsigned long long too).
Another example is type deduction of the C++11's auto keyword. Try this:
for (auto i = 0; i < v.size(); ++i)
With warnings enabled GCC barks (live example)
warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
However, changing this to
for (auto i = 0u; i < v.size(); ++i)
make the warning disappear. Again since the suffix 0u led the compiler to deduce the type of i as unsigned int and not simply int.
In your case, you've the suffix ui64 which isn't standard C++, so it should be an implementation-specific extension that denotes unsigned 64-bit integer.
0xFull ,for example, is also valid C++ constant. It is 0xF, unsigned long long, a.k.a. ui64 in Microsoft compilers transcription.

Why or why not should I use 'UL' to specify unsigned long?

ulong foo = 0;
ulong bar = 0UL;//this seems redundant and unnecessary. but I see it a lot.
I also see this in referencing the first element of arrays a good amount
blah = arr[0UL];//this seems silly since I don't expect the compiler to magically
//turn '0' into a signed value
Can someone provide some insight to why I need 'UL' throughout to specify specifically that this is an unsigned long?
void f(unsigned int x)
{
//
}
void f(int x)
{
//
}
...
f(3); // f(int x)
f(3u); // f(unsigned int x)
It is just another tool in C++; if you don't need it don't use it!
In the examples you provide it isn't needed. But suffixes are often used in expressions to prevent loss of precision. For example:
unsigned long x = 5UL * ...
You may get a different answer if you left off the UL suffix, say if your system had 16-bit ints and 32-bit longs.
Here is another example inspired by Richard Corden's comments:
unsigned long x = 1UL << 17;
Again, you'd get a different answer if you had 16 or 32-bit integers if you left the suffix off.
The same type of problem will apply with 32 vs 64-bit ints and mixing long and long long in expressions.
Some compiler may emit a warning I suppose.
The author could be doing this to make sure the code has no warnings?
Sorry, I realize this is a rather old question, but I use this a lot in c++11 code...
ul, d, f are all useful for initialising auto variables to your intended type, e.g.
auto my_u_long = 0ul;
auto my_float = 0f;
auto my_double = 0d;
Checkout the cpp reference on numeric literals: http://www.cplusplus.com/doc/tutorial/constants/
You don't normally need it, and any tolerable editor will have enough assistance to keep things straight. However, the places I use it in C# are (and you'll see these in C++):
Calling a generic method (template in C++), where the parameter types are implied and you want to make sure and call the one with an unsigned long type. This happens reasonably often, including this one recently:
Tuple<ulong, ulong> = Tuple.Create(someUlongVariable, 0UL);
where without the UL it returns Tuple<ulong, int> and won't compile.
Implicit variable declarations using the var keyword in C# or the auto keyword coming to C++. This is less common for me because I only use var to shorten very long declarations, and ulong is the opposite.
When you feel obligated to write down the type of constant (even when not absolutely necessary) you make sure:
That you always consider how the compiler will translate this constant into bits
Who ever reads your code will always know how you thought the constant looks like and that you taken it into consideration (even you, when you rescan the code)
You don't spend time if thoughts whether you need to write the 'U'/'UL' or don't need to write it
also, several software development standards such as MISRA require you to mention the type of constant no matter what (at least write 'U' if unsigned)
in other words it is believed by some as good practice to write the type of constant because at the worst case you just ignore it and at the best you avoid bugs, avoid a chance different compilers will address your code differently and improve code readability