How to write an unsigned short int literal?

How to write an unsigned short int literal? - c++

42 as unsigned int is well defined as "42U".
unsigned int foo = 42U; // yeah!
How can I write "23" so that it is clear it is an unsigned short int?
unsigned short bar = 23; // booh! not clear!
EDIT so that the meaning of the question is more clear:
template <class T>
void doSomething(T) {
std::cout << "unknown type" << std::endl;
}
template<>
void doSomething(unsigned int) {
std::cout << "unsigned int" << std::endl;
}
template<>
void doSomething(unsigned short) {
std::cout << "unsigned short" << std::endl;
}
int main(int argc, char* argv[])
{
doSomething(42U);
doSomething((unsigned short)23); // no other option than a cast?
return EXIT_SUCCESS;
}

You can't. Numeric literals cannot have short or unsigned short type.
Of course in order to assign to bar, the value of the literal is implicitly converted to unsigned short. In your first sample code, you could make that conversion explicit with a cast, but I think it's pretty obvious already what conversion will take place. Casting is potentially worse, since with some compilers it will quell any warnings that would be issued if the literal value is outside the range of an unsigned short. Then again, if you want to use such a value for a good reason, then quelling the warnings is good.
In the example in your edit, where it happens to be a template function rather than an overloaded function, you do have an alternative to a cast: do_something<unsigned short>(23). With an overloaded function, you could still avoid a cast with:
void (*f)(unsigned short) = &do_something;
f(23);
... but I don't advise it. If nothing else, this only works if the unsigned short version actually exists, whereas a call with the cast performs the usual overload resolution to find the most compatible version available.

unsigned short bar = (unsigned short) 23;
or in new speak....
unsigned short bar = static_cast<unsigned short>(23);

at least in Visual Studio (at least 2013 and newer) you can write
23ui16
for get an constant of type unsigned short.
see definitions of INT8_MIN, INT8_MAX, INT16_MIN, INT16_MAX, etc. macros in stdint.h
I don't know at the moment whether this is part of the standard C/C++

There are no modifiers for unsigned short. Integers, which has int type by default, usually implicitly converted to target type with no problems. But if you really want to explicitly indicate type, you could write the following:
unsigned short bar = static_cast<unsigned short>(23);
As I can see the only reason is to use such indication for proper deducing template type:
func( static_cast<unsigned short>(23) );
But for such case more clear would be call like the following:
func<unsigned short>( 23 );

There are multiple answers here, none of which are terribly satisfying. So here is a compilation answer with some added info to help explain things a little more thoroughly.
First, avoid shorts as suggested, but if you find yourself needing them such as when working with indexed mesh data and simply switching to shorts for your index size cuts your index data size in half...then read on...
1 While it is technically true that there is no way to express an unsigned short literal in c or C++ you can easily side step this limitation by simply marking your literal as unsigned with a 'u'.
unsigned short myushort = 16u;
This works because it tells the compiler that 16 is unsigned int, then the compiler goes looking for a way to convert it to unsigned short, finds one, most compilers will then check for overflow, and do the conversion with no complaints. The "narrowing conversion" error/warning when the "u" is left out is the compiler complaining that the code is throwing away the sign. Such that if the literal is negative such as -1 then the result is undefined. Usually this means you will get a very large unsigned value that will then be truncated to fit the short.
2 There are multiple suggestions on how to side step this limitation, most seasoned programmers will sum these up with a "don't do that".
unsigned short myshort = (unsigned short)16;
unsigned short myothershort = static_cast<unsigned short>(16);
While both of these work they are undesirable for 2 major reasons. First they are wordy, programmers get lazy and typing all that just for a literal is easy to skip which leads to basic errors that could have been avoided with a better solution. Second they are not free, static_cast in particular generates a little assembly code to do the conversion, and while an optimizer may(or may not) figure out that it can do the conversion its better to just write good quality code from the start.
unsigned short myshort = 16ui16;
This solution is undesirable because it limits who can read your code and understand it, it also means you are starting down the slippery slope of compiler specific code which can lead to your code suddenly not working because of the whims of some compiler writer, or some company that randomly decides to "make a right hand turn", or goes away and leaves in you in the lurch.
unsigned short bar = L'\x17';
This is so unreadable that nobody has upvoted it. And unreadable should be avoided for many good reasons.
unsigned short bar = 0xf;
This to is unreadable. While being able to read understand and convert hex is something serious programmers really need to learn it is very unreadable quick what number is this: 0xbad; Now convert it to binary...now octal.
3 Lastly if you find all the above solutions undesirable I offer up yet another solution that is available via a user defined operator.
constexpr unsigned short operator ""_ushort(unsigned long long x)
{
return (unsigned short)x;
}
and to use it
unsigned short x = 16_ushort;
Admittedly this too isn't perfect. First it takes an unsigned long long and whacks it all the way down to an unsigned short suppressing potential compiler warnings along the way, and it uses the c style cast. But it is constexpr which gurantees it is free in an optimized program, yet can be stepped into during debug. It is also short and sweet so programmers are more likely to use it and it is expressive so it is easy to read and understand. Unfortunately it requires a recent compiler as what can legally be done with user defined operators has changed over the various version of C++.
So pick your trade off but be careful as you may regret them later. Happy Programming.

Unfortunately, the only method defined for this is
One or two characters in single quotes
('), preceded by the letter L
According to http://cpp.comsci.us/etymology/literals.html
Which means you would have to represent your number as an ASCII escape sequence:
unsigned short bar = L'\x17';

Unfortunately, they can't. But if people just look two words behind the number, they should clearly see it is a short... It's not THAT ambiguous. But it would be nice.

If you express the quantity as a 4-digit hex number, the unsigned shortness might be clearer.
unsigned short bar = 0x0017;

You probably shouldn't use short, unless you have a whole lot of them. It's intended to use less storage than an int, but that int will have the "natural size" for the architecture. Logically it follows that a short probably doesn't. Similar to bitfields, this means that shorts can be considered a space/time tradeoff. That's usually only worth it if it buys you a whole lot of space. You're unlikely to have very many literals in your application, though, so there was no need foreseen to have short literals. The usecases simply didn't overlap.

In C++11 and beyond, if you really want an unsigned short literal conversion then it can be done with a user defined literal:
using uint16 = unsigned short;
using uint64 = unsigned long long;
constexpr uint16 operator""_u16(uint64 to_short) {
// use your favorite value validation
assert(to_short < USHRT_MAX); // USHRT_MAX from limits.h
return static_cast<uint16>(to_short);
}
int main(void) {
uint16 val = 26_u16;
}

Related

Best way to define number is unsigned char in c++

I had some code that when simplified is essentially this
unsigned char a=255;
unsigned char b=0;
while (a+1==b) {//do something}
Now since 255+1=0 with unsigned chars I was expecting it to do something but it didn't because it promoted everything to int. I can make it work by replacing a+1 with either (unsigned char)(a+1) or (a+1)%256.
What would be the best way to tell the compiler that I don't want the types to be changed? Or should I just be doing 1 of the ways I know works?

According to This cppreference.com page, "arithmetic operators don't accept types smaller than int as arguments." That said, your proposed solution of (unsigned char)(a + 1) is enough to keep this promotion from happening and a will roll over to 0. Using the modulo operator is more explicit but it introduces an additional operation in the machine code. It's a balance between clarity and performance.

how does the short(vector.size()) command conversion work in C++?

I don't know any other way to return the size of a vector other than the .size() command, and it works very well, but, it return a variable of type long long unsigned int, and this in very cases are very good, but I'm sure my program will never have a vector so big that it need all that size of return, short int is more than enough.
I know, for today's computers those few enused bytes are irrelevant, but I don't like to leave these "loose ends" even if they are small, and whem I was programming, I came across some details that bothered me.
Look at these examples:
for(short int X = 0 ; X < Vector.size() ; X++){
}
compiling this, I receive this warning:
warning: comparison of integer expressions of different signedness: 'short int' and 'std::vector<unsigned char>::size_type' {aka 'long long unsigned int'} [-Wsign-compare]|
this is because the .size() return value type is different from the short int I'm compiling, "X" is a short int, and Vector.size() return a long long unsigned int, was expected, so if I do this:
for(size_t X = 0 ; X < Vector.size() ; X++){
}
the problem is gone, but by doing this, I'm creating a long long unsigned int in variable size_t and I'm returning another variable long long unsigned int, so, my computer allocale two variables long long unsigned int, so, what I do for returning a simple short int? I don't need anything more than this, long long unsigned int is overkill, so I did this:
for(short int X = 0 ; X < short(Vector.size()) ; X++){
}
but... how is this working? short int X = 0 is allocating a short int, nothing new, but what about short (Vector.size()), is the computer allocating a long unsigned int and converting it to a short int? or is the compiler "changing" the return of the .size() function by making it naturally return a short int and, in this case, not allocating a long long unsined int? because I know the compilers are responsible for optimizing the code too, is there any "problem" or "detail" when using this method? since I rarely see anyone using this, what exactly is this short() doing in memory allocation? where can i read more about it?
(thanks to everyone who responded)

Forget for a moment that this involves a for loop; that's important for the underlying code, but it's a distraction from what's going on with the conversion.
short X = Vector.size();
That line calls Vector.size(), which returns a value of type std::size_t. std::size_t is an unsigned type, large enough to hold the size of any object. So it could be unsigned long, or it could be unsigned long long. In any event, it's definitely not short. So the compiler has to convert that value to short, and that's what it does.
Most compilers these days don't trust you to understand what this actually does, so they warn you. (Yes, I'm rather opinionated about compilers that nag; that doesn't change the analysis here). So if you want to see that warning (i.e., you don't turn it off), you'll see it. If you want to write code that doesn't generate that warning, then you have to change the code to say "yes, I know, and I really mean it". You do that with a cast:
short X = short(Vector.size());
The cast tells the compiler to call Vector.size() and convert the resulting value to short. The code then assigns the result of that conversion to X. So, more briefly, in this case it tells the compiler that you want it to do exactly what it would have done without the cast. The difference is that because you wrote a cast, the compiler won't warn you that you might not know what you're doing.
Some folks prefer to write that cast is with a static_cast:
short X = static_cast<short>(Vector.size());
That does the same thing: it tells the compiler to do the conversion to short and, again, the compiler won't warn you that you did it.
In the original for loop, a different conversion occurs:
X < Vector.size()
That bit of code calls Vector.size(), which still returns an unsigned type. In order to compare that value with X, the two sides of the < have to have the same type, and the rules for this kind of expression require that X gets promoted to std::size_t, i.e., that the value of X gets treated as an unsigned type. That's okay as long as the value isn't negative. If it's negative, the conversion to the unsigned type is okay, but it will produce results that probably aren't what was intended. Since we know that X is not negative here, the code works perfectly well.
But we're still in the territory of compiler nags: since X is signed, the compiler warns you that promoting it to an unsigned type might do something that you don't expect. Again, you know that that won't happen, but the compiler doesn't trust you. So you have to insist that you know what you're doing, and again, you do that with a cast:
X < short(Vector.size())
Just like before, that cast converts the result of calling Vector.size() to short. Now both sides of the < are the same type, so the < operation doesn't require a conversion from a signed to an unsigned type, so the compiler has nothing to complain about. There is still a conversion, because the rules say that values of type short get promoted to int in this expression, but don't worry about that for now.
Another possibility is to use an unsigned type for that loop index:
for (unsigned short X = 0; X < Vector.size(); ++X)
But the compiler might still insist on warning you that not all values of type std::size_t can fit in an unsigned short. So, again, you might need a cast. Or change the type of the index to match what the compiler think you need:
for (std::size_t X = 0; X < Vector.size(); ++X_
If I were to go this route, I would use unsigned int and if the compiler insisted on telling me that I don't know what I'm doing I'd yell at the compiler (which usually isn't helpful) and then I'd turn off that warning. There's really no point in using short here, because the loop index will always be converted to int (or unsigned int) wherever it's used. It will probably be in a register, so there is no space actually saved by storing it as a short.
Even better, as recommended in other answers, is to use a range-base for loop, which avoids managing that index:
for (auto& value: Vector) ...

In all cases, X has a storage duration of automatic, and the result of Vector.size() does not outlive the full expression where it is created.
I don't need anything more than this, long long unsigned int is overkill
Typically, automatic duration variables are "allocated" either on the stack, or as registers. In either case, there is no performance benefit to decreasing the allocation size, and there can be a performance penalty in narrowing and then widening values.
In the very common case where you are using X solely to index into Vector, you should strongly consider using a different kind of for:
for (auto & value : Vector) {
// replace Vector[X] with value in your loop body
}

Why are stoi, stol not fixed width integers?

Since ints and longs and other integer types may be different sizes on different systems, why not have stouint8_t(), stoint64_t(), etc. so that portable string to int code could be written?

Because typing that would make me want to chop off my fingers.
Seriously, the basic integer types are int and long and the std::stoX functions are just very simple wrappers around strtol etc. and note that C doesn't provide strtoi32 or strtoi64 or anything that std::stouint32_t could wrap.
If you want something more complicated you can write it yourself.
I could just as well ask "why do people use int and long, instead of int32_t and int64_t everywhere, so the code is portable?" and the answer would be because it's not always necessary.
But the actual reason is probably that noone ever proposed it for the standard. Things don't just magically appear in the standard, someone has to write a proposal and justify adding them, and convince the rest of the committee to add them. So the answer to most "why isn't this thing I just thought of in the standard?" is that noone proposed it.

Because it's usually not necessary.
stoll and stoull return results of type long long and unsigned long long respectively. If you want to convert a string to int64_t, you can just call stoll() and store the result in your int64_t object; the value will be implicitly converted.
This assumes that long long is the widest signed integer type. Like C (starting with C99), C++ permits extended integer types, some of which might be wider than [unsigned] long long. C provides conversion functions strtoimax and strtoumax (operating on intmax_t and uintmax_t, respectively) in <inttypes.h>. For whatever reason, C++ doesn't provide wrappers for this functions (the logical names would be stoimax and stoumax.
But that's not going to matter unless you're using a C++ compiler that provides an extended integer type wider than [unsigned] long long, and I'm not aware that any such compilers actually exist. For any types no wider than 64 bits, the existing functions are all you need.
For example:
#include <iostream>
#include <string>
#include <cstdint>
int main() {
const char *s = "0xdeadbeeffeedface";
uint64_t u = std::stoull(s, NULL, 0);
std::cout << u << "\n";
}

What is the downside of replacing size_t with unsigned long

The library I am working on need to be used on both 32 and 64 bit machines; I have lots of compiler warnings because on 64bit machines unsigned int != size_t.
Is there any downside in replacing all unsigned ints and size_ts by 'unsigned long'? I appreciate it does not look very elegant, but, in out case, the memory is not too much of an issue... I am wondering if there is a possibility of any bugs/unwanted behaviour etc. created by such replace all operation (could you give examples)? Thanks.

What warnings? The most obvious one I can think of is for a "narrowing conversion", that is to say you're assigning size_t to unsigned int, and getting a warning that information might be lost.
The main downside of replacing size_t with unsigned long is that unsigned long is not guaranteed to be large enough to contain every possible value of size_t, and on Windows 64 it is not large enough. So you might find that you still have warnings.
The proper fix is that if you assign a size_t to a variable (or data member), you should make sure that variable has a type large enough to contain any value of size_t. That's what the warning is all about. So you should not switch to unsigned long, you should switch those variables to size_t.
Conversely, if you have a variable that doesn't need to be big enough to hold any size, just big enough for unsigned int, then don't use size_t for it in the first place.
Both types (size_t and unsigned int) have valid uses, so any approach that indiscriminately replaces all use of them by some other type must be wrong :-) Actually, you could replace everything with size_t or uintmax_t and for most programs that would be OK. The exceptions are where the code relies on using an unsigned type of the same size as int, or whatever, such that a larger type breaks the code.

The standard makes little guarantees about the sizes of types like int and long. size_t is guaranteed to be large enough to hold any object, and all std containers operate on size_t.
It's perfectly possible for a platform to define long as smaller than size_t, or have the size of long subject to compilation options, for example. To be safe, it's best to stick to size_t.
Another criterion to consider is that size_t carries a meaning - "this thing is used to store a size or an index." It makes the code slightly more self-documenting.

If you are using size_t in places where you should get a size_t and replace it with unsigned long, you will introduce new warnings.
example:
size_t count = some_vector.size();
Replace size_t with unsigned long, and (to the degree they are different) you will have introduced a new warning (because some_vector.size() returns a size_t - actually a std:::vector<something>::size_type but in practice it should evaluate to the same).

It may be a problem of assuming it's unsigned long when long is 8 bytes. then (unsigned int) -1 != (unsigned long) -1, the following code may have assertion failure.
unsigned int i = string::npos;
assert(i == string::npos);

Why or why not should I use 'UL' to specify unsigned long?

ulong foo = 0;
ulong bar = 0UL;//this seems redundant and unnecessary. but I see it a lot.
I also see this in referencing the first element of arrays a good amount
blah = arr[0UL];//this seems silly since I don't expect the compiler to magically
//turn '0' into a signed value
Can someone provide some insight to why I need 'UL' throughout to specify specifically that this is an unsigned long?

void f(unsigned int x)
{
//
}
void f(int x)
{
//
}
...
f(3); // f(int x)
f(3u); // f(unsigned int x)
It is just another tool in C++; if you don't need it don't use it!

In the examples you provide it isn't needed. But suffixes are often used in expressions to prevent loss of precision. For example:
unsigned long x = 5UL * ...
You may get a different answer if you left off the UL suffix, say if your system had 16-bit ints and 32-bit longs.
Here is another example inspired by Richard Corden's comments:
unsigned long x = 1UL << 17;
Again, you'd get a different answer if you had 16 or 32-bit integers if you left the suffix off.
The same type of problem will apply with 32 vs 64-bit ints and mixing long and long long in expressions.

Some compiler may emit a warning I suppose.
The author could be doing this to make sure the code has no warnings?

Sorry, I realize this is a rather old question, but I use this a lot in c++11 code...
ul, d, f are all useful for initialising auto variables to your intended type, e.g.
auto my_u_long = 0ul;
auto my_float = 0f;
auto my_double = 0d;
Checkout the cpp reference on numeric literals: http://www.cplusplus.com/doc/tutorial/constants/

You don't normally need it, and any tolerable editor will have enough assistance to keep things straight. However, the places I use it in C# are (and you'll see these in C++):
Calling a generic method (template in C++), where the parameter types are implied and you want to make sure and call the one with an unsigned long type. This happens reasonably often, including this one recently:
Tuple<ulong, ulong> = Tuple.Create(someUlongVariable, 0UL);
where without the UL it returns Tuple<ulong, int> and won't compile.
Implicit variable declarations using the var keyword in C# or the auto keyword coming to C++. This is less common for me because I only use var to shorten very long declarations, and ulong is the opposite.

When you feel obligated to write down the type of constant (even when not absolutely necessary) you make sure:
That you always consider how the compiler will translate this constant into bits
Who ever reads your code will always know how you thought the constant looks like and that you taken it into consideration (even you, when you rescan the code)
You don't spend time if thoughts whether you need to write the 'U'/'UL' or don't need to write it
also, several software development standards such as MISRA require you to mention the type of constant no matter what (at least write 'U' if unsigned)
in other words it is believed by some as good practice to write the type of constant because at the worst case you just ignore it and at the best you avoid bugs, avoid a chance different compilers will address your code differently and improve code readability

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js