C++ limits.h defines - c++

I am studying beginner cryptography in C++ and was taking a look inside limits.h.
Would someone please explain to me what this code snippet does? Does it define the number of binary numbers these types can hold?
Specificaly, what is 0xffu?
Sorry, for the crap title.

0xffu is the number 255 in hexadecimal notation, defined as being interpreted as unsigned for the compiler.
these_MAX defines mean that this is the max value a datatype can hold before an overflow happens.
i.e.
unsigned char myChar = 0xFFu;
myChar += 1;
printf("%i", myChar);
will print
0
Same for the other unsigned datatypes here.
Maybe your question can be understood as "What happens if I change this?". No, it does not define the maximum number, it is a help for you as programmer have the maximum numbers at hand for programming. If you change this snippet, it will not change the datatypes. Just some algorithms using these defines will change their behavior (working with another max value).
I do not recommend changing these, if that was your intent.

These just define the largest values that can be stored in each of unsigned char, unsigned short, unsigned int, and unsigned long int. 0xffU means hexadecimal value FF, with the U suffix denoting that that literal is explicitly unsigned.

The U is used to indicate unsigned constants. Without it, you may get warnings such as "value outside of range for int".

It defines it's maximal numeric value it can store.
So unsigned char can store 2^8-1.

Related

Is `-1` correct for using as maximum value of an unsigned integer?

Is there any c++ standard paragraph which says that using -1 for this is portable and correct way or the only way of doing this correctly is using predefined values?
I have had a conversation with my colleague, what is better: using -1 for a maximum unsigned integer number or using a value from limits.h or std::numeric_limits ?
I have told my colleague that using predefined maximum values from limits.h or std::numeric_limits is the portable and clean way of doing this, however, the colleague objected to -1 being as same portable as numeric limits, and more, it has one more advantage:
unsigned short i = -1; // unsigned short max
can easily be changed to any other type, like
unsigned long i = -1; // unsigned long max
when using the predefined value from the limits.h header file or std::numeric_limits also requires to rewrite it too along with the type to the left.
Regarding conversions of integers, C 2011 [draft N1570] 6.3.1.3 2 says
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Thus, converting -1 to an unsigned integer type necessarily produces the maximum value of that type.
There may be issues with using -1 in various contexts where it is not immediately converted to the desired type. If it is immediately converted to the desired unsigned integer type, as by assignment or explicit conversion, then the result is clear. However, if it is a part of an expression, its type is int, and it behaves like an int until converted. In contrast, UINT_MAX has the type unsigned int, so it behaves like an unsigned int.
As chux points out in a comment, USHRT_MAX effectively has a type of int, so even the named limits are not fully safe from type issues.
Not using the standard way or not clearly showing the intent is often a bad idea that we pay later
I would suggest:
auto i = std::numeric_limits<unsigned int>::max();
or #jamesdin suggested a certainly better one, closer to the C
habits:
unsigned int i = std::numeric_limits<decltype(i)>::max();
Your colleague argument is not admissible. Changing int -> long int, as bellow:
auto i = std::numeric_limits<unsigned long int>::max();
does not require extra work compared to the -1 solution (thanks to the use of auto).
the '-1' solution does not directly reflect our intent, hence it possibly has harmful consequences. Consider this code snippet:
.
using index_t = unsigned int;
... now in another file (or far away from the previous line) ...
const index_t max_index = -1;
First, we do not understand why max_index is -1.
Worst, if someone wants to improve the code and define
using index_t = ptrdiff_t;
=> then the statement max_index=-1 is not the max anymore and you get a buggy code. Again this can not happen with something like:
const index_t max_index = std::numeric_limits<index_t>::max();
CAVEAT: nevertheless there is a caveat when using std::numeric_limits. It has nothing to do with integers, but is related to floating point numbers.
std::cout << "\ndouble lowest: "
<< std::numeric_limits<double>::lowest()
<< "\ndouble min : "
<< std::numeric_limits<double>::min() << '\n';
prints:
double lowest: -1.79769e+308
double min : 2.22507e-308 <-- maybe you expected -1.79769e+308 here!
min returns the smallest finite value of the given type
lowest returns the lowest finite value of the given type
Always interesting to remember that, as it can be a source of bug if we do not pay attention to (using min instead of lowest).
Is -1 correct for using as maximum value of an unsigned integer?
Yes, it is functionally correct when used as a direct assignment/initialization. Yet often looks questionable #Ron.
Constants from limits.h or std::numeric_limits convey more code understanding, yet need maintenance should the type of i change.
[Note] OP later drop the C tag.
To add an alternative to assigning a maximum value (available in C11) that helps reduce code maintenance:
Use the loved/hated _Generic
#define info_max(X) _Generic((X), \
long double: LDBL_MAX, \
double: DBL_MAX, \
float: FLT_MAX, \
unsigned long long: ULLONG_MAX, \
long long: LLONG_MAX, \
unsigned long: ULONG_MAX, \
long: LONG_MAX, \
unsigned: UINT_MAX, \
int: INT_MAX, \
unsigned short: USHRT_MAX, \
short: SHRT_MAX, \
unsigned char: UCHAR_MAX, \
signed char: SCHAR_MAX, \
char: CHAR_MAX, \
_Bool: 1, \
default: 1/0 \
)
int main() {
...
some_basic_type i = info_max(i);
...
}
The above macro info_max() have limitations concerning types like size_t, intmax_t, etc. that may not be enumerated in the above list. There are more complex macros that can cope with that. The idea here is illustrative.
The technical side has been covered by other answers; and while you focus on technical correctness in your question, pointing out the cleanness aspect again is important, because imo that’s the much more important point.
The major reason why it is a bad idea to use that particular trickery is: The code is ambiguous. It is unclear whether someone used the unsigned trickery intentionally or made a mistake and actually wanted to initialize a signed variable to -1. Should your colleague mention a comment after you present this argument, tell him to stop being silly. :)
I’m actually slightly baffled that someone would even consider this trick in earnest. There’s an unambigous, intuitive and idiomatic way to set a value to its max in C: the _MAX macros. And there’s an additional, equally unambigous, intuitive and idiomatic way in C++ that provides some more type safety: numeric_limits. That -1 trick is a classic case of being clever.
The C++ standard says this about signed to unsigned conversions ([conv.integral]/2):
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo
2n where n is the number of bits used to represent the unsigned type). [ Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]
So yes, converting -1 to an n-bit unsigned integer will always give you 2n-1, regardless of which signed integer type the -1 started as.
Whether or not unsigned x = -1; is more or less readable than unsigned x = UINT_MAX; though is another discussion (there's definitely the chance that it'll raise some eyebrows, maybe even your own when you look at your own code later;).

Is static_cast<T>(-1) the right way to generate all-one-bits data without numeric_limits?

I'm writing C++ code in an environment in which I don't have access to the C++ standard library, specifically not to std::numeric_limits. Suppose I want to implement
template <typename T> constexpr T all_ones( /* ... */ )
Focusing on unsigned integral types, what do I put there? Specifically, is static_cast<T>(-1) good enough? (Other types I could treat as an array of unsigned chars based on their size I guess.)
Use the bitwise NOT operator ~ on 0.
T allOnes = ~(T)0;
A static_cast<T>(-1) assumes two's complement, which is not portable. If you are only concerned about unsigned types, hvd's answer is the way to go.
Working example: https://ideone.com/iV28u0
Focusing on unsigned integral types, what do I put there? Specifically, is static_cast(-1) good enough
If you're only concerned about unsigned types, yes, converting -1 is correct for all standard C++ implementations. Operations on unsigned types, including conversions of signed types to unsigned types, are guaranteed to work modulo (max+1).
This disarmingly direct way.
T allOnes;
memset(&allOnes, ~0, sizeof(T));
Focusing on unsigned integral types, what do I put there?
Specifically, is static_cast(-1) good enough
Yes, it is good enough.
But I prefer a hex value because my background is embedded systems, and I have always had to know the sizeof(T).
Even in desktop systems, we know the sizes of the following T:
uint8_t allones8 = 0xff;
uint16_t allones16 = 0xffff;
uint32_t allones32 = 0xffffffff;
uint64_t allones64 = 0xffffffffffffffff;
Another way is
static_cast<T>(-1ull)
which would be more correct and works in any signed integer format, regardless of 1's complement, 2's complement or sign-magnitude. You can also use static_cast<T>(-UINTMAX_C(1))
Because unary minus of an unsigned value is defined as
The negative of an unsigned quantity is computed by subtracting its value from 2^n, where n is the number of bits in the promoted operand."
Therefore -1u will always return an all-one-bits data in unsigned int. ll suffix is to make it work for any types narrower than unsigned long long. There's no extended integer types (yet) in C++ so this should be fine
However a solution that expresses the intention clearer would be
static_cast<T>(~0ull)

The "unsigned" keyword [duplicate]

This question already has answers here:
Difference between unsigned and unsigned int in C
(5 answers)
Closed 9 years ago.
I saw in some C++ code the keyword "unsigned" in the following form:
const int HASH_MASK = unsigned(-1) >> 1;
and later:
unsigned hash = HASH_SEED;
(it is taken from the CS106B/X reader - of Stanford - by Eric S. Roberts - on the topic of "implementation of the hash code function for strings").
Can someone tell me please what does that keyword mean and when do I use it anyway?
Thanks!
Take a look: https://stackoverflow.com/a/7176690/1758762
unsigned is a modifier which can apply to any integral type (char,
short, int, long, etc.) but on its own it is identical to unsigned
int.
It's a short version of unsigned int. Syntactically, you can use it anywhere you would use any other datatype like float or short.
Unsigned types are types that can't represent negative numbers; only zero and positive numbers. In C++, they use modular arithmetic; the modulus for an N-bit type is 2^N. It's a good idea to use unsigned rather than signed types when messing around with bit patterns (for example, when calculating hash codes), since C++ allows several different representations of negative numbers which could lead to portability issues.
unsigned can be used as a qualifier for any integer type (e.g. unsigned int or unsigned long long); or on its own as shorthand for unsigned int.
So the first converts -1 into unsigned int. Due to modular arithmetic, this gives the largest representable value. This could also be written (more clearly, in my opinion) as std::numeric_limits<unsigned>::max().
The second declares and initialises a variable of type unsigned int.
Values are signed by default, which means they can be positive or negative. The unsigned keyword is used to specify that a value must be positive.
Signed variables use 1 bit to specify whether the value is positive or not. The unsigned keyword actualy makes this bit part of the value (thus allowing bigger numbers to be stored).
Lastly, unsigned hash is interpreted by compilers as unsigned int hash (int being the default type in C programming).
To get a good idea what unsigned means, one has to understand signed and unsigned integers. For a full explanation of twos-compliment, search Wikipedia, but in a nutshell, a computer stores negative numbers by subtracting negative numbers from 2^32 (for a 32-bit integer). In this way, -1 is stored as 2^32-1. This does mean that you only have 2^31 positive numbers, but that is by the by. This is known as signed integers (as it can have positive or negative sign)
Unsigned tells the compiler that you don't want twos compliment and are dealing only in positive numbers. When -1 is typecast (as it is in the code) to an unsigned int it becomes
2^32-1 = 0b111111111...
Thus that is an easy way of getting a whole lot of 1s in binary.
Use unsigned rarely. If you need to do bit operations, or for some reason need only positive integers bigger than 2^31. Otherwise, if you leave it out, c++ assumes signed integers.
C allows chars to be signed or unsigned, depending on which is more efficient for the host computer. if you want to be sure your char is unsigned, you can declare your variable to be unsigned char. You can use signed char if you want the ensure signed interpretation.
Incidentally, the C and C++ compilers treatd char, signed char, and unsigned char as three distinct types, even though char is compiled into one of the other two.

Differences in assignment of integer variable

I just asked this question and it got me thinking if there is any reason
1)why you would assign a int variable using hexidecimal or octal instead of decimal and
2)what are the difference between the different way of assignment
int a=0x28ff1c; // hexideciaml
int a=10; //decimal (the most commonly used way)
int a=012177434; // octal
You may have some constants that are more easily understood when written in hexadecimal.
Bitflags, for example, in hexadecimal are compact and easily (for some values of easily) understood, since there's a direct correspondence 4 binary digits => 1 hex digit - for this reason, in general the hexadecimal representation is useful when you are doing bitwise operations (e.g. masking).
In a similar fashion, in several cases integers may be internally divided in some fields, for example often colors are represented as a 32 bit integer that goes like this: 0xAARRGGBB (or 0xAABBGGRR); also, IP addresses: each piece of IP in the dotted notation is two hexadecimal digits in the "32-bit integer" notation (usually in such cases unsigned integers are used to avoid messing with the sign bit).
In some code I'm working on at the moment, for each pixel in an image I have a single byte to use to store "accessory information"; since I have to store some flags and a small number, I use the least significant 4 bits to store the flags, the 4 most significant ones to store the number. Using hexadecimal notations it's immediate to write the appropriate masks and shifts: byte & 0x0f gives me the 4 LS bits for the flags, (byte & 0xf0)>>4 gives me the 4 MS bits (re-shifted in place).
I've never seen octal used for anything besides IOCCC and UNIX permissions masks (although in the last case they are actually useful, as you probably know if you ever used chmod); probably their inclusion in the language comes from the fact that C was initially developed as the language to write UNIX.
By default, integer literals are of type int, while hexadecimal literals are of type unsigned int or larger if unsigned int isn't large enough to hold the specified value. So, when assigning a hexadecimal literal to an int there's an implicit conversion (although it won't impact the performance, any decent compiler will perform the cast at compile time). Sorry, brainfart. I checked the standard right now, it goes like this:
decimal literals, without the u suffix, are always signed; their type is the smallest that can represent them between int, long int, long long int;
octal and hexadecimal literals without suffix, instead, may also be of unsigned type; their actual type is the smallest one that can represent the value between int, unsigned int, long int, unsigned long int, long long int, unsigned long long int.
(C++11, §2.14.2, ¶2 and Table 6)
The difference may be relevant for overload resolution1, but it's not particularly important when you are just assigning a literal to a variable. Still, keep in mind that you may have valid integer constants that are larger than an int, i.e. assignment to an int will result in signed integer overflow; anyhow, any decent compiler should be able to warn you in these cases.
Let's say that on our platform integers are in 2's complement representation, int is 16 bit wide and long is 32 bit wide; let's say we have an overloaded function like this:
void a(unsigned int i)
{
std::cout<<"unsigned";
}
void a(int i)
{
std::cout<<"signed";
}
Then, calling a(1) and a(0x1) will produce the same result (signed), but a(32768) will print signed and a(0x10000) will print unsigned.
It matters from a readability standpoint - which one you choose expresses your intention.
If you're treating the variable as an integral type, you know, like 2+2=4, you use the decimal representation. It's intuitive and straight-forward.
If you're using it as a bitmask, you can use hexa, octal or even binary. For example, you'll know
int a = 0xFF;
will have the last 8 bits set to 1. You'll know that
int a = 0xF0;
is (...)11110000, but you couldn't directly say the same thing about
int a = 240;
although they are equivalent. It just depends on what you use the numbers for.
well the truth is it doesn't matter if you want it on decimal, octal or hexadecimal its just a representation and for your information, numbers in computers are stored in binary(so they are just 0's and 1's) which you can use also to represent a number. so its just a matter of representation and readability.
NOTE:
Well in some of C++ debuggers(in my experience) I assigned a number as a decimal representation but in my debugger it is shown as hexadecimal.
It's similar to the assignment of and integer this way:
int a = int(5);
int b(6);
int c = 3;
it's all about preference, and when it breaks down you're just doing the same thing. Some might choose octal or hex to go along with their program that manipulates that type of data.

C++ char definition from binary string and overflow

I have a datatype that's more or less a character array. Each space in the array holds a char, which, as per my understanding, is a single byte (8 bits) of information. I need to be able to specify the char value through a binary string... for instance
char someChar = char(0b00110011);
What I don't understand is why the max value I can specify is 0b0XXXXXXX, where I have to leave that MSB set to zero. If I try setting the char like so
char someChar = char(0b11111111);
I get a decimal value: -2147483648, which looks very much like overflow. So I don't really get what's going on here. If I call the sizeof() operator on char, I get an answer of 1 (one byte). Doesn't that mean that I either get 0-255 if the char is unsigned, or -128-127 if the char is signed? Any advice/input would be appreciated.
In response to most of the comments -- I converted it to an int before printing it out:
std::cerr << int(someChar)
Thanks to all for the thorough explanations :)
char is signed in this case, so setting the top bit will give a negative value. Use unsigned char if you don't want to worry about positive/negative values.
As for the negative integer value - please show how you're converting/displaying the char.
NB. You can use signed char or unsigned char to tell the compiler explicitly what you want.
-2147483648 in binary is 10000000 00000000 00000000 01111111.
When you declare you char in binary, you compiler interprets it as a signed char, which is the case for the most compilers. The leftmost bit is interpreted as the sign bit.
Upon conversion to int, the bit pattern of the value is copied, therefore the seven rightmost bits, and the sign bit is moved to the MSB of the 32-bit block.
You have two main problems here :
First, it seems that you except someChar to be unsigned. If that's the case, you should tell it to your compiler : unsigned char someChar = unsigned char(0b11111111);
Second, the way you put it to the console (which is unknown to us) apparently involves a conversion to int. If it's not needed, there is likely a way to print someChar for what it is really, i.e. a signed char.