The book told that writing:
static unsigned int foo;
and later
if( foo > 0)
{
is wrong, and it will leads to a hard to find bug.
Why is that?
In the x86 assembly language programming there are signed arithmetic instructions and
also unsigned arithmetic instructions,
JG JL <-signed arithmetic
JB JA <- unsigned instructions.
So the compiler can just assemble that if (foo >0 ) statement with unsigned instructions
isn't it? Can somebody explain how it works in advance?
Is that instruction wrong? Or if there is a difference in "C" where "C++" is strict in
that case? Please explain.
Here we are comparing a unsigned variable with a immediate value. What is happening inside
the compiler actually in this case?
And when we compare a signed value with unsigned value what happens? Then what instructions will compiler choose, signed instructions or unsigned instructions?
--thanks in advance--
This question should not be answered on the level of assembler but stil on c/c++ language level. On most architectures it is impossible to compare signed and unsigned numbers, and c/c++ does not facilitate such comparisons. Instead there are rules about converting one of the operands to type of the other in order to compare them - see for example aswers to this question
About comparing to literals - typical way of doing it (as you did) is not wrong, but you can do it better - according to c++ standard:
2.13.1.1 An integer literal is a sequence of digits that has no period
or exponent part. An integer literal may have a prefix that specifies
its base and a suffix that specifies its type. The lexically first
digit of the sequence of digits is the most significant. A decimal
integer literal (base ten) begins with a digit other than 0 and con-
sists of a sequence of decimal digits. An octal integer literal (base
eight) begins with the digit 0 and con- sists of a sequence of octal
digits.22) A hexadecimal integer literal (base sixteen) begins with 0x
or 0X and consists of a sequence of hexadecimal digits, which include
the decimal digits and the letters a through f and A through F with
decimal values ten through fifteen. [Example: the number twelve can be
written 12, 014, or 0XC. ]
2.13.1.2 The type of an integer literal depends on its form, value,
and suffix. If it is decimal and has no suffix, it has the first of
these types in which its value can be represented: int, long int; if
the value cannot be repre- sented as a long int, the behavior is
undefined. If it is octal or hexadecimal and has no suffix, it has the first of these types in which its value can be represented: int,
unsigned int, long int, unsigned long int. If it is suffixed by u or
U, its type is the first of these types in which its value can be
repre- sented: unsigned int, unsigned long int. If it is suffixed by l
or L, its type is the first of these types in which its value can be
represented: long int, unsigned long int. If it is suffixed by ul, lu,
uL, Lu, Ul, lU, UL, or LU, its type is unsigned long int.
If you want to be sure about your literal type (and therefore comaprison type) add described suffixes to ensure right type of literal.
It is also worth noticing that literal 0 is actually not decimal but octal - it doesn't seem to change anything, but is quite unexpected - or am I wrong?
To summarize - it is not wrong to write code like that, but you should remeber that in certain conditions in might behave counter-intuitive (or at least counter-mathematical ;)
Related
The decimal number 4294967295 is equal to hexadecimal 0xFFFFFFFF, so I would expect a literal to have the same type regardless of what base it is expressed in, yet
std::is_same<decltype(0xFFFFFFFF), decltype(4294967295)>::value; //evaluates false
It appears that on my compiler decltype(0xFFFFFFFF) is unsigned int, while decltype(4294967295) is signed long.
hex literals and decimal literals types are determined differently from lex.icon table 7
The type of an integer literal is the first of the corresponding list in Table 7 in which its value can be represented.
when there is no suffix for decimal literal the types listed are in order:
integer
long int
long long int
for hexidecimal the list in order are:
int
unsigned int
long int
unsigned long int
long long int
unsigned long long int
Why does this difference exist? Considering we also have this in C, we can look at the C99 rationale document and it says:
Unlike decimal constants, octal and hexadecimal constants too large to be ints are typed as
unsigned int if within range of that type, since it is more likely that they represent bit
patterns or masks, which are generally best treated as unsigned, rather than “real” numbers.
I'm currently working through C++ Primer (5th Edition), and I'm struggling trying to figure out what the author means in this part on literals (Chapter 2, section 2.1.3):
... By default, decimal literals are signed whereas octal and hexadecimal literals can be either signed or unsigned types. A decimal literal has the smallest type of int, long, or long long (i.e., the first type in this list) in which the literal’s value fits. Octal and hexadecimal literals have the smallest type of int, unsigned int, long, unsigned long, long long, or unsigned long long in which the literal’s value fits. It is an error to use a literal that is too large to fit in the largest related type...
In the first sentence, does the author mean that decimal literals are signed according to the C++ standard, and for octal and hexadecimal literals it depends on the compiler?
The next three sentences really confuse me though, so if someone could offer an alternative explaination, it would be greatly appreciated.
If you have an integer literal for example a decimal integer literal the compiler has to define its type. For example a decimal literal can be used in expressions and the compiler need to determine the type of an expression based on the types of its operands.
So for decimal integer literals the compiler selects between the following types
int
long int
long long int
and choices the first type that can accomodate the decimal literal.
It does not consider unsigned integer types as for example unsigned int or unsigned long int though they could accomodate a given literal.
The situation is different when the compiler deals with octal or hexadecimal integer literals. In this case it considers the following types in the given order
int
unsigned int
long int
unsigned long int
long long int
unsigned long long int
That it would be more clear consider an artificial example to demonstrate the idea. Let's assume that you have a value equal to 127. This value can be stored in type signed char. Now what about value 128? It can not be stored in an object of type signed char because the maximum positive value that can be stored in an object of type signed char is 127.
What to do? We could store 128 in an object of type unsigned char because its maximum value is 255. However the compiler prefers to store it in an object of type signed short.
But if this value was specified like 0x80 then the compiler would select an object of type unsigned char
It is of course an imaginary process.
However in realty a similar algorithm is used for decimal literals only the compiler takes into account integer types starting from int that to determine the type of a decimal literal.
Decimal (meaning base-10) literals are those that have no prefix. The author is saying that these are always signed.
5 // signed int (decimal)
12 // signed int (decimal)
They can also be signed or unsigned based on either you providing a suffix. Here's a full reference for integer literal syntax.
5 // signed int
7U // unsigned int
7UL // unsigned long
Hex (base-8) values will be prefixed with 0x.
0x05 // int (hex)
Similarly octal (base-8) values are prefixed with 0.
05 // int (octal)
To append to Cory's answer:
The relevant diagram in the link states
Types allowed for integer literals
No suffix, regular decimal
int, long int, long long int(since C++11)
So the decimal number
78625723
Is represented by a signed type.
No suffix hexadecimal or octal bases
int, long int,
unsigned int, unsigned long int
long long int(since C++11)
unsigned long long int(since C++11)
So the 0x hex number
0x78625723
Might be represented by a signed or an unsigned value.
The place this is relevant is when you have literal values that are just a little too big to fit in a signed type, but do fit in the corresponding unsigned type. For example, on a machine with 16-bit int and 32-bit long (rare these days, but the minimum allowed by the spec), the constant literal 0xffff will be an unsigned int, while the literal 65535 (same value) will be a long.
Of course, you can force the latter to be an unsigned by using a U suffix; this part of the spec is only relevant for literals with no suffix.
I am looking through some c++ code and I came across this:
if( (size & 0x03L) != 0 )
throw MalformedBundleException( "bundle size must be multiple of four" );
what does L stand for after the hexadecimal value ?
how does it alter the value 0x03 ?
It means Long, as in, the type of the literal 0x03L is long instead of the default int. On some platforms that will mean 64 bits instead of 32 bits, but that's entirely platform-dependent (the only guarantee is that long is not shorter than int).
This suffix sets the type of the numeric literal. L stands for long; LL stands for long long type. The number does not need to be hex - it works on decimals and octals as well.
3LL // A decimal constant 3 of type long long
03L // An octal constant 3 of type long
0x3L // A hex constant 3 of type long
It means so-called long-suffix of integer literals and denotes that the type of the literal is int long The integer literal in your example is hexadecomal integer literal of type int long.
You can meet also two LL (or ll) that denote type int long long
This question already has answers here:
Difference between unsigned and unsigned int in C
(5 answers)
Closed 9 years ago.
I saw in some C++ code the keyword "unsigned" in the following form:
const int HASH_MASK = unsigned(-1) >> 1;
and later:
unsigned hash = HASH_SEED;
(it is taken from the CS106B/X reader - of Stanford - by Eric S. Roberts - on the topic of "implementation of the hash code function for strings").
Can someone tell me please what does that keyword mean and when do I use it anyway?
Thanks!
Take a look: https://stackoverflow.com/a/7176690/1758762
unsigned is a modifier which can apply to any integral type (char,
short, int, long, etc.) but on its own it is identical to unsigned
int.
It's a short version of unsigned int. Syntactically, you can use it anywhere you would use any other datatype like float or short.
Unsigned types are types that can't represent negative numbers; only zero and positive numbers. In C++, they use modular arithmetic; the modulus for an N-bit type is 2^N. It's a good idea to use unsigned rather than signed types when messing around with bit patterns (for example, when calculating hash codes), since C++ allows several different representations of negative numbers which could lead to portability issues.
unsigned can be used as a qualifier for any integer type (e.g. unsigned int or unsigned long long); or on its own as shorthand for unsigned int.
So the first converts -1 into unsigned int. Due to modular arithmetic, this gives the largest representable value. This could also be written (more clearly, in my opinion) as std::numeric_limits<unsigned>::max().
The second declares and initialises a variable of type unsigned int.
Values are signed by default, which means they can be positive or negative. The unsigned keyword is used to specify that a value must be positive.
Signed variables use 1 bit to specify whether the value is positive or not. The unsigned keyword actualy makes this bit part of the value (thus allowing bigger numbers to be stored).
Lastly, unsigned hash is interpreted by compilers as unsigned int hash (int being the default type in C programming).
To get a good idea what unsigned means, one has to understand signed and unsigned integers. For a full explanation of twos-compliment, search Wikipedia, but in a nutshell, a computer stores negative numbers by subtracting negative numbers from 2^32 (for a 32-bit integer). In this way, -1 is stored as 2^32-1. This does mean that you only have 2^31 positive numbers, but that is by the by. This is known as signed integers (as it can have positive or negative sign)
Unsigned tells the compiler that you don't want twos compliment and are dealing only in positive numbers. When -1 is typecast (as it is in the code) to an unsigned int it becomes
2^32-1 = 0b111111111...
Thus that is an easy way of getting a whole lot of 1s in binary.
Use unsigned rarely. If you need to do bit operations, or for some reason need only positive integers bigger than 2^31. Otherwise, if you leave it out, c++ assumes signed integers.
C allows chars to be signed or unsigned, depending on which is more efficient for the host computer. if you want to be sure your char is unsigned, you can declare your variable to be unsigned char. You can use signed char if you want the ensure signed interpretation.
Incidentally, the C and C++ compilers treatd char, signed char, and unsigned char as three distinct types, even though char is compiled into one of the other two.
I just asked this question and it got me thinking if there is any reason
1)why you would assign a int variable using hexidecimal or octal instead of decimal and
2)what are the difference between the different way of assignment
int a=0x28ff1c; // hexideciaml
int a=10; //decimal (the most commonly used way)
int a=012177434; // octal
You may have some constants that are more easily understood when written in hexadecimal.
Bitflags, for example, in hexadecimal are compact and easily (for some values of easily) understood, since there's a direct correspondence 4 binary digits => 1 hex digit - for this reason, in general the hexadecimal representation is useful when you are doing bitwise operations (e.g. masking).
In a similar fashion, in several cases integers may be internally divided in some fields, for example often colors are represented as a 32 bit integer that goes like this: 0xAARRGGBB (or 0xAABBGGRR); also, IP addresses: each piece of IP in the dotted notation is two hexadecimal digits in the "32-bit integer" notation (usually in such cases unsigned integers are used to avoid messing with the sign bit).
In some code I'm working on at the moment, for each pixel in an image I have a single byte to use to store "accessory information"; since I have to store some flags and a small number, I use the least significant 4 bits to store the flags, the 4 most significant ones to store the number. Using hexadecimal notations it's immediate to write the appropriate masks and shifts: byte & 0x0f gives me the 4 LS bits for the flags, (byte & 0xf0)>>4 gives me the 4 MS bits (re-shifted in place).
I've never seen octal used for anything besides IOCCC and UNIX permissions masks (although in the last case they are actually useful, as you probably know if you ever used chmod); probably their inclusion in the language comes from the fact that C was initially developed as the language to write UNIX.
By default, integer literals are of type int, while hexadecimal literals are of type unsigned int or larger if unsigned int isn't large enough to hold the specified value. So, when assigning a hexadecimal literal to an int there's an implicit conversion (although it won't impact the performance, any decent compiler will perform the cast at compile time). Sorry, brainfart. I checked the standard right now, it goes like this:
decimal literals, without the u suffix, are always signed; their type is the smallest that can represent them between int, long int, long long int;
octal and hexadecimal literals without suffix, instead, may also be of unsigned type; their actual type is the smallest one that can represent the value between int, unsigned int, long int, unsigned long int, long long int, unsigned long long int.
(C++11, §2.14.2, ¶2 and Table 6)
The difference may be relevant for overload resolution1, but it's not particularly important when you are just assigning a literal to a variable. Still, keep in mind that you may have valid integer constants that are larger than an int, i.e. assignment to an int will result in signed integer overflow; anyhow, any decent compiler should be able to warn you in these cases.
Let's say that on our platform integers are in 2's complement representation, int is 16 bit wide and long is 32 bit wide; let's say we have an overloaded function like this:
void a(unsigned int i)
{
std::cout<<"unsigned";
}
void a(int i)
{
std::cout<<"signed";
}
Then, calling a(1) and a(0x1) will produce the same result (signed), but a(32768) will print signed and a(0x10000) will print unsigned.
It matters from a readability standpoint - which one you choose expresses your intention.
If you're treating the variable as an integral type, you know, like 2+2=4, you use the decimal representation. It's intuitive and straight-forward.
If you're using it as a bitmask, you can use hexa, octal or even binary. For example, you'll know
int a = 0xFF;
will have the last 8 bits set to 1. You'll know that
int a = 0xF0;
is (...)11110000, but you couldn't directly say the same thing about
int a = 240;
although they are equivalent. It just depends on what you use the numbers for.
well the truth is it doesn't matter if you want it on decimal, octal or hexadecimal its just a representation and for your information, numbers in computers are stored in binary(so they are just 0's and 1's) which you can use also to represent a number. so its just a matter of representation and readability.
NOTE:
Well in some of C++ debuggers(in my experience) I assigned a number as a decimal representation but in my debugger it is shown as hexadecimal.
It's similar to the assignment of and integer this way:
int a = int(5);
int b(6);
int c = 3;
it's all about preference, and when it breaks down you're just doing the same thing. Some might choose octal or hex to go along with their program that manipulates that type of data.