I have the below simple program:
#include <iostream>
#include <stdio.h>
void SomeFunction(int a)
{
std::cout<<"Value in function: a = "<<a<<std::endl;
}
int main(){
size_t a(0);
std::cout<<"Value in main: "<<a-1<<std::endl;
SomeFunction(a-1);
return 0;
}
Upon executing this I get:
Value in main: 18446744073709551615
Value in function: a = -1
I think I roughly understand why the function gets the 'correct' value of -1: there is an implicit conversion from the unsigned type to the signed one i.e. 18446744073709551615(unsigned) = -1(signed).
Is there any situation where the function will not get the 'correct' value?
Since size_t type is unsigned, subtracting 1 is well defined:
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
However, the resultant value of 264-1 is out of ints range, so you get implementation-defined behavior:
[when] the new type is signed and the value cannot be represented in it, either the result is implementation-defined or an implementation-defined signal is raised.
Therefore, the answer to your question is "yes": there are platforms where the value of a would be different; there are also platforms where instead of calling SomeFunction the program will raise a signal.
Not on your computer... but technically yes, there is a situation where things can go wrong.
All modern PCs use the "two's complement" system for signed integer arithmetic (read Wikipedia for details). Two's complement has many advantages, but one of the biggest is this: unsaturated addition and subtraction of signed integers is identical to that of unsigned integers. As long as overflow/underflow causes the result to "wrap around" (i.e., 0-1 = UINT_MAX), the computer can add and subtract without even knowing whether you're interpreting the numbers as signed or unsigned.
BUT! C/C++ do not technically require two's complement for signed integers. There are two other permissible systems, known as "sign-magnitude" and "one's complement". These are unusual systems, never found outside antique architectures and embedded processors (and rarely even there). But in those systems, signed and unsigned arithmetic do not match up, and (signed)(a+b) will not necessarily equal (signed)a + (signed) b.
There's another, more mundane caveat when you're also narrowing types, as is the case between size_t and int on x64, because C/C++ don't require compilers to follow a particular rule when narrowing out-of-range values to signed types. This is likewise more a matter of language lawyering than actual unsafeness, though: VC++, GCC, Clang, and all other compilers I'm aware of narrow through truncation, leading to the expected behavior.
It's easier to compare and contrast signed and insigned of type same basic type, say signed int and unsigned int.
On a system that uses 32 bits for int, the range of unsigned int is [0 - 4294967295] and the range of signed int is [-2147483647 - 2147483647].
Say you have a variable of type unsigned int and its value is greater than 2147483647. If you pass such a variable to SomeFunction, you will see an incorrect value in the function.
Conversely, say you have a variable of type signed int and its value is less than zero. If you pass such a variable to a function that expects an unsigned int, you will see an incorrect value in the function.
Related
I know that in order to get the 4 least significant bytes of a number of type long I can cast it to int/unsigned int or use a bitwise AND (& 0xFFFFFFFF).
This code produces the following output:
#include <stdio.h>
int main()
{
long n = 0x8899AABBCCDDEEFF;
printf("0x%016lX\n", n);
printf("0x%016X\n", (int)n);
printf("0x%016X\n", (unsigned int)n);
printf("0x%016lX\n", n & 0xFFFFFFFF);
}
Output:
0x8899AABBCCDDEEFF
0x00000000CCDDEEFF
0x00000000CCDDEEFF
0x00000000CCDDEEFF
Does that mean that the two methods used are equivalent? If so, do they always produce the same output regardless of the platform/compiler?
Also, is there any catch or pitfall while casting to unsigned int rather than int for the purpose of this question?
Finally, why is the output the same if you change the number n to be an unsigned long instead?
The methods are definitely different.
According to integral conversion rules (cf, for example, this online c++11 standard), a conversion (e.g. through an explicit cast) from one integral type to another depends on whether the destination type is signed or unsigned. If the destination type is unsigned, one can rely on a "modulo 2n" truncation, whereas with signed destination types one could tap into implementation defined behaviour:
4.7 Integral conversions [conv.integral]
2 If the destination type is unsigned, the resulting value is the
least unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type). [
Note: In a two's complement representation, this conversion is
conceptual and there is no change in the bit pattern (if there is no
truncation). — end note ]
3 If the destination type is signed, the value is unchanged if it can
be represented in the destination type (and bit-field width);
otherwise, the value is implementation-defined.
For your first question, as others have pointed out, the size of int and long is dependent on the platform, so the methods are not equivalent. In C data types, check that the types say "at least XX bits in size"
For the second question, it comes down to this: long and int are signed, meaning that one bit is reserved for sign (take a look also to two's complement). If you were the compiler, what can you do with negative values (especially the long ones)? As Stepahn Lechner mentioned, this is implementation defined (that is, is up to the compiler).
Finally, in the spirit of "your code must do what it says it does", the best thing to do if you need to do masks is to use masks (and, if you use masks, use unsigned types). Don't try to use cleaver answers. Believe me, they always bite you in the rear. I've dealt with a lot of legacy code to know that by heart.
What's the difference between casting a long to int versus using a bitwise AND in order to get the 4 least significant bytes?
Type. Casting makes the value an int. And'ing does not change the type.
Range. Depending on int,long range, a cast may not change the value at all.
IDB and UB. implementation defined behavior and undefined behavior are present with mixing signed-ness.
To "get" the 4 LSBytes, use & 0xFFFFFFFFu or cast to uint32_t.
OP's question is unnecessarily convoluted.
long n = 0x8899AABBCCDDEEFF; --> Converting a value outside the range of a signed integer type is implementation-defined.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
C11 §6.3.1.3 3
printf("0x%016lX\n", n); --> Printing a long with a "%lX" outside the the common range of long/unsigned long is undefined behavior.
Let's go forward with unsigned long:
unsigned long n = 0x8899AABBCCDDEEFF; // no problem,
printf("0x%016lX\n", n); // no problem,
printf("0x%016X\n", (int)n); // problem, C11 6.3.1.3 3
printf("0x%016X\n", (unsigned int)n); // no problem,
printf("0x%016lX\n", n & 0xFFFFFFFF); // no problem,
The "no problem" are OK even is unsigned long is 32-bit or 64-bit. The output will differ, yet is OK.
Recall that int,long are not always 32,64 bit. (16,32), (32,32), (32,64) are common.
int is at least 16 bit.
long is at least that of int and at least 32 bit.
unsigned int t = 10;
int d = 16;
float c = t - d;
int e = t - d;
Why is the value of c positive but e negative?
Let's start by analysing the result of t - d.
t is an unsigned int while d is an int, so to do arithmetic on them, the value of d is converted to an unsigned int (C++ rules say unsigned gets preference here). So we get 10u - 16u, which (assuming 32-bit int) wraps around to 4294967290u.
This value is then converted to float in the first declaration, and to int in the second one.
Assuming the typical implementation of float (32-bit single-precision IEEE), its highest representable value is roughly 1e38, so 4294967290u is well within that range. There will be rounding errors, but the conversion to float won't overflow.
For int, the situation's different. 4294967290u is too big to fit into an int, so wrap-around happens and we arrive back at the value -6. Note that such wrap-around is not guaranteed by the standard: the resulting value in this case is implementation-defined(1), which means it's up to the compiler what the result value is, but it must be documented.
(1) C++17 (N4659), [conv.integral] 7.8/3:
If the destination type is signed, the value is unchanged if it can be represented in the destination type;
otherwise, the value is implementation-defined.
First, you have to understand "usual arithmetic conversions" (that link is for C, but the rules are the same in C++). In C++, if you do arithmetic with mixed types (you should avoid that when possible, by the way), there's a set of rules that decides which type the calculation is done in.
In your case, you are subtracting a signed int from an unsigned int. The promotion rules say that the actual calculation is done using unsigned int.
So your calculation is 10 - 16 in unsigned int arithmetic. Unsigned arithmetic is modulo arithmetic, meaning that it wraps around. So, assuming your typical 32-bit int, the result of this calculation is 2^32 - 6.
This is the same for both lines. Note that the subtraction is completely independent from the assignment; the type on the left side has absolutely no influence on how the calculation happens. It is a common beginner mistake to think that the type on the left side somehow influences the calculation; but float f = 5 / 6 is zero, because the division still uses integer arithmetic.
The difference, then, is what happens during the assignment. The result of the subtraction is implicitly converted to float in one case, and int in the other.
The conversion to float tries to find the closest value to the actual one that the type can represent. This will be some very large value; not quite the one the original subtraction yielded though.
The conversion to int says that if the value fits into the range of int, the value will be unchanged. But 2^32 - 6 is far larger than the 2^31 - 1 that a 32-bit int can hold, so you get the other part of the conversion rule, which says that the resulting value is implementation-defined. This is a term in the standard that means "different compilers can do different things, but they have to document what they do".
For all practical purposes, all compilers that you'll likely encounter say that the bit pattern stays the same and is just interpreted as signed. Because of the way 2's complement arithmetic works (the way that almost all computers represent negative numbers), the result is the -6 you would expect from the calculation.
But all this is a very long way of repeating the first point, which is "don't do mixed type arithmetic". Cast the types first, explicitly, to types that you know will do the right thing.
The C and C++ standards both allow signed and unsigned variants of the same integer type to alias each other. For example, unsigned int* and int* may alias. But that's not the whole story because they clearly have a different range of representable values. I have the following assumptions:
If an unsigned int is read through an int*, the value must be within the range of int or an integer overflow occurs and the behaviour is undefined. Is this correct?
If an int is read through an unsigned int*, negative values wrap around as if they were casted to unsigned int. Is this correct?
If the value is within the range of both int and unsigned int, accessing it through a pointer of either type is fully defined and gives the same value. Is this correct?
Additionally, what about compatible but not equivalent integer types?
On systems where int and long have the same range, alignment, etc., can int* and long* alias? (I assume not.)
Can char16_t* and uint_least16_t* alias? I suspect this differs between C and C++. In C, char16_t is a typedef for uint_least16_t (correct?). In C++, char16_t is its own primitive type, which compatible with uint_least16_t. Unlike C, C++ seems to have no exception allowing compatible but distinct types to alias.
If an unsigned int is read through an int*, the value must be
within the range of int or an integer overflow occurs and the
behaviour is undefined. Is this correct?
Why would it be undefined? there is no integer overflow since no conversion or computation is done. We take an object representation of an unsigned int object and see it through an int. In what way the value of the unsigned int object transposes to the value of an int is completely implementation defined.
If an int is read through an unsigned int*, negative values wrap
around as if they were casted to unsigned int. Is this correct?
Depends on the representation. With two's complement and equivalent padding, yes. Not with signed magnitude though - a cast from int to unsigned is always defined through a congruence:
If the destination type is unsigned, the resulting value is the
least unsigned integer congruent to the source integer (modulo
2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this
conversion is conceptual and there is no change in the bit pattern (if
there is no truncation). — end note ]
And now consider
10000000 00000001 // -1 in signed magnitude for 16-bit int
This would certainly be 215+1 if interpreted as an unsigned. A cast would yield 216-1 though.
If the value is within the range of both int and unsigned int,
accessing it through a pointer of either type is fully defined and
gives the same value. Is this correct?
Again, with two's complement and equivalent padding, yes. With signed magnitude we might have -0.
On systems where int and long have the same range, alignment,
etc., can int* and long* alias? (I assume not.)
No. They are independent types.
Can char16_t* and uint_least16_t* alias?
Technically not, but that seems to be an unneccessary restriction of the standard.
Types char16_t and char32_t denote distinct types with the same
size, signedness, and alignment as uint_least16_t and
uint_least32_t, respectively, in <cstdint>, called the underlying
types.
So it should be practically possible without any risks (since there shouldn't be any padding).
If an int is read through an unsigned int*, negative values wrap around as if they were casted to unsigned int. Is this correct?
For a system using two's complement, type-punning and signed-to-unsigned conversion are equivalent, for example:
int n = ...;
unsigned u1 = (unsigned)n;
unsigned u2 = *(unsigned *)&n;
Here, both u1 and u2 have the same value. This is by far the most common setup (e.g. Gcc documents this behaviour for all its targets). However, the C standard also addresses machines using ones' complement or sign-magnitude to represent signed integers. In such an implementation (assuming no padding bits and no trap representations), the result of a conversion of an integer value and type-punning can yield different results. As an example, assume sign-magnitude and n being initialized to -1:
int n = -1; /* 10000000 00000001 assuming 16-bit integers*/
unsigned u1 = (unsigned)n; /* 11111111 11111111
effectively 2's complement, UINT_MAX */
unsigned u2 = *(unsigned *)&n; /* 10000000 00000001
only reinterpreted, the value is now INT_MAX + 2u */
Conversion to an unsigned type means adding/subtracting one more than the maximum value of that type until the value is in range. Dereferencing a converted pointer simply reinterprets the bit pattern. In other words, the conversion in the initialization of u1 is a no-op on 2's complement machines, but requires some calculations on other machines.
If an unsigned int is read through an int*, the value must be within the range of int or an integer overflow occurs and the behaviour is undefined. Is this correct?
Not exactly. The bit pattern must represent a valid value in the new type, it doesn't matter if the old value is representable. From C11 (n1570) [omitted footnotes]:
6.2.6.2 Integer types
For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2N-1, so that objects of that type shall be capable of representing values from 0 to 2N-1 using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified.
For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M≤N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, the value shall be modified in one of the following ways:
the corresponding value with sign bit 0 is negated (sign and magnitude);
the sign bit has the value -2M (two's complement);
the sign bit has the value -2M-1 (ones' complement).
Which of these applies is implementation-defined, as is whether the value with sign bit 1 and all value bits zero (for the first two), or with sign bit and all value bits 1 (for ones' complement), is a trap representation or a normal value. In the case of sign and magnitude and ones' complement, if this representation is a normal value it is called a negative zero.
E.g., an unsigned int could have value bits, where the corresponding signed type (int) has a padding bit, something like unsigned u = ...; int n = *(int *)&u; may result in a trap representation on such a system (reading of which is undefined behaviour), but not the other way round.
If the value is within the range of both int and unsigned int, accessing it through a pointer of either type is fully defined and gives the same value. Is this correct?
I think, the standard would allow for one of the types to have a padding bit, which is always ignored (thus, two different bit patterns can represent the same value and that bit may be set on initialization) but be an always-trap-if-set bit for the other type. This leeway, however, is limited at least by ibid. p5:
The values of any padding bits are unspecified. A valid (non-trap) object representation of a signed integer type where the sign bit is zero is a valid object representation of the corresponding unsigned type, and shall represent the same value. For any integer type, the object representation where all the bits are zero shall be a representation of the value zero in that type.
On systems where int and long have the same range, alignment, etc., can int* and long* alias? (I assume not.)
Sure they can, if you don't use them ;) But no, the following is invalid on such platforms:
int n = 42;
long l = *(long *)&n; // UB
Can char16_t* and uint_least16_t* alias? I suspect this differs between C and C++. In C, char16_t is a typedef for uint_least16_t (correct?). In C++, char16_t is its own primitive type, which compatible with uint_least16_t. Unlike C, C++ seems to have no exception allowing compatible but distinct types to alias.
I'm not sure about C++, but at least for C, char16_t is a typedef, but not necessarily for uint_least16_t, it could very well be a typedef of some implementation-specific __char16_t, some type incompatible with uint_least16_t (or any other type).
It is not defined that happens since the c standard does not exactly define how singed integers should be stored. so you can not rely on the internal representation. Also there does no overflow occur. if you just typecast a pointer nothing other happens then another interpretation of the binary data in the following calculations.
Edit
Oh, i misread the phrase "but not equivalent integer types", but i keep the paragraph for your interest:
Your second question has much more trouble in it. Many machines can only read from correctly aligned addresses there the data has to lie on multiples of the types width. If you read a int32 from a non-by-4-divisable address (because you casted a 2-byte int pointer) your CPU may crash.
You should not rely on the sizes of types. If you chose another compiler or platform your long and int may not match anymore.
Conclusion:
Do not do this. You wrote highly platform dependent (compiler, target machine, architecture) code that hides its errors behind casts that suppress any warnings.
Concerning your questions regarding unsigned int* and int*: if the
value in the actual type doesn't fit in the type you're reading, the
behavior is undefined, simply because the standard neglects to define
any behavior in this case, and any time the standard fails to define
behavior, the behavior is undefined. In practice, you'll almost always
obtain a value (no signals or anything), but the value will vary
depending on the machine: a machine with signed magnitude or 1's
complement, for example, will result in different values (both ways)
from the usual 2's complement.
For the rest, int and long are different types, regardless of their
representations, and int* and long* cannot alias. Similarly, as you
say, in C++, char16_t is a distinct type in C++, but a typedef in
C (so the rules concerning aliasing are different).
This question already has answers here:
sizeof() operator in if-statement
(5 answers)
Closed 4 years ago.
#include <stdio.h>
int arr[] = {1,2,3,4,5,6,7,8};
#define SIZE (sizeof(arr)/sizeof(int))
int main()
{
printf("SIZE = %d\n", SIZE);
if ((-1) < SIZE)
printf("less");
else
printf("more");
}
The output after compiling with gcc is "more". Why the if condition fails even when -1 < 8?
The problem is in your comparison:
if ((-1) < SIZE)
sizeof typically returns an unsigned long, so SIZE will be unsigned long, whereas -1 is just an int. The rules for promotion in C and related languages mean that -1 will be converted to size_t before the comparison, so -1 will become a very large positive value (the maximum value of an unsigned long).
One way to fix this is to change the comparison to:
if (-1 < (long long)SIZE)
although it's actually a pointless comparison, since an unsigned value will always be >= 0 by definition, and the compiler may well warn you about this.
As subsequently noted by #Nobilis, you should always enable compiler warnings and take notice of them: if you had compiled with e.g. gcc -Wall ... the compiler would have warned you of your bug.
TL;DR
Be careful with mixed signed/unsigned operations (use -Wall compiler warnings). The Standard has a long section about it. In particular, it is often but not always true that signed is value-converted to unsigned (although it does in your particular example). See this explanation below (taken from this Q&A)
Relevant quote from the C++ Standard:
5 Expressions [expr]
10 Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
which are defined as follows:
[2 clauses about equal types or types of equal sign omitted]
— Otherwise, if the operand that has unsigned integer type has rank
greater than or equal to the rank of the type of the other operand,
the operand with signed integer type shall be converted to the type of
the operand with unsigned integer type.
— Otherwise, if the type of
the operand with signed integer type can represent all of the values
of the type of the operand with unsigned integer type, the operand
with unsigned integer type shall be converted to the type of the
operand with signed integer type.
— Otherwise, both operands shall be
converted to the unsigned integer type corresponding to the type of
the operand with signed integer type.
Your actual example
To see into which of the 3 cases your program falls, modify it slightly to this
#include <stdio.h>
int arr[] = {1,2,3,4,5,6,7,8};
#define SIZE (sizeof(arr)/sizeof(int))
int main()
{
printf("SIZE = %zu, sizeof(-1) = %zu, sizeof(SIZE) = %zu \n", SIZE, sizeof(-1), sizeof(SIZE));
if ((-1) < SIZE)
printf("less");
else
printf("more");
}
On the Coliru online compiler, this prints 4 and 8 for the sizeof() of -1 and SIZE, respectively, and selects the "more" branch (live example).
The reason is that the unsigned type is of greater rank than the signed type. Hence, clause 1 applies and the signed type is value-converted to the unsigned type (on most implementation, typically by preserving the bit-representation, so wrapping around to a very large unsigned number), and the comparison then proceeds to select the "more" branch.
Variations on a theme
Rewriting the condition to if ((long long)(-1) < (unsigned)SIZE) would take the "less" branch (live example).
The reason is that the signed type is of greater rank than the unsigned type and can also accomodate all the unsigned values. Hence, clause 2 applies and the unsigned type is converted to the signed type, and the comparison then proceeds to select the "less" branch.
Of course, you would never write such a contrived if() statement with explicit casts, but the same effect could happen if you compare variables with types long long and unsigned. So it illustrates the point that mixed signed/unsigned arithmetic is very subtle and depends on the relative sizes ("ranking" in the words of the Standard). In particular, there is no fixed rules saying that signed will always be converted to unsigned.
When you do comparison between signed and unsigned where unsigned has at least an equal rank to that of the signed type (see TemplateRex's answer for the exact rules), the signed is converted to the type of the unsigned.
With regards to your case, on a 32bit machine the binary representation of -1 as unsigned is 4294967295. So in effect you are comparing if 4294967295 is smaller than 8 (it isn't).
If you had enabled warnings, you would have been warned by the compiler that something fishy is going on:
warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
Since the discussion has shifted a bit on how appropriate the use of unsigned is, let me put a quote by James Gosling with regards to the lack of unsigned types in Java (and I will shamelessly link to another post of mine on the subject):
Gosling: For me as a language designer, which I don't really count
myself as these days, what "simple" really ended up meaning was could
I expect J. Random Developer to hold the spec in his head. That
definition says that, for instance, Java isn't -- and in fact a lot of
these languages end up with a lot of corner cases, things that nobody
really understands. Quiz any C developer about unsigned, and pretty
soon you discover that almost no C developers actually understand what
goes on with unsigned, what unsigned arithmetic is. Things like that
made C complex. The language part of Java is, I think, pretty simple.
The libraries you have to look up.
This is an historical design bug of C that was also repeated in C++.
It dates back to 16-bit computers and the error was deciding to use all 16 bits to represent sizes up to 65536 giving up the possibility to represent negative sizes.
This in se wouldn't have been an error if unsigned meaning was "non-negative integer" (a size cannot logically be negative) but it's a problem with the conversion rules of the language.
Given the conversion rules of the language the unsigned type in C doesn't represent a non-negative number, but it's instead more like a bitmask (the mathematical term is actually "a member of the ℤ/n ring"). To see why consider that for the C and C++ language
unsigned - unsigned gives an unsigned result
signed + unsigned gives and unsigned result
both of them clearly make no sense at all if you read unsigned as "non-negative number".
Of course saying that the size of an object is a member of ℤ/n ring doesn't make any sense at all and here it's where the error resides.
Practical implications:
Every time you deal with the size of an object be careful because the value is unsigned and that type in C/C++ has a lot of properties that are illogical for a number. Please always remember that unsigned doesn't mean "non-negative integer" but "member of ℤ/n algebraic ring" and that, most dangerous, in case of a mixed operation an int is converted to unsigned int and not the opposite.
For example:
void drawPolyline(const std::vector<P2d>& pts) {
for (int i=0; i<pts.size()-1; i++) {
drawLine(pts[i], pts[i+1]);
}
}
is buggy, because if passed an empty vector of points it will do illegal (UB) operations. The reason is that pts.size() is an unsigned.
The rules of the language will convert 1 (an integer) to 1{mod n}, will perform the subtraction in ℤ/n resulting in (size-1){mod n}, will convert i also to a {mod n} representation and will do the comparison in ℤ/n.
C/C++ actually defines a < operator in ℤ/n (rarely done in math) and you will end up accessing pts[0], pts[1] ... and so on until huge numbers even if the input vector was empty.
A correct loop could be
void drawPolyline(const std::vector<P2d>& pts) {
for (int i=1; i<pts.size(); i++) {
drawLine(pts[i-1], pts[i]);
}
}
but I normally prefer
void drawPolyline(const std::vector<P2d>& pts) {
for (int i=0,n=pts.size(); i<n-1; i++) {
drawLine(pts[i], pts[i+1]);
}
}
in other words getting rid of unsigned as soon as possible, and just working with regular ints.
Never use unsigned to represent size of containers or counters because unsigned means "member of ℤ/n" and the size of a container is not one of those things. Unsigned types are useful, but NOT to represent size of objects.
The standard C/C++ library unfortunately made this wrong choice, and it's too late to fix it. You are not forced to do the same mistake however.
In the words of Bjarne Stroustrup:
Using an unsigned instead of an int to gain one more bit to represent
positive integers is almost never a good idea. Attempts to ensure that
some values are positive by declaring variables unsigned will
typically be defeated by the implicit conversion rules
well, i'm not going to repeat the strong words Paul R said, but when you are comparing unsigned and integers you are going to experience dome bad things.
do if ((-1) < (int)SIZE)
instead of your if condition
Convert the unsigned type returned from sizeof operator to signed
when you compare two unsigned and signed number compiler implicitly converts signed to unsigned.
-1 signed representation in 4 byte int is 11111111 11111111 11111111 11111111 when converted to unsigned this representation would refer to 2^16-1
So basically your are comparing that 2^16-1>SIZE, which would be true.
You have to override that by explicitly casting the unsigned value to signed.
Since sizeof operator returns unsigned long long you should cast it to signed long long
if((-1)<(signed long long)SIZE)
use this if condition in your code
Code:
typedef signed short SIGNED_SHORT; //16 bit
typedef signed int SIGNED_INT; //32 bit
SIGNED_SHORT x;
x = (SIGNED_SHORT)(SIGNED_INT) 45512; //or any value over 32,767
Here is what I know:
Signed 16 bits:
Signed: From −32,768 to 32,767
Unsigned: From 0 to 65,535
Don't expect 45512 to fit into x as x is declared a 16 bit signed integer.
How and what does the double casting above do?
Thank You!
typedef signed short SIGNED_SHORT; //16 bit
typedef signed int SIGNED_INT; //32 bit
These typedefs are not particularly useful. A typedef does nothing more than provide a new name for an existing type. Type signed short already has a perfectly good name: "signed short"; calling it SIGNED_SHORT as well doesn't buy you anything. (It would make sense if it abstracted away some information about the type, or if the type were likely to change -- but using the name SIGNED_SHORT for a type other than signed short would be extremely confusing.)
Note also that short and int are both guaranteed to be at least 16 bits wide, and int is at least as wide as short, but different sizes are possible. For example, a compiler could make both short and int 16 bits -- or 64 bits for that matter. But I'll assume the sizes for your compiler are as you state.
In addition, signed short and short are names for the same type, as are signed int and int.
SIGNED_SHORT x;
x = (SIGNED_SHORT)(SIGNED_INT) 45512; //or any value over 32,767
A cast specifies a conversion to a specified type. Two casts specify two such conversions. The value 45512 is converted to signed int, and then to signed short.
The constant 45512 is already of type int (another name for signed int), so the innermost cast is fairly pointless. (Note that if int is only 16 bits, then 45512 will be of type long.)
When you assign a value of one numeric type to an object of another numeric type, the value is implicitly converted to the object's type, so the outermost cast is also redundant.
So the above code snippet is exactly equivalent to:
short x = 45512;
Given the ranges of int and short on your system, the mathematical value 45512 cannot be represented in type short. The language rules state that the result of such a conversion is implementation-defined, which means that it's up to each implementation to determine what the result is, and it must document that choice, but different implementations can do it differently. (Actually that's not quite the whole story; the 1999 ISO C standard added permission for such a conversion to raise an implementation-defined signal. I don't know of any compiler that does this.)
The most common semantics for this kind of conversion is that the result gets the low-order bits of the source value. This will probably result in the value -20024 being assigned to x. But you shouldn't depend on that if you want your program to be maximally portable.
When you cast twice, the casts are applied in sequence.
int a = 45512;
int b = (int) a;
short x = (short) b;
Since 45512 does not fit in a short on most (but not all!) platforms, the cast overflows on those platforms. This will either raise an implementation-defined signal or result in an implementation-defined value.
In practice, many platforms define the result as the truncated value, which is -20024 in this case. However, there are platforms which raise a signal, which will probably terminate your program if uncaught.
Citation: n1525 §6.3.1.3
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
The double casting is equivalent to:
short x = static_cast<short>(static_cast<int>(45512));
which is equivalent to:
short x = 45512;
which will likely wrap around so x equals -20024, but technically it's implementation defined behavior if a short has a maximum value less than 45512 on your platform. The literal 45512 is of type int.
You can assume it does two type conversions (although signed int and int are only separated once in the C standard, IIRC).
If SIGNED_SHORT is too small to handle 45512, the result is either implementation-defined or an implementation-defined signal is raised. (In C++ only the former applies.)