So the following code
#include <stdio.h>
int main() {
int myInt;
myInt = 0xFFFFFFE2;
printf("%d\n",myInt);
return 0;
}
yields -30.
I get the idea of two's complement but when I directly type -30 I get the same number too. So my question is why would I write my number in hexadecimal form ? And if I use hexadecimal form how does the compiler distinguish between whether I mean 0xFFFFFFE2 is two's complement for -30 or the value 4294967266?
Edit: This has nothing to do with two's complement as I later on find out. Two's complement is just a design way for cpu's to operate on integers in a more efficient way. [For more information Wiki]. See #Lightness Races in Orbit 's answer for explanation of the above code.
There are some misconceptions here, all of which can be fixed by revisiting first principles:
A number is a number.
When you write the decimal literal 42, that is the number 42.
When you write the hexadecimal literal 0x2A, that is still the number 42.
When you assign either of these expressions to an int, the int contains the number 42.
A number is a number.
Which base you used does not matter. It changes nothing. Writing a literal in hex then assigning it to an int does not change what happens. It does not magically make the number be interpreted or handled or represented any differently.
A number is a number.
What you've done here is assign 0xFFFFFFE2, which is the number 4294967266, to myInt. That number is larger than the maximum value of a [signed] int on your platform, so it overflows. The results are undefined, but you happen to be seeing a "wrap around" to -30, probably due to how two's complement representation works and is implemented in your computer's chips and memory.
That's it.
It's got nothing to do with hexadecimal, so there's no "choice" to be made between hex and decimal literals. The same thing would happen if you used a decimal literal:
myInt = 4294967266;
Furthermore, if you were looking for a way to "trigger" this wrap-around behaviour, don't, because the overflow has undefined behaviour.
If you want to manipulate the raw bits and bytes that make up myInt, you can alias it via a char*, unsigned char* or std::byte*, and play around that way.
Hexidecimal vs decimal has nothing at all to do with why it is displaying -30 or not.
For a 32 bit word that is being treated as signed (since you are using %d), any unsigned number that is greater than 2147483648 will be treated as a negative (2s complement number)
so, myInt could be -30, 0xFFFFFFE2, 4294967266 or -0x1E, and treating it as a signed 32 bit integer will display as -30.
Related
I have some small problems regarding (implicit) type conversion in C++.
1. float to int
float f = 554344.76;
int x1 = f;
std::cout << x1 << std::endl;
Prints 554344 (rounding down or cutting of decimal places) but when replacing it with float f = 5543444.76; it prints 5543445 (rounding up). Why is it in the first case rounding down and in the second case rounding up? On top of that for huger numbers it produces completely weird results (e.g 5543444675.76 turns into 5543444480). Why?
What is the difference between int x1 = f; and long int x2 = f;?
2. Long int to float
long int li;
float x3 = li;
std::cout << x3 << std::endl;
A solution to an exercise says that the values is rounded down and results in incorrect values for large numbers. If I try long int li = 5435; it is not rounded down. Or is the meaning that long int li = 5435.56; is rounded down? Second, why does it result in incorrect values for large numbers? I think long int and float have the same number of bits.
3. char to double
char c = 130;
double x4 = c;
std::cout << x4 << std::endl;
Why does this result in -126 while char c = 100; provides the correct value?
4. int to char
int i = 200;
char x5 = i;
std::cout << x5 << std::endl;
This prints nothing (no output). Why? I think up to 255 the result should be correct because char can store values up to 255.
Edit: One question per post, please. Here I answer #3 and #4
I think up to 255 the result should be correct because char can store values up to 255.
This is an incorrect assumption. (Demo)
You have potential signed overflow if a char is signed and 8 bits (1 byte). It would only have a maximum value of 127. This would be undefined behavior.
Whether a char is signed or not is implementation dependent, but it usually is. It will always be 1 byte long, but "1 byte" is allowed to be implementation-dependent, although it's almost universally going to be 8 bits.
In fact, if you reference any ASCII table, it only goes up to 127 before you get into "extended" ASCII, which on most platforms you'd need a wide character type to display it.
So your code in #3 and #4 have overflow.
You should have even gotten a warning about it when you tried char c = 130:
warning: overflow in implicit constant conversion
A float usually does not have enough precision to fully represent 5543444.76. Your float is likely storing the value 5543455.0. The cast to int is not where the rounding occurs. Casting from floating point to int always truncates the decimal. Try using a double instead of float or assign the value directly to an int to illustrate the difference.
Many of float's bits are used to represent sign and exponent, it cannot accurately represent all values an int of the same size. Again, this is a problem of precision, the least significant digits have to be discarded causing unexpected results that look like rounding errors. Consider the scientific notation. You can represent a very large range of values using few digits, but only a few decimal points are tracked. The less important digits are dropped.
char may be signed or may be unsigned, it depends on your platform. It appears that char is signed 8 bit on your platform, meaning it can only represent values from -128 to 127, clearly 130 exceeds that limit. Since signed integer overflow is undefined, your test case might do anything, including wrapping to -126.
char variables don't print their value when passed to std::cout. They print the character associated with that value. See this table. Note that since the value 200 exceeds the maximum value a char can represent on your platform, it might do anything, including trying to display a character that has no obvious representation.
float has only 23 bits of precision. 5543444 doesn't fit in 23 bits, so it gets rounded to closest value that fits.
You have uninitialized li, so it's undefined behavior. Perhaps you should edit the question to show the real code you are wondering about.
char is quite often signed. 130 can't be represented as signed char. This is probably undefined behavior (would have to check the standard to be sure, it could be implementation defined or there could be some special rule), but in practice on a PC CPU, the compiler takes the 8 lowest bits of 130 and showes them into the 8 bit signed character, and that sets the 8th bit, resulting in negative value.
Same thing as 3 above: 200 does not fit in signed 8 bit integer, which your char probably is. Instead it ends up setting the sign bit, resulting in negative value.
The nearest number from 5,543,444.76 an IEEE 754 32bit float can represent is 5,543,445.0. See this IEEE 754 converter. So f is equal to 5543445.0f and then rounded down to 5543445 when converted to an integer.
Even though on your specific system a float and an long int may have the same size, all values from one cannot be represented by the other. For instance, 0.5f cannot be represented as a long int. Similarly, 100000002 cannot be represented as a float: the nearest IEEE 754 32 bits floats are 100000000.0f and 100000008.0f.
In general, you need to read about the floattng point representation. Wikipedia is a good start.
char may be signed char or unsigned char according to the system you're in. In the (8bits) signed char case, 130 cannot be represented. A signed integer overflow occur (which is UB) and most probably it wraps to -126 (note that 130+126=256). On the other hand, 100 is a perfectly valid value for a signed char.
In the Extended ASCII Table, 200 maps to È. If your system does not handled extended ascii (if it's configured with UTF-8 for instance) or if you've got no font to represent this character, you'll see no output. If you're on a system with char defined as signed char, it's UB anyway.
Using pow() from the <cmath> library, I get a negative number for some numbers.
2601*((int)pow(10.0,3*2)) = -1693967296
Why is this so? Is due to the fact that int only has a range between -32767 to 32767? Or is it because of the casting?
The behaviour of your program is undefined as you are overflowing a signed integral type.
pow(10.0, 3 * 2) is not returning a negative number. That particular overload will return a double. You are casting that to an int then multiplying that by a constant. That is too big to fit into an int. Did you check INT_MAX (or the equivalent std::numeric_limits<int>::max())on your platform?
It is because of integer overflow.
2601*1000000 > INT_MAX.
Hence overflow.
It's not the pow(), but you preempted the answer.
10 to the 6th power is a million, and a million times 2,601 is
2,601,000,000
For signed integers on your platform, the range is probably
–2,147,483,648 to 2,147,483,647
So you see, you've exceeded that range. The actual overflow difference may seem inconsistent, but that's because of two's complement form.
Yes, It is because of int data type. int range is from -2,147,483,648 to 2,147,483,647, because sizeof(int) is 4 bytes.
To overcome this, you can use unsigned int or long int
Your platform appears to have a 32-bit int. This supports a range of -2147483648 to 2147483647
pow() returns a positive number in a double precision floating point number. This bit works fine.
Your cast converts this to an int, again this is fine.
Where things go wrong is the multiplication. The multiplication overflows, signed overflow is undefined behaviour.
In your case you seem to have ended up with twos complement wraparound but you should be aware that this behaviour is not gauranteed, especially with modern optimising compilers.
I just asked this question and it got me thinking if there is any reason
1)why you would assign a int variable using hexidecimal or octal instead of decimal and
2)what are the difference between the different way of assignment
int a=0x28ff1c; // hexideciaml
int a=10; //decimal (the most commonly used way)
int a=012177434; // octal
You may have some constants that are more easily understood when written in hexadecimal.
Bitflags, for example, in hexadecimal are compact and easily (for some values of easily) understood, since there's a direct correspondence 4 binary digits => 1 hex digit - for this reason, in general the hexadecimal representation is useful when you are doing bitwise operations (e.g. masking).
In a similar fashion, in several cases integers may be internally divided in some fields, for example often colors are represented as a 32 bit integer that goes like this: 0xAARRGGBB (or 0xAABBGGRR); also, IP addresses: each piece of IP in the dotted notation is two hexadecimal digits in the "32-bit integer" notation (usually in such cases unsigned integers are used to avoid messing with the sign bit).
In some code I'm working on at the moment, for each pixel in an image I have a single byte to use to store "accessory information"; since I have to store some flags and a small number, I use the least significant 4 bits to store the flags, the 4 most significant ones to store the number. Using hexadecimal notations it's immediate to write the appropriate masks and shifts: byte & 0x0f gives me the 4 LS bits for the flags, (byte & 0xf0)>>4 gives me the 4 MS bits (re-shifted in place).
I've never seen octal used for anything besides IOCCC and UNIX permissions masks (although in the last case they are actually useful, as you probably know if you ever used chmod); probably their inclusion in the language comes from the fact that C was initially developed as the language to write UNIX.
By default, integer literals are of type int, while hexadecimal literals are of type unsigned int or larger if unsigned int isn't large enough to hold the specified value. So, when assigning a hexadecimal literal to an int there's an implicit conversion (although it won't impact the performance, any decent compiler will perform the cast at compile time). Sorry, brainfart. I checked the standard right now, it goes like this:
decimal literals, without the u suffix, are always signed; their type is the smallest that can represent them between int, long int, long long int;
octal and hexadecimal literals without suffix, instead, may also be of unsigned type; their actual type is the smallest one that can represent the value between int, unsigned int, long int, unsigned long int, long long int, unsigned long long int.
(C++11, §2.14.2, ¶2 and Table 6)
The difference may be relevant for overload resolution1, but it's not particularly important when you are just assigning a literal to a variable. Still, keep in mind that you may have valid integer constants that are larger than an int, i.e. assignment to an int will result in signed integer overflow; anyhow, any decent compiler should be able to warn you in these cases.
Let's say that on our platform integers are in 2's complement representation, int is 16 bit wide and long is 32 bit wide; let's say we have an overloaded function like this:
void a(unsigned int i)
{
std::cout<<"unsigned";
}
void a(int i)
{
std::cout<<"signed";
}
Then, calling a(1) and a(0x1) will produce the same result (signed), but a(32768) will print signed and a(0x10000) will print unsigned.
It matters from a readability standpoint - which one you choose expresses your intention.
If you're treating the variable as an integral type, you know, like 2+2=4, you use the decimal representation. It's intuitive and straight-forward.
If you're using it as a bitmask, you can use hexa, octal or even binary. For example, you'll know
int a = 0xFF;
will have the last 8 bits set to 1. You'll know that
int a = 0xF0;
is (...)11110000, but you couldn't directly say the same thing about
int a = 240;
although they are equivalent. It just depends on what you use the numbers for.
well the truth is it doesn't matter if you want it on decimal, octal or hexadecimal its just a representation and for your information, numbers in computers are stored in binary(so they are just 0's and 1's) which you can use also to represent a number. so its just a matter of representation and readability.
NOTE:
Well in some of C++ debuggers(in my experience) I assigned a number as a decimal representation but in my debugger it is shown as hexadecimal.
It's similar to the assignment of and integer this way:
int a = int(5);
int b(6);
int c = 3;
it's all about preference, and when it breaks down you're just doing the same thing. Some might choose octal or hex to go along with their program that manipulates that type of data.
This is a very basic question.Please don't mind but I need to ask this. Adding two integers
int main()
{
cout<<"Enter a string: ";
int a,b,c;
cout<<"Enter a";
cin>>a;
cout<<"\nEnter b";
cin>>b;
cout<<a<<"\n"<<b<<"\n";
c= a + b;
cout <<"\n"<<c ;
return 0;
}
If I give a = 2147483648 then
b automatically takes a value of 4046724. Note that cin will not be prompted
and the result c is 7433860
If int is 2^32 and if the first bit is MSB then it becomes 2^31
c= 2^31+2^31
c=2^(31+31)
is this correct?
So how to implement c= a+b for a= 2147483648 and b= 2147483648 and should c be an integer or a double integer?
When you perform any sort of input operation, you must always include an error check! For the stream operator, this could look like this:
int n;
if (!(std::cin >> n)) { std::cerr << "Error!\n"; std::exit(-1); }
// ... rest of program
If you do this, you'll see that your initial extraction of a already fails, so whatever values are read afterwards are not well defined.
The reason the extraction fails is that the literal token "2147483648" does not represent a value of type int on your platform (it is too large), no different from, say, "1z" or "Hello".
The real danger in programming is to assume silently that an input operation succeeds when often it doesn't. Fail as early and as noisily as possible.
The int type is signed and therefor it's maximum value is 2^31-1 = 2147483648 - 1 = 2147483647
Even if you used unsigned integer it's maximum value is 2^32 -1 = a + b - 1 for the values of a and b you give.
For the arithmetics you are doing, you should better use "long long", which has maximum value of 2^63-1 and is signed or "unsigned long long" which has a maximum value of 2^64-1 but is unsigned.
c= 2^31+2^31
c=2^(31+31)
is this correct?
No, but you're right that the result takes more than 31 bits. In this case the result takes 32 bits (whereas 2^(31+31) would take 62 bits). You're confusing multiplication with addition: 2^31 * 2^31 = 2^(31+31).
Anyway, the basic problem you're asking about dealing with is called overflow. There are a few options. You can detect it and report it as an error, detect it and redo the calculation in such a way as to get the answer, or just use data types that allow you to do the calculation correctly no matter what the input types are.
Signed overflow in C and C++ is technically undefined behavior, so detection consists of figuring out what input values will cause it (because if you do the operation and then look at the result to see if overflow occurred, you may have already triggered undefined behavior and you can't count on anything). Here's a question that goes into some detail on the issue: Detecting signed overflow in C/C++
Alternatively, you can just perform the operation using a data type that won't overflow for any of the input values. For example, if the inputs are ints then the correct result for any pair of ints can be stored in a wider type such as (depending on your implementation) long or long long.
int a, b;
...
long c = (long)a + (long)b;
If int is 32 bits then it can hold any value in the range [-2^31, 2^31-1]. So the smallest value obtainable would be -2^31 + -2^31 which is -2^32. And the largest value obtainable is 2^31 - 1 + 2^31 - 1 which is 2^32 - 2. So you need a type that can hold these values and every value in between. A single extra bit would be sufficient to hold any possible result of addition (a 33-bit integer would hold any integer from [-2^32,2^32-1]).
Or, since double can probably represent every integer you need (a 64-bit IEEE 754 floating point data type can represent integers up to 53 bits exactly) you could do the addition using doubles as well (though adding doubles may be slower than adding longs).
If you have a library that offers arbitrary precision arithmetic you could use that as well.
I have a strange problem. I have created a simple function to convert from decimal to binary. The argument is a int value that represents the number in decimal, and the function returns a bitset that represent the binary number.
The problem is that conversion for a binary number smaller than 10000000000000000000000000000000 (2,147,483,648 in decimal) works perfectly, but when the number to be converted is higher, the conversion doesn't work properly. Where is the mistake???
Here I send you the function:
bitset<15000> Utilities::getDecToBin(int dec)
{
bitset<15000> columnID;
int x;
for(x=0;x<columnID.size();x++)
{
columnID[x]=dec%2;
dec=dec/2;
}
return columnID;
}
Thanks in advance for all your help! :D
The range for an 32 bit int is −2,147,483,648 to 2,147,483,647.
If by larger you mean 1073741825, then I can't see anything wrong.
If you mean adding an extra bit at the most significant position (i.e. 2147483648) then you might be running into signed/unsigned issues.
I see that you restrict your loop by the size of columnID. It would be a good idea to limit it by the size of dec in bits too, or stop when dec is 0.
You should use a long int, an array of long int or even a string as input to your method (depending on the source of your data and how you will call your function).
I am surprised it only works up to 30 bits, you should be able to manage 31 but for 32 bits you would need unsigned ints. If you used unsigned 64-bit integers you could manage a lot more than that, but 15,000-bit integers are something that can only be implemented through special classes.