Confused by unsigned short behavior in C++

Confused by unsigned short behavior in C++ - c++

I am currently attempting to learn C++. I usually learn by playing around with stuff, and since I was reading up on data types, and then ways you could declare the value of an integer (decimal,binary,hex etc), I decided to test how "unsigned short"s worked. I am now confused.
Here is my code:
#include <cstdio>
int main(){
unsigned short a = 0b0101010101010101;
unsigned short b = 0b01010101010101011;
unsigned short c = 0b010101010101010101;
printf("%hu\n%hu\n%hu\n", a, b, c);
}
Integers of type "unsigned short" should have a size of 2 bytes across all operating systems.
I have used binary to declare the values of these integers, because this is the easiest way to make the source of my confusion obvious.
Integer "a" has 16 digits in binary. 16 digits in a data type with a size of 16 bits (2 bytes). When I print it, I get the number 21845. Seems okay. Checks out.
Then it gets weird.
Integer "b" has 17 digits. When we print it, we get the decimal version of the whole 17 digit number, 43691. How does a binary number that takes up 17 digits fit into a variable that should only have 16 bits of memory allocated to it? Is someone lying about the size? Is this some sort of compiler magic?
And then it gets even weirder. Integer "c" has 18 digits, but here we hit the upper limit. When we build, we get the following error:
/home/dimitrije/workarea/c++/helloworld.cpp: In function ‘int main()’:
/home/dimitrije/workarea/c++/helloworld.cpp:6:22: warning: large integer implicitly truncated to unsigned type [-Woverflow]
unsigned short c = 0b010101010101010101;
Okay, so we can put 17 digits in 16 bits, but we can't put in 18. Makes some kind of sense I guess? Like we can magic away 1 digit but two wont work. But the supposed "truncation", rather than truncating to the actual maximum value, 17 digits (or 43691 in this example), truncates to what the limit logically should be, 21845.
This is frying my brain and I'm too far into the rabbit whole to stop now. Does anyone understand why C++ behaves this way?
---EDIT---
So after someone pointed out to me that my binary numbers started with a 0, I realized I was stupid.
However, when I took the 0 from the left hand side and carried it right (meaning that a,b c were actually 16,17,18 bits respectively), I realized that the truncating behaviour still doesn't make sense. Here is the output:
43690
21846
43690
43960 is the maximum value for 16 bits. I could've checked this before asking the original question and saved myself some time.
Why does 17 bits truncate to 15, and 18 (and also 19,20,21) truncate to 16?
--EDIT 2---
I've changed all the digits in my integers to 1, and my mistake makes sense now. I get back 65535. I took the time to type 2^16 into a calculator this time. The entirety of my question was a result of the fact that I didn't properly look at the binary value I was assigning.
Thanks to the guy who linked implicit conversions, I will read up on that.

On most systems a unsigned short is 16 bits. No matter what you assign to a unsigned short it will be truncated to 16 bits. In your example the first bit is a 0 which is essentially being ignored, in the same way int x = 05; will just equal 5 and not 05.
If you change the first bit from a 0 to a 1, you will see the expected behaviour of the assignment truncating the value to 16 bits.
The range for an unsigned short int (16 bits) is 0 to 65535
65535 = 1111 1111 1111 1111 in binary

Related

Why subtract from 256 when assigning signed char to unsigned char in C++?

in Bjarne's "The C++ Programming Language" book, the following piece of code on chars is given:
signed char sc = -140;
unsigned char uc = sc;
cout << uc // prints 't'
1Q) chars are 1byte (8 bits) in my hardware. what is the binary representation of -140? is it possible to represent -140 using 8 bits. I would think range is guaranteed to be at least [-127...127] when signed chars is considered. how is it even possible to represent -140 in 8 bits?
2Q) assume it's possible. why do we subtract 140 from uc when sc is assigned to uc? what is the logic behind this?
EDIT: I've wrote cout << sizeof (signed char) and it's produced 1 (1 byte). I put this to be exact on the byte-wise size of signed char.
EDIT 2: cout << int {sc } gives the output 116. I don't understand what happens here?

First of all: Unless you're writing something very low-level that requires bit-representation manipulation - avoid writing this kind of code like the plague. It's hard to read, easy to get wrong, confusing, and often exhibits implementation-defined/undefined behavior.
To answer your question though:
The code assumed you're on a platform in which the types signed char and unsigned char have 8 bits (although theoretically they could have more). And that the hardware has "two's complement" behavior: The bit representation of the result of an arithmetic operation on an integer type with N bits is always modulo 2^N. That also specifies how the same bit-pattern is interpreted as signed or unsigned. Now, -140 modulo 2^8 is 116 (01110100), so that's the bit pattern sc will hold. Interpreted as a signed char (-128 through 127), this is still 116.
An unsigned char can represent 116 as well, so the second assignment results in 116 as well.
116 is the ASCII code of the character t; and std::cout interprets unsigned char values (under 128) as ASCII codes. So, that's what gets printed.

The result of assigning -140 to a signed char is implementation-defined, just like its range is (i.e. see the manual). A very common choice is to use wrap-around math: if it doesn't fit, add or subtract 256 (or the relevant max range) until it fits.
Since sc will have the value 116, and uc can also hold that value, that conversion is trivial. The unusual thing already happened when we assigned -140 to sc.

confusing sizeof operator result

I'm a bit confused about a sizeof result.
I have this :
unsigned long part1 = 0x0200000001;
cout << sizeof(part1); // this gives 8 byte
if I count correctly, part 1 is 9 byte long right ?
Can anybody clarify this for me
thanks
Regards

if I count correctly, part 1 is 9 byte long right ?
No, you are counting incorrectly. 0x0200000001 can fit into five bytes. One byte is represented by two hex digits. Hence, the bytes are 02 00 00 00 01.

I suppose you misinterpret the meaning of sizeof. sizeof(type) returns the number of bytes that the system reserves to hold any value of the respective type. So sizeof(int) on an 32 bit system will probably give you 4 bytes, and 8 bytes on a 64 bit system, sizeof(char[20]) gives 20, and so on.
Note that one can use also identifiers of (typed) variables, e.g. int x; sizeof(x); type is then deduced from the variable declaration/definition, such that sizeof(x) is the same as sizeof(int) in this case.
But: sizeof never ever interprets or analyses the content / the value of a variable at runtime, even if sizeof somehow sounds as if. So char *x = "Hello, world"; sizeof(x) is not the string length of string literal "Hello, world" but the size of type char*.
So your sizeof(part1) is the same as sizeof(long), which is 8 on your system regardless of what the actual content of part1 is at runtime.

An unsigned long has a minimum range of 0 to 4294967295 that's 4 bytes.
Assigning 0x0200000001 (8589934593) to an unsigned long that's not big enough will trigger a conversion so that it fits in an unsigned long on your machine. This conversion is implementation-defined but usually the higher bits will be discarded.
sizeof will tell you the amount of bytes a type uses. It won't tell you how many bytes are occupied by your value.

sizeof(part1) (I'm assuming you have a typo) gives the size of a unsigned long (i.e. sizeof(unsigned long)). The size of part1 is therefore the same regardless of what value is stored in it.
On your compiler, sizeof(unsigned long) has value of 8. The size of types is implementation defined for all types (other than char types that are defined to have a size of 1), so may vary between compilers.
The value of 9 you are expecting is the size of output you will obtain by writing the value of part1 to a file or string as human-readable hex, with no leading zeros or prefix. That has no relationship to the sizeof operator whatsoever. And, when outputting a value, it is possible to format it in different ways (e.g. hex versus decimal versus octal, leading zeros or not) which affect the size of output

sizeof(part1) returns the size of the data type of variable part1, which you have defined as unsigned long.
The bit-size of unsigned long for your compiler is always 64 bits, or 8 bytes long, that's 8 groups of 8 bits. The hexadecimal representation is a human readable form of the binary format, where each digit is 4 bits long. We humans often omit leading zeroes for clarity, computers never do.
Let's consider a single byte of data - a char - and the value zero.
- decimal: 0
- hexadecimal : 0x0 (often written as 0x00)
- binary: 0000 0000
For a list of C++ data types and their corresponding bit-size check out the documentation at (the msvc documentation is easier to read):
for msvc: https://msdn.microsoft.com/en-us/library/s3f49ktz(v=vs.71).aspx
for gcc: https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html#Data-Types
All compilers have documentation for their data sizes, since they depend on the hardware and the compiler itself. If you use a different compiler, a google search on "'your compiler name' data sizes" will help you find the correct sizes for your compiler.

Unexpected output of C++ programm

I compiled the following code in turbo C compiler
void main()
{
int i =400*400/400;
if(i==400)
cout<<"Filibusters";
else
cout<<"Sea ghirkins";
}
I expected the value of 'i' to be 400 and hence, the output should be Filibusters. However, the output I got is Sea Ghirkins. How is this possible?

You are overflowing your int type: the behaviour on doing that is undefined.
The range of an int can be as small as -32767 to +32767. Check the value of sizeof int. If it's 2, then it will not be able to represent 400 * 400. You can also check the values of INT_MAX and INT_MIN.
Use a long instead, which must be at least 32 bits. And perhaps treat yourself to a new compiler this weekend?

Look at operator associativity: * and / are left-associative, that means your formula is calculated in this order: (400*400)/400
400*400=160000. Computer arithmetic is finite. You are using 16-bit compiler where int fits into 16 bits and can only hold values from range -32768 ... 32767 (why -32768?). 160000 obviously doesn't fit into that range and is trimmed (integer overflow occurs). Then you divide trimmed value by 400 and get something unexpected to you. Compilers of the past were quite straightforward so I would expect something like 72 to be stored into i.
The above means you can either use a bigger integer type - long which is able to store 160000 or change assiciativity manually using parenthesses: 400*(400/400).

Converting string to int fails

I'm trying to convert the string to int with stringstream, the code down below works, but if i use a number more then 1234567890, like 12345678901 then it return 0 back ...i dont know how to fix that, please help me out
std:: string number= "1234567890";
int Result;//number which will contain the result
std::stringstream convert(number.c_str()); // stringstream used for the conversion initialized with the contents of Text
if ( !(convert >> Result) )//give the value to Result using the characters in the string
Result = 0;
printf ("%d\n", Result);

the maximum number an int can contain is slightly more than 2 billion. (assuming ubiquitios 32 bit ints)
It just doesn't fit in an int!

The largest unsigned int (on a 32-bit platform) is 2^32 (4294967296), and your input is larger than that, so it's giving up. I'm guessing you can get an error code from it somehow. Maybe check failbit or badbit?
int Result;
std::stringstream convert(number.c_str());
convert >> Result;
if(convert.fail()) {
std::cout << "Bad things happened";
}

If you're on a 32-bit or LP64 64-bit system then int is 32-bit so the largest number you can store is approximately 2 billion. Try using a long or long long instead, and change "%d" to "%ld" or "%lld" appropriately.

The (usual) maximum value for a signed int is 2.147.483.647 as it is (usually) a 32bit integer, so it fails for numbers which are bigger.
if you replace int Result; by long Result; it should be working for even bigger numbers, but there is still a limit. You can extend that limit by factor 2 by using unsigned integer types, but only if you don't need negative numbers.

Hm, lots of disinformation in the existing four or five answers.
An int is minimum 16 bits, and with common desktop system compilers it’s usually 32 bits (in all Windows version) or 64 bits. With 32 bits it has maximum 232 distinct values, which, setting K=210 = 1024, is 4·K3, i.e. roughly 4 billion. Your nearest calculator or Python prompt can tell you the exact value.
A long is minimum 32 bits, but that doesn’t help you for the current problem, because in all extant Windows variants, including 64-bit Windows, long is 32 bits…
So, for better range than int, use long long. It’s minimum 64 bits, and in practice, as of 2012 it’s 64 bits with all compilers. Or, just use a double, which, although not an integer type, with the most common implementation (IEEE 754 64-bit) can represent integer values exactly with, as I recall, about 51 or 52 bits – look it up if you want exact number of bits.
Anyway, remember to check the stream for conversion failure, which you can do by s.fail() or simply !s (which is equivalent to fail(), more precisely, the stream’s explicit conversion to bool returns !fail()).

Problem when converting from decimal to binary and returning a bitset

I have a strange problem. I have created a simple function to convert from decimal to binary. The argument is a int value that represents the number in decimal, and the function returns a bitset that represent the binary number.
The problem is that conversion for a binary number smaller than 10000000000000000000000000000000 (2,147,483,648 in decimal) works perfectly, but when the number to be converted is higher, the conversion doesn't work properly. Where is the mistake???
Here I send you the function:
bitset<15000> Utilities::getDecToBin(int dec)
{
bitset<15000> columnID;
int x;
for(x=0;x<columnID.size();x++)
{
columnID[x]=dec%2;
dec=dec/2;
}
return columnID;
}
Thanks in advance for all your help! :D

The range for an 32 bit int is −2,147,483,648 to 2,147,483,647.
If by larger you mean 1073741825, then I can't see anything wrong.
If you mean adding an extra bit at the most significant position (i.e. 2147483648) then you might be running into signed/unsigned issues.
I see that you restrict your loop by the size of columnID. It would be a good idea to limit it by the size of dec in bits too, or stop when dec is 0.

You should use a long int, an array of long int or even a string as input to your method (depending on the source of your data and how you will call your function).

I am surprised it only works up to 30 bits, you should be able to manage 31 but for 32 bits you would need unsigned ints. If you used unsigned 64-bit integers you could manage a lot more than that, but 15,000-bit integers are something that can only be implemented through special classes.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js