Why main function cannot return a negative number? - c++

This might be a really simple question for some, but I'm new to C++ and hope someone can answer this for me.
I'm using this online C++ compiler. Here's the simple code I'm running in it:
int main()
{
int x = 1- 2;
std::cout << x << std::endl;
return x;
}
The output is:
-1
...Program finished with exit code 255
Press ENTER to exit console.
That really ponders me. Why would the main() function return 255 when the value of x is -1?
Doesn't main() return an int (not an unsigned int), so it should be able to return a negative number, right?
How does -1 get converted to 255? Something to do with an 8-bit variable? But isn't the int type 16-bit?

This is not related to C language really. The operating system, or possibly just the C runtime (the small piece of the code which sets up things for your C program and actually calls your main function) limits exit code of the program to unsigned 8 bit number.
Very nearly all systems today use two's complement representation for negative numbers, and then bit pattern for -1 is having all bits of the number to be 1. Doesn't matter how many bits, they are all set when value is -1.
The simplest way to convert an int to 8 bit number is to just take 8 lowest bits (which are now all 1 as per above), so you end up with binary number:
11111111
If interpreted as unsigned, then in decimal value of this happens to be 255 (as signed 8 bits it is still -1), which you can check with any calculator which supports binary (such as Windows 10 Calculator app when you switch it to Programmer mode).
Looking at this from the opposite direction: When trying to understand funny numbers related to computers or programming, it is often useful to convert them to binary. If you convert 255 to binary, you get 11111111, and then if you know binary numbers, you should realize this is -1 if interpreted as signed 8 bit number.

Related

why declare "score[11] = {};" and "grade" as "unsigned" instead of "int'

I'm new to C++ and is trying to learn the concept of array. I saw this code snippet online. For the sample code below, does it make any difference to declare:
unsigned scores[11] = {};
unsigned grade;
as:
int scores[11] = {};
int grade;
I guess there must be a reason why score[11] = {}; and grade is declared as unsigned, but what is the reason behind it?
int main() {
unsigned scores[11] = {};
unsigned grade;
while (cin >> grade) {
if (0 <= grade <= 100) {
++scores[grade / 10];
}
}
for (int i = 0; i < 11; i++) {
cout << scores[i] << endl;
}
}
unsigned means that the variable will not hold a negative values (or even more accurate - It will not care about the sign-). It seems obvious that scores and grades are signless values (no one scores -25). So, it is natural to use unsigned.
But note that: if (0 <= grade <= 100) is redundant. if (grade <= 100) is enough since no negative values are allowed.
As Blastfurnace commented, if (0 <= grade <= 100) is not right even. if you want it like this you should write it as:
if (0 <= grade && grade <= 100)
Unsigned variables
Declaring a variable as unsigned int instead of int has 2 consequences:
It can't be negative. It provides you a guarantee that it never will be and therefore you don't need to check for it and handle special cases when writing code that only works with positive integers
As you have a limited size, it allows you to represent bigger numbers. On 32 bits, the biggest unsigned int is 4294967295 (2^32-1) whereas the biggest int is 2147483647 (2^31-1)
One consequence of using unsigned int is that arithmetic will be done in the set of unsigned int. So 9 - 10 = 4294967295 instead of -1 as no negative number can be encoded on unsigned int type. You will also have issues if you compare them to negative int.
More info on how negative integer are encoded.
Array initialization
For the array definition, if you just write:
unsigned int scores[11];
Then you have 11 uninitialized unsigned int that have potentially values different than 0.
If you write:
unsigned int scores[11] = {};
Then all int are initialized with their default value that is 0.
Note that if you write:
unsigned int scores[11] = { 1, 2 };
You will have the first int intialized to 1, the second to 2 and all the others to 0.
You can easily play a little bit with all these syntax to gain a better understanding of it.
Comparison
About the code:
if(0 <= grade <= 100)
as stated in the comments, this does not do what you expect. In fact, this will always evaluate to true and therefore execute the code in the if. Which means if you enter a grade of, say, 20000, you should have a core dump. The reason is that this:
0 <= grade <= 100
is equivalent to:
(0 <= grade) <= 100
And the first part is either true (implicitly converted to 1) or false (implicitly converted to 0). As both values are lower than 100, the second comparison is always true.
unsigned integers have some strange properties and you should avoid them unless you have a good reason. Gaining 1 extra bit of positive size, or expressing a constraint that a value may not be negative, are not good reasons.
unsigned integers implement arithmetic modulo UINT_MAX+1. By contrast, operations on signed integers represent the natural arithmetic that we are familiar with from school.
Overflow semantics
unsigned has well defined overflow; signed does not:
unsigned u = UINT_MAX;
u++; // u becomes 0
int i = INT_MAX;
i++; // undefined behaviour
This has the consequence that signed integer overflow can be caught during testing, while an unsigned overflow may silently do the wrong thing. So use unsigned only if you are sure you want to legalize overflow.
If you have a constraint that a value may not be negative, then you need a way to detect and reject negative values; int is perfect for this. An unsigned will accept a negative value and silently overflow it into a positive value.
Bit shift semantics
Bit shift of unsigned by an amount not greater than the number of bits in the data type is always well defined. Until C++20, bit shift of signed was undefined if it would cause a 1 in the sign bit to be shifted left, or implementation-defined if it would cause a 1 in the sign bit to be shifted right. Since C++20, signed right shift always preserves the sign, but signed left shift does not. So use unsigned for some kinds of bit twiddling operations.
Mixed sign operations
The built-in arithmetic operations always operate on operands of the same type. If they are supplied operands of different types, the "usual arithmetic conversions" coerce them into the same type, sometimes with surprising results:
unsigned u = 42;
std::cout << (u * -1); // 4294967254
std::cout << std::boolalpha << (u >= -1); // false
What's the difference?
Subtracting an unsigned from another unsigned yields an unsigned result, which means that the difference between 2 and 1 is 4294967295.
Double the max value
int uses one bit to represent the sign of the value. unsigned uses this bit as just another numerical bit. So typically, int has 31 numerical bits and unsigned has 32. This extra bit is often cited as a reason to use unsigned. But if 31 bits are insufficient for a particular purpose, then most likely 32 bits will also be insufficient, and you should be considering 64 bits or more.
Function overloading
The implicit conversion from int to unsigned has the same rank as the conversion from int to double, so the following example is ill formed:
void f(unsigned);
void f(double);
f(42); // error: ambiguous call to overloaded function
Interoperability
Many APIs (including the standard library) use unsigned types, often for misguided reasons. It is sensible to use unsigned to avoid mixed-sign operations when interacting with these APIs.
Appendix
The quoted snippet includes the expression 0 <= grade <= 100. This will first evaluate 0 <= grade, which is always true, because grade can't be negative. Then it will evaluate true <= 100, which is always true, because true is converted to the integer 1, and 1 <= 100 is true.
Yes it does make a difference. In the first case you declare an array of 11 elements a variable of type "unsigned int". In the second case you declare them as ints.
When the int is on 32 bits you can have values from the following ranges
–2,147,483,648 to 2,147,483,647 for plain int
0 to 4,294,967,295 for unsigned int
You normally declare something unsigned when you don't need negative numbers and you need that extra range given by unsigned. In your case I assume that that by declaring the variables unsigned, the developer doesn't accept negative scores and grades. You basically do a statistic of how many grades between 0 and 10 were introduced at the command line. So it looks like something to simulate a school grading system, therefore you don't have negative grades. But this is my opinion after reading the code.
Take a look at this post which explains what unsigned is:
what is the unsigned datatype?
As the name suggests, signed integers can be negative and unsigned cannot be. If we represent an integer with N bits then for unsigned the minimum value is 0 and the maximum value is 2^(N-1). If it is a signed integer of N bits then it can take the values from -2^(N-2) to 2^(N-2)-1. This is because we need 1-bit to represent the sign +/-
Ex: signed 3-bit integer (yes there are such things)
000 = 0
001 = 1
010 = 2
011 = 3
100 = -4
101 = -3
110 = -2
111 = -1
But, for unsigned it just represents the values [0,7]. The most significant bit (MSB) in the example signifies a negative value. That is, all values where the MSB is set are negative. Hence the apparent loss of a bit in its absolute values.
It also behaves as one might expect. If you increment -1 (111) we get (1 000) but since we don't have a fourth bit it simply "falls off the end" and we are left with 000.
The same applies to subtracting 1 from 0. First take the two's complement
111 = twos_complement(001)
and add it to 000 which yields 111 = -1 (from the table) which is what one might expect. What happens when you increment 011(=3) yielding 100(=-4) is perhaps not what one might expect and is at odds with our normal expectations. These overflows are troublesome with fixed point arithmetic and have to be dealt with.
One other thing worth pointing out is the a signed integer can take one negative value more than it can positive which has a consequence for rounding (when using integer to represent fixed point numbers for example) but am sure that's better covered in the DSP or signal processing forums.

C++ : storing a 13 digit number always fails

I'm programming in C++ and I have to store big numbers in one of my exercices.
The biggest number i have to store is : 9 780 321 563 842.
Each time i try to print the number (contained in a variable) it gives me a wrong result (not that number).
A 32bit type isn't enough since 2^32 is a 10 digit number and I have to store a 13 digit number. But with 64 bits you can respresent a number that has 20digits. So I tried using the type "uint64_t" but that didn't work for me and I really don't understand why.
So I searched on the internet to find which type would be sufficient for my variable to fit in. I saw on this forum persons with the same problem but they solved it using long long int or long double as type. But none worked for me (neither did long float).
I really don't know which other type could store that number, as I tried a lot but nothing worked for me.
Thanks for your help! :)
--
EDIT : The code is a bit long and complex and would not matter for the question, so this is actually what I do with the variable containing that number :
string barcode_s = "9780321563842";
uint64_t barcode = atoi(barcode_s.c_str());
cout << "Barcode is : " << barcode << endl;
Off course I don't put that number in a variable (of type string) "barcode_s" to convert it directly to a number, but that's what happen in my program. I read text from an input file and put it in "barcode_s" (the text I read and put in that variable is always a number) and then I convert that string to a number (using atoi).
So i presume the problem comes from the "atoi" function?
Thanks for your help!
The problem is indeed atoi: it returns an int, which is on most platforms a 32-bits integer. Converting to uint64_t from int will not magically restore the information that has been lost.
There are several solutions, though. In C++03, you could use stringstream to handle the conversion:
std::istringstream stream(barcode_s);
unsigned long barcode = 0;
if (not (stream >> barcode)) { std::abort(); }
In C++11, you can simply use stoul or stoull:
unsigned long long const barcode = std::stoull(barcode_s);
Your number 9 780 321 563 842 is hex 8E52897B4C2, which fits into 44 bits (4 bits per hex digit), so any 64 bit integer, no matter if signed or unsigned, will have space to spare. 'uint64_t' will work, and it will even fit into a 'double' with no loss of precision.
It follows that the remaining issue is a mistake in your code, usually that is either an accidental conversion of the 64 bit number to another type somewhere, or you are calling the wrong fouction to print a 64 bit integer.
Edit: just saw your code. 'atoi' returns int. As in 'int32_t'. Converting that to 'unit64_t' will not reconstruct the 64 bit number. Have a look at this: http://msdn.microsoft.com/en-us/library/czcad93k.aspx
The atoll () function converts char* to a long long.
If you don't have the longer function available, write your own in the mean time.
uint64_t result = 0 ;
for (unsigned int ii = 0 ; str.c_str()[ii] != 0 ; ++ ii)
{
result *= 10 ;
result += str.c_str () [ii] - '0' ;
}

Undefined behavior when exceed 64 bits

I have written a function that converts a decimal number to a binary number. I enter my decimal number as a long long int. It works fine with small numbers, but my task is to determine how the computer handles overflow so when I enter (2^63) - 1 the function outputs that the decimal value is 9223372036854775808 and in binary it is equal to -954437177. When I input 2^63 which is a value a 64 bit machine can't hold, I get warnings that the integer constant is so large that it is unsigned and that the decimal constant is unsigned only in ISO C90 and the output of the decimal value is negative 2^63 and binary number is 0. I'm using gcc as a compiler. Is that outcome correct?
The code is provided below:
#include <iostream>
#include<sstream>
using namespace std;
int main()
{
long long int answer;
long long dec;
string binNum;
stringstream ss;
cout<<"Enter the decimal to be converted:"<< endl;;
cin>>dec;
cout<<"The dec number is: "<<dec<<endl;
while(dec>0)
{
answer = dec%2;
dec=dec/2;
ss<<answer;
binNum=ss.str();
}
cout<<"The binary of the given number is: ";
for (int i=sizeof(binNum);i>=0;i--){
cout<<binNum[i];}
return 0;
}
First, “on a 64-bit computer” is meaningless: long long is guaranteed at least 64 bits regardless of computer. If could press a modern C++ compiler onto a Commodore 64 or a Sinclair ZX80, or for that matter a KIM-1, a long long would still be at least 64 bits. This is a machine-independent guarantee given by the C++ standard.
Secondly, specifying a too large value is not the same as “overflow”.
The only thing that makes this question a little bit interesting is that there is a difference. And that the standard treats these two cases differently. For the case of initialization of a signed integer with an integer value a conversion is performed if necessary, with implementation-defined effect if the value cannot be represented, …
C++11 §4.7/3:
“If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined”
while for the case of e.g. a multiplication that produces a value that cannot be represented by the argument type, the effect is undefined (e.g., might even crash) …
C++11 §5/4:
“If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.”
Regarding the code I I only discovered it after writing the above, but it does look like it will necessarily produce overflow (i.e. Undefined Behavior) for sufficiently large number. Put your digits in a vector or string. Note that you can also just use a bitset to display the binary digits.
Oh, the KIM-1. Not many are familiar with it, so here’s a photo:
It was, reportedly, very nice, in spite of the somewhat restricted keyboard.
This adaptation of your code produces the answer you need. Your code is apt to produce the answer with the bits in the wrong order. Exhaustive testing of decimal values 123, 1234567890, 12345678901234567 show it working OK (G++ 4.7.1 on Mac OS X 10.7.4).
#include <iostream>
#include<sstream>
using namespace std;
int main()
{
long long int answer;
long long dec;
string binNum;
cout<<"Enter the decimal to be converted:"<< endl;;
cin>>dec;
cout<<"The dec number is: "<<dec<<endl;
while(dec>0)
{
stringstream ss;
answer = dec%2;
dec=dec/2;
ss<<answer;
binNum.insert(0, ss.str());
// cout << "ss<<" << ss.str() << ">> bn<<" << binNum.c_str() << ">>" << endl;
}
cout<<"The binary of the given number is: " << binNum.c_str() << endl;
return 0;
}
Test runs:
$ ./bd
Enter the decimal to be converted:
123
The dec number is: 123
The binary of the given number is: 1111011
$ ./bd
Enter the decimal to be converted:
1234567890
The dec number is: 1234567890
The binary of the given number is: 1001001100101100000001011010010
$ ./bd
Enter the decimal to be converted:
12345678901234567
The dec number is: 12345678901234567
The binary of the given number is: 101011110111000101010001011101011010110100101110000111
$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
obase=2
123
1111011
1234567890
1001001100101100000001011010010
12345678901234567
101011110111000101010001011101011010110100101110000111
$
When I compile this with the largest value possible for a 64 bit machine, nothing shows up for my binary value.
$ bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
2^63-1
9223372036854775807
quit
$ ./bd
Enter the decimal to be converted:
9223372036854775807
The dec number is: 9223372036854775807
The binary of the given number is: 111111111111111111111111111111111111111111111111111111111111111
$
If you choose a larger value for the largest value that can be represented, all bets are off; you may get a 0 back from cin >> dec; and the code does not handle 0 properly.
Prelude
The original code in the question was:
#include <iostream>
using namespace std;
int main()
{
int rem,i=1,sum=0;
long long int dec = 9223372036854775808; // = 2^63 9223372036854775807 = 2^63-1
cout<<"The dec number is"<<dec<<endl;
while(dec>0)
{
rem=dec%2;
sum=sum + (i*rem);
dec=dec/2;
i=i*10;
}
cout<<"The binary of the given number is:"<<sum<<endl;
return 0;
}
I gave this analysis of the earlier code:
You are multiplying the plain int variable i by 10 for every bit position in the 64-bit number. Given that i is probably a 32-bit quantity, you are running into signed integer overflow, which is undefined behaviour. Even if i was a 128-bit quantity, it would not be big enough to handle all possible 64-bit numbers (such as 263-1) accurately.

printing float, preserving precision

I am writing a program that prints floating point literals to be used inside another program.
How many digits do I need to print in order to preserve the precision of the original float?
Since a float has 24 * (log(2) / log(10)) = 7.2247199 decimal digits of precision, my initial thought was that printing 8 digits should be enough. But if I'm unlucky, those 0.2247199 get distributed to the left and to the right of the 7 significant digits, so I should probably print 9 decimal digits.
Is my analysis correct? Is 9 decimal digits enough for all cases? Like printf("%.9g", x);?
Is there a standard function that converts a float to a string with the minimum number of decimal digits required for that value, in the cases where 7 or 8 are enough, so I don't print unnecessary digits?
Note: I cannot use hexadecimal floating point literals, because standard C++ does not support them.
In order to guarantee that a binary->decimal->binary roundtrip recovers the original binary value, IEEE 754 requires
The original binary value will be preserved by converting to decimal and back again using:[10]
5 decimal digits for binary16
9 decimal digits for binary32
17 decimal digits for binary64
36 decimal digits for binary128
For other binary formats the required number of decimal digits is
1 + ceiling(p*log10(2))
where p is the number of significant bits in the binary format, e.g. 24 bits for binary32.
In C, the functions you can use for these conversions are snprintf() and strtof/strtod/strtold().
Of course, in some cases even more digits can be useful (no, they are not always "noise", depending on the implementation of the decimal conversion routines such as snprintf() ). Consider e.g. printing dyadic fractions.
24 * (log(2) / log(10)) = 7.2247199
That's pretty representative for the problem. It makes no sense whatsoever to express the number of significant digits with an accuracy of 0.0000001 digits. You are converting numbers to text for the benefit of a human, not a machine. A human couldn't care less, and would much prefer, if you wrote
24 * (log(2) / log(10)) = 7
Trying to display 8 significant digits just generates random noise digits. With non-zero odds that 7 is already too much because floating point error accumulates in calculations. Above all, print numbers using a reasonable unit of measure. People are interested in millimeters, grams, pounds, inches, etcetera. No architect will care about the size of a window expressed more accurately than 1 mm. No window manufacturing plant will promise a window sized as accurate as that.
Last but not least, you cannot ignore the accuracy of the numbers you feed into your program. Measuring the speed of an unladen European swallow down to 7 digits is not possible. It is roughly 11 meters per second, 2 digits at best. So performing calculations on that speed and printing a result that has more significant digits produces nonsensical results that promise accuracy that isn't there.
If you have a C library that is conforming to C99 (and if your float types have a base that is a power of 2 :) the printf format character %a can print floating point values without lack of precision in hexadecimal form, and utilities as scanf and strod will be able to read them.
If the program is meant to be read by a computer, I would do the simple trick of using char* aliasing.
alias float* to char*
copy into an unsigned (or whatever unsigned type is sufficiently large) via char* aliasing
print the unsigned value
Decoding is just reversing the process (and on most platform a direct reinterpret_cast can be used).
The floating-point-to-decimal conversion used in Java is guaranteed to be produce the least number of decimal digits beyond the decimal point needed to distinguish the number from its neighbors (more or less).
You can copy the algorithm from here: http://www.docjar.com/html/api/sun/misc/FloatingDecimal.java.html
Pay attention to the FloatingDecimal(float) constructor and the toJavaFormatString() method.
If you read these papers (see below), you'll find that there are some algorithm that print the minimum number of decimal digits such that the number can be re-interpreted unchanged (i.e. by scanf).
Since there might be several such numbers, the algorithm also pick the nearest decimal fraction to the original binary fraction (I named float value).
A pity that there's no such standard library in C.
http://www.cs.indiana.edu/~burger/FP-Printing-PLDI96.pdf
http://grouper.ieee.org/groups/754/email/pdfq3pavhBfih.pdf
You can use sprintf. I am not sure whether this answers your question exactly though, but anyways, here is the sample code
#include <stdio.h>
int main( void )
{
float d_n = 123.45;
char s_cp[13] = { '\0' };
char s_cnp[4] = { '\0' };
/*
* with sprintf you need to make sure there's enough space
* declared in the array
*/
sprintf( s_cp, "%.2f", d_n );
printf( "%s\n", s_cp );
/*
* snprinft allows to control how much is read into array.
* it might have portable issues if you are not using C99
*/
snprintf( s_cnp, sizeof s_cnp - 1 , "%f", d_n );
printf( "%s\n", s_cnp );
getchar();
return 0;
}
/* output :
* 123.45
* 123
*/
With something like
def f(a):
b=0
while a != int(a): a*=2; b+=1
return a, b
(which is Python) you should be able to get mantissa and exponent in a loss-free way.
In C, this would probably be
struct float_decomp {
float mantissa;
int exponent;
}
struct float_decomp decomp(float x)
{
struct float_decomp ret = { .mantissa = x, .exponent = 0};
while x != floor(x) {
ret.mantissa *= 2;
ret.exponent += 1;
}
return ret;
}
But be aware that still not all values can be represented in that way, it is just a quick shot which should give the idea, but probably needs improvement.

Using tilde to get MAX value for int

I tryed to get MAX value for int, using tilde.But output is not what I have expected.
When I run this:
#include <stdio.h>
#include <limits.h>
int main(){
int a=0;
a=~a;
printf("\nMax value: %d",-a);
printf("\nMax value: %d",INT_MAX);
return 0;
}
I get output:
Max value: 1
Max value: 2147483647
I thought,(for exemple) if i have 0000 in RAM (i know that first bit shows is number pozitiv or negativ).After ~ 0000 => 1111 and after -(1111) => 0111 ,that I would get MAX value.
You have a 32-bit two's complement system. So - a = 0 is straightforward. ~a is 0xffffffff. In a 32-bit two's complement representation, 0xffffffff is -1. Basic algebra explains that -(-1) is 1, so that's where your first printout comes from. INT_MAX is 0x7fffffff.
Your logical error is in this statement: "-(1111) => 0111", which is not true. The arithmetic negation operation for a two's complement number is equivalent to ~x+1 - for your example:
~x + 1 = ~(0xffffffff) + 1
= 0x00000000 + 1
= 0x00000001
Is there a reason you can't use std::numeric_limits<int>::max()? Much easier and impossible to make simple mistakes.
In your case, assuming 32 bit int:
int a = 0; // a = 0
a = ~a; // a = 0xffffffff = -1 in any twos-comp system
a = -a; // a = 1
So that math is an incorrect way of computer the max. I can't see a formulaic way to compute the max: Just use numeric_limits (or INT_MAX if you're in a C-only codebase).
Your trick of using '~' to get maximum value works with unsigned integers. As others have pointed out, it doesn't work for signed integers.
Your posting shows an int which is equivalent to signed int. Try changing the type to unsigned int and see what happens.
There is no formula to compute the max value of a signed integer type in C. You simply must use the INT_MAX, etc. macros from limits.h and stdint.h.
binary 1...1111 would always represent -1. Simple math says -1 * -1 = 1!
Always remember there's just one zero: 0...0000. If you'd now swap the MSB and you'd be right, then you'd have 10...0000 which would then be -0 which can't be true (as 0 = -0 in math, but your binary numbers would be different).
Getting the negative value of a number isn't just about swapping the MSB.
It's not quite as straightforward as the top-bit indicating the sign. If it were, you could have both +0 and -0. You should read up on two's complement.
The correct answer is
max = (~0) >> 1;
I'm not a C/C++ expert, so you might need >>> instead. You need the shift operator that does NOT do sign extension.
In 2's complement notation 111111... is -1; now, the unary minus operator does not simply change the sign bit (otherwise it would provide strange results in every normal context), but computes correctly the opposite of the number, i.e. +1.
If you want to change the MSB you could use bitwise operators to simply set it to zero. Notice that however this way of finding the maximum value for the int type is not portable, since you're making assumptions about how the number is represented that are not required by the standard.