Integer Arithmetics Going Wild - c++

Please, could somebody explain what's happening under the hood there?
The example runs on an Intel machine. Would the behavior be the same on other architectures?
Actually, I have a hardware counter which overruns every now and then, and I have to make sure that the intervals are always computed correctly. I thought that integer arithmetics should always do the trick but when there is a sign change, binary subtraction yields an overflow bit which appears to be actually interpreted as the sign.
Do I really have to handle the sign by myself or is there a more elegant way to compute the interval regardless of the hardware or the implementation?
TIA
std::cout << "\nTest integer arithmetics\n";
int8_t iFirst = -2;
int8_t iSecond = 2;
int8_t iResult = iSecond - iFirst;
std::cout << "\n" << std::to_string(iSecond) << " - " << std::to_string(iFirst) << " = " << std::to_string(iResult);
iResult = iFirst - iSecond;
std::cout << "\n" << std::to_string(iFirst) << " - " << std::to_string(iSecond) << " = " << std::to_string(iResult);
iFirst = SCHAR_MIN + 1; iSecond = SCHAR_MAX - 2;
iResult = iSecond - iFirst;
std::cout << "\n" << std::to_string(iSecond) << " - " << std::to_string(iFirst) << " = " << std::to_string(iResult);
iResult = iFirst - iSecond;
std::cout << "\n" << std::to_string(iFirst) << " - " << std::to_string(iSecond) << " = " << std::to_string(iResult) << "\n\n";
And this is what I get:
Test integer arithmetics
2 - -2 = 4
-2 - 2 = -4
125 - -127 = -4
-127 - 125 = 4

What happens with iResult = iFirst - iSecond is that first both variables iFirst and iSecond are promoted to int due to usual arithmetic conversion. The result is an int. That int result is truncated to int8_t for the assignment (in effect, the top 24 bits of the 32-bit int is cut away).
The int result of -127 - 125 is -252. With two's complement representation that will be 0xFFFFFF04. Truncation only leaves the 0x04 part. Therefore iResult will be equal to 4.

the problem is that your variable is 8 bit. 8 bits can hold up to 256 numbers. So, your variables can only represent numbers within -128~127 range. Any number out of that range will give wrong output. Both of your last calculations produce numbers beyond the variable's range (252 and -252). There is no elegant or even possible way to handle it as it is. You can only handle the overflow bit yourself.
PS. This is not hardware problem. Any processor would give same results.

Related

Why is UINT32_MAX + 1 = 0?

Consider the following code snippet:
#include <cstdint>
#include <limits>
#include <iostream>
int main(void)
{
uint64_t a = UINT32_MAX;
std::cout << "a: " << a << std::endl;
++a;
std::cout << "a: " << a << std::endl;
uint64_t b = (UINT32_MAX) + 1;
std::cout << "b: " << b << std::endl;
uint64_t c = std::numeric_limits<uint32_t>::max();
std::cout << "c: " << c << std::endl;
uint64_t d = std::numeric_limits<uint32_t>::max() + 1;
std::cout << "d: " << d << std::endl;
return 0;
}
Which gives the following output:
a: 4294967295
a: 4294967296
b: 0
c: 4294967295
d: 0
Why are b and d both 0? I cannot seem to find an explanation for this.
This behaviour is referred to as an overflow. uint32_t takes up 4 bytes or 32 bits of memory. When you use UINT32_MAX you are setting each of the 32 bits to 1 which is the maximum value 4 bytes of memory can represent. 1 is an integer literal which typically takes up 4 bytes of memory too. So you're basically adding 1 to the maximum value 4 bytes can represent. This is how the maximum value looks like in memory:
1111 1111 1111 1111 1111 1111 1111 1111
When you add one to this, there is no more room to represent one greater than the maximum value and hence all bits are set to 0 and back to their minimum value.
Although you're assigning to a uint64_t that has twice the capacity of uint32_t, it is only assigned after the addition operation is complete.
The addition operation checks the types of both the left and the right operands and this is what decides the type of the result. If atleast one value were of type uint64_t, the other operand would automatically be promoted to uint64_t too.
If you do:
(UINT32_MAX) + (uint64_t)1;
or:
(unint64_t)(UINT32_MAX) + 1;
,
you'll get what you expect. In languages like C#, you can use a checked block to check for overflow and prevent this from happening implicitly.

Bit shift for 32 and 64 unsigned integers

I have a question with this snippet of code:
uint32_t c = 1 << 31;
uint64_t d = 1 << 31;
cout << "c: " << std::bitset<64>(c) << endl;
cout << "d: " << std::bitset<64>(d) << endl;
cout << (c == d ? "equal" : "not equal") << endl;
The result is:
c: 0000000000000000000000000000000010000000000000000000000000000000
d: 1111111111111111111111111111111110000000000000000000000000000000
not equal
Yes, I know that the solution for 'd' is to use '1ULL'. But I cannot understand why this happens when the shift is of 31 bits. I read somewhere that it is safe to shift size-1 bits, so if I write the instruction without the 'UUL' and the literal '1' is 32 bits long then it should be safe to shift it 31 bits, right?
What am I missing here?
Regards
YotKay
The problem is that the expression that you shift left, namely, the constant 1, is treated as a signed integer. That is why the compiler performs sign extension on it before assigning the result to d, causing the result that you see.
Adding suffix U to 1 will fix the problem (demo).
uint64_t d = 1U << 31;

Appending bits in C/C++

I want to append two unsigned 32bit integers into 1 64 bit integer. I have tried this code, but it fails. However, it works for 16bit integers into 1 32 bit
Code:
char buffer[33];
char buffer2[33];
char buffer3[33];
/*
uint16 int1 = 6535;
uint16 int2 = 6532;
uint32 int3;
*/
uint32 int1 = 653545;
uint32 int2 = 562425;
uint64 int3;
int3 = int1;
int3 = (int3 << 32 /*(when I am doing 16 bit integers, this 32 turns into a 16)*/) | int2;
itoa(int1, buffer, 2);
itoa(int2, buffer2, 2);
itoa(int3, buffer3, 2);
std::cout << buffer << "|" << buffer2 << " = \n" << buffer3 << "\n";
Output when the 16bit portion is enabled:
1100110000111|1100110000100 =
11001100001110001100110000100
Output when the 32bit portion is enabled:
10011111100011101001|10001001010011111001 =
10001001010011111001
Why is it not working? Thanks
I see nothing wrong with this code. It works for me. If there's a bug, it's in the code that's not shown.
Version of the given code, using standardized type declarations and iostream manipulations, instead of platform-specific library calls. The bit operations are identical to the example given.
#include <iostream>
#include <iomanip>
#include <stdint.h>
int main()
{
uint32_t int1 = 653545;
uint32_t int2 = 562425;
uint64_t int3;
int3 = int1;
int3 = (int3 << 32) | int2;
std::cout << std::hex << std::setw(8) << std::setfill('0')
<< int1 << " "
<< std::setw(8) << std::setfill('0')
<< int2 << "="
<< std::setw(16) << std::setfill('0')
<< int3 << std::endl;
return (0);
}
Resulting output:
0009f8e9 000894f9=0009f8e9000894f9
The bitwise operation looks correct to me. When working with bits, hexadecimal is more convenient. Any bug, if there is one, is in the code that was not shown in the question. As far as "appending bits in C++" goes, what you have in your code appears to be correct.
Try declaring buffer3 as buffer3[65]
Edit:
Sorry.
But I don't understand what the complaint is about.
In fact the answer is just as expected. You can infer it from your own result for the 16 bit input.
Since when you are oring the 32 '0' bits in lsb with second integer it will have leading zeroes in msb (when assigned to a 32 bit int which is in the signature of atoi) which are truncated in atoi (only the integer value equivalent will be read in the string, hence the string has to be 0X0 terminated, otherwise it would have a determinable size), giving the result.

C++ safeguards exceeding limits of integer

I am working on a chapter review of a book: at the end of the chapter there are some questions/tasks which you are to complete.
I decided to do them in the format of a program rather than a text file:
#include <iostream>
int main(int argc, char* argv[]) {
std::cout << "Chapter review\n"
<< "1. Why does C++ have more than one integer type?\n"
<< "\tTo be able to represent more accurate values & save memory by only allocating what is needed for the task at hand.\n"
<< "2. Declare variables matching the following descriptions:\n"
<< "a.\tA short integer with the value 80:\n";
short myVal1 = 80;
std::cout << "\t\t\"short myVal1 = 80;\": " << myVal1 << std::endl
<< "b.\tAn unsigned int integer with the value 42,110:\n";
unsigned int myVal2 = 42110;
std::cout << "\t\t\"unsigned int myVal2 = 42110;\": " << myVal2 << std::endl
<< "c.\tAn integer with the value 3,000,000,000:\n";
float myVal3 = 3E+9;
std::cout << "\t\t\"float myVal3 = 3E+9;\": " << static_cast<unsigned int>(myVal3) << std::endl
<< "3. What safeguards does C++ provide to keep you from exceeding the limits of an integer type?\n"
<< "\tWhen it reaches maximum number it starts from the begging again (lowest point).\n"
<< "4. What is the distinction between 33L and 33?\n"
<< "\t33L is of type long, 33 is of type int.\n"
<< "5. Consider the two C++ statements that follow:\n\tchar grade = 65;\n\tchar grade = 'A';\nAre they equivalent?\n"
<< "\tYes, the ASCII decimal number for 'A' is '65'.\n"
<< "6. How could you use C++ to find out which character the code 88 represents?\nCome up with at least two ways.\n"
<< "\t1: \"static_cast<char>(88);\": " << static_cast<char>(88) << std::endl; // 1.
char myChar = 88;
std::cout << "\t2: \"char myChar = 88;\": " << myChar << std::endl // 2.
<< "\t3: \"std::cout << (char) 88;\" " << (char) 88 << std::endl // 3.
<< "\t4: \"std::cout << char (88);\": " << char (88) << std::endl // 4.
<< "7. Assigning a long value to a float can result in a rounding error. What about assigning long to double? long long to double?\n"
<< "\tlong -> double: Rounding error.\n\tlong long -> double: Significantly incorrect number and/or rounding error.\n"
<< "8. Evaluate the following expressions as C++ would:\n"
<< "a.\t8 * 9 + 2\n"
<< "\t\tMultiplication (8 * 9 = 72) -> addition (72 + 2 = 74).\n"
<< "b.\t6 * 3 / 4\n"
<< "\t\tMultiplication (6 * 3 = 18 -> division (18 / 4 = 4).\n"
<< "c.\t3 / 4 * 6\n"
<< "\t\tDivision (3 / 4 = 0) -> multiplication (0 * 6 = 0).\n"
<< "d.\t6.0 * 3 / 4\n"
<< "\t\tMultiplication (6.0 * 3 -> 18.0) -> division (18.0 / 4 = 4.5).\n"
<< "e.\t 15 % 4\n"
<< "\t\tDivision (15 / 4 = 3.75) Then returns the reminder, basically how many times can 4 go into 15 in this case that is 3 (3*4 = 12).\n"
<< "9. Suppose x1 and x2 are two type of double variables that you want to add as integers and assign to an integer variable. Construct a C++ statement for doing so. What if you wanted to add them as type double and then convert to int?\n"
<< "\t1: \"int myInt = static_cast<double>(doubleVar);\"\n\t2: \"int myInt = int (doubleVar);\".\n"
<< "10. What is the variable type for each of the following declarations?\n"
<< "a.\t\"auto cars = 15;\"\n\t\tint\n"
<< "b.\t\"auto iou = 150.37f;\"\n\t\tfloat\n"
<< "c.\t\"auto level = 'B';\"\n\t\tchar\n"
<< "d.\t\"auto crat = U'/U00002155';\"\n\t\twchar_t ?\n"
<< "e.\t\"auto fract = 8.25f/.25;\"\n\t\tfloat" << std::endl;
return 0;
}
It's been a while since I read chapter 3 due to moving/some other real life stuff.
What I am unsure about here is basically question number 3: it says safeguards as in plural.
However I am only aware of one: that it starts from the beginning again after reaching maximum value? Am I missing something here?
Let me know if you see any other errors also - I am doing this to learn after all :).
Basically I can't accept a comment as an answer so to sum it up:
There are none safeguards, I misunderstood that question which #n.m. clarified for me.
10.e was wrong as pointed out by #Jarod42, which is correct.
Thanks!
As for me "Declare variable of integer with the value 3,000,000,000" is:
unsigned anInteger = 3000000000;
cause c++ the 11th supplies 15 integer types and unsigned int is the smallest that can store such a big integer as 3 000 000 000.
C++ classifies integer overflow as "Undefined Behavior" - anything can happen as a result of it. This by itself may be called a "safeguard" (though it's a stretch), by the following thinking:
gcc has that -ftrapv compilation switch that makes your program crash when integer overflow happens. This allows you to debug your overflows easily. This feature is possible because C++ made it legal (by nature of Undefined Behavior) to make your program crash in these circumstances. I think the C++ Committee had this exact scenario in mind when making that part of the C++ Standard.
This is different from e.g. Java, where integer overflow causes wraparound, and is probably harder to debug.

Horrible number comes from nowhere

Out of nowhere I get quite a big result for this function... It should be very simple, but I can't see it now.
double prob_calculator_t::pimpl_t::B_full_term() const
{
double result = 0.0;
for (uint32_t j=0, j_end=U; j<j_end; j++)
{
uint32_t inhabited_columns = doc->row_sums[j];
// DEBUG
cout << "inhabited_columns: " << inhabited_columns << endl;
cout << "log_of_sum[j]: " << log_of_sum[j] << endl;
cout << "sum_of_log[j]: " << sum_of_log[j] << endl;
// end DEBUG
result += ( -inhabited_columns * log( log_of_sum[j] ) + sum_of_log[ j ] );
cout << "result: " << result << endl;
}
return result;
}
and where is the trace:
inhabited_columns: 1
log_of_sum[j]: 110.56
sum_of_log[j]: -2.81341
result: 2.02102e+10
inhabited_columns: 42
log_of_sum[j]: 110.56
sum_of_log[j]: -143.064
result: 4.04204e+10
Thanks for the help!
inhabited_columns is unsigned and I see a unary - just before it: -inhabited_columns.
(Note that unary - has a really high operator precedence; higher than * etc).
That is where your problem is! To quote Mike Seymour's answer:
When you negate it, the result is still unsigned; the value is reduced
modulo 232 to give a large positive value.
One fix would be to write
-(inhabited_columns * log(log_of_sum[j]))
as then the negation will be carried out in floating point
inhabited_columns is an unsigned type. When you negate it, the result is still unsigned; the value is reduced modulo 232 to give a large positive value.
You should change it to a sufficiently large signed type (maybe int32_t, if you're not going to have more than a couple of billion columns), or perhaps double since you're about to use it in double-precision arithmetic.