Unexpected behavior from unsigned_int64; - c++

unsigned__int64 difference;
difference=(64*33554432);
printf ("size %I64u \n", difference);
difference=(63*33554432);
printf ("size %I64u \n", difference);
the first # is ridiculously large. The second number is the correct answer. How does changing it from 62 to 63 cause such a change?
First value is 18446744071562067968
Second value is 2113929216
Sorry the values were 64 and 63, not 63 and 62.

Unless qualified otherwise, integer literals are of type int. I would assume that on the platform you're on, an int is 32-bit. So the calculation (64*33554432) overflows and becomes negative. You then cast this to a unsigned __int64, so this now gets flipped back to a very very large positive integer.
Voila:
int main()
{
int a1 = (64*33554432);
int a2 = (63*33554432);
printf("%08x\n", a1); // 80000000 (negative)
printf("%08x\n", a2); // 7e000000 (positive)
unsigned __int64 b1 = a1;
unsigned __int64 b2 = a2;
printf("%016llx\n", b1); // ffffffff80000000
printf("%016llx\n", b2); // 000000007e000000
}

On gcc it works fine and gives out the correct number in both cases.
size 2113929216
size 2080374784
Could it be a bug with printf?
Are you using MSVC or similar? try stepping it through the debugger and inspect difference after each evaluation. If the numbers look right there then it might just be a printf problem. However, under gcc on linux it's correct.

Related

C/C++ Printing bytes in hex, getting weird hex values

I am using the following to print out numbers from an array in hex:
char buff[1000];
// Populate array....
int i;
for(i = 0; i < 1000; ++i)
{
printf("[%d] %02x\n", i,buff[i]);
}
but I sometimes print weird values:
byte[280] 00
byte[281] 00
byte[282] 0a
byte[283] fffffff4 // Why is this a different length?
byte[284] 4e
byte[285] 66
byte[286] 0a
Why does this print 'fffffff4'?
Use %02hhx as the format string.
From CppReference, %02x accepts unsigned int. When you pass the arguments to printf(), which is a variadic function, buff[i] is automatically converted to int. Then the format specifier %02x makes printf() interprets the value as int, so potential negative values like (char)-1 get interpreted and printed as (int)-1, which is the cause of what you observed.
It can also be inferred that your platform has signed char type, and a 32-bit int type.
The length modifier hh will tell printf() to interpret whatever supplied as char type, so %hhx is the correct format specifier for unsigned char.
Alternatively, you can cast the data to unsigned char before printing. Like
printf("[%d] %02x\n", i, (unsigned char)buff[i]);
This can also prevent negative values from showing up as too long, as int can (almost) always contain unsigned char value.
See the following example:
#include <stdio.h>
int main(){
signed char a = +1, b = -1;
printf("%02x %02x %02hhx %02hhx\n", a, b, a, b);
return 0;
}
The output of the above program is:
01 ffffffff 01 ff
Your platform apparantly has signed char. On platforms where char is unsigned the output would be f4.
When calling a variadic function any integer argument smaller than int gets promoted to int.
A char value of f4 (-12 as a signed char) has the sign bit set, so when converted to int becomes fffffff4 (still -12 but now as a signed int) in your case.
%x02 causes printf to treat the argument as an unsigned int and will print it using at least 2 hexadecimal digits.
The output doesn't fit in 2 digits, so as many as are required are used.
Hence the output fffffff4.
To fix it, either declare your array unsigned char buff[1000]; or cast the argument:
printf("[%d] %02x\n", i, (unsigned char)buff[i]);

C++ modulus requires cast of subtraction between two *un*signed bytes to work, why?

The following Arduino (C++) code
void setup()
{
Serial.begin(115200);
byte b1 = 12;
byte b2 = 5;
const byte RING_BUFFER_SIZE = 64;
byte diff = b2 - b1;
byte diff2 = (byte)(b2 - b1) % RING_BUFFER_SIZE; //<---NOTE HOW THE (byte) CAST IS *REQUIRED* TO GET THE RIGHT RESULT!!!!
Serial.println(b1);
Serial.println(b2);
Serial.println(RING_BUFFER_SIZE);
Serial.println(diff);
Serial.println(diff2);
}
void loop()
{
}
produces the expected:
12
5
64
249
57 //<--correct answer
Whereas without the "(byte)" cast as shown here:
void setup()
{
Serial.begin(115200);
byte b1 = 12;
byte b2 = 5;
const byte RING_BUFFER_SIZE = 64;
byte diff = b2 - b1;
byte diff2 = (b2 - b1) % RING_BUFFER_SIZE; //<---(byte) cast removed
Serial.println(b1);
Serial.println(b2);
Serial.println(RING_BUFFER_SIZE);
Serial.println(diff);
Serial.println(diff2);
}
void loop()
{
}
it produces:
12
5
64
249
249 //<--wrong answer
Why the difference? Why does the modulo operator ONLY work with the explicit cast?
Note: "byte" = "uint8_t"
5 - 12 gives -7 (an int). So your code does -7 % 64.
Mathematically we would expect this to give 57. However, in C and C++, % for negative numbers it doesn't do what you might expect mathematically. Instead it satisfies the following equation:
(a/b) * b + a%b == a
Now, (-7)/64 gives 0 because C and C++ use truncation-towards-zero for integer division of negative-positive. Therefore -7 % 64 evaluates to -7.
Finally, converting -7 to uint8_t gives 249.
When you write (byte)-7 % 64 you are actually doing 249 % 64 giving the expected answer.
Regarding the behaviour of b2 - b1: all integer arithmetic is done in at least int precision; for each operand of -, if it is a narrower integer type than int it is first promoted to int (leaving the value unchanged). Further conversions may occur if the types differ after this promotion (which they don't in this case).
In code, b2 - b1 means (int)b2 - (int)b1 yielding an int; there is no way to specify doing the operation in lower precision.
Arithmetic operations want to operate on an int or larger. So, your byte's are being promoted to integers before they are subtracted -- and, you're likely getting actual int's, which C/C++ is OK with because they can hold the entire range of byte.
If the result of the subtraction is cast back down to byte, it gives you the expected overflow behavior. However, if you omit the cast in the diff2 calculation, you're doing the modulus on a negative int. And, because C/C++ signed division rounds towards zero, the signed modulus has the same sign as the dividend.
The first misstep here is to expect subtraction to act directly on your byte type, or to translate your unsigned byte into an unsigned int. The cascading problem is to overlook the behavior of C++ signed division (which is understandable if you don't know that you should expect signed arithmetic to be an issue in the first place).
Note that, if your RING_BUFFER_SIZE were not a power of two, the division wouldn't work correctly for cases like this anyway. And, since it is a power of two, note that:
(b2 - b1)&(RING_BUFFER_SIZE-1)
should work correctly.
And finally (as suggested in the comment), the right way to do a ring-buffer subtract would be to make sure b1 < RING_BUFFER_SIZE (which makes sense for a ring buffer operation), and use something like:
(b2>b1)? b2 - b1 : RING_BUFFER_SIZE + b2 - b1

bit shift for unsigned int, why negative?

Code:
unsigned int i = 1<<31;
printf("%d\n", i);
Why the out put is -2147483648, a negative value?
Updated question:
#include <stdio.h>
int main(int argc, char * argv[]) {
int i = 1<<31;
unsigned int j = 1<<31;
printf("%u, %u\n", i, j);
printf("%d, %d\n", i, j);
return 0;
}
The above print:
2147483648, 2147483648
-2147483648, -2147483648
So, does this means, signed int & unsigned int have the same bit values, the difference is how you treat the 31st bit when convert it to a number value?
%d prints the int version of the unsigned int i. Try %u for unsigned int.
printf("%u\n", i);
int main(){
printf("%d, %u",-1,-1);
return 0;
}
Output: -1, 4294967295
i.e The way a signed integer is stored and how it gets converted to signed from unsigned or vice-versa will help you. Follow this.
To answer your updated question, its how the system represents them i.e in a 2's complement (as in the above case where
-1 =2's complement of 1 = 4294967295.
Use '%u' for unsigned int
printf("%u\n", i);
.......................................................
Response to updated question: any sequence of bits can be interpreted as a signed or unsigned value.
printf("%d\n", i); invokes UB. i is unsigned int and you try to print it as signed int. Writing 1 << 31 instead of 1U << 31 is undefined too.
Print it as:
printf("%u\n", i);
or
printf("%X\n", i);
About your updated question, it also invokes UB for the very same reasons (If you use '1U' instead of 1, then for the reason that an int is initialized with 1U << 31 which is out of range value. If an unsigned is initialized with out of range value, modular arithmetic come into picture and remainder is assigned. For signed the behavior is undefined.)
Understanding the behavior on your platformOn your platform, int appears to be 4 byte. When you write something like 1 << 31, it converts to bit patters 0x80000000 on your machine.
Now when you try to print this pattern as signed, it prints signed interpretation which is -231 (AKA INT_MIN) in 2s completement system. When you print this as unsigned, you get expected 231 as output.
Learnings
1. Use 1U << 31 instead of 1 << 31
2. Always use correct print specifiers in printf.
3. Pass correct argument types to variadic functions.
4. Be careful when implicit typecast (unsigned -> signed, wider type -> narrow type) takes place. If possible, avoid such castings completely.
Try
printf("%u\n", i);
By using %d specifier, printf expects argument as int and it typecast to int.
So, use %u for unsigned int.
You are not doing the shift operation on an unsigned but on a signed int.
1 is signed.
the shift operation then shifts into the sign bit of
that signed (assuming that int is 32 bit wide), that is already undefined behavior
then you assign whatever the compiler thinks he wants for that value to an unsigned.
then you print an unsigned as a signed, again no defined behavior.
%u is for unsigned int.
%d is for signed int.
in your programs output :
2147483648, 2147483648 (output for unsigned int)
-2147483648, -2147483648 (output for signed int )

Is there a different way integer promotion will be handle in "uint8_t | (uint8_t << 8)"?

In the following program, it seems that a[1] is shifted by eight bit and goes to 0 and the print value should be 1. But integer promotion actually happens and the print value of b is 257. I am running gcc version 4.8.2 on x86-64.
Here is the question: will integer promotion be handled in a different way so the print value is not 257 without changing the code by changing processor and compilers (processor options limited to x86, x86-64, and all ARM)?
#include<stdio.h>
#include<stdint.h>
#include<inttypes.h>
int main(){
uint8_t *a;
a = (uint8_t *)malloc(sizeof(uint8_t)*2);
uint16_t b;
a[0] = 1; a[1] = 1;
b = a[0] | (a[1] << 8);
printf("b = %d\n", b);
return 0;
}
will integer promotion be handled in a different way so the print value is not 257?
No. The integer promotions always take place, meaning both a[0] and a[1] are promoted to int before the shift or the bitwise OR take place.
From the spec:
If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int...

C++ bitwise ops on double via long long pointer breaks output

yes i know, making bitwise ops on double values seems like a bad idea, but i actually need it.
You don't need to read the next paragraph for my question, only for the curious of you guys:
I actually try a special mod to the Mozilla Tamarin (Actionscript Virtual Machine). In it, any object has the first 3 bits reserved for it's type (double is 7 for example). These bits reduce precision for primitive data types (int only 29 bits etc.). For my mod, i need to expand this area by 2 bits. This means, when you for example add 2 doubles, you need to set these last 5 bits to zero, do the math, then reset them on the result. so much for the why ^^
Now back to the code.
Here a minimal example which shows a very similar problem:
double *d = new double;
*d = 15.25;
printf("float: %f\n", *d);
//forced hex output of double
printf("forced bitwise of double: ");
unsigned char * c = (unsigned char *) d;
int i;
for (i = sizeof (double)-1; i >=0 ; i--) {
printf ("%02X ", c[i]);
}
printf ("\n");
//cast to long long-pointer, so that bitops become possible
long long * l = (long long*)d;
//now the bitops:
printf("IntHex: %016X, float: %f\n", *l, *(double*)l); //this output is wrong!
*l = *l | 0x07;
printf("last 3 bits set to 1: %016X, float: %f\n", *l, *d);//this output is wrong!
*l = *l | 0x18;
printf("2 bits more set to 1: %016X, float: %f\n", *l, *d);//this output is wrong!
when running this in VisualStudio2008, the first output is correct. second too. 3rd yields 0 for both hex and float-representation, which is obviously wrong. 4th and 5th also zero for both hex and float, but the modified bits show in the hex-value. So i thought, maybe the typecast messed things up here. so 2 more outputs:
printf("float2: %f\n", *(double*)(long long*)d); //almost right
printf("float3: %f\n", *d); //almost right
well, they show 15.25, but it should be 15.2500000000000550670620214078. so i thought, hey, it's just the precision issue in the output. lets modify a bit further up:
*l = *l |= 0x10000000000;
printf("float4: %f\n", *d);
again, output is 15.25(0000), and not 15.2519531250000550670620214078. Weird enough, another forced hex output (see code above) shows no modification of d at all. so i tinkered a bit, and realized that bit 31 (0x80000000) is the last one i can set by hand. and holy moly, it actually has an effect on the output (15.250004)!
so, though i slightly strayed, still a lot of confusion. is printf broken? am i having a big/little-endian confusion here? am i accidently creating some kind of buffer overrun?
If anybody is interested, in the original problem (the tamarin thing, see above) it's pretty much inverse. there, the last three bits are already set (which represents a double). setting them to zero works fine (which is the original implementation). setting 2 more to zero has the same effect as above (overall value gets floored to zero). which by the way is not output-specific, but also math-ops seem to work with those floored values (mul of 2 values obtained like that results in 0).
Any help would be appreciated.
Greetings.
well, they show 15.25, but it should be 15.2500000000000550670620214078
By default, %f displays 6 digits of precision, so you won't see the difference. You also need to specify that the first argument is long long rather than int, using the ll modifier; otherwise, it might print garbage. If you fix that and use a higher precision, such as %.30f, you should see the expected result:
printf("last 3 bits set to 1: %016llX, float: %.30f\n", *l, *d);
printf("2 bits more set to 1: %016llX, float: %.30f\n", *l, *d);
last 3 bits set to 1: 0000000000000007, float: 15.250000000000012434497875801753
2 bits more set to 1: 000000000000001F, float: 15.250000000000055067062021407764
lets modify a bit further up:
*l = *l |= 0x10000000000;
printf("float4: %f\n", *d);
You have a rogue = giving undefined behaviour, so the value may or may not end up being modified (and the program may or may not crash, phone out for pizza, or destroy the universe). Also, if your compiler isn't C++11 compliant, the type of the integer literal might be no larger than long, which might only be 32 bits; in which case it will (probably) become zero.
Fixing those (and in my case, with your code as it is), I get the expected result:
*l = *l | 0x10000000000LL; // just one assignment, and "LL" to force "long long"
printf("float4: %f\n", *d);
float4: 15.251953
Here is a demonstration.
You have a mistake in the parameter of printf. If you pass a 8byte value you have to use
%llx instead of %x.
use
printf("last 3 bits set to 1: %llX, float: %f\n", *l, *d);
*l = *l | 0x18;
printf("2 bits more set to 1: %llX, float: %f\n", *l, *d);
and your code will work
On 32 bit the constant can not be greater then long (32 bit) so you can not do that:
*l |= 0x10000000000;
You have to create a variable then shift it.
long long ll = 1;
ll <= 32;
*l |= ll;