Flip bits using XOR 0xffffffff or ~ in C++? - c++

If I want to flip some bits, I was wondering which way is better. Should I flip them using XOR 0xffffffff or by using ~?
I'm afraid that there will be some cases where I might need to pad bits onto the end in one of these ways and not the other, which would make the other way safer to use. I'm wondering if there are times when it's better to use one over the other.
Here is some code that uses both on the same input value, and the output values are always the same.
#include <iostream>
#include <iomanip>
void flipBits(unsigned long value)
{
const unsigned long ORIGINAL_VALUE = value;
std::cout << "Original value:" << std::setw(19) << std::hex << value << std::endl;
value ^= 0xffffffff;
std::cout << "Value after XOR:" << std::setw(18) << std::hex << value << std::endl;
value = ORIGINAL_VALUE;
value = ~value;
std::cout << "Value after bit negation: " << std::setw(8) << std::hex << value << std::endl << std::endl;
}
int main()
{
flipBits(0x12345678);
flipBits(0x11223344);
flipBits(0xabcdef12);
flipBits(15);
flipBits(0xffffffff);
flipBits(0x0);
return 0;
}
Output:
Original value: 12345678
Value after XOR: edcba987
Value after bit negation: edcba987
Original value: 11223344
Value after XOR: eeddccbb
Value after bit negation: eeddccbb
Original value: abcdef12
Value after XOR: 543210ed
Value after bit negation: 543210ed
Original value: f
Value after XOR: fffffff0
Value after bit negation: fffffff0
Original value: ffffffff
Value after XOR: 0
Value after bit negation: 0
Original value: 0
Value after XOR: ffffffff
Value after bit negation: ffffffff

Use ~:
You won't be relying on any specific width of the type; for example, int is not 32 bits on all platforms.
It removes the risk of accidentally typing one f too few or too many.
It makes the intent clearer.

As you're asking for c++ specifically, simply use std::bitset
#include <iostream>
#include <iomanip>
#include <bitset>
#include <limits>
void flipBits(unsigned long value) {
std::bitset<std::numeric_limits<unsigned long>::digits> bits(value);
std::cout << "Original value : 0x" << std::hex << value;
value = bits.flip().to_ulong();
std::cout << ", Value after flip: 0x" << std::hex << value << std::endl;
}
See live demo.
As for your mentioned concerns, of just using the ~ operator with the unsigned long value, and having more bits flipped as actually wanted:
Since std::bitset<NumberOfBits> actually specifies the number of bits, that should be operated on, it will well solve such problems correctly.

Related

Bit masking with hexadecimal in c++

I need to mask my output in binary with hexadecimals variables. Do I need to convert the binary output to hexadecimal (or hexadecimals variables to binary)? Or is there any way in C++ to directly mask them and store it to a new variable?
#Edit : The binary output is stored to a std::bitset variable.
The use of bitset wasn't mentioned in your question, improve on that next time.
You need to create a bitmask for the hex value as well. Then you can just & the bitmasks
#include <bitset>
#include <iostream>
int main()
{
std::bitset<8> value{ 0x03 };
std::bitset<8> mask{ 0x01 };
std::bitset<8> masked_value = value & mask;
std::cout << value.to_string() << " & " << mask.to_string() << " = " << masked_value.to_string() << "\n";
}

Output of strtoull() loses precision when converted to double and then back to uint64_t

Consider the following:
#include <iostream>
#include <cstdint>
int main() {
std::cout << std::hex
<< "0x" << std::strtoull("0xFFFFFFFFFFFFFFFF",0,16) << std::endl
<< "0x" << uint64_t(double(std::strtoull("0xFFFFFFFFFFFFFFFF",0,16))) << std::endl
<< "0x" << uint64_t(double(uint64_t(0xFFFFFFFFFFFFFFFF))) << std::endl;
return 0;
}
Which prints:
0xffffffffffffffff
0x0
0xffffffffffffffff
The first number is just the result of converting ULLONG_MAX, from a string to a uint64_t, which works as expected.
However, if I cast the result to double and then back to uint64_t, then it prints 0, the second number.
Normally, I would attribute this to the precision inaccuracy of floats, but what further puzzles me, is that if I cast the ULLONG_MAX from uint64_t to double and then back to uint64_t, the result is correct (third number).
Why the discrepancy between the second and the third result?
EDIT (by #Radoslaw Cybulski)
For another what-is-going-on-here try this code:
#include <iostream>
#include <cstdint>
using namespace std;
int main() {
uint64_t z1 = std::strtoull("0xFFFFFFFFFFFFFFFF",0,16);
uint64_t z2 = 0xFFFFFFFFFFFFFFFFull;
std::cout << z1 << " " << uint64_t(double(z1)) << "\n";
std::cout << z2 << " " << uint64_t(double(z2)) << "\n";
return 0;
}
which happily prints:
18446744073709551615 0
18446744073709551615 18446744073709551615
The number that is closest to 0xFFFFFFFFFFFFFFFF, and is representable by double (assuming 64 bit IEEE) is 18446744073709551616. You'll find that this is a bigger number than 0xFFFFFFFFFFFFFFFF. As such, the number is outside the representable range of uint64_t.
Of the conversion back to integer, the standard says (quoting latest draft):
[conv.fpint]
A prvalue of a floating-point type can be converted to a prvalue of an integer type.
The conversion truncates; that is, the fractional part is discarded.
The behavior is undefined if the truncated value cannot be represented in the destination type.
Why the discrepancy between the second and the third result?
Because the behaviour of the program is undefined.
Although it is mostly pointless to analyse reasons for differences in UB because the scope of variation is limitless, my guess at the reason for the discrepancy in this case is that in one case the value is compile time constant, while in the other there is a call to a library function that is invoked at runtime.

How does std::string length() function work?

I can't understand why this loop prints "INFINITE". If the string length is 1, how can length()-2 result in a big integer?
for(int i=0;i<s.length()-2;i++)
{
cout<<"INFINITE"<<endl;
}
std::string.length() returns a size_t. This is an unsigned integer type. You are experiencing integer overflow. In pseudocode:
0 - 1 = int.maxvalue
In your case specifically it is:
(size_t)1 - 2 = SIZE_MAX
where SIZE_MAX usually equals 2^32 - 1
std::string::length() returns a std::string::size_type.
std::string::size_type is specified to be the same type as allocator_traits<>::size_type (of the string's allocator).
This is specified to be an unsigned type.
Hence, the number will wrap (defined behaviour) and become huge. Precisely how huge will depend on the architecture.
You can test it on your architecture with this little program:
#include <limits>
#include <iostream>
#include <string>
#include <utility>
#include <iomanip>
int main() {
using size_type = std::string::size_type;
std::cout << "unsigned : " << std::boolalpha << std::is_unsigned<size_type>::value << std::endl;
std::cout << "size : " << std::numeric_limits<size_type>::digits << " bits" << std::endl;
std::cout << "npos : " << std::hex << std::string::npos << std::endl;
}
in the case of apple x64:
unsigned : true
size : 64 bits
npos : ffffffffffffffff

Appending bits in C/C++

I want to append two unsigned 32bit integers into 1 64 bit integer. I have tried this code, but it fails. However, it works for 16bit integers into 1 32 bit
Code:
char buffer[33];
char buffer2[33];
char buffer3[33];
/*
uint16 int1 = 6535;
uint16 int2 = 6532;
uint32 int3;
*/
uint32 int1 = 653545;
uint32 int2 = 562425;
uint64 int3;
int3 = int1;
int3 = (int3 << 32 /*(when I am doing 16 bit integers, this 32 turns into a 16)*/) | int2;
itoa(int1, buffer, 2);
itoa(int2, buffer2, 2);
itoa(int3, buffer3, 2);
std::cout << buffer << "|" << buffer2 << " = \n" << buffer3 << "\n";
Output when the 16bit portion is enabled:
1100110000111|1100110000100 =
11001100001110001100110000100
Output when the 32bit portion is enabled:
10011111100011101001|10001001010011111001 =
10001001010011111001
Why is it not working? Thanks
I see nothing wrong with this code. It works for me. If there's a bug, it's in the code that's not shown.
Version of the given code, using standardized type declarations and iostream manipulations, instead of platform-specific library calls. The bit operations are identical to the example given.
#include <iostream>
#include <iomanip>
#include <stdint.h>
int main()
{
uint32_t int1 = 653545;
uint32_t int2 = 562425;
uint64_t int3;
int3 = int1;
int3 = (int3 << 32) | int2;
std::cout << std::hex << std::setw(8) << std::setfill('0')
<< int1 << " "
<< std::setw(8) << std::setfill('0')
<< int2 << "="
<< std::setw(16) << std::setfill('0')
<< int3 << std::endl;
return (0);
}
Resulting output:
0009f8e9 000894f9=0009f8e9000894f9
The bitwise operation looks correct to me. When working with bits, hexadecimal is more convenient. Any bug, if there is one, is in the code that was not shown in the question. As far as "appending bits in C++" goes, what you have in your code appears to be correct.
Try declaring buffer3 as buffer3[65]
Edit:
Sorry.
But I don't understand what the complaint is about.
In fact the answer is just as expected. You can infer it from your own result for the 16 bit input.
Since when you are oring the 32 '0' bits in lsb with second integer it will have leading zeroes in msb (when assigned to a 32 bit int which is in the signature of atoi) which are truncated in atoi (only the integer value equivalent will be read in the string, hence the string has to be 0X0 terminated, otherwise it would have a determinable size), giving the result.

Why is this value printed although being NaN?

The following code assumes that we are on an x86-compatible system and that long double maps to x87 FPU's 80-bit format.
#include <cmath>
#include <array>
#include <cstring>
#include <iomanip>
#include <iostream>
int main()
{
std::array<uint8_t,10> data1{0x52,0x23,0x6f,0x24,0x8f,0xac,0xd1,0x43,0x30,0x02};
std::array<uint8_t,10> data2{0x52,0x23,0x6f,0x24,0x8f,0xac,0xd1,0xc3,0x30,0x02};
std::array<uint8_t,10> data3{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x80,0x30,0x02};
long double value1, value2, value3;
static_assert(sizeof value1 >= 10,"Expected float80");
std::memcpy(&value1, data1.data(),sizeof value1);
std::memcpy(&value2, data2.data(),sizeof value2);
std::memcpy(&value3, data3.data(),sizeof value3);
std::cout << "isnan(value1): " << std::boolalpha << std::isnan(value1) << "\n";
std::cout << "isnan(value2): " << std::boolalpha << std::isnan(value2) << "\n";
std::cout << "isnan(value3): " << std::boolalpha << std::isnan(value3) << "\n";
std::cout << "value1: " << std::setprecision(20) << value1 << "\n";
std::cout << "value2: " << std::setprecision(20) << value2 << "\n";
std::cout << "value3: " << std::setprecision(20) << value3 << "\n";
}
Output:
isnan(value1): true
isnan(value2): false
isnan(value3): false
value1: 3.3614005946481929011e-4764
value2: 9.7056260598879139386e-4764
value3: 6.3442254652397210376e-4764
Here value1 is classified as "unsupported" by 387 and higher, because it has nonzero and not all-ones exponent — it's in fact an "unnormal". And isnan works as expected with it: the value is indeed nothing of a number (although not exactly a NaN). The second value, value2, has that integer bit set, and also works as expected: it's not a NaN. The third one is the value of the missing integer bit.
But somehow both numbers value1 and value2 appear printed, and the values differ exactly by the missing integer bit! Why is that? All other methods I tried, like printf and to_string give just 0.00000.
Even stranger, if I do any arithmetic with value1, in subsequent prints I do get nan. Taking this into account, how does operator<<(long double) even manage to actually print anything but nan? Does it explicitly set the integer bit, or maybe it parses the number instead of doing any FPU arithmetic on it? (assuming g++4.8 on Linux 32 bit).
All other methods I tried, like printf and to_string give just
0.00000.
What operator<<(long double) actually does is using the num_put<> class from locale library to perform the numeric formatting, which in turn uses one of the printf-family functions (see sections 27.7.3.6 and 22.4.2.2 of the C++ standard).
Depending on the settings, printf conversion specifier used for long double by locale might be any of: %Lf, %Le, %LE, %La, %LA, %Lg or %LG.
In your (and my) case it seems to be %Lg:
printf("value1: %.20Lf\n", value1);
printf("value1: %.20Le\n", value1);
printf("value1: %.20La\n", value1);
printf("value1: %.20Lg\n", value1);
std::cout << "value1: " << std::setprecision(20) << value1 << "\n";
value1: 0.00000000000000000000
value1: 3.36140059464819290106e-4764
value1: 0x4.3d1ac8f246f235200000p-15826
value1: 3.3614005946481929011e-4764
value1: 3.3614005946481929011e-4764
Taking this into account, how does operator<<(long double) even manage
to actually print anything but nan? Does it explicitly set the integer
bit, or maybe it parses the number instead of doing any FPU arithmetic
on it?
It prints the unnormalized value.
Conversion from binary to decimal floating point representation used by printf() may be performed without any FPU arithmetics. You can find the glibc implementation in the stdio-common/printf_fp.c source file.
I was trying this:
long double value = std::numeric_limits<long double>::quiet_NaN();
std::cout << "isnan(value): " << std::boolalpha << std::isnan(value) << "\n";
std::cout << "value: " << std::setprecision(20) << value << "\n";
So my assumption is that as stated here: http://en.cppreference.com/w/cpp/numeric/math/isnan value is being cast to double and not long double when evaluated by std::isnan and strictly:
std::numeric_limits<long double>::quiet_NaN() != std::numeric_limits<double>::quiet_NaN()