Question
What is the best way to convert binary to it's integral representation?
Context
Let's imagine that we have a buffer containing binary data obtained from an external source such as a socket connection or a binary file. The data is organised in a well defined format and we know that the first four octets represent a single unsigned 32 bit integer (which could be the size of following data). What would be the more efficient way to covert those octets to a usable format (such as std::uint32_t)?
Example
Here is what I have tried so far:
#include <algorithm>
#include <array>
#include <cstdint>
#include <cstring>
#include <iostream>
int main()
{
std::array<char, 4> buffer = { 0x01, 0x02, 0x03, 0x04 };
std::uint32_t n = 0;
n |= static_cast<std::uint32_t>(buffer[0]);
n |= static_cast<std::uint32_t>(buffer[1]) << 8;
n |= static_cast<std::uint32_t>(buffer[2]) << 16;
n |= static_cast<std::uint32_t>(buffer[3]) << 24;
std::cout << "Bit shifting: " << n << "\n";
n = 0;
std::memcpy(&n, buffer.data(), buffer.size());
std::cout << "std::memcpy(): " << n << "\n";
n = 0;
std::copy(buffer.begin(), buffer.end(), reinterpret_cast<char*>(&n));
std::cout << "std::copy(): " << n << "\n";
}
On my system, the result of the following program is
Bit shifting: 67305985
std::memcpy(): 67305985
std::copy(): 67305985
Are they all standard compliant or are they using implementation defined behaviour?
Which one is the more efficient?
Is there an bette way to make that conversion?
You essentially are asking about endianness. While your program might work on one computer, it might not on another. If the "well defined format" is network order, there are a standard set of macros/functions to convert to and from network order to the natural order for your specific machine.
Related
Unsing MS Studio 2022 I am trying to pack two items into a union of size 16 bits but I am having problems with the correct syntax.
The first item is an unsigned short int so no problems there. The other is an array of 5 items, all two bits long. So imagine:
enum States {unused, on, off};
// Should be able to store this in a 2 bit field
then I want
States myArray[5];
// Should be able to fit in 10 bits and
// be unioned with my unsigned short
Unfortunatly I am completely failing to work out the correct syntax which leads to my array fitting into 16 bits. Any ideas?
You can't do that. An array is an array, not some packed bits.
What you can do is using manual bit manipulation:
#include <iostream>
#include <cstdint>
#include <bitset>
#include <climits>
enum status {
on = 0x03,
off = 0x01,
unused = 0x00
};
constexpr std::uint8_t status_bit_width = 2;
std::uint16_t encode(status s,std::uint8_t id, std::uint16_t status_vec) {
if(id >= (CHAR_BIT * sizeof(std::uint16_t)) / status_bit_width) {
std::cout << "illegal id" << std::endl;
return status_vec;
}
std::uint8_t bit_value = s;
status_vec |= bit_value << (id*status_bit_width);
return status_vec;
};
int main(void) {
std::uint16_t bits = 0;
bits = encode(on,1,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(off,2,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(unused,3,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(off,4,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(off,7,bits);
std::cout << std::bitset<16>(bits) << std::endl;
bits = encode(on,8,bits);
}
I need to mask my output in binary with hexadecimals variables. Do I need to convert the binary output to hexadecimal (or hexadecimals variables to binary)? Or is there any way in C++ to directly mask them and store it to a new variable?
#Edit : The binary output is stored to a std::bitset variable.
The use of bitset wasn't mentioned in your question, improve on that next time.
You need to create a bitmask for the hex value as well. Then you can just & the bitmasks
#include <bitset>
#include <iostream>
int main()
{
std::bitset<8> value{ 0x03 };
std::bitset<8> mask{ 0x01 };
std::bitset<8> masked_value = value & mask;
std::cout << value.to_string() << " & " << mask.to_string() << " = " << masked_value.to_string() << "\n";
}
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Consider the following code for integral types:
template <class T>
std::string as_binary_string( T value ) {
return std::bitset<sizeof( T ) * 8>( value ).to_string();
}
int main() {
unsigned char a(2);
char b(4);
unsigned short c(2);
short d(4);
unsigned int e(2);
int f(4);
unsigned long long g(2);
long long h(4);
std::cout << "a = " << +a << " " << as_binary_string( a ) << std::endl;
std::cout << "b = " << +b << " " << as_binary_string( b ) << std::endl;
std::cout << "c = " << c << " " << as_binary_string( c ) << std::endl;
std::cout << "d = " << c << " " << as_binary_string( d ) << std::endl;
std::cout << "e = " << e << " " << as_binary_string( e ) << std::endl;
std::cout << "f = " << f << " " << as_binary_string( f ) << std::endl;
std::cout << "g = " << g << " " << as_binary_string( g ) << std::endl;
std::cout << "h = " << h << " " << as_binary_string( h ) << std::endl;
std::cout << "\nPress any key and enter to quit.\n";
char q;
std::cin >> q;
return 0;
}
Pretty straight forward, works well and is quite simple.
EDIT
How would one go about writing a function to extract the binary or bit pattern of arbitrary floating point types at compile time?
When it comes to floats I have not found anything similar in any existing libraries of my own knowledge. I've searched google for days looking for one, so then I resorted into trying to write my own function without any success. I no longer have the attempted code available since I've originally asked this question so I can not exactly show you all of the different attempts of implementations along with their compiler - build errors. I was interested in trying to generate the bit pattern for floats in a generic way during compile time and wanted to integrate that into my existing class that seamlessly does the same for any integral type. As for the floating types themselves, I have taken into consideration the different formats as well as architecture endian. For my general purposes the standard IEEE versions of the floating point types is all that I should need to be concerned with.
iBug had suggested for me to write my own function when I originally asked this question, while I was in the attempt of trying to do so. I understand binary numbers, memory sizes, and the mathematics, but when trying to put it all together with how floating point types are stored in memory with their different parts {sign bit, base & exp } is where I was having the most trouble.
Since then with the suggestions those who have given a great answer - example I was able to write a function that would fit nicely into my already existing class template and now it works for my intended purposes.
What about writing one by yourself?
static_assert(sizeof(float) == sizeof(uint32_t));
static_assert(sizeof(double) == sizeof(uint64_t));
std::string as_binary_string( float value ) {
std::uint32_t t;
std::memcpy(&t, &value, sizeof(value));
return std::bitset<sizeof(float) * 8>(t).to_string();
}
std::string as_binary_string( double value ) {
std::uint64_t t;
std::memcpy(&t, &value, sizeof(value));
return std::bitset<sizeof(double) * 8>(t).to_string();
}
You may need to change the helper variable t in case the sizes for the floating point numbers are different.
You can alternatively copy them bit-by-bit. This is slower but serves for arbitrarily any type.
template <typename T>
std::string as_binary_string( T value )
{
const std::size_t nbytes = sizeof(T), nbits = nbytes * CHAR_BIT;
std::bitset<nbits> b;
std::uint8_t buf[nbytes];
std::memcpy(buf, &value, nbytes);
for(int i = 0; i < nbytes; ++i)
{
std::uint8_t cur = buf[i];
int offset = i * CHAR_BIT;
for(int bit = 0; bit < CHAR_BIT; ++bit)
{
b[offset] = cur & 1;
++offset; // Move to next bit in b
cur >>= 1; // Move to next bit in array
}
}
return b.to_string();
}
You said it doesn't need to be standard. So, here is what works in clang on my computer:
#include <iostream>
#include <algorithm>
using namespace std;
int main()
{
char *result;
result=new char[33];
fill(result,result+32,'0');
float input;
cin >>input;
asm(
"mov %0,%%eax\n"
"mov %1,%%rbx\n"
".intel_syntax\n"
"mov rcx,20h\n"
"loop_begin:\n"
"shr eax\n"
"jnc loop_end\n"
"inc byte ptr [rbx+rcx-1]\n"
"loop_end:\n"
"loop loop_begin\n"
".att_syntax\n"
:
: "m" (input), "m" (result)
);
cout <<result <<endl;
delete[] result;
return 0;
}
This code makes a bunch of assumptions about the computer architecture and I am not sure on how many computers it would work.
EDIT:
My computer is a 64-bit Mac-Air. This program basically works by allocating a 33-byte string and filling the first 32 bytes with '0' (the 33rd byte will automatically be '\0').
Then it uses inline assembly to store the float into a 32-bit register and then it repeatedly shifts it to the right by one bit.
If the last bit in the register was 1 before the shift, it gets stored into the carry flag.
The assembly code then checks the carry flag and, if it contains 1, it increases the corresponding byte in the string by 1.
Since it was previously initialized to '0', it will turn to '1'.
So, effectively, when the loop in the assembly is finished, the binary representation of a float is stored into a string.
This code only works for x64 (it uses 64-bit registers "rbx" and "rcx" to store the pointer and the counter for the loop), but I think it's easy to tweak it to work on other processors.
An IEEE floating point number looks like the following
sign exponent mantissa
1 bit 11 bits 52 bits
Note that there's a hidden 1 before the mantissa, and the exponent
is biased so 1023 = 0, not two's complement.
By memcpy()ing to a 64 bit unsigned integer you can then apply AND and
OR masks to get the bit pattern. The arrangement could be big endian
or little endian.
You can easily work out which arrangement you have by passing easy numbers
such as 1 or 2.
Generally people either use std::hexfloat or cast a pointer to the floating-point value to a pointer to an unsigned integer of the same size and print the indirected value in hex format. Both methods facilitate bit-level analysis of floating-point in a productive fashion.
You could roll your by casting the address of the float/double to a char and iterating it that way:
#include <memory>
#include <iostream>
#include <limits>
#include <iomanip>
template <typename T>
std::string getBits(T t) {
std::string returnString{""};
char *base{reinterpret_cast<char *>(std::addressof(t))};
char *tail{base + sizeof(t) - 1};
do {
for (int bits = std::numeric_limits<unsigned char>::digits - 1; bits >= 0; bits--) {
returnString += ( ((*tail) & (1 << bits)) ? '1' : '0');
}
} while (--tail >= base);
return returnString;
}
int main() {
float f{10.0};
double d{100.0};
double nd{-100.0};
std::cout << std::setprecision(1);
std::cout << getBits(f) << std::endl;
std::cout << getBits(d) << std::endl;
std::cout << getBits(nd) << std::endl;
}
Output on my machine (note the sign flip in the third output):
01000001001000000000000000000000
0100000001011001000000000000000000000000000000000000000000000000
1100000001011001000000000000000000000000000000000000000000000000
I want to append two unsigned 32bit integers into 1 64 bit integer. I have tried this code, but it fails. However, it works for 16bit integers into 1 32 bit
Code:
char buffer[33];
char buffer2[33];
char buffer3[33];
/*
uint16 int1 = 6535;
uint16 int2 = 6532;
uint32 int3;
*/
uint32 int1 = 653545;
uint32 int2 = 562425;
uint64 int3;
int3 = int1;
int3 = (int3 << 32 /*(when I am doing 16 bit integers, this 32 turns into a 16)*/) | int2;
itoa(int1, buffer, 2);
itoa(int2, buffer2, 2);
itoa(int3, buffer3, 2);
std::cout << buffer << "|" << buffer2 << " = \n" << buffer3 << "\n";
Output when the 16bit portion is enabled:
1100110000111|1100110000100 =
11001100001110001100110000100
Output when the 32bit portion is enabled:
10011111100011101001|10001001010011111001 =
10001001010011111001
Why is it not working? Thanks
I see nothing wrong with this code. It works for me. If there's a bug, it's in the code that's not shown.
Version of the given code, using standardized type declarations and iostream manipulations, instead of platform-specific library calls. The bit operations are identical to the example given.
#include <iostream>
#include <iomanip>
#include <stdint.h>
int main()
{
uint32_t int1 = 653545;
uint32_t int2 = 562425;
uint64_t int3;
int3 = int1;
int3 = (int3 << 32) | int2;
std::cout << std::hex << std::setw(8) << std::setfill('0')
<< int1 << " "
<< std::setw(8) << std::setfill('0')
<< int2 << "="
<< std::setw(16) << std::setfill('0')
<< int3 << std::endl;
return (0);
}
Resulting output:
0009f8e9 000894f9=0009f8e9000894f9
The bitwise operation looks correct to me. When working with bits, hexadecimal is more convenient. Any bug, if there is one, is in the code that was not shown in the question. As far as "appending bits in C++" goes, what you have in your code appears to be correct.
Try declaring buffer3 as buffer3[65]
Edit:
Sorry.
But I don't understand what the complaint is about.
In fact the answer is just as expected. You can infer it from your own result for the 16 bit input.
Since when you are oring the 32 '0' bits in lsb with second integer it will have leading zeroes in msb (when assigned to a 32 bit int which is in the signature of atoi) which are truncated in atoi (only the integer value equivalent will be read in the string, hence the string has to be 0X0 terminated, otherwise it would have a determinable size), giving the result.
I am trying to write a C++ application to send a 64bit word to an Arduino.
I used termios using the method described here
The problem i am having is the byes are arriving at the arduino in least significant byte first.
ie
if a use (where serialword is a uint64_t)
write(fp,(const void*)&serialWord, 8);
the least significant bytes arrive at the arduino first.
this is not the behavior i was wanted, is there a way to get the most significant byes to arrive first? Or is it best to brake the serialword into bytes and send byte by byte?
Thanks
Since the endianess of the CPU's involved are different you will need to reverse the order of bytes before you send them or after your receive them. In this case I would recommend reversing them before you send them just to save CPU cycles on the Arduino. The simplest way using the C++ Standard Library is with std::reverse as shown in the following example
#include <cstdint> // uint64_t (example only)
#include <iostream> // cout (example only)
#include <algorithm> // std::reverse
int main()
{
uint64_t value = 0x1122334455667788;
std::cout << "Before: " << std::hex << value << std::endl;
// swap the bytes
std::reverse(
reinterpret_cast<char*>(&value),
reinterpret_cast<char*>(&value) + sizeof(value));
std::cout << "After: " << std::hex << value << std::endl;
}
This outputs the following:
Before: 1122334455667788
After: 8877665544332211