conversion of integers into binary in c++ - c++

As we know, each value is stored in binary form inside memory. So, in C++, will these two values have different binary numbers when stored inside memory ?
unsigned int a = 90;
signed int b = 90;

So, in C++, will these two values have different binary numbers when stored inside memory ?
The C++ language doesn't specify whether they do. Ultimately, the binary representation is dictated by the hardware, so the answer technically depends on that.
That said, I haven't encountered hardware and C++ implementation where identically valued signed and unsigned variants of an integer didn't have identical binary representation. As such, I would find it surprising if the binary representations were different.
Sidenote: Since "byte" is the smallest addressable unit of memory in C++, there isn't a way in the language to observe a directional order of individual bits in memory.

Consider the value 63. In binary it is 111111 and in hex it is 3f.
Because char is special in C++, and any object can be viewed as a sequence of bytes, you can directly look at the binary representation:
#include <iostream>
#include <iomanip>
int main()
{
unsigned int a = 63;
signed int b = 63;
std::cout << std::hex;
char* a_bin = reinterpret_cast<char*>(&a);
for (int i=0; i < sizeof(unsigned int); ++i)
std::cout << std::setw(4) << std::setfill('0') << static_cast<unsigned>(*(a_bin+i)) << " ";
std::cout << "\n";
char* b_bin = reinterpret_cast<char*>(&b);
for (int i=0; i < sizeof(signed int); ++i)
std::cout << std::setw(4) << std::setfill('0') << static_cast<unsigned>(*(b_bin+i)) << " ";
}
Unfortunately, there is no std::bin io-manipulator, so I used std::hex (it is sticky). The reinterpret_cast is ok, because of the aforementioned special rules for char. Because std::cout << has special overload to print characters, but we want to see numerical values, another cast is needed. The output of the above is:
003f 0000 0000 0000
003f 0000 0000 0000
Live Demo
As already mentioned in a comment, the byte order is implementation defined. Moreover, I have to admit that I am not aware about the very details of what the standard has to say about this. Be careful with assumptions about byte representation, especially when transfering objects between two programs or over a wire. You would typically use some form of de-/serialization, such that you are in control of the byte representations to be transfered.
TL;DR: Typically yes, in general you need to carefully consider what the C++ standard mandates, and I am not aware of signed and unsigned being guaranteed to have same byte representations.

Related

Is there any char data type alternative ( 1-byte value ) to represent numeric values?

I've this issue:
#include <iostream>
int main() {
unsigned char little_number_from_0_to_255 = 0;
std::cout << "How old are you (for example)? _";
std::cin >> little_number_from_0_to_255;
std::cout << "You are " << little_number_from_0_to_255 << " year/s old.";
}
In brief,
even if in this case what follows would'nt care much,
i'd like to avoid the waste of 1 byte (if compared to a short int type) for each variable i need to store when its value is so tiny,
but in all my attempts the un/signed char type variable is always interpreted as the ASCII representation of the first digit of the number I input the un/signed into.
Is there in C++ any good way to have a 1 byte data type, the value of could be only numeric (with all the arithmetic etc., like un/signed char) but without having to deal with the automatic ASCII representation (unlike un/signed char)?
If not, how could i "circumvent" the problem?
Is there a way to avoid that, even if I say that I'm 61, in the variable there will be 54 (decimal) and i will realize that i'm a liar 'cause in reality i'm only 6?
A way to say the computer that it has to grab the whole number and that it hasn't to look only at the first character input?
--I've already read of '+' just before the char variable, but this works only for output, and isn't my case.
Is there in C++ any good way to have a 1 byte data type, the value of could be only numeric (with all the arithmetic etc., like un/signed char) but without having to deal with the automatic ASCII representation (unlike un/signed char)?
You could define a wrapper class that stores a single byte integer type such as std::int8_t as a member, and implicitly converts to a non-character integer type such as int. Whether this is "good" is subjective and depends on use case.
If not, how could i "circumvent" the problem?
You can use an intermediate variable of non-character integer type when dealing with character streams:
std::uint8_t little_number_from_0_to_255 = 0;
// input
unsigned input;
std::cout << "How old are you (for example)? _";
std::cin >> input;
little_number_from_0_to_255 = input;
// output
unsigned output = little_number_from_0_to_255
std::cout << "You are " << output << " year/s old.";

Bit representation of float using an int pointer

I have the following exercise:
Implement a function void float to bits(float x) which prints the bit
representation of x. Hint: Casting a float to an int truncates the
fractional part, but no information is lost casting a float pointer to
an int pointer.
Now, I know that a float is represented by a sign-bit, some bits for its mantissa, some bits for the basis and some bits for the exponent. It depends on my system how many bits are used.
The problem we are facing here is that our number basically has two parts. Let's consider 8.7 the bit representation of this number would be (to my understanding) the following: 1000.0111
Now, float's are stored wit a leading zero, so 8.8 would become 0.88*10^1
So I somehow have to get all the information out of my memory. I don't really see how I should do that. What should that hint hint me to? What's the difference between a integer pointer and a float pointer?
Currently I have this:
void float_to_bits() {
float a = 4.2345678f;
int* b;
b = (int*)(&a);
*b = a;
std::cout << *(b) << "\n";
}
But I really don't get the bigger picture behind the hint here. How do I get the mantissa, the exponent, the sign and the basis? I also tried playing around with the bit-wise operators >>, <<. But I just don't see how this should help me here, since they won't change the pointers position. It's useful to get e.g. the bit representation of an integer but that's about it, no idea what use it'd be here.
The hint your teacher gave is misleading: casting pointer between different types is at best implementation defined. However, memcpy(...)ing an object to a suutably sized array if unsigned char is defined. The content if the resulting array can then be decomposed into bits. Here is a quick hack to represent the bits using hexadecimal values:
#include <iostream>
#include <iomanip>
#include <cstring>
int main() {
float f = 8.7;
unsigned char bytes[sizeof(float)];
std::memcpy(bytes, &f, sizeof(float));
std::cout << std::hex << std::setfill(‘0’);
for (int b: bytes) {
std::cout << std::setw(2) << b;
}
std::cout << ‘\n’;
}
Note that IEEE 754 binary floating points do not store the full significand (the standard doesn’t use mantissa as a term) except for denormalized values: the 32 bit floats store
1 bit for the sign
8 bits for the exponent
23 bits for the normalized significand with the non-zero high bit being implied
The hint directs you how to pass the Float into an Integer without passing through value conversion.
When you assign floating-point value to an integer, the processor removes the fraction part. int i = (int) 4.502f; will result in i=4;
but when you make a int pointer (int*) point to a float's location,
no conversion is made, also when you read the int* value.
to show the representation, i like seeing HEX numbers,
thats why my first example was given in HEX
(each Hexa-decimal digit represents 4 binary digits).
but it is also possible to print as binary,
and there are many ways (I like this one best!)
Follows an annotated example code:
Also available # Culio
#include <iostream>
#include <bitset>
using namespace std;
int main()
{
float a = 4.2345678f; // allocate space for a float. Call it 'a' and put the floating point value of `4.2345678f` in it.
unsigned int* b; // allocate a space for a pointer (address), call the space b, (hint to compiler, this will point to integer number)
b = (unsigned int*)(&a); // GREAT, exactly what you needed! take the float 'a', get it's address '&'.
// by default, it is an address pointing at float (float*) , so you correctly cast it to (int*).
// Bottom line: Set 'b' to the address of a, but treat this address of an int!
// The Hint implied that this wont cause type conversion:
// int someInt = a; // would cause `someInt = 4` same is your line below:
// *b = a; // <<<< this was your error.
// 1st thing, it aint required, as 'b' already pointing to `a` address, hence has it's value.
// 2nd by this, you set the value pointed by `b` to 'a' (including conversion to int = 4);
// the value in 'a' actually changes too by this instruction.
cout << a << " in binary " << bitset<32>(*b) << endl;
cout << "Sign " << bitset<1>(*b >> 31) << endl; // 1 bit (31)
cout << "Exp " << bitset<8>(*b >> 23) << endl; // 8 bits (23-30)
cout << "Mantisa " << bitset<23>(*b) << endl; // 23 bits (0-22)
}

reinterpret_cast swaps bits?

I was testing a simple compiler when I noticed that its output was completely wrong. In fact, the output had its endianness swapped from little to big. Upon closer examination, the offending code turned out to be this:
const char *bp = reinterpret_cast<const char*>(&command._instruction);
for (int i = 0; i < 4; ++i)
out << bp[i];
A four-byte instruction is reinterpreted as a set of one-byte characters and printed to stdout (it's clunky, yes, but that decision was not mine). It doesn't seem logical to me why the bits would be swapped, since the char pointer should be pointing to the most-significant (on this x86 system) bits at first. For example, given 0x00...04, the char pointer should point to 0x00, not 0x04. The case is the latter.
I have created a simple demonstration of code:
CODE
#include <bitset>
#include <iostream>
#include <stdint.h>
int main()
{
int32_t foo = 4;
int8_t* cursor = reinterpret_cast<int8_t*>(&foo);
std::cout << "Using a moving 8-bit pointer:" << std::endl;
for (int i = 0; i < 4; ++i)
std::cout << std::bitset<8>(cursor[i]) << " "; // <-- why?
std::cout << std::endl << "Using original 4-byte int:" << std::endl;
std::cout << std::bitset<32>(foo) << std::endl;
return 0;
}
Output:
Using a moving 8-bit pointer:
00000100 00000000 00000000 00000000
Using original 4-byte int:
00000000000000000000000000000100
It doesn't seem logical to me why the bits would be swapped, since the char pointer should be pointing to the most-significant (on this x86 system) bits at first.
On an x86 system, a pointer to the base of a multi-byte object does not point at the most significant byte, but at the least-significant byte. This is called "little endian" byte order.
In C, if we take the address of an object that occupies multiple bytes, and convert that to char *, it points to the base of the object: that one which is considered to be at the least significant address, from which the pointer can be positively displaced (with + or ++ etc) to get to the other bytes.

Binary representation of a double

I was bored and wanted to see what the binary representation of double's looked like. However, I noticed something weird on windows. The following lines of codes demonstrate
double number = 1;
unsigned long num = *(unsigned long *) &number;
cout << num << endl;
On my Macbook, this gives me a nonzero number. On my Windows machine it gives me 0.
I was expecting that it would give me a non zero number, since the binary representation of 1.0 as a double should not be all zeros. However, I am not really sure if what I am trying to do is well defined behavior.
My question is, is the code above just stupid and wrong? And, is there a way I can print out the binary representation of a double?
Thanks.
1 double is 3ff0 0000 0000 0000. long is a 4 byte int. On a little endian hardware you're reading the 0000 0000 part.
If your compiler supports it (GCC does) then use a union. This is undefined behavior according to the C++ standard (strict aliasing rule):
#include <iostream>
int main() {
union {
unsigned long long num;
double fp;
} pun;
pun.fp = 1.0;
std::cout << std::hex << pun.num << std::endl;
}
The output is
3ff0000000000000

Is std::bitset bit-order portable?

Does C++ say anything on bit-ordering? I'm especially working on protocol packet layouts, and I'm doubting whether there is a portable way to specify that a certain number be written into bits 5,6,7, where bit 5 is the 'most significant'.
My questions:
is 0x01 always represented as a byte with bit 7 set?
is bitset<8>().set(7).to_ulong() always equal to 1?
From 20.5/3 (ISO/IEC 14882:2011)
When converting between an object
of class bitset and a value of some integral type, bit position pos corresponds to the bit value 1 << pos.
That is, bitset<8>().set(7).to_ulong() is guaranteed to be (1 << 7) == 128.
bitset doesn't do serialization, so you don't (need to) know. Use serialization/deserialization.
is bitset<8>().set(7).to_ulong() always equal to 1
No, not on my machine (see below).
However, I'd certainly expect the iostream operators to behave portably:
#include <bitset>
#include <sstream>
#include <iostream>
int main()
{
std::bitset<8> bits;
std::cout << bits.set(7).to_ulong() << std::endl;
std::stringstream ss;
ss << bits;
std::cout << ss.rdbuf() << std::endl;
std::bitset<8> cloned;
ss >> cloned;
std::cout << cloned.set(7).to_ulong() << std::endl;
std::cout << cloned << std::endl;
}
Prints
128
10000000
128
10000000
If the question is whether you can happily ignore endianness of the platform while sending binary objects over the network, the answer is you cannot. If the question is whether the same code compiled in two different platforms will yield the same results, then the answer is yes.