Right Shift in C++ giving unusual results (unsigned 64-bit)

Right Shift in C++ giving unusual results (unsigned 64-bit) - c++

I am in the horrible world of bit shifting. I have the following code:
I am shifting this number: 140638023551944 >> 5.
The binary representation for 140638023551944 according to http://www.binaryhexconverter.com/decimal-to-binary-converter is
1000011000011111011101000111
Right shifted 5, I expect: 0000010000110000111110111010
But instead, I get 4394938235998, which is 111111111101000110101110110111110001011110.
That number, to me, looks to have almost nothing at all to do with the original number. I don't see a pattern in one that exists in the other. It is very bizarre.
The code is along the lines of:
uint64_t n, index, tag;
uint64_t one = 1;
uint64_t address = 140638023551944;
/*left shift to get index into the last index.length() number of slots*/
cout << "original address is " << address << " " << "\n";
n = (address >> 5);
cout << "after right shifting away offset bits " << n << "\n";
"address" is populated with the correct integer, 140638023551944. I have verified that.
What is this bizarre behavior? It is consistent with this simulator: http://www.miniwebtool.com/bitwise-calculator/bit-shift/?data_type=10&number=140638023551944&place=5&operator=Shift+Right! But I am pretty sure right shift is not supposed to work that way!

// EVERYTHING WORKS CORRECTLY!
#include <cassert> // assert()
#include <iostream> // cout
#include <cstdint> // UINT64_MAX
using namespace std;
int main() {
uint64_t n, index, tag;
uint64_t one = 1;
uint64_t address = 140638023551944;
/*left shift to get index into the last index.length() number of slots*/
cout << "original address is " << address << " " << "\n";
n = (address >> 5);
cout << "after right shifting away offset bits " << n << "\n";
{ // Everything works correctly!
assert( 140638023551944>>5 == 140638023551944/32 );
assert( 140638023551944>>5 == 4394938235998 );
assert( 140638023551944/32 == 4394938235998 );
assert( 140638023551944 < UINT64_MAX );
}
}

Related

Showing binary representation of floating point types in C++ [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Consider the following code for integral types:
template <class T>
std::string as_binary_string( T value ) {
return std::bitset<sizeof( T ) * 8>( value ).to_string();
}
int main() {
unsigned char a(2);
char b(4);
unsigned short c(2);
short d(4);
unsigned int e(2);
int f(4);
unsigned long long g(2);
long long h(4);
std::cout << "a = " << +a << " " << as_binary_string( a ) << std::endl;
std::cout << "b = " << +b << " " << as_binary_string( b ) << std::endl;
std::cout << "c = " << c << " " << as_binary_string( c ) << std::endl;
std::cout << "d = " << c << " " << as_binary_string( d ) << std::endl;
std::cout << "e = " << e << " " << as_binary_string( e ) << std::endl;
std::cout << "f = " << f << " " << as_binary_string( f ) << std::endl;
std::cout << "g = " << g << " " << as_binary_string( g ) << std::endl;
std::cout << "h = " << h << " " << as_binary_string( h ) << std::endl;
std::cout << "\nPress any key and enter to quit.\n";
char q;
std::cin >> q;
return 0;
}
Pretty straight forward, works well and is quite simple.
EDIT
How would one go about writing a function to extract the binary or bit pattern of arbitrary floating point types at compile time?
When it comes to floats I have not found anything similar in any existing libraries of my own knowledge. I've searched google for days looking for one, so then I resorted into trying to write my own function without any success. I no longer have the attempted code available since I've originally asked this question so I can not exactly show you all of the different attempts of implementations along with their compiler - build errors. I was interested in trying to generate the bit pattern for floats in a generic way during compile time and wanted to integrate that into my existing class that seamlessly does the same for any integral type. As for the floating types themselves, I have taken into consideration the different formats as well as architecture endian. For my general purposes the standard IEEE versions of the floating point types is all that I should need to be concerned with.
iBug had suggested for me to write my own function when I originally asked this question, while I was in the attempt of trying to do so. I understand binary numbers, memory sizes, and the mathematics, but when trying to put it all together with how floating point types are stored in memory with their different parts {sign bit, base & exp } is where I was having the most trouble.
Since then with the suggestions those who have given a great answer - example I was able to write a function that would fit nicely into my already existing class template and now it works for my intended purposes.

What about writing one by yourself?
static_assert(sizeof(float) == sizeof(uint32_t));
static_assert(sizeof(double) == sizeof(uint64_t));
std::string as_binary_string( float value ) {
std::uint32_t t;
std::memcpy(&t, &value, sizeof(value));
return std::bitset<sizeof(float) * 8>(t).to_string();
}
std::string as_binary_string( double value ) {
std::uint64_t t;
std::memcpy(&t, &value, sizeof(value));
return std::bitset<sizeof(double) * 8>(t).to_string();
}
You may need to change the helper variable t in case the sizes for the floating point numbers are different.
You can alternatively copy them bit-by-bit. This is slower but serves for arbitrarily any type.
template <typename T>
std::string as_binary_string( T value )
{
const std::size_t nbytes = sizeof(T), nbits = nbytes * CHAR_BIT;
std::bitset<nbits> b;
std::uint8_t buf[nbytes];
std::memcpy(buf, &value, nbytes);
for(int i = 0; i < nbytes; ++i)
{
std::uint8_t cur = buf[i];
int offset = i * CHAR_BIT;
for(int bit = 0; bit < CHAR_BIT; ++bit)
{
b[offset] = cur & 1;
++offset; // Move to next bit in b
cur >>= 1; // Move to next bit in array
}
}
return b.to_string();
}

You said it doesn't need to be standard. So, here is what works in clang on my computer:
#include <iostream>
#include <algorithm>
using namespace std;
int main()
{
char *result;
result=new char[33];
fill(result,result+32,'0');
float input;
cin >>input;
asm(
"mov %0,%%eax\n"
"mov %1,%%rbx\n"
".intel_syntax\n"
"mov rcx,20h\n"
"loop_begin:\n"
"shr eax\n"
"jnc loop_end\n"
"inc byte ptr [rbx+rcx-1]\n"
"loop_end:\n"
"loop loop_begin\n"
".att_syntax\n"
:
: "m" (input), "m" (result)
);
cout <<result <<endl;
delete[] result;
return 0;
}
This code makes a bunch of assumptions about the computer architecture and I am not sure on how many computers it would work.
EDIT:
My computer is a 64-bit Mac-Air. This program basically works by allocating a 33-byte string and filling the first 32 bytes with '0' (the 33rd byte will automatically be '\0').
Then it uses inline assembly to store the float into a 32-bit register and then it repeatedly shifts it to the right by one bit.
If the last bit in the register was 1 before the shift, it gets stored into the carry flag.
The assembly code then checks the carry flag and, if it contains 1, it increases the corresponding byte in the string by 1.
Since it was previously initialized to '0', it will turn to '1'.
So, effectively, when the loop in the assembly is finished, the binary representation of a float is stored into a string.
This code only works for x64 (it uses 64-bit registers "rbx" and "rcx" to store the pointer and the counter for the loop), but I think it's easy to tweak it to work on other processors.

An IEEE floating point number looks like the following
sign exponent mantissa
1 bit 11 bits 52 bits
Note that there's a hidden 1 before the mantissa, and the exponent
is biased so 1023 = 0, not two's complement.
By memcpy()ing to a 64 bit unsigned integer you can then apply AND and
OR masks to get the bit pattern. The arrangement could be big endian
or little endian.
You can easily work out which arrangement you have by passing easy numbers
such as 1 or 2.

Generally people either use std::hexfloat or cast a pointer to the floating-point value to a pointer to an unsigned integer of the same size and print the indirected value in hex format. Both methods facilitate bit-level analysis of floating-point in a productive fashion.

You could roll your by casting the address of the float/double to a char and iterating it that way:
#include <memory>
#include <iostream>
#include <limits>
#include <iomanip>
template <typename T>
std::string getBits(T t) {
std::string returnString{""};
char *base{reinterpret_cast<char *>(std::addressof(t))};
char *tail{base + sizeof(t) - 1};
do {
for (int bits = std::numeric_limits<unsigned char>::digits - 1; bits >= 0; bits--) {
returnString += ( ((*tail) & (1 << bits)) ? '1' : '0');
}
} while (--tail >= base);
return returnString;
}
int main() {
float f{10.0};
double d{100.0};
double nd{-100.0};
std::cout << std::setprecision(1);
std::cout << getBits(f) << std::endl;
std::cout << getBits(d) << std::endl;
std::cout << getBits(nd) << std::endl;
}
Output on my machine (note the sign flip in the third output):
01000001001000000000000000000000
0100000001011001000000000000000000000000000000000000000000000000
1100000001011001000000000000000000000000000000000000000000000000

Comparison Of Pointers

I want to compare the memory address and pointer value of p, p + 1, q , and q + 1.
I want to understand, what the following values actually mean. I can't quite wrap my head around whats going on.
When I run the code:
I get an answer of 00EFF680 for everytime I compare the adresss p with another pointer.
I get an answer of 00EFF670 for everytime I compare the address of q with another pointer.
I get an answer of 15726208 when I look at the pointer value of p.
And I get an answer of 15726212 When I look at the pointer value of p + 1.
I get an answer of 15726192 when I look at the pointer value of q
And I get an answer of 15726200 Wehn I look at the pointer value of q + 1.
Code
#include <iostream>
#include <string>
using namespace std;
int main()
{
int val = 20;
double valD = 20;
int *p = &val;
double *q;
q = &valD;
cout << "Memory Address" << endl;
cout << p == p + 1;
cout << endl;
cout << q == q + 1;
cout << endl;
cout << p == q;
cout << endl;
cout << q == p;
cout << endl;
cout << p == q + 1;
cout << endl;
cout << q == p + 1;
cout << endl;
cout << "Now Compare Pointer Value" << endl;
cout << (unsigned long)(p) << endl;
cout << (unsigned long) (p + 1) << endl;
cout << (unsigned long)(q) << endl;
cout << (unsigned long) (q + 1) << endl;
cout <<"--------" << endl;
return 0;
}

There are a few warnings and/or errors.
The first is that overloaded operator << has higher precedence than the comparison operator (on clang++ -Woverloaded-shift-op-parentheses is the flag).
The second is that there is a comparison of distinct pointer types ('int *' and 'double *').
For the former, parentheses must be placed around the comparison to allow for the comparison to take precedence. For the latter, the pointers should be cast to a type that allows for safe comparison (e.g., size_t).
For instance on line 20, the following would work nicely.
cout << ((size_t) p == (size_t) (q + 1));
As for lines 25-28, this is standard pointer arithmetic. See the explanation here.

As to your question:
I want to compare p, p +1 , q , and q + 1. And Understand what the results mean.
If p is at address 0x80000000 then p+1 is at address 0x80000000 + sizeof(*p). If *p is int then this is 0x80000000 + 0x8 = 0x80000008. And the same reasoning applies for q.
So if you do p == p + 1 then compiler will first do the additon: p+1 then comparison, so you will have 0x80000000 == 0x80000008 which results in false.
Now to your code:
cout << p == p + 1;
is actually equivalent to:
(cout << p) == p + 1;
and that is because << has higher precedence than ==. Actually you should get a compilation error for this.
Another thing is comparision of pointers of non related types like double* with int*, without cast it should not compile.

In C and C++ pointer arithmetic is very closely tied with array manipulation. The goal is that
int array[3] = { 1, 10, 100 };
int *ptr = { 1, 10, 100 };
std::cout << array[2] << '\n';
std::cout << *(ptr + 2) << '\n';
outputs two 100s. This allows the language to treat arrays and pointers as equivalent - that's not the same thing as "the same" or "equal", see the C FAQ for clarification.
This means that the language allows:
int array[3] = { 1, 10, 100 };
int *ptr = { 1, 10, 100 };
And then
std::cout << (void*)array << ", " << (void*)&array[0] << '\n';
outputs the address of the first element twice, the first array behaves like a pointer.
std::cout << (void*)(array + 1) << ", " << (void*)&array[1] << '\n';
prints the address of the second element of array, again array behaving like a pointer in the first case.
std::cout << ptr[2] << ", " << *(ptr + 2) << '\n';
prints element #3 of ptr (100) twice, here ptr is behaving like an array in the first use,
std::cout << (void*)ptr << ", " << (void*)&ptr[0] << '\n';
prints the value of ptr twice, again ptr behaving like an array in the second use,
But this can catch people unaware.
const char* h = "hello"; // h points to the character 'h'.
std::cout << (void*)h << ", " << (void*)(h+1);
This prints the value of h and then a value one higher. But this is purely because the type of h is a pointer to a one-byte-sized data type.
h + 1;
is
h + (sizeof(*h)*1);
If we write:
const char* hp = "hello";
short int* sip = { 1 };
int* ip = { 1 };
std::cout << (void*)hp << ", " << (void*)(hp + 1) << "\n";
std::cout << (void*)sip << ", " << (void*)(sip + 1) << "\n";
std::cout << (void*)ip << ", " << (void*)(ip + 1) << "\n";
The first line of output will show two values 1 byte (sizeof char) apart, the second two values will be 2 bytes (sizeof short int) apart and the last will be four bytes (sizeof int) apart.
The << operator invokes
template<typename T>
std::ostream& operator << (std::ostream& stream, const T& instance);
The operator itself has very high precedence, higher than == so what you are actually writing is:
(std::cout << p) == p + 1
what you need to write is
std::cout << (p == p + 1)
this is going to print 0 (the result of int(false)) if the values are different and 1 (the result of int(true)) if the values are the same.

Perhaps a picture will help (For a 64bit machine)
p is a 64bit pointer to a 32bit (4byte) int. The green pointer p takes up 8 bytes. The data pointed to by p, the yellow int val takes up 4 bytes. Adding 1 to p goes to the address just after the 4th byte of val.
Similar for pointer q, which points to a 64bit (8byte) double. Adding 1 to q goes to the address just after the 8th byte of valD.

If you want to print the value of a pointer, you can cast it to void *, for example:
cout << static_cast<void*>(p) << endl;
A void* is a pointer of indefinite type. C code uses it often to point to arbitrary data whose type isn’t known at compile time; C++ normally uses a class hierarchy for that. Here, though, it means: treat this pointer as nothing but a memory location.
Adding an integer to a pointer gets you another pointer, so you want to use the same technique there:
cout << static_cast<void*>(p+1) << endl;
However, the difference between two pointers is a signed whole number (the precise type, if you ever need it, is defined as ptrdiff_t in <cstddef>, but fortunately you don’t need to worry about that with cout), so you just want to use that directly:
cout << (p+1) - p << endl;
cout << reinterpret_cast<char*>(p+1) - reinterpret_cast<char*>(p) << endl;
cout << (q - p) << endl;
That second line casts to char* because the size of a char is always 1. That’s a big hint what’s going on.
As for what’s going on under the hood: compare the numbers you get to sizeof(*p) and sizeof(*q), which are the sizes of the objects p and q point to.

The pointer values that are printed are likely to change on every execution (see why the addresses of local variables can be different every time and Address Space Layout Randomization)
I get an answer of 00EFF680 for everytime I compare the adresss p with another pointer.
int val = 20;
double valD = 20;
int *p = &val;
cout << p == p + 1;
It is translated into (cout << p) == p + 1; due to the higher precedence of operator << on operator ==.
It print the hexadecimal value of &val, first address on the stack frame of the main function.
Note that in the stack, address are decreasing (see why does the stack address grow towards decreasing memory addresses).
I get an answer of 00EFF670 for everytime I compare the address of q with another pointer.
double *q = &valD;
cout << q == q + 1;
It is translated into (cout << q) == q + 1; due to the precedence of operator << on operator ==.
It prints the hexadecimal value of &valD, second address on the stack frame of the main function.
Note that &valD <= &val - sizeof(decltype(valD) = double) == &val - 8 since val is just after valD on the stack. It is a compiler choice that respects some alignment constraints.
I get an answer of 15726208 when I look at the pointer value of p.
cout << (unsigned long)(p) << endl;
It just prints the decimal value of &val
And I get an answer of 15726212 When I look at the pointer value of p + 1.
int *p = &val;
cout << (unsigned long) (p + 1) << endl;
It prints the decimal value of &val + sizeof(*decltype(p)) = &val + sizeof(int) = &val + 4 since on your machine int = 32 bits
Note that if p is a pointer to type t, p+1 is p + sizeof(t) to avoid memory overlapping in array indexing.
Note that if p is a pointer to void, p+1 should be undefined (see void pointer arithmetic)
I get an answer of 15726192 when I look at the pointer value of q
cout << (unsigned long)(q) << endl;
It prints the decimal value of &valD
And I get an answer of 15726200 Wehn I look at the pointer value of q + 1.
cout << (unsigned long) (q + 1) << endl;
It prints the decimal value of &val + sizeof(*decltype(p)) = &valD + sizeof(double) = &valD + 8

How to store randomly generated unsigned int numbers in a bitset

I have a bitset in which I need to store a # of randomly generated integers (store its bit representation of course). So, the thing is that I am confuse on how to do that.
i.e suppose that I generate the integers (all unsigned int) 8, 15 , 20, one at a time. How can I store the recently generated integer in my existing bit set.
Say that I start by generating "8" and store in the bitset, then I generate "15" and store it in the bitset.
I don't know or don't understand how to store those values within the bitset.
Note: I know in advance the size of the bitset, the size is based on the number of integers that I am going to generate, and that I know too. So, at the end what I need is a bitset with all the bits set matching the bits of all the generated integers.
I'll Appreciate your help.

How can I store the recently generated integer in my existing bit set.
You can generate a temporary bitset form the integer and then assign values between the two bitsets.
Example program:
#include <iostream>
#include <bitset>
#include <cstdlib>
int main()
{
const int size = sizeof(int)*8;
std::bitset<2*size> res;
std::bitset<size> res1(rand());
std::bitset<size> res2(rand());
for ( size_t i = 0; i < size; ++i )
{
res[i] = res1[i];
res[size+i] = res2[i];
}
std::cout << "res1: " << res1 << std::endl;
std::cout << "res2: " << res2 << std::endl;
std::cout << "res: " << res << std::endl;
return 0;
}
Output:
res1: 01101011100010110100010101100111
res2: 00110010011110110010001111000110
res: 0011001001111011001000111100011001101011100010110100010101100111
Update
A function to set the bitset values given an integer can be used to avoid the cost of creating temporary bitsets.
#include <iostream>
#include <bitset>
#include <cstdlib>
#include <climits>
const int size = sizeof(int)*8;
void setBitsetValue(std::bitset<2*size>& res,
int num,
size_t bitsetIndex,
size_t numIndex)
{
if ( numIndex < size )
{
res[bitsetIndex] = (num >> numIndex) & 0x1;
setBitsetValue(res, num, bitsetIndex+1, numIndex+1);
}
}
int main()
{
std::bitset<2*size> res;
int num1 = rand()%INT_MAX;
int num2 = rand()%INT_MAX;
std::bitset<size> res1(num1);
std::bitset<size> res2(num2);
std::cout << "res1: " << res1 << std::endl;
std::cout << "res2: " << res2 << std::endl;
setBitsetValue(res, num1, 0, 0);
setBitsetValue(res, num2, size, 0);
std::cout << "res: " << res << std::endl;
return 0;
}

Write int value into an array of bytes

I have this array : BYTE set[6] = { 0xA8,0x12,0x84,0x03,0x00,0x00, } and i need to insert this value : "" int Value = 1200; "" ....on last 4 bytes. Practically to convert from int to hex and then to write inside the array...
Is this possible ?
I already have BitConverter::GetBytes function, but that's not enough.
Thank you,

To answer original quesion: sure you can.
As soon as your sizeof(int) == 4 and sizeof(BYTE) == 1.
But I'm not sure what you mean by "converting int to hex". If you want a hex string representation, you'll be much better off just using one of standard methods of doing it.
For example, on last line I use std::hex to print numbers as hex.
Here is solution to what you've been asking for and a little more (live example: http://codepad.org/rsmzngUL):
#include <iostream>
using namespace std;
int main() {
const int value = 1200;
unsigned char set[] = { 0xA8,0x12,0x84,0x03,0x00,0x00 };
for (const unsigned char* c = set; c != set + sizeof(set); ++c) {
cout << static_cast<int>(*c) << endl;
}
cout << endl << "Putting value into array:" << endl;
*reinterpret_cast<int*>(&set[2]) = value;
for (const unsigned char* c = set; c != set + sizeof(set); ++c) {
cout << static_cast<int>(*c) << endl;
}
cout << endl << "Printing int's bytes one by one: " << endl;
for (int byteNumber = 0; byteNumber != sizeof(int); ++byteNumber) {
const unsigned char oneByte = reinterpret_cast<const unsigned char*>(&value)[byteNumber];
cout << static_cast<int>(oneByte) << endl;
}
cout << endl << "Printing value as hex: " << hex << value << std::endl;
}
UPD: From comments to your question:
1. If you need just getting separate digits out of the number in separate bytes, it's a different story.
2. Little vs Big endianness matters as well, I did not account for that in my answer.

did you mean this ?
#include <stdio.h>
#include <stdlib.h>
#define BYTE unsigned char
int main ( void )
{
BYTE set[6] = { 0xA8,0x12,0x84,0x03,0x00,0x00, } ;
sprintf ( &set[2] , "%d" , 1200 ) ;
printf ( "\n%c%c%c%c", set[2],set[3],set[4],set[5] ) ;
return 0 ;
}
output :
1200

Increment IP address

In that program I want to increment IP address. And I see output like that:
125.23.45.67
126.23.45.67
127.23.45.67
128.23.45.67
129.23.45.67
130.23.45.67
131.23.45.67
132.23.45.67
133.23.45.67
134.23.45.67
But I want to see output like this:
124.23.45.67
124.23.45.68
124.23.45.68
124.23.45.70
124.23.45.71
124.23.45.72
124.23.45.73
124.23.45.74
124.23.45.75
124.23.45.76
Here is program code:
#include <stdlib.h>
#include <stdio.h>
#include <iostream>
using namespace std;
#include "winsock2.h"
#pragma comment(lib,"wsock32.lib")
void main()
{
in_addr adr1;
in_addr adr2;
int i;
adr1.s_addr=inet_addr("124.23.45.67");
adr2.s_addr=inet_addr("as.34.34.56");
if (adr1.s_addr!=INADDR_NONE)
cout << " adr1 correct" << endl;
else
cout << " adr1 incorect " << endl;
if (adr2.s_addr!=INADDR_NONE)
cout << " adr2 correct" << endl;
else
cout << " adr2 incorect" << endl;
cout << inet_ntoa(adr1) << endl;
cout << inet_ntoa(adr2) << endl;
for (i=0;i<10;i++)
{
adr1.s_addr ++;
cout << inet_ntoa(adr1) << endl;
}
}

Big endian and little endian gets another one! Use htonl and ntohl to convert back and forth.
for (i=0;i<10;i++)
{
adr1.s_addr = htonl(ntohl(adr1.s_addr) + 1);
cout << inet_ntoa(adr1) << endl;
}

To increment an IP address you will need to break up the in_addr object into 4 int objects (a short int will also do) and increment the 4th one until it hits 256, and then reset it to 1 and increment the 3rd one, etc. You shouldn't be using ++ on the in_addr object directly.
EDIT: Okay, so you can properly increment it if you reverse the byte order. I personally wouldn't do it that way. Especially if all you're doing is outputting IP strings and not using them as an in_addr elsewhere in code.

Instead of using adr1.s_addr:
adr1.s_addr=inet_addr("124.23.45.67");
adr2.s_addr=inet_addr("as.34.34.56");
Use this:
u_long addr1=inet_addr("124.23.45.67");
And increment addr1, i.e. addr1++
the last octet gets incremented.
Or follow this formula:
if IP is A.B.C.D then u_long addr = A + 256*B + 256*256*C + 256*256*256*D

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Right Shift in C++ giving unusual results (unsigned 64-bit) - c++

Related

Showing binary representation of floating point types in C++ [closed]

Comparison Of Pointers

How to store randomly generated unsigned int numbers in a bitset

Write int value into an array of bytes

Increment IP address

Categories

Resources