printing double in binary - c++

In 'Thinking in C++' by Bruce Eckel, there is a program given to print a double value
in binary. (Chapter 3, page no. 189)
int main(int argc, char* argv[])
{
if(argc != 2)
{
cout << "Must provide a number" << endl;
exit(1);
}
double d = atof(argv[1]);
unsigned char* cp = reinterpret_cast<unsigned char*>(&d);
for(int i = sizeof(double); i > 0 ; i -= 2)
{
printBinary(cp[i-1]);
printBinary(cp[i]);
}
}
Here while printing cp[i] when i=8 (assuming double is of 8 bytes), wouldn't it be undefined behaviour?
I mean this code doesn't work as it doesn't print cp[0].

A1: Yes, it would be undefined behaviour when it accesses cp[8].
A2: Yes, it also does not print cp[0].
As shown, it prints bytes 7, 8, 5, 6, 3, 4, 2, 1 of the valid values 0..7. So, if you have copied the code correctly from the book, there is a bug in the book's code. Check the errata page for the book, if there is one.
It is also odd that it unwinds the loop; a simpler formulation is:
for (int i = sizeof(double); i-- > 0; )
printBinary(cp[i]);
There is also, presumably, a good reason for printing the bytes in reverse order; it is not obvious what that would be.

It looks like a typo in the book's code. The second call should probably be printBinary(cp[i-2]).
This is a bit wierd though, because they're reversing the byte order compared to what's actually in memory (IEEE 754 floating point numbers have no rules about endianness, so I guess it's valid on his platform), and because he's counting by 2 instead of just 1.
It would be simpler to write
for(int i = 0; i != sizeof(double) ; ++i) printBinary(cp[i]);
or (if reversing the bytes is important) use the standard idiom for a loop that counts down
for(int i = sizeof(double); (i--) > 0;) printBinary(cp[i]);

You can do this in an endian-independent way, by casting the double to an unsigned long long. Then you can use simple bit shifting on the integer to access and print the bits, from bit 0 to bit 63.
(I've written a C function called "print_raw_double_binary() that does this -- see my article Displaying the Raw Fields of a Floating-Point Number for details.)

Related

How to effectively store large number in integer array? C++

Hello everybody out there! I have a home work assigment where I need to build a high presision calculator that will operate with very large numbers. The whole point of this assigment is that storing the values in arrays as one digit goes to separate array cell is now allowed.
That is memory representation of number
335897294593872
like so
int number[] = {3, 3, 5, 8, 9, 7, 2, 9, 4, 5, 9, 3, 8, 7, 2};
is not legit,
nor
char number[] = {3, 3, 5, 8, 9, 7, 2, 9, 4, 5, 9, 3, 8, 7, 2};
nor
std::string number("335897294593872");
What I want to do is to split up the whole number into 32bit chunks and store each individual chunk in separate array cell data type of which is u32int_t.
Since I get the input from keyboard I store all values in std::string initially and later put them in integer arrays to perform operations.
How do I put binary representation of a large number into an integer array filling in all bits properly?
Thank you in advance.
EDIT: Using standard C++ libraries only
EDIT2: I want to be able to add, subtract, multiply, divide those arrays with large numbers so I mean not to merely cut the string up and store decimal representation in integer array, but rather preserve bits order of the number itself to be able to calculate carry.
This is a rather naïve solution:
If last digit in string is odd store a 1 in result (otherwise leave it 0).
Divide digits in string by 2 (considering carries).
If 32 bits have written add another element to result vector.
Repeat this until string contains 0s only.
Source Code:
#include <iomanip>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
std::vector<uint32_t> toBigInt(std::string text)
{
// convert string to BCD-like
for (char &c : text) c -= '0';
// build result vector
std::vector<uint32_t> value(1, 0);
uint32_t bit = 1;
for (;;) {
// set next bit if last digit is odd
if (text.back() & 1) value.back() |= bit;
// divide BCD-like by 2
bool notNull = false; int carry = 0;
for (char &c : text) {
const int carryNew = c & 1;
c /= 2; c += carry * 5;
carry = carryNew;
notNull |= c;
}
if (!notNull) break;
// shift bit
bit <<= 1;
if (!bit) {
value.push_back(0); bit = 1;
}
}
// done
return value;
}
std::ostream& operator<<(std::ostream &out, const std::vector<uint32_t> &value)
{
std::ios fmtOld(0); fmtOld.copyfmt(out);
for (size_t i = value.size(); i--;) {
out << std::hex << value[i] << std::setfill('0') << std::setw(sizeof (uint32_t) * 2);
}
out.copyfmt(fmtOld);
return out;
}
int main()
{
std::string tests[] = {
"0", "1",
"4294967295", // 0xffffffff
"4294967296", // 0x100000000
"18446744073709551615", // 0xffffffffffffff
"18446744073709551616", // 0x100000000000000
};
for (const std::string &test : tests) {
std::cout << test << ": " << toBigInt(test) << '\n';
}
return 0;
}
Output:
0: 0
1: 1
4294967295: ffffffff
4294967296: 100000000
18446744073709551615: ffffffffffffffff
18446744073709551616: 10000000000000000
Live Demo on coliru
Notes:
The output is little-endian. (The least significant element is first.)
For the tests, I used numbers where hex-code is simple to check by eyes.
To use an array to store the different parts of a big number is a common way to do the work. Another thing to think of is to consider the different architecture implementations for signed ints, that lead you to have to sacrifice (this is what normal libraries to deal with big integers do) to allow signed to unsigned conversions (you have several ways of doing here) between the parts of your number or how are you going to implement the different arithmetic operations.
I don't generally recommend to use long long integer versions for the array cells, as they are not generally the native size of the architecture, so to give the architecture some chance to do things efficiently, I should use a reduced (at least one bit, to be able to see the carries out from one extended digit to the next) standard unsigned (for example, gnu **libgmp* uses 24bit integers on each array cell ---last time I checked that). It's also common to reduce it to a multiple of char size, so displacements and reallocation of numbers are easier than to make 31 bit displacements on a full array of bits.
It's common that when you work with money or delicate numbers like that you often use Integers, because you can assure, many things of it, so my recommendation is that whenever you work with this big numbers, simulate a fixpoint or floating-point arithmetic with two Ints, so you can "watch" how everything is executing, you could check the IEEE 754 standard for the floating point.
If you store the number in an array make sure to take a constant number of steps to make all the operations that you are doing while manipulating it. Which could be tricky.
I recommend you to trust the integers but fix the size of bits.
But if you really want to go for interesting stuff, try and use the bit-wise operators, and maybe you could get something interesting out if it.
You could check the details of the data types here, in particular, the signed short int, or the long long int, and to confirm sizes of the data types check this

Rounding error of binary32

As part of a homework, I'm writing a program that takes a float decimal number as input entered from terminal, and return IEEE754 binary32 of that number AND return 1 if the binary exactly represents the number, 0 otherwise. We are only allowed to use iostream and cmath.
I already wrote the part that returns binary32 format, but I don't understand how to see if there's rounding to that format.
My idea to see the rounding was to calculate the decimal number back from binary32 form and compare it with the original number. But I am having difficulty with saving the returned binary32 as some type of data, since I can't use the vector header. I've tried using for loops and pow, but I still get the indices wrong.
Also, I'm having trouble understanding what exactly is df or *df? I wrote the code myself, but I only know that I needed to convert address pointed to float to address pointed to char.
My other idea was to compare binary32 and binary 64, which gives more precision. And again, I don't know how to do this without using vector?
int main(int argc, char* argv[]){
int i ,j;
float num;
num = atof(argv[1]);
char* numf = (char*)(&num);
for (i = sizeof(float) - 1; i >= 0; i--){
for (j = 7; j >= 0; j--)
if (numf[i] & (1 << j)) {
cout << "1";
}else{
cout << "0";
}
}
cout << endl;
}
//////
Update:
Since there's no other way around without using header files, I hard coded for loops to convert binary32 back to decimal.
Since x = 1.b31b30...b0 * 2^p. One for loop for finding the exponent and one for loop for finding the significand.
Basic idea: Convert your number d back to a string (eg. with to_string) and compare it to the input. If the strings are different, there was some loss because of the limitations of float.
Of course, this means your input always has to be in the same string format that to_string uses. No additional unneeded 0's, no whitespaces, etc.
...
That said, doing the float conversion without cast (but with manually parsing the input and calculating the IEEE754 bits) is more work initally, but in return, it sovled this problem automatically. And, as noted in the comments, your cast might not work the way you want.

Big File reading error in C++

I need to read a file in c++ that has this specific format:
10 5
1 2 3 4 1 5 1 5 2 1
All the values are separated with a space. The first 2 on the first line are the variables N and M respectively and all the N values from the second line need to be in an array called S with the size of N. The code I have written has no problem with files like these but it does not work when it comes to really big files with millions and so on that i need it to work with. Here is the code
int N,M;
FILE *read = fopen("file.in", "r");
fscanf(read, "%d %d ", &N, &M);
int S[N];
for( i =0; i < N; i++){
fscanf(read, "%d ", &S[i]);
}
What should I change?
There are multiple potential issues when getting in the range of millions of integers:
int is most often 32 bits, a 32 bits signed integer will have a range of -2^31 to 2^31 - 1, and thus the maximum of 2,147,483,647. You should switch to a 64 bits integral.
You are using int S[N] a Variable Length Array (VLA) which is not Standard C++ (it is Standard C99, but... there are discussions as to whether it was a good idea or not). The important detail, though, is that a VLA is stored on the stack: 1 million of 32 bits int is 4 MB, 2 millions is 8 MB, etc... check your default stack size, but it likely is less than 8 MB, and thus you have a stack-overflow (you're on the right site for help!).
So, let's switch to C++ and do away with those issues:
#include <cstdint> // for int64_t
#include <fstream>
#include <vector>
int main(int argc, char* argv[]) {
std::ifstream stream("data.txt");
int64_t n = 0, m = 0;
stream >> n >> m;
std::vector<int> data;
for (int64_t c = 0; c != n; ++c) {
int i = 0;
stream >> i;
data.push_back(i);
}
// do your best :)
}
First of all, we use int64_t from <cstdint> to do away with the integer overflow issue. Second, we use a stream (input file stream: ifstream) to avoid having to learn what is the format associated with each and every integral type (it's a pain). Third, we use a vector to store the data we read, and do away with the stack overflow issue.
You are using variable sized arrays. This is not standard and not supported by all compilers. If your compiler support it, and you go in the millions, you'll run out of stack space (stack overflow).
Alternatively, you could define S as being a vector with vector<int> S(N);

Bitwise operations with int

Excuse me for my english. I have a number of int values stored in it from 0 to 255. To find out what lies in 7 bit number, I translate the numbers into a binary system, then in the line and check the line:
if (informationOctet_.substr(6, 1) == "0")
{
...
}
Two questions arose,
If I use int (which we have 4 bytes), and my number is unsigned int the range [0, 255] How do I determine which byte I need to consider? High byte?
I have found the desired byte, how do you know the contents of, for example, the 6th bit?
P.S.
I do not use spells, because do with unsigned int.
THANK ALL, I test int number:
int k = 3;
for (int i = 0; i < 8; i++)
{
if (k & (1 << i))
{
std::cout << 1;
}
else
{
std::cout << 0;
}
}
print: 11000000
This is implementation-defined and depends on the endianess of the CPU. If you are smart though, you do the check like this: the_int & 0xFF, which will always give you the least significant byte no matter endianess.
byte & (1 << 6). Maybe. Note that bit counting is zero-indexed (just like arrays), so you have to be careful with the terms. "Bit 6" and "the 6th bit" may have different meanings. Always enumerate bits as 7 6 5 4 3 2 1 0, then it will be consistent with the C code.
You can choose the "Char" Data Type to check. It answer your both question. Because, the character Data Type is of 1 byte (8 bits). And it contains integer values as well because char & int are the compatible data types.

Double precision in C++ (or pow(2, 1000))

I'm working on Project Euler to brush up on my C++ coding skills in preparation for the programming challenge(s) we'll be having this next semester (since they don't let us use Python, boo!).
I'm on #16, and I'm trying to find a way to keep real precision for 2¹°°°
For instance:
int main(){
double num = pow(2, 1000);
printf("%.0f", num):
return 0;
}
prints
10715086071862673209484250490600018105614050000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Which is missing most of the numbers (from python):
>>> 2**1000
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376L
Granted, I can write the program with a Python 1 liner
sum(int(_) for _ in str(2**1000))
that gives me the result immediately, but I'm trying to find a way to do it in C++. Any pointers? (haha...)
Edit:
Something outside the standard libs is worthless to me - only dead-tree code is allowed in those contests, and I'm probably not going to print out 10,000 lines of external code...
If you just keep track of each digit in a char array, this is easy. Doubling a digit is trivial, and if the result is greater than 10 you just subtract 10 and add a carry to the next digit. Start with a value of 1, loop over the doubling function 1000 times, and you're done. You can predict the number of digits you'll need with ceil(1000*log(2)/log(10)), or just add them dynamically.
Spoiler alert: it appears I have to show the code before anyone will believe me. This is a simple implementation of a bignum with two functions, Double and Display. I didn't make it a class in the interest of simplicity. The digits are stored in a little-endian format, with the least significant digit first.
typedef std::vector<char> bignum;
void Double(bignum & num)
{
int carry = 0;
for (bignum::iterator p = num.begin(); p != num.end(); ++p)
{
*p *= 2;
*p += carry;
carry = (*p >= 10);
*p -= carry * 10;
}
if (carry != 0)
num.push_back(carry);
}
void Display(bignum & num)
{
for (bignum::reverse_iterator p = num.rbegin(); p != num.rend(); ++p)
std::cout << static_cast<int>(*p);
}
int main(int argc, char* argv[])
{
bignum num;
num.push_back(1);
for (int i = 0; i < 1000; ++i)
Double(num);
Display(num);
std::cout << std::endl;
return 0;
}
You need a bignum library, such as this one.
You probably need a pointer here (pun intended)
In C++ you would need to create your own bigint lib in order to do the same as in python.
C/C++ operates on fundamental data types. You are using a double which has only 64 bits to store a 1000 bit number. double uses 51 bit for the significant digits and 11 bit for the magnitude.
The only solution for you is to either use a library like bignum mentioned elsewhere or to roll out your own.
UPDATE: I just browsed to the Euler Problem site and found that Problem 13 is about summing large integers. The iterated method can become very tricky after a short while, so I'd suggest to use the code from Problem #13 you should have already to solve this, because 2**N => 2**(N-1) + 2**(N-1)
Using bignums is cheating and not a solution. Also, you don't need to compute 2**1000 or anything like that to get to the result. I'll give you a hint:
Take the first few values of 2**N:
0 1 2 4 8 16 32 64 128 256 ...
Now write down for each number the sum of its digits:
1 2 4 8 7 5 10 11 13 ...
You should notice that (x~=y means x and y have the same sum of digits)
1+1=2, 1+(1+2)=4, 1+(1+2+4)=8, 1+(1+2+4+8)=16~=7 1+(1+2+4+8+7)=23~=5
Now write a loop.
Project Euler = Think before Compute!
If you want to do this sort of thing on a practical basis, you're looking for an arbitrary precision arithmetic package. There are a number around, including NTL, lip, GMP, and MIRACL.
If you're just after something for Project Euler, you can write your own code for raising to a power. The basic idea is to store your large number in quite a few small pieces, and implement your own carries, borrows, etc., between the pieces.
Isn't pow(2, 1000) just 2 left-shifted 1000 times, essentially? It should have an exact binary representation in a double float. It shouldn't require a bignum library.