Multiplying and Dividing two floating point numbers without using * and / operators

Multiplying and Dividing two floating point numbers without using * and / operators - c++

I am trying to solve how to multiply and divide two numbers without using * and / operators
I tried using for loops:
for(int a = 1; a<=secondnum; a++)
{
total = firstnum + total;
}
cout << "Total: " << total;
for(b = firstnum; b>=secondnum; b = b-secondnum)
{
total = total + 1;
}
cout << "Answer: " << total;
However this only works for integers...Is there a way for this to work on floating point values?

In the old days (before pocket calculators and the like), logarithm tables were used to turn multiplication and division into a matter of addition and subtraction:
#include <cmath>
double Mult(double a, double b)
{
return exp(log(a)+log(b));
}
double Div(double a, double b)
{
return exp(log(a)-log(b));
}
Note this only works for positive numbers, but it is relatively easy to work with absolute values and then give the result the correct sign.

I actually did this back in high school. I wrote a program that could multiply or divide two arbitrarily-long floating point numbers. Basically, I did it the exact same way I would have done it through long multiplication/division.
I kept both values in arrays of decimal digits.
char firstValue[1024];
char secondValue[1024];
I don't remember if I kept them as ASCII or converted them. It was 40 years ago, after all.
Then I worked it out on paper. Multiply isn't hard, although admittedly I used the * operator to multiply two one-digit values. But you could implement an integerMultiply method.
If you can do it in long hand on paper, you can write an algorithm for it. But it's way too long for an answer here.

Related

How to set precision of a float?

For a number a = 1.263839, we can do -
float a = 1.263839
cout << fixed << setprecision(2) << a <<endl;
output :- 1.26
But what if i want set precision of a number and store it, for example-
convert 1.263839 to 1.26 without printing it.

But what if i want set precision of a number and store it
You can store the desired precision in a variable:
int precision = 2;
You can then later use this stored precision when converting the float to a string:
std::cout << std::setprecision(precision) << a;
I think OP wants to convert from 1.263839 to 1.26 without printing the number.
If this is your goal, then you first must realise, that 1.26 is not representable by most commonly used floating point representation. The closest representable 32 bit binary IEEE-754 value is 1.2599999904632568359375.
So, assuming such representation, the best that you can hope for is some value that is very close to 1.26. In best case the one I showed, but since we need to calculate the value, keep in mind that some tiny error may be involved beyond the inability to precisely represent the value (at least in theory; there is no error with your example input using the algorithm below, but the possibility of accuracy loss should always be considered with floating point math).
The calculation is as follows:
Let P bet the number of digits after decimal point that you want to round to (2 in this case).
Let D be 10P (100 in this case).
Multiply input by D
std::round to nearest integer.
Divide by D.
P.S. Sometimes you might not want to round to the nearest, but instead want std::floor or std::ceil to the precision. This is slightly trickier. Simply std::floor(val * D) / D is wrong. For example 9.70 floored to two decimals that way would become 9.69, which would be undesirable.
What you can do in this case is multiply with one magnitude of precision, round to nearest, then divide the extra magnitude and proceed:
Let P bet the number of digits after decimal point that you want to round to (2 in this case).
Let D be 10P (100 in this case).
Multiply input by D * 10
std::round to nearest integer.
Divide by 10
std::floor or std::ceil
Divide by D.

You would need to truncate it. Possibly the easiest way is to multiply it by a factor (in case of 2 decimal places, by a factor of 100), then truncate or round it, and lastly divide by the very same factor.
Now, mind you, that floating-point precision issues might occur, and that even after those operations your float might not be 1.26, but 1.26000000000003 instead.

If your goal is to store a number with a small, fixed number of digits of precision after the decimal point, you can do that by storing it as an integer with an implicit power-of-ten multiplier:
#include <stdio.h>
#include <math.h>
// Given a floating point value and the number of digits
// after the decimal-point that you want to preserve,
// returns an integer encoding of the value.
int ConvertFloatToFixedPrecision(float floatVal, int numDigitsAfterDecimalPoint)
{
return (int) roundf(floatVal*powf(10.0f, numDigitsAfterDecimalPoint));
}
// Given an integer encoding of your value (as returned
// by the above function), converts it back into a floating
// point value again.
float ConvertFixedPrecisionBackToFloat(int fixedPrecision, int numDigitsAfterDecimalPoint)
{
return ((float) fixedPrecision) / powf(10.0f, numDigitsAfterDecimalPoint);
}
int main(int argc, char ** arg)
{
const float val = 1.263839;
int fixedTwoDigits = ConvertFloatToFixedPrecision(val, 2);
printf("fixedTwoDigits=%i\n", fixedTwoDigits);
float backToFloat = ConvertFixedPrecisionBackToFloat(fixedTwoDigits, 2);
printf("backToFloat=%f\n", backToFloat);
return 0;
}
When run, the above program prints this output:
fixedTwoDigits=126
backToFloat=1.260000

If you're talking about storing exactly 1.26 in your variable, chances are you can't (there may be an off chance that exactly 1.26 works, but let's assume it doesn't for a moment) because floating point numbers don't work like that. There are always little inaccuracies because of the way computers handle floating point decimal numbers. Even if you could get 1.26 exactly, the moment you try to use it in a calculation.
That said, you can use some math and truncation tricks to get very close:
int main()
{
// our float
float a = 1.263839;
// the precision we're trying to accomplish
int precision = 100; // 3 decimal places
// because we're an int, this will keep the 126 but lose everything else
int truncated = a * precision; // multiplying by the precision ensures we keep that many digits
// convert it back to a float
// Of course, we need to ensure we're doing floating point division
float b = static_cast<float>(truncated) / precision;
cout << "a: " << a << "\n";
cout << "b: " << b << "\n";
return 0;
}
Output:
a: 1.26384
b: 1.26
Note that this is not really 1.26 here. But is is very close.
This can be demonstrated by using setprecision():
cout << "a: " << std:: setprecision(10) << a << "\n";
cout << "b: " << std:: setprecision(10) << b << "\n";
Output:
a: 1.263839006
b: 1.25999999
So again, it's not exactly 1.26, but very close, and slightly closer than you were before.

Using a stringstream would be an easy way to achieve that:
#include <iostream>
#include <iomanip>
#include <sstream>
using namespace std;
int main() {
stringstream s("");
s << fixed << setprecision(2) << 1.263839;
float a;
s >> a;
cout << a; //Outputs 1.26
return 0;
}

Calculate PI up to 42 decimal places

I'm trying to write a program that uses the series to compute the value of PI. The user will input how far it wants the program to compute the series and then the program should output its calculated value of PI. I believe I've successfully written the code for this, however it does not do well with large numbers and only gives me a few decimal places. When I tried to use cout << fixed << setprecision(42); It just gave me "nan" as the value of PI.
int main() {
long long seqNum; // sequence number users will input
long double val; // the series output
cout << "Welcome to the compute PI program." << endl; // welcome message
cout << "Please inter the sequence number in the form of an integer." << endl;
cin >> seqNum; // user input
while ( seqNum < 0) // validation, number must be positive
{
cout << "Please enter a positive number." << endl;
cin >> seqNum;
} // end while
if (seqNum > 0)
{
for ( long int i = 0; i < seqNum; i++ )
{
val = val + 4*(pow(-1.00,i)/(1 + 2*i)); // Gregory-Leibniz sum calculation
}// end for
cout << val;
} // end if
return 0;
}
Any help would be really appreciated. Thank you

Your problem involves an elementary, fundamental principle related to double values: a double, or any floating point type, can hold only a fixed upper limit of significant digits. There is no unlimited digits of precision with plain, garden-variety doubles. There's a hard, upper limit. The exact limit is implementation defined, but on modern C++ implementations the typical limit is just 16 or 17 digits of precision, not even close to your desired 42 digits of precision.
#include <limits>
#include <iostream>
int main()
{
std::cout << std::numeric_limits<double>::max_digits10 << std::endl;
return 0;
}
This gives you the maximum digits of precision with your platform/C++ compiler. This shows a maximum of 17 digits of precision with g++ 9.2 on Linux (max_digits10 is C++11 or later, use digits10 with old C++ compilers to show a closely-related metric).
Your desired 42 digits of precision likely far exceed what your modest doubles can handle. There are various special-purpose math libraries that can perform calculations with higher levels of precision, you can investigate those, if you wish.

You did not initialize or assign any value to val, but you are reading it when you get to the first iteration of
val = val + 4*(pow(-1.00,i)/(1 + 2*i));
This cause your program to have undefined behavior. Initialize val, probably to zero:
long double val = 0; // the series output
That aside, as mentioned in the answer of #SamVarshavchik there is a hard limit on the precision you can reach with the built-in floating point types and 42 places significance is almost certainly outside of that. Similarly the integer types that you are using are limited in size to probably at most 2^64 which is approximately 10^19.
Even if these limits weren't the problem, the series requires summation of roughly 10^42 terms to get PI to a precision of 42 places. It would take you longer than the universe has been around to calculate to that precision with all of earths current computing power combined.

Estimating Pi using leibniz in C++

I am using the leibniz method for calculating pi. My program inputs an accuracy number for a calculation of π and then applies the leibniz infinite sum to find an approximate value of π within that accuracy.
I am trying to write a c++ while loop that will input an accuracy number (a double), the initial sum would be equal to zero and initially n would equal zero.
The denominator at n=0, d = 2*n+1 = 1, so the next term would be 4.0/1 = 4.0.
This would then be added to the sum and increment n. If This previous term was greater than the accuracy, the loop would continue.
When the previous term is smaller then the accuracy the number should be outputted and the loop should be exited.
My accuracy is never more than ±0.0001.
My not working code:
while(sum<accuracy)
{
int n = 0;
while(n>10) //this is here due to debugging
{
sum=-4/(2*n+1);
n++;
cout << sum << endl;
}
}
Question:
I am having a hard time coming up with a working while loop that can do what I described above. How do I create such a loop. Please provide an example with explanation.

Make sure that your sum-variable is float/double. It is like our friend commentet you are victim of integer division
double sum;
while(sum<accuracy){
int n=0;
while(n>10)
{
sum=-4.0/(2.0*n+1.0);
n++;
cout << sum << endl;
}
}

Euler's number expansion

#include <iostream>
#include <iomanip>
using namespace std;
int a[8], e[8];
void term (int n)
{
a[0]=1;
for (int i=0; i<8; i++)
{
if (i<7)
{
a[i+1]+=(a[i]%n)*100000;
}
/* else
{
a[i+1]+=((a[i]/640)%(n/640))*100000;
}
*/
a[i]=a[i]/(n);
}
}
void sum ()
{
}
int factorial(int x, int result = 1)
{
if (x == 1)
return result;
else return factorial(x - 1, x * result);
}
int main()
{
int n=1;
for (int i=1; i<=30; i++)
{
term(n);
cout << a[0] << " "<< a[1] << " " << a[2] << " "
<< a[3] << " " << a[4] << " " << a[5]<< " "
<< " " << a[6] << " " << a[7] << endl;
n++;
for (int j=1; j<8; j++)
a[j]=0;
}
return 0;
}
That what I have above is the code that I have thus far.
the Sum and the rest are left purposely uncompleted because that is still in the building phase.
Now, I need to make an expansion of euler' number,
This is supposed to make you use series like x[n] in order to divide a result into multiple parts and use functions to calculate the results and such.
According to it,
I need to find the specific part of the Maclaurin's Expansion and calculate it.
So the X in e=1+x+(1/2!)*x and so on is always 1
Giving us e=1+1+1/2!+1/3!+1/n! to calculate
The program should calculate it in order of the N
so If N is 1 it will calculate only the corresponding factorial division part;
meaning that one part of the variable will hold the result of the calculation which will be x=1.00000000~ and the other will hold the actual sum up until now which is e=2.000000~
For N=2
x=1/2!, e=previous e+x
for N=3
x=1/3!, e=previous e+x
The maximum number of N is 29
each time the result is calculated, it needs to hold all the numbers after the dot into separate variables like x[1] x[2] x[3] until all the 30~35 digits of precision are filled with them.
so when printing out, in the case of N=2
x[0].x[1]x[2]x[3]~
should come out as
0.50000000000000000000
where x[0] should hold the value above the dot and x[1~3] would be holding the rest in 5 digits each.
Well yeah Sorry if my explanation sucks but This is what its asking.
All the arrays must be in Int and I cannot use others
And I cant use bigint as it defeats the purpose
The other problem I have is, while doing the operations, it goes well till the 7th.
Starting from the 8th and so on It wont continue without giving me negative numbers.
for N=8
It should be 00002480158730158730158730.
Instead I get 00002 48015 -19220 -41904 30331 53015 -19220
That is obviously due to int's limit and since at that part it does
1936000000%40320
in order to get a[3]'s value which then is 35200 which is then multiplied by 100000
giving us a 3520000000/40320, though the value of a[3] exceeds the limit of integer, any way to fix this?
I cannot use doubles or Bigints for this so if anyone has a workaround for this, it would be appreciated.

You cannot use floating point or bigint, but what about other compiler intrinsic integral types like long long, unsigned long long, etc.? To make it explicit you could use <stdint.h>'s int64_t and uint64_t (or <cstdint>'s std::int64_t and std::uint64_t, though this header is not officially standard yet but is supported on many compilers).

I don't know if this is of any use, but you can find the code I wrote to calculate Euler's number here: http://41j.com/blog/2011/10/program-for-calculating-e/

32bit int limits fact to 11!
so you have to store all the above facts divided by some number
12!/10000
13!/10000
when it does not fit anymore use 10000^2 and so on
when using the division result is just shifted to next four decimals ... (as i assumed was firstly intended)
of course you do not divide 1/n!
on integers that will be zero instead divide 10000
but that limits the n! to only 9999 so if you want more add zeroes everywhere and the result are decimals
also i think there can be some overflow so you should also carry on to upper digits

avoid rounding error (floating specifically) c++

http://www.learncpp.com/cpp-tutorial/25-floating-point-numbers/
I have been about this lately to review C++.
In general computing class professors tend not to cover these small things, although we knew what rounding errors meant.
Can someone please help me with how to avoid rounding error?
The tutorial shows a sample code
#include <iomanip>
int main()
{
using namespace std;
cout << setprecision(17);
double dValue = 0.1;
cout << dValue << endl;
}
This outputs
0.10000000000000001
By default float is kept 6-digits of precisions. Therefore, when we override the default, and asks for more (n this case, 17!!), we may encounter truncation (as explained by the tutorial as well).
For double, the highest is 16.
In general, how do good C++ programmers avoid rounding error?
Do you guys always look at the binary representation of the number?
Thank you.

The canonical advice for this topic is to read "What Every Computer Scientist Should Know About Floating-Point Arithmetic", by David Goldberg.

In other words, to minimize rounding errors, it can be helpful to keep numbers in decimal fixed-point (and actually work with integers).
#include <iostream>
#include <iomanip>
int main() {
using namespace std;
cout << setprecision(17);
double v1=1, v1D=10;
cout << v1/v1D << endl; // 0.10000000000000001
double v2=3, v2D=1000; //0.0030000000000000001
cout << v2/v2D << endl;
// v1/v1D + v2/v2D = (v1*v2D+v2*v1D)/(v1D*v2D)
cout << (v1*v2D+v2*v1D)/(v1D*v2D) << endl; // 0.10299999999999999
}

Short version - you can't really avoid rounding and other representation errors when you're trying to represent base 10 numbers in base 2 (ie, using a float or a double to represent a decimal number). You pretty much either have to work out how many significant digits you actually have or you have to switch to a (slower) arbitrary precision library.

Most floating point output routines look to see if the answer is very close to being even when represented in base 10 and round the answer to actually be even on output. By setting the precision in this way you are short-circuiting this process.
This rounding is done because almost no answer that comes out even in base 10 will be even (i.e. end in an infinite string of trailing 0s) in base 2, which is the base in which the number is represented internally. But, of course, the general goal of an output routine is to present the number in a fashion useful for a human being, and most human beings in the world today read numbers in base 10.

When you calculate simple thing like variance you can have this kind of problem... here is my solution...
int getValue(double val, int precision){
std::stringstream ss;
ss << val;
string strVal = ss.str();
size_t start = strVal.find(".");
std::string major = strVal.substr(0, start);
std::string minor = strVal.substr(start + 1);
// Fill whit zero...
while(minor.length() < precision){
minor += "0";
}
// Trim over precision...
if(minor.length() > precision){
minor = minor.substr(0, precision);
}
strVal = major + minor;
int intVal = atoi(strVal.c_str());
return intVal;
}
So you will make your calcul in the integer range...
for example 2523.49 became 252349 whit a precision of tow digits, and 2523490 whit a precision of tree digit... if you calculate the mean for example first you convert all value in integer, make the summation and get the result back in double, so you not accumulate error... Error are amplifie whit operation like square root and power function...

You want to use the manipulator called "Fixed" to format your digits correctly so they do not round or show in a scientific notation after you use fixed you will also be able to use set the precision() function to set the value placement to the right of the .
decimal point. the example would be as follows using your original code.
#include <iostream>
#include <iomanip>
int main()
{
using namespace std;
double dValue = 0.19213;
cout << fixed << setprecision(2) << dValue << endl;
}
outputs as:
dValue = 0.19

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js