Doing one of my first homeworks of uni, and have ran into this problem:
Task: Find a sum of all n elements where n is the count of numerals in a number (n=1, means 1, 2, 3... 8, 9 for example, answer is 45)
Problem: The code I wrote has gotten all the test answers correctly up to 10 to the power of 9, but when it reaches 10 to the power of 10 territory, then the answers start being wrong, it's really close to what I should be getting, but not quite there (For example, my output = 49499999995499995136, expected result = 49499999995500000000)
Would really appreciate some help/insights, am guessing it's something to do with the variable types, but not quite sure of a possible solution..
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
int main()
{
int n;
double ats = 0, maxi, mini;
cin >> n;
maxi = pow(10, n) - 1;
mini = pow(10, n-1) - 1;
ats = (maxi * (maxi + 1)) / 2 - (mini * (mini + 1)) / 2;
cout << setprecision(0) << fixed << ats;
}
The main reason of problems is pow() function. It works with double, not int. Loss of accuracy is price for representing huge numbers.
There are 3 way's to solve problem:
For small n you can make your own long long int pow(int x, int pow) function. But there is problem, that we can overflow even long long int
Use long arithmetic functions, as #rustyx sayed. You can write your own with vector, or find and include library.
There is Math solution specific for topic's task. It solves the big numbers problem.
You can write your formula like
((10^n) - 1) * (10^n) - (10^m - 1) * (10^m)) / 2 , (here m = n-1)
Then multiply numbers in numerator. Regroup them. Extract common multiples 10^(n-1). And then you can see, that answer have a structure:
X9...9Y0...0 for big enought n, where letter X and Y are constants.
So, you can just print the answer "string" without calculating.
I think you're stretching floating points beyond their precision. Let me explain:
The C pow() function takes doubles as arguments. You're passing ints, the compiler is adding the code to convert them to doubles before they reach pow(). (And anyway you're storing it as a double when you get the return value since you declared it that way).
Floating points are called that way precisely because the point "floats". Inside a double there's a sign bit, a few bits for the mantissa and a few bits for the exponent. In binary, elevating to a power of two is equivalent to moving the fractional point to the right (or to the left if you're elevating to a negative number). So basically the exponent is saying where the fractional point is, in binary. The great advantage of using this kind of in-memory representation for doubles is that you get a lot of precision for numbers close to 0, and gradually lose precision as numbers become bigger.
That last thing is exactly what's happening to you. Your number is too large to be stored exactly. So it's being rounded to the closest sum of powers of two (powers of two are the numbers that have all zeroes to the right in binary).
Quick experiment: press F12 in your browser, open the javascript console and type 49499999995499995136. In my case, in chrome, I reproduce the same problem.
If you really really really want precision with such big numbers then you can try some of these libraries, but that's too advanced for a student program, you don't need it. Just add an if block and print an error message if the number that the user typed is too big (professors love that, which is actually quite correct).
I need to write a program that converts binary numbers to decimal.
I'm very new to C++ so I've tried looking at other people's examples but they are way too advanced for me.. I thought I had a clever idea on how to do it but I'm not sure if my idea was way off or if I'm just missing something.
int main(void)
{
//variables
string binary;
int pow2 = binary.length() - 1;
int length = binary.length();
int index = 0;
int decimal = 0;
cout << "Enter a binary number ";
cin >> binary; cout << endl;
while(index < length)
{
if (binary.substr(index, 1) == "1")
{
decimal = decimal + pow(2, pow2);
}
index++;
pow2--;
}
cout << binary << " converted to decimal is " << decimal;
}
Your computer is a logical beast. Your computer executes your program, one line at a time. From start to finish. So, let's take a trip, together, with your computer, and see what it ends up doing, starting at the very beginning of your main:
string binary;
Your computer begins by creating a new std::string object. Which is, of course, empty. There's nothing in it.
int pow2 = binary.length() - 1;
This is the very second thing that your computer does.
And because we've just discovered that binary is empty, binary.length() is obviously 0, so this sets pow2 to -1. When you take 0, and subtract 1 from it, that's what you get.
int length = binary.length();
Since binary is still empty, its length() is still 0, and this simply creates a new variable called length, whose value is 0.
int index = 0;
int decimal = 0;
This creates a bunch more variables, and sets them to 0. That's the next thing your computer does.
cout << "Enter a binary number ";
cin >> binary; cout << endl;
Here you print some stuff, and read some stuff.
while(index < length)
Now, we get into the thick of things. So, let's see what your computer did, before you got to this point. It set index to 0, and length to 0 also. So, both of these variables are 0, and, therefore, this condition is false, 0 is not less than 0. So nothing inside the while loop ever executes. We can skip to the end of your program, after the while loop.
cout << binary << " converted to decimal is " << decimal;
And that's how you your computer always gives you the wrong result. Your actual question was:
sure if my idea was way off or if I'm just missing something.
Well, there also other problems with your idea, too. It was slightly off. For starters, nothing here really requires the use of the pow function. Using pow is like try to kill a fly with a hammer. What pow does is: it converts integer values to floating point, computes the natural logarithm of the first number, multiplies it by the second number, and then raises e to this power, and then your code (which never runs) finally converts the result from floating point to integer, rounding things off. Nothing of this sort is ever needed in order to simply convert binary to decimal. This never requires employing the services of natural logarithms. This is not what pow is for.
This task can be easily accomplished with just multiplication and addition. For example, if you already have the number 3, and your next digit is 7, you end up with 37 by multiplying 3 by 10 and then adding 7. You do the same exact thing with binary, base 2, with the only difference being that you multiply your number by 2, instead of 10.
But what you're really missing the most, is the Golden Rule Of Computer Programming:
A computer always does exactly what you tell it to do, instead of what you want it to do.
You need to tell your computer exactly what your computer needs to do. One step at a time. And in the right order. Telling your computer to compute the length of the string before it's even read from std::cin does not accomplish anything useful. It does not automatically recompute its length, after it's actually read. Therefore, if you need to compute the length of an entered string, you computer it after it's been read in, not before. And so on.
I'm on Manjaro 64 bit, latest edition. HP pavilion g6, Codeblocks
Release 13.12 rev 9501 (2013-12-25 18:25:45) gcc 5.2.0 Linux/unicode - 64 bit.
There was a discussion between students on why
sn = 1/n diverges
sn = 1/n^2 converges
So decided to write a program about it, just to show them what kind of output they can expect
#include <iostream>
#include <math.h>
#include <fstream>
using namespace std;
int main()
{
long double sn =0, sn2=0; // sn2 is 1/n^2
ofstream myfile;
myfile.open("/home/Projects/c++/test/test.csv");
for (double n =2; n<100000000;n++){
sn += 1/n;
sn2 += 1/pow(n,2);
myfile << "For n = " << n << " Sn = " << sn << " and Sn2 = " << sn2 << endl;
}
myfile.close();
return 0;
}
Starting from n=9944 I got sn2 = 0.644834, and kept getting it forever. I did expect that the compiler would round the number and ignore the 0s at some point, but this is just too early, no?
So at what theoretical point does 0s start to be ignored? And what to do if you care about all 0s in a number? If long double doesn't do it, then what does?
I know it seems like a silly question but I expected to see a longer number, since you can store big part of pi in long doubles. By the way same result for double too.
The code that you wrote suffers from a classic programming mistake: it sums a sequence of floating-point numbers by adding larger numbers to the sum first and smaller numbers later.
This will inevitably lead to precision loss during addition, since at some point in the sequence the sum will become relatively large, while the next member of the sequence will become relatively small. Adding a sufficiently small floating-point value to a sufficiently large floating-point sum does not affect the sum. Once you reach that point, it will look as if the addition operation is "ignored", even though the value you attempt to add is not zero.
You can observe the same effect if you try calculating 100000000.0f + 1 on a typical machine: it still evaluates to 100000000. This does not happen because 1 somehow gets rounded to zero. This happens because the mathematically-correct result 100000001 is rounded back to 100000000. In order to force 100000000.0f to change through addition, you need to add at least 5 (and the result will be "snapped" to 100000008).
So, the issue here is not that the compiler "rounds the number when it gets so small", as you seem to believe. Your 1/pow(n,2) number is probably fine and sufficiently precise (not rounded to 0). The issue here is that at some iteration of your cycle the small non-zero value of 1/pow(n,2) just cannot affect the sum anymore.
While it is true that adjusting output precision will help you to see better what is going on (as stated in the comments), the real issue is what is described above.
When calculating sums of floating-point sequences with large differences in member magnitudes, you should do it by adding smaller members of the sequence first. Using my 100000000.0f example again, you can easily see that 4.0f + 4.0f + 100000000.0f correctly produces 100000008, while 100000000.0f + 4.0f + 4.0f is still 100000000.
You're not running into precision issues here. The sum doesn't stop at 0.644834; it keeps going to roughly the correct value:
#include <iostream>
#include <math.h>
using namespace std;
int main() {
long double d = 0;
for (double n = 2; n < 100000000; n++) {
d += 1/pow(n, 2);
}
std::cout << d << endl;
return 0;
}
Result:
0.644934
Note the 9! That's not 0.644834 any more.
If you were expecting 1.644934, you should have started the sum at n=1. If you were expecting visible changes between successive partial sums, you didn't see those because C++ is truncating the representation of the sums to 6 significant digits. You can configure your output stream to display more digits with std::setprecision from the iomanip header:
myfile << std::setprecision(9);
My goal is as the following,
Generate successive values, such that each new one was never generated before, until all possible values are generated. At this point, the counter start the same sequence again. The main point here is that, all possible values are generated without repetition (until the period is exhausted). It does not matter if the sequence is simple 0, 1, 2, 3,..., or in other order.
For example, if the range can be represented simply by an unsigned, then
void increment (unsigned &n) {++n;}
is enough. However, the integer range is larger than 64-bits. For example, in one place, I need to generated 256-bits sequence. A simple implementation is like the following, just to illustrate what I am trying to do,
typedef std::array<uint64_t, 4> ctr_type;
static constexpr uint64_t max = ~((uint64_t) 0);
void increment (ctr_type &ctr)
{
if (ctr[0] < max) {++ctr[0]; return;}
if (ctr[1] < max) {++ctr[1]; return;}
if (ctr[2] < max) {++ctr[2]; return;}
if (ctr[3] < max) {++ctr[3]; return;}
ctr[0] = ctr[1] = ctr[2] = ctr[3] = 0;
}
So if ctr start with all zeros, then first ctr[0] is increased one by one until it reach max, and then ctr[1], and so on. If all 256-bits are set, then we reset it to all zero, and start again.
The problem is that, such implementation is surprisingly slow. My current improved version is sort of equivalent to the following,
void increment (ctr_type &ctr)
{
std::size_t k = (!(~ctr[0])) + (!(~ctr[1])) + (!(~ctr[2])) + (!(~ctr[3]))
if (k < 4)
++ctr[k];
else
memset(ctr.data(), 0, 32);
}
If the counter is only manipulated with the above increment function, and always start with zero, then ctr[k] == 0 if ctr[k - 1] == 0. And thus the value k will be the index of the first element that is less than the maximum.
I expected the first to be faster, since branch mis-prediction shall happen only once in every 2^64 iterations. The second, though mis-predication only happen every 2^256 iterations, it shall not make a difference. And apart from the branching, it needs four bitwise negation, four boolean negation, and three addition. Which might cost much more than the first.
However, both clang, gcc, or intel icpc generate binaries that the second was much faster.
My main question is that does anyone know if there any faster way to implement such a counter? It does not matter if the counter start by increasing the first integers or if it is implemented as an array of integers at all, as long as the algorithm generate all 2^256 combinations of 256-bits.
What makes things more complicated, I also need non uniform increment. For example, each time the counter is incremented by K where K > 1, but almost always remain a constant. My current implementation is similar to the above.
To provide some more context, one place I am using the counters is using them as input to AES-NI aesenc instructions. So distinct 128-bits integer (loaded into __m128i), after going through 10 (or 12 or 14, depending on the key size) rounds of the instructions, a distinct 128-bits integer is generated. If I generate one __m128i integer at once, then the cost of increment matters little. However, since aesenc has quite a bit latency, I generate integers by blocks. For example, I might have 4 blocks, ctr_type block[4], initialized equivalent to the following,
block[0]; // initialized to zero
block[1] = block[0]; increment(block[1]);
block[2] = block[1]; increment(block[2]);
block[3] = block[2]; increment(block[3]);
And each time I need new output, I increment each block[i] by 4, and generate 4 __m128i output at once. By interleaving instructions, overall I was able to increase the throughput, and reduce the cycles per bytes of output (cpB) from 6 to 0.9 when using 2 64-bits integers as the counter and 8 blocks. However, if instead, use 4 32-bits integers as counter, the throughput, measured as bytes per sec is reduced to half. I know for a fact that on x86-64, 64-bits integers could be faster than 32-bits in some situations. But I did not expect such simple increment operation makes such a big difference. I have carefully benchmarked the application, and the increment is indeed the one slow down the program. Since the loading into __m128i and store the __m128i output into usable 32-bits or 64-bits integers are done through aligned pointers, the only difference between the 32-bits and 64-bits version is how the counter is incremented. I expected that the AES-NI expected, after loading the integers into __m128i, shall dominate the performance. But when using 4 or 8 blocks, it was clearly not the case.
So to summary, my main question is that, if anyone know a way to improve the above counter implementation.
It's not only slow, but impossible. The total energy of universe is insufficient for 2^256 bit changes. And that would require gray counter.
Next thing before optimization is to fix the original implementation
void increment (ctr_type &ctr)
{
if (++ctr[0] != 0) return;
if (++ctr[1] != 0) return;
if (++ctr[2] != 0) return;
++ctr[3];
}
If each ctr[i] was not allowed to overflow to zero, the period would be just 4*(2^32), as in 0-9, 19,29,39,49,...99, 199,299,... and 1999,2999,3999,..., 9999.
As a reply to the comment -- it takes 2^64 iterations to have the first overflow. Being generous, upto 2^32 iterations could take place in a second, meaning that the program should run 2^32 seconds to have the first carry out. That's about 136 years.
EDIT
If the original implementation with 2^66 states is really what is wanted, then I'd suggest to change the interface and the functionality to something like:
(*counter) += 1;
while (*counter == 0)
{
counter++; // Move to next word
if (counter > tail_of_array) {
counter = head_of_array;
memset(counter,0, 16);
break;
}
}
The point being, that the overflow is still very infrequent. Almost always there's just one word to be incremented.
If you're using GCC or compilers with __int128 like Clang or ICC
unsigned __int128 H = 0, L = 0;
L++;
if (L == 0) H++;
On systems where __int128 isn't available
std::array<uint64_t, 4> c[4]{};
c[0]++;
if (c[0] == 0)
{
c[1]++;
if (c[1] == 0)
{
c[2]++;
if (c[2] == 0)
{
c[3]++;
}
}
}
In inline assembly it's much easier to do this using the carry flag. Unfortunately most high level languages don't have means to access it directly. Some compilers do have intrinsics for adding with carry like __builtin_uaddll_overflow in GCC and __builtin_addcll
Anyway this is rather wasting time since the total number of particles in the universe is only about 1080 and you cannot even count up the 64-bit counter in your life
Neither of your counter versions increment correctly. Instead of counting up to UINT256_MAX, you are actually just counting up to UINT64_MAX 4 times and then starting back at 0 again. This is apparent from the fact that you do not bother to clear any of the indices that has reached the max value until all of them have reached the max value. If you are measuring performance based on how often the counter reaches all bits 0, then this is why. Thus your algorithms do not generate all combinations of 256 bits, which is a stated requirement.
You mention "Generate successive values, such that each new one was never generated before"
To generate a set of such values, look at linear congruential generators
the sequence x = (x*1 + 1) % (power_of_2), you thought about it, this are simply sequential numbers.
the sequence x = (x*13 + 137) % (power of 2) , this generates unique numbers with a predictable period (power_of_2 - 1) and the unique numbers look more "random", kind of pseudo-random. You need to resort to arbitrary precision arithmetic to get it working, and also all the trickeries of multiplications by constants. This will get you a nice way to start.
You also complain that your simple code is "slow"
At 4.2 GHz frequency, running 4 intructions per cycle and using AVX512 vectorizations, on a 64-core computer with a multithreaded version of your program doing nothing else than increments, you get only 64x8x4*232=8796093022208 increments per second, that is 264 increments reached in 25 days. This post is old, you might have reached 841632698362998292480 by now, running such a program on such a machine, and you will gloriously reach 1683265396725996584960 in 2 years time.
You also require "until all possible values are generated".
You can only generate a finite number of values, depending how much you are willing to pay for the energy to power your computers. As mentioned in the other responses, with 128 or 256-bit numbers, even being the richest man in the world, you will never wrap around before the first of these conditions occurs:
getting out of money
end of humankind (nobody will get the outcome of your software)
burning the energy from the last particles of the universe
Multi-word addition can easily be accomplished in portable fashion by using three macros that mimic three types of addition instructions found on many processors:
ADDcc adds two words, and sets the carry if their was unsigned overflow
ADDC adds two words plus carry (from a previous addition)
ADDCcc adds two words plus carry, and sets the carry if their was unsigned overflow
A multi-word addition with two words uses ADDcc of the least significant words followed by ADCC of the most significant words. A multi-word addition with more than two words forms sequence ADDcc, ADDCcc, ..., ADDC. The MIPS architecture is a processor architecture without conditions code and therefore without carry flag. The macro implementations shown below basically follow the techniques used on MIPS processors for multi-word additions.
The ISO-C99 code below shows the operation of a 32-bit counter and a 64-bit counter based on 16-bit "words". I chose arrays as the underlying data structure, but one might also use struct, for example. Use of a struct will be significantly faster if each operand only comprises a few words, as the overhead of array indexing is eliminated. One would want to use the widest available integer type for each "word" for best performance. In the example from the question that would likely be a 256-bit counter comprising four uint64_t components.
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#define ADDCcc(a,b,cy,t0,t1) \
(t0=(b)+cy, t1=(a), cy=t0<cy, t0=t0+t1, t1=t0<t1, cy=cy+t1, t0=t0)
#define ADDcc(a,b,cy,t0,t1) \
(t0=(b), t1=(a), t0=t0+t1, cy=t0<t1, t0=t0)
#define ADDC(a,b,cy,t0,t1) \
(t0=(b)+cy, t1=(a), t0+t1)
typedef uint16_t T;
/* increment a multi-word counter comprising n words */
void inc_array (T *counter, const T *increment, int n)
{
T cy, t0, t1;
counter [0] = ADDcc (counter [0], increment [0], cy, t0, t1);
for (int i = 1; i < (n - 1); i++) {
counter [i] = ADDCcc (counter [i], increment [i], cy, t0, t1);
}
counter [n-1] = ADDC (counter [n-1], increment [n-1], cy, t0, t1);
}
#define INCREMENT (10)
#define UINT32_ARRAY_LEN (2)
#define UINT64_ARRAY_LEN (4)
int main (void)
{
uint32_t count32 = 0, incr32 = INCREMENT;
T count_arr2 [UINT32_ARRAY_LEN] = {0};
T incr_arr2 [UINT32_ARRAY_LEN] = {INCREMENT};
do {
count32 = count32 + incr32;
inc_array (count_arr2, incr_arr2, UINT32_ARRAY_LEN);
} while (count32 < (0U - INCREMENT - 1));
printf ("count32 = %08x arr_count = %08x\n",
count32, (((uint32_t)count_arr2 [1] << 16) +
((uint32_t)count_arr2 [0] << 0)));
uint64_t count64 = 0, incr64 = INCREMENT;
T count_arr4 [UINT64_ARRAY_LEN] = {0};
T incr_arr4 [UINT64_ARRAY_LEN] = {INCREMENT};
do {
count64 = count64 + incr64;
inc_array (count_arr4, incr_arr4, UINT64_ARRAY_LEN);
} while (count64 < 0xa987654321ULL);
printf ("count64 = %016llx arr_count = %016llx\n",
count64, (((uint64_t)count_arr4 [3] << 48) +
((uint64_t)count_arr4 [2] << 32) +
((uint64_t)count_arr4 [1] << 16) +
((uint64_t)count_arr4 [0] << 0)));
return EXIT_SUCCESS;
}
Compiled with full optimization, the 32-bit example executes in about a second, while the 64-bit example runs for about a minute on a modern PC. The output of the program should look like so:
count32 = fffffffa arr_count = fffffffa
count64 = 000000a987654326 arr_count = 000000a987654326
Non-portable code that is based on inline assembly or proprietary extensions for wide integer types may execute about two to three times as fast as the portable solution presented here.
Consider the problem:
It can be shown that for some powers of two in decimal format like:
2^9 = 512
2^89 = 618,970,019,642,690,137,449,562,112
The results end in a string consisting of 1s and 2s. In fact, it can be proven that for every integer R, there
exists a power of 2 such that 2K where K > 0 has a string of only 1s and 2s in its last R digits.
It can be shown clearly in the table below:
R Smallest K 2^K
1 1 2
2 9 512
3 89 ...112
4 89 ...2112
Using this technique, what then is the sum of all the smallest K values for 1 <= R <= 10?
Proposed sol:
Now this problem ain't that difficult to solve. You can simply do
int temp = power(2, int)
and then if you can get the length of the temp then multiply it with
(100^len)-i or (10^len)-i
// where i would determine how many last digits you want.
Now this temp = power(2,int) gets much higher with increasing int that you can't even store it in the int type or even in long int....
So what would be done. And is there any other solution based on bit strings. I guess that might make this problem easy.
Thanks in advance.
No, I doubt there are any solutions based on "strings of bits". That would be quite inefficient. But there are Bignum Libraries like GMP which feature variable types either fixed-size much bigger than int types, or of arbitrary size limited only by memory capacity, plus matching sets of math operations, working similarly to software FPU emulation.
Quoting after reference with a minor paraphrase.
#include <gmpxx.h>
int
main (void)
{
mpz_class a, b, c;
a = 1234;
b = "-5676739826856836954375492356569366529629568926519085610160816539856926459237598";
c = a+b;
cout << "sum is " << c << "\n";
cout << "absolute value is " << abs(c) << "\n";
return 0;
}
Thanks to C++ operator overloading, it is much easier to use than ANSI C version.
Since you are only interested in the the n least significant digits of your result, you could try to devise an algorithm that only calculates those. Based on the standard algorithm for written multiplication you can see that the n least significant digits of the product are entirely determined by the n least significant digits of the multiplicands. Based on this it should be possible to create an algorithm that calculates as many digits of R^K as fit into a long int.
The only problem you might run into is that there may be numbers that end in a matching sequence that is longer that a long int can hold. In that case you can still resort to calculating additional digits using your own algorithm or a library.
Note that this is basically the same thing that big-number libraries do, only your approach might be more efficient, because you are calculating less digits that you are unlikely to need.
Try GMP, http://gmplib.org/
It can store a number with any size if it fits in the memory.
Altough you might be better off with less brute force approach.
You can store binary strings in std::bitset or in std::vector
www.cplusplus.com/reference/bitset/bitset/
I think bitset is your choice.
Using big arithmetic for operations on powers of 2 is not though