How to subtract two Big Numbers - c++

I am trying to subtract 2 very large ints / big nums, but I have run into an issue. My code works for subtractions like 123 - 94, 5 - 29 but I can't seem to get around edge cases. For example 13 - 15 should result in -2. But if I do num1 - num2 - borrow + 10 on the first digit I get 8 and borrow becomes 1. Moving on to the last digit I end up with 1 - 1 - borrow(=1) which leaves me with -1 therefor my end result is -18 instead of being -2.
Here is my code for the subtraction:
//Infint is the class for the very large number
Infint Infint::sub(Infint other)
{
string result;
Infint i1 = *this;
Infint i2 = other;
if (int(i1._numberstr.length() - i2._numberstr.length()) < 0)
{
Infint(result) = i2 - i1;
result._numberstr.insert(result._numberstr.begin(), '-');
return result;
}
else if (i1._numberstr.length() - i2._numberstr.length() > 0)
{
int diff = i1._numberstr.length() - i2._numberstr.length();
for (int i = diff; i > 0 ; --i)
{
i2._numberstr.insert(i2._numberstr.begin(), '0');
}
}
int borrow = 0;
int i = i2._numberstr.length() - 1;
for (; i >= 0 ; --i)
{
int sub = (i1._numberstr[i] - '0') - (i2._numberstr[i] - '0') - borrow;
if (sub < 0)
{
sub += 10;
borrow = 1;
}
else
borrow = 0;
result.insert(0, to_string(sub));
}
while (i > 0)
{
result.insert(result.begin(), i1._numberstr[i1._numberstr.length() - i]);
--i;
}
int j = 0;
while (result[j] == '0')
j++;
result.erase(0, j);
if (borrow == 1)
result.insert(result.begin(), '-');
return Infint(result);
}
Would you kindly help me understand the errors or mistakes in logic I have made ?

Since you got 8 at the 1s position and -1 at the 10s position. the sum of these two is -10 + 8 = -2, the correct answer (instead of -10 - 8 = -18, which is wrong).
EDIT: To systematically derive the correct answer, if you find the highest-digit difference to be negative, distribute the minus sign to all digits. Suppose the per-digit differences of two n-digit values are
an-1, ..., a0
with aj be the difference at digit of 10j, and you find that an-1 < 0. Then total difference of the two numbers could be calculated as
-1 * (-an-1 * 10n-1 + ... + -a0)
It should be fairly straight-forward to derive the correct (negative) answer by going through the sum from 10n-1 down to 1s.

Related

Optimization. How to speed up the given C ++ code?

This is my code. 1<=i<=j<=n j-i<=a 1<=n<=1000000 0<=a<=1000000
#include <iostream>
using namespace std;
int main(){
int n, a, r = 0;
cin>>n>>a;
for(int i = 1; i <= n; i++){
int j = i;
for(j; j <= n; j++){
if(j-i<=a){
r++;
}
}
}
cout<<r;
}
Instead of loops, I changed it to a simple check of variables, which greatly accelerated the code. there is no need to calculate thousands of options.
My final, optimized code is:
#include <iostream>
using namespace std;
int main(){
unsigned long long n, a, r = 0;
cin>>n>>a;
if(a==0){
r = n;
}
if(n<=a){
r = (n*(n+1))/2;
}
if(n>a){
r += (n-a)*(a+1) + (a*(a+1))/2;
}
cout<<r;
}
After accounting for both positive numbers, negative numbers, and zeros, your double-nested for-loop can be simplified into this:
if (n < 1)
{
r = 0;
}
else if (a == 0)
{
r = n;
}
else if (a < 0)
{
r = 0;
}
else if (n <= a)
{
r = (n * (n + 1)) / 2;
}
else
{
r = (n-a)*(a + 1) + (a * (a + 1)) / 2;
}
Recall that summing a sequence of digits from 1..N is:
N*(N+1)
-------
2
If n <= a (positive numbers), r is incremented n times in the inner loop on the first iteration of the outer loop. Then n-1 times, then n-2 times... all the way down to 1.
For cases where n > a, then there are n-a summations of a+1 followed by a decrementing summation from a down to 1
This strikes me as something to speed up by doing a bit of math, not by massaging the code.
Basically, we can think of the loops as defining a square matrix of the values of i and j. So let's assume n = 9, and a = 3. I'll draw in a + for each place we increment r, a blank for the values we don't generate, and a 0 for the places we generate values, but don't increment r.
i\j 1 2 3 4 5 6 7 8 9
1 + + + + 0 0 0 0 0
2 + + + + 0 0 0 0
3 + + + + 0 0 0
4 + + + + 0 0
5 + + + + 0
6 + + + +
7 + + +
8 + +
9 +
So, ignoring the last a rows (i.e., for the first n-a rows), in each row we have a band a + 1 elements wide where we do an increment. Then at the end, we have a triangle, where we're basically summing a + a-1 + a-2 ... 0.
So, the first piece is (a+1) * (n-a) and the second piece is a * (a+1) / 2. Add those together, and we get the final answer.
Seems like
for(j; j <= n; j++){
if(j-i<=a){
r++;
}
}
could be replaced by
r += f(i,n,a);
Where f() is some simple expression involving those 3 values, probably including the equivalent of min(..,..)
If you want to speed up your code, instead of just tuning your algorithm, you can also try to use some parallel api.
Parallel computing api such as OpenMP enables you take advantage of your cpu resources.
If you uses OpenMP, you can try to use it to parallel your loop.

How to find the difference of all consecutive sub-sequences?

I need to get an effective algorithm, which can find sum of the difference of all consecutive sub-sequences, but I don't know how to do it.
For example, all consecutive sub-sequences for 12345:
12 (Dif = 1)
23 (Dif = 1)
34 (Dif = 1)
45 (Dif = 1)
123 (Dif = 2)
234 (Dif = 2)
345 (Dif = 2)
1234 (Dif = 3)
2345 (Dif = 3)
12345 (Dif = 4)
Sum of the difference = 20
Count of sequence elements >= 2 <= 300000.
Each element >= 1 <= 10^7.
Time limit: 1s.
I wrote the code, but it's too slow:
#include <bits/stdc++.h>
using namespace std;
int main() {
cin.tie(0);
iostream::sync_with_stdio(false);
int count;
cin >> count;
int elem;
vector<int> vec;
int sum = 0;
for (int i = 0; i < count; i++) {
cin >> elem;
if (vec.size() > 0) {
sum += abs(vec.back() - elem);
}
vec.push_back(elem);
if (vec.size() > 2) {
sum += abs(*max_element(vec.begin(), vec.end()) - *min_element(vec.begin(), vec.end()));
}
for (int z = 3; z < count; z++) {
if (vec.size() > z) {
sum += abs(*max_element(vec.begin() + i - z + 1, vec.end()) - *min_element(vec.begin() + i - z + 1, vec.end()));
}
}
}
cout << sum;
return 0;
}
I found that the count of sub-sequences can be found by the triangle numbers formula (Where is n - length of sequence):
count = 1/2 * n * (n - 1);
For n = 300000, count of sub-sequence is 45 billion.
How to do it faster? I need algorithm.
My first thoughts were to build a tree in order to remember sub answers (i.e. dynamic-programming) and just combine the answers together. However, each higher branch isn't strictly speaking the sum of the nodes below it. Consider for example:
I noticed, however, that the nodes were predictable. Namely:
And when expanded to 6 consecutive nodes:
Which, summarily expressed is
SUM( i * (n - i) ) with i = [1 .. n) where n >=2
This of course runs in O(N) time and doesn't require anything other than add + multiply.
However, it bothered me that perhaps this summation formula could be reduced to a simple equation. So I looked up the properties of summation formulas and worked through to get a simple equation:
Which means that (n^3 - n) / 6 should execute in O(1) time. I tested it for the first 6 and it gave the right answers...

add 1 to number represented by array of digits

The question goes as follows:
Given a non-negative number represented as an array of digits,
add 1 to the number ( increment the number represented by the digits ).
The digits are stored such that the most significant digit is at the head of the list.
Solution:
class Solution {
public:
vector<int> plusOne(vector<int> &digits) {
reverse(digits.begin(), digits.end());
vector<int> ans;
int carry = 1;
for (int i = 0; i < digits.size(); i++) {
int sum = digits[i] + carry;
ans.push_back(sum%10);
carry = sum / 10;
}
while (carry) {
ans.push_back(carry%10);
carry /= 10;
}
while (ans[ans.size() - 1] == 0 && ans.size() > 1) {
ans.pop_back();
}
reverse(ans.begin(), ans.end());
reverse(digits.begin(), digits.end());
return ans;
}
};
This is the solution i encountered while solving on a portal..
I cannot understand this :
while (ans[ans.size() - 1] == 0 && ans.size() > 1) {
ans.pop_back();
}
why do we need this while loop ? I tried self evaluating the code for example 9999 and i couldn't understand the logic behind popping the integers from the end!
Please help.
The logic
while (ans[ans.size() - 1] == 0 && ans.size() > 1) {
ans.pop_back();
}
removes any 0's at the end after incrementing the value by 1.
The logic is vague and isn't needed since you would never need ever find xyz..0000 in the answer set.
Example that the logic builder might have though: 9999 would be changed to 0000100 therefore he removes 0's to convert the conversion to 00001, which is reversed to form 10000, but since this scenario will never occur, the code should be removed from the logic.

C++ - Code Optimization

I have a problem:
You are given a sequence, in the form of a string with characters ‘0’, ‘1’, and ‘?’ only. Suppose there are k ‘?’s. Then there are 2^k ways to replace each ‘?’ by a ‘0’ or a ‘1’, giving 2^k different 0-1 sequences (0-1 sequences are sequences with only zeroes and ones).
For each 0-1 sequence, define its number of inversions as the minimum number of adjacent swaps required to sort the sequence in non-decreasing order. In this problem, the sequence is sorted in non-decreasing order precisely when all the zeroes occur before all the ones. For example, the sequence 11010 has 5 inversions. We can sort it by the following moves: 11010 →→ 11001 →→ 10101 →→ 01101 →→ 01011 →→ 00111.
Find the sum of the number of inversions of the 2^k sequences, modulo 1000000007 (10^9+7).
For example:
Input: ??01
-> Output: 5
Input: ?0?
-> Output: 3
Here's my code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <string.h>
#include <math.h>
using namespace std;
void ProcessSequences(char *input)
{
int c = 0;
/* Count the number of '?' in input sequence
* 1??0 -> 2
*/
for(int i=0;i<strlen(input);i++)
{
if(*(input+i) == '?')
{
c++;
}
}
/* Get all possible combination of '?'
* 1??0
* -> ??
* -> 00, 01, 10, 11
*/
int seqLength = pow(2,c);
// Initialize 2D array of integer
int **sequencelist, **allSequences;
sequencelist = new int*[seqLength];
allSequences = new int*[seqLength];
for(int i=0; i<seqLength; i++){
sequencelist[i] = new int[c];
allSequences[i] = new int[500000];
}
//end initialize
for(int count = 0; count < seqLength; count++)
{
int n = 0;
for(int offset = c-1; offset >= 0; offset--)
{
sequencelist[count][n] = ((count & (1 << offset)) >> offset);
// cout << sequencelist[count][n];
n++;
}
// cout << std::endl;
}
/* Change '?' in former sequence into all possible bits
* 1??0
* ?? -> 00, 01, 10, 11
* -> 1000, 1010, 1100, 1110
*/
for(int d = 0; d<seqLength; d++)
{
int seqCount = 0;
for(int e = 0; e<strlen(input); e++)
{
if(*(input+e) == '1')
{
allSequences[d][e] = 1;
}
else if(*(input+e) == '0')
{
allSequences[d][e] = 0;
}
else
{
allSequences[d][e] = sequencelist[d][seqCount];
seqCount++;
}
}
}
/*
* Sort each sequences to increasing mode
*
*/
// cout<<endl;
int totalNum[seqLength];
for(int i=0; i<seqLength; i++){
int num = 0;
for(int j=0; j<strlen(input); j++){
if(j==strlen(input)-1){
break;
}
if(allSequences[i][j] > allSequences[i][j+1]){
int temp = allSequences[i][j];
allSequences[i][j] = allSequences[i][j+1];
allSequences[i][j+1] = temp;
num++;
j = -1;
}//endif
}//endfor
totalNum[i] = num;
}//endfor
/*
* Sum of all Num of Inversions
*/
int sum = 0;
for(int i=0;i<seqLength;i++){
sum = sum + totalNum[i];
}
// cout<<"Output: "<<endl;
int out = sum%1000000007;
cout<< out <<endl;
} //end of ProcessSequences method
int main()
{
// Get Input
char seq[500000];
// cout << "Input: "<<endl;
cin >> seq;
char *p = &seq[0];
ProcessSequences(p);
return 0;
}
the results were right for small size input, but for bigger size input I got time CPU time limit > 1 second. I also got exceeded memory size. How to make it faster and optimal memory use? What algorithm should I use and what better data structure should I use?, Thank you.
Dynamic programming is the way to go. Imagine You are adding the last character to all sequences.
If it is 1 then You get XXXXXX1. Number of swaps is obviously the same as it was for every sequence so far.
If it is 0 then You need to know number of ones already in every sequence. Number of swaps would increase by the amount of ones for every sequence.
If it is ? You just add two previous cases together
You need to calculate how many sequences are there. For every length and for every number of ones (number of ones in the sequence can not be greater than length of the sequence, naturally). You start with length 1, which is trivial, and continue with longer. You can get really big numbers, so You should calculate modulo 1000000007 all the time. The program is not in C++, but should be easy to rewrite (array should be initialized to 0, int is 32bit, long in 64bit).
long Mod(long x)
{
return x % 1000000007;
}
long Calc(string s)
{
int len = s.Length;
long[,] nums = new long[len + 1, len + 1];
long sum = 0;
nums[0, 0] = 1;
for (int i = 0; i < len; ++i)
{
if(s[i] == '?')
{
sum = Mod(sum * 2);
}
for (int j = 0; j <= i; ++j)
{
if (s[i] == '0' || s[i] == '?')
{
nums[i + 1, j] = Mod(nums[i + 1, j] + nums[i, j]);
sum = Mod(sum + j * nums[i, j]);
}
if (s[i] == '1' || s[i] == '?')
{
nums[i + 1, j + 1] = nums[i, j];
}
}
}
return sum;
}
Optimalization
The code above is written to be as clear as possible and to show dynamic programming approach. You do not actually need array [len+1, len+1]. You calculate column i+1 from column i and never go back, so two columns are enough - old and new. If You dig more into it, You find out that row j of new column depends only on row j and j-1 of the old column. So You can go with one column if You actualize the values in the right direction (and do not overwrite values You would need).
The code above uses 64bit integers. You really need that only in j * nums[i, j]. The nums array contain numbers less than 1000000007 and 32bit integer is enough. Even 2*1000000007 can fit into 32bit signed int, we can make use of it.
We can optimize the code by nesting loop into conditions instead of conditions in the loop. Maybe it is even more natural approach, the only downside is repeating the code.
The % operator is, as every dividing, quite expensive. j * nums[i, j] is typically far smaller that capacity of 64bit integer, so we do not have to do modulo in every step. Just watch the actual value and apply when needed. The Mod(nums[i + 1, j] + nums[i, j]) can also be optimized, as nums[i + 1, j] + nums[i, j] would always be smaller than 2*1000000007.
And finally the optimized code. I switched to C++, I realized there are differences what int and long means, so rather make it clear:
long CalcOpt(string s)
{
long len = s.length();
vector<long> nums(len + 1);
long long sum = 0;
nums[0] = 1;
const long mod = 1000000007;
for (long i = 0; i < len; ++i)
{
if (s[i] == '1')
{
for (long j = i + 1; j > 0; --j)
{
nums[j] = nums[j - 1];
}
nums[0] = 0;
}
else if (s[i] == '0')
{
for (long j = 1; j <= i; ++j)
{
sum += (long long)j * nums[j];
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
}
}
else
{
sum *= 2;
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
for (long j = i + 1; j > 0; --j)
{
sum += (long long)j * nums[j];
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
long add = nums[j] + nums[j - 1];
if (add >= mod) { add -= mod; }
nums[j] = add;
}
}
}
return (long)(sum % mod);
}
Simplification
Time limit still exceeded? There is probably better way to do it. You can either
get back to the beginning and find out mathematically different way to calculate the result
or simplify actual solution using math
I went the second way. What we are doing in the loop is in fact convolution of two sequences, for example:
0, 0, 0, 1, 4, 6, 4, 1, 0, 0,... and 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,...
0*0 + 0*1 + 0*2 + 1*3 + 4*4 + 6*5 + 4*6 + 1*7 + 0*8...= 80
The first sequence is symmetric and the second is linear. It this case, the sum of convolution can be calculated from sum of the first sequence which is = 16 (numSum) and number from second sequence corresponding to the center of the first sequence, which is 5 (numMult). numSum*numMult = 16*5 = 80. We replace the whole loop with one multiplication if we are able to update those numbers in each step, which fortulately seems the case.
If s[i] == '0' then numSum does not change and numMult does not change.
If s[i] == '1' then numSum does not change, only numMult increments by 1, as we shift the whole sequence by one position.
If s[i] == '?' we add original and shiftet sequence together. numSum is multiplied by 2 and numMult increments by 0.5.
The 0.5 means a bit problem, as it is not the whole number. But we know, that the result would be whole number. Fortunately in modular arithmetics in this case exists inversion of two (=1/2) as a whole number. It is h = (mod+1)/2. As a reminder, inversion of 2 is such a number, that h*2=1 modulo mod. Implementation wisely it is easier to multiply numMult by 2 and divide numSum by 2, but it is just a detail, we would need 0.5 anyway. The code:
long CalcOptSimpl(string s)
{
long len = s.length();
long long sum = 0;
const long mod = 1000000007;
long numSum = (mod + 1) / 2;
long long numMult = 0;
for (long i = 0; i < len; ++i)
{
if (s[i] == '1')
{
numMult += 2;
}
else if (s[i] == '0')
{
sum += numSum * numMult;
if (sum > std::numeric_limits<long long>::max() / 4) { sum %= mod; }
}
else
{
sum = sum * 2 + numSum * numMult;
if (sum > std::numeric_limits<long long>::max() / 4) { sum %= mod; }
numSum = (numSum * 2) % mod;
numMult++;
}
}
return (long)(sum % mod);
}
I am pretty sure there exists some simple way to get this code, yet I am still unable to see it. But sometimes path is the goal :-)
If a sequence has N zeros with indexes zero[0], zero[1], ... zero[N - 1], the number of inversions for it would be (zero[0] + zero[1] + ... + zero[N - 1]) - (N - 1) * N / 2. (you should be able to prove it)
For example, 11010 has two zeros with indexes 2 and 4, so the number of inversions would be 2 + 4 - 1 * 2 / 2 = 5.
For all 2^k sequences, you can calculate the sum of two parts separately and then add them up.
1) The first part is zero[0] + zero[1] + ... + zero[N - 1]. Each 0 in the the given sequence contributes index * 2^k and each ? contributes index * 2^(k-1)
2) The second part is (N - 1) * N / 2. You can calculate this using a dynamic programming (maybe you should google and learn this first). In short, use f[i][j] to present the number of sequence with j zeros using the first i characters of the given sequence.

Large numbers multiplication in C++

I am looking for a fast large numbers multiplication algorithm in C++.
I have tried something like this but I think I am creating too many string objects.
string sum(string v1, string v2)
{
string r;
int temp = 0, i, n, m;
int size1 = v1.size(), size2 = v2.size();
n = min(size1, size2);
m = max(size1, size2);
if ((v1 == "0" || v1 == "") && (v2 == "0" || v2 == "")) return "0";
r.resize(m, '0');
for (i = 0; i < n; i++)
{
temp += v1[size1 - 1 - i] + v2[size2 - 1 - i] - 96;
r[m - 1 - i] = temp % 10 + 48;
temp /= 10;
}
while (i < size1)
{
temp += v1[size1 - 1 - i] - 48;
r[m - 1 - i] = (char)(temp % 10 + 48);
temp /= 10;
++i;
}
while (i < size2)
{
temp += v2[size2 - 1 - i] - 48;
r[m - 1 - i] = (char)(temp % 10 + 48);
temp /= 10;
++i;
}
if (temp != 0)
r = (char)(temp + 48) + r;
return r;
}
string multSmall(string v1, int m)
{
string ret = "0";
while(m)
{
if (m & 1) ret = sum(ret, v1);
m >>= 1;
if (m) v1 = sum(v1, v1);
}
return ret;
}
string multAll(string v1, string v2)
{
string ret = "0", z = "", pom;
int i, size;
if (v1.size() < v2.size())
std::swap(v1, v2);
size = v2.size();
for (i = 0; i < size; i++)
{
pom = multSmall(v1, v2[size - 1 - i] - 48);
pom.append(z);
ret = sum(ret, pom);
z.resize(i + 1, '0');
}
return ret;
}
I DON'T want do use any external libraries. How should I do it? Maybe I should use char arrays instead of strings? But I am not sure if reallocating memory for an array would be faster than creating string object.
Fast large number multiplication is a big project. A very big project depending upon just how large of numbers you want to multiply.
Probably the simplest important thing, however, is that you want to get as much mileage out of your CPU's native instructions as possible. Addition of 64-bit numbers is 8 times more powerful than addition of 8-bit numbers, and over 19 times more powerful than addition of decimal digits. But your computer can probably add 64-bit numbers just as quickly as it can add 8-bit numbers, and a lot faster than any code you write to do addition of decimal digits.
Multiplication is much more dramatic; an instruction that multiplies two 64-bit numbers to produce a 128-bit result is doing around 64 times more work than an instruction that multiplies two 8-bit numbers to produce a 16-bit result -- but your CPU can probably do them at the same speed, or maybe the latter is twice as fast as the former.
So, you really want to orient your data structures and base case algorithms around the idea of using these more powerful instructions as much as you can.
If you need to, you can think of it as doing arithmetic in base 2^64. (or maybe base 2^32, if you can't or don't want to use 64-bit arithmetic)