BigInt Multiplication w/ int arrays - c++

I'm making a BigInt class in C++ as an exercise. I'm currently working on the multiplication functionality. My BigInt's are represented as a fixed length (that is very big) int[], with each entry being a digit of the number entered.
So, BigInt = 324, will result in [0,0,0,..,3,2,4].
I'm currently trying to multiply using this code:
// multiplication
BigInt BigInt::operator*(BigInt const& other) const {
BigInt a = *this;
BigInt b = other;
cout << a << b << endl;
BigInt product = 0;
for(int i = 0; i < arraySize; i++){
int carry = 0;
for(int j = 0; j < arraySize; j++){
product.digits[arraySize - (j + i)] += (carry + (a.digits[j] * b.digits[i]));
carry = (product.digits[arraySize - (j + i)] / 10);
product.digits[arraySize - (j + i)] = (product.digits[arraySize - (j + i)] % 10);
}
product.digits[arraySize - i] += carry;
}
return product;
}
My answer keeps returning 0. For example, 2 * 2 = 0.

It is not sure that this will fix your program, but you have Undefined Behavior because of this:
product.digits[arraySize - (j + i)]
This index arraySize - (j + i) becomes negative when i + j > arraySize, which will obviously occur in your loop.
Basically, when multiplying two numbers with n digits, the result may be as wide as 2n digits. Since you encode all your numbers on fixed length arraySize, you have to take measures to avoid out of bound.
A simple test if(i+j) <= arraySize could do, or by changing the second loop:
for(int j = 0; j < arraySize - i; j++)
Alternatively, it would be better to use std::vector as the internal representation of your BigInt. It can be sized dynamically to fit your result beforehand.
It is not completely sure that this will fix completely your code, but it has to be fixed, before proceeding with the debugging. It will be easier after removing the UB. Here I approve #Dúthomhas's note that your indexing through the arrays seems obviously irregular... You go from right to left with the result, while from left to right with the inputs...

Related

How to multiply strings without overflow by c++?

I tried to solve Multiply Strings by c++ by this approach, but I cannot avoid integer overflow by change type from int to long long int or double. Python won't overflow, so my code works like below.
Given two non-negative integers num1 and num2 represented as strings, return the product of num1 and num2, also represented as a string.
Python works:
class Solution:
def multiply(self, num1: str, num2: str) -> str:
n = len(num1) # assume n >= m
m = len(num2)
if n < m:
num1, num2 = num2, num1
n, m = m, n
product = 0
for i in range(1, m + 1):
multiplier = int(num2[m - i]) # current character of num2
sum_ = 0
for j in range(0, n): # multiply num1 by multiplier
multiplicand = int(num1[n - j - 1])
num = multiplicand * (10 ** j) * multiplier
sum_ += num
product += sum_ * (10 ** (i - 1))
return str(product)
C++ failed:
string multiply(string num1, string num2) {
int n = num1.size();
int m = num2.size();
if (n < m) {
std::swap(num1, num2);
std::swap(n, m);
}
long long int product = 0;
for (int i = 1; i <= m; ++i) {
int multiChar = num2[m - i] - '0';
long long int sum = 0;
for (int j = 0; j < n ; ++j) {
int charCand = num1[n - j - 1] - '0';
long long int num = charCand * ((pow(10, j))) * multiChar;
sum += num;
}
product += sum * ((pow(10, i - 1)));
}
return std::to_string(product);
}
As far as I have tested, some cases are OK, but overflow seems unavoidable if the number is too big. Is there any way to fix my code?
Testcase:
"12323247989"
"98549324321"
runtime error: 1.05355e+20 is outside the range of representable values of type 'long long' (solution.cpp)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior prog_joined.cpp:28:17
Expected:
"1214447762756072040469"
You are not on the right way. Imagine how you would do that by hand:
abc*def
-------
xxxx
xxxx0
xxxx00
-------
You just add single digits as well, don't you? Only those of same significance – possibly considering some carry.
You might rather reproduce the same in code, too. Producing overflow that way is much less likely (I assume that after multiplying single digits summing up the results in a single integer – recommending an unsigned type for – is acceptable; if not, you'd have to build up a std::string again). The sign you calculate independently, just as you'd do by hand as well.
One difference to multiplication by hand we'll have, though: By hand you would create rather large intermediate numbers by multiplying one number with each digit
of the other number. That would require to store these intermediate numbers as strings again, e. g. in a vector. More efficient, though, is identifying those digit pairs of which the multiplication results in the same significance.
These will be 0|0 -> 0; 0|1, 1|0 -> 1; 0|2, 1|1, 2|0 -> 2, and so on. You produce these pairs by:
for(size_t i = 0, max = std::max(num1.length(), num2.length); i < max; ++i)
{
for(size_t j = 0; j < i; ++j)
{
if(j < num1.length() && i - j < num2.length())
{
// iterate backwards for easy carry handling
size_t idx1 = num1.length() - j;
size_t idx2 = num2.length() - (i - j);
// multiply characters at num1[idx1] and num2[idx2] and add result to sum
}
}
// add carry
// calculate last digit and a p p e n d to a result string
// update carry
}
// append '-' sign, if result is negative
std::reverse(result.begin(), result.end());
Building up the string in reverse order is more efficient, as you do not need to move the subsequent characters all the time. (Untested code, if you find a bug, please fix it yourself).
The loops are in my preferred variant; if you feel better with another, feel free to change; just be aware that with signed types you can produce endless loops if you try e. g. for(unsigned i = n; n >= 0; --i /* overflows to UINT_MAX */).
Side note: You should accept the input strings by reference (std::string const& num1, std::string const& num2), that avoids the needless copies arising by accepting by value.

What is the Big-O Notation for this code?

I am having trouble deciding between N^2 and NlogN as the Big O? Whats throwing me off is the third nested for loop from k <=j. How do I reconcile this?
int Max_Subsequence_Sum( const int A[], const int N )
{
int This_Sum = 0, Max_Sum = 0;
for (int i=0; i<N; i++)
{
for (int j=i; j<N; j++)
{
This_Sum = 0;
for (int k=i; k<=j; k++)
{
This_Sum += A[k];
}
if (This_Sum > Max_Sum)
{
Max_Sum = This_Sum;
}
}
}
return Max_Sum;
}
This can be done with estimation or analysis. Looking at the inner most loop there are j-i operations inside the second loop. To get the total number of operations one would sum to get :
(1+N)(2 N + N^2) / 6
Making the algorithm O(N^3). To estimate one can see that there are three loops which at some point have O(N) calls thus it's O(N^3).
Let us analyze the most inner loop first:
for (int k=i; k <= j; k++) {
This_Sum += A[k];
}
Here the counter k iterates from i (inclusive) to j (inclusive), this thus means that the body of the for loop is performed j-i+1 times. If we assume that fetching the k-th number from an array is done in constant time, and the arithmetic operations (incrementing k, calculating the sum of This_Sum and A[k], and comparking k with j), then this thus runs in O(j-i).
The initialization of This_Sum and the if statement is not significant:
This_Sum = 0;
// ...
if (This_Sum > Max_Sum) {
Max_Sum = This_Sum;
}
indeed, if we can compare two numbers in constant time, and set one variable to the value hold by another value in constant time, then regardless whether the condition holds or not, the number of operations is fixed.
Now we can take a look at the loop in the middle, and abstract away the most inner loop:
for (int j=i; j < N; j++) {
// constant number of oprations
// j-i+1 operations
// constant number of operations
}
Here j ranges from i to N, so that means that the total number of operations is:
N
---
\
/ j - i + 1
---
j=i
This sum is equivalent to:
N
---
\
(N-j) * (1 - i) + / j
---
j=i
This is an arithmetic sum [wiki] and it is equivalent to:
(N - i + 1) × ((1 - i) + (i+N) / 2) = (N - i + 1) × ((N-i) / 2 + 1)
or when we expand this:
i2/2 + 3×N/2 - 3×i/2 + N2/2 - N×i + 1
So that means that we can now focus on the outer loop:
for (int i=0; i<N; i++) {
// i2/2 + 3×N/2 - 3×i/2 + N2/2 - N×i + 1
}
So now we can again calculate the number of operations with:
N
---
\
/ i2/2 + 3×N/2 - 3×i/2 + N2/2 - N×i + 1
---
i=0
We can use Faulhaber's formula [wiki] here to solve this sum, and obtain:
(N+1)×(N2+5×N+6)/6
or in expanded form:
N3/6 + N2 + 11×N/6 + 1
which is thus an O(n3) algorithm.

Algorithm on hexagonal grid

Hexagonal grid is represented by a two-dimensional array with R rows and C columns. First row always comes "before" second in hexagonal grid construction (see image below). Let k be the number of turns. Each turn, an element of the grid is 1 if and only if the number of neighbours of that element that were 1 the turn before is an odd number. Write C++ code that outputs the grid after k turns.
Limitations:
1 <= R <= 10, 1 <= C <= 10, 1 <= k <= 2^(63) - 1
An example with input (in the first row are R, C and k, then comes the starting grid):
4 4 3
0 0 0 0
0 0 0 0
0 0 1 0
0 0 0 0
Simulation: image, yellow elements represent '1' and blank represent '0'.
This problem is easy to solve if I simulate and produce a grid each turn, but with big enough k it becomes too slow. What is the faster solution?
EDIT: code (n and m are used instead R and C) :
#include <cstdio>
#include <cstring>
using namespace std;
int old[11][11];
int _new[11][11];
int n, m;
long long int k;
int main() {
scanf ("%d %d %lld", &n, &m, &k);
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) scanf ("%d", &old[i][j]);
}
printf ("\n");
while (k) {
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) {
int count = 0;
if (i % 2 == 0) {
if (i) {
if (j) count += old[i-1][j-1];
count += old[i-1][j];
}
if (j) count += (old[i][j-1]);
if (j < m-1) count += (old[i][j+1]);
if (i < n-1) {
if (j) count += old[i+1][j-1];
count += old[i+1][j];
}
}
else {
if (i) {
if (j < m-1) count += old[i-1][j+1];
count += old[i-1][j];
}
if (j) count += old[i][j-1];
if (j < m-1) count += old[i][j+1];
if (i < n-1) {
if (j < m-1) count += old[i+1][j+1];
count += old[i+1][j];
}
}
if (count % 2) _new[i][j] = 1;
else _new[i][j] = 0;
}
}
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) old[i][j] = _new[i][j];
}
k--;
}
for (int i = 0; i < n; i++) {
for (int j = 0; j < m; j++) {
printf ("%d", old[i][j]);
}
printf ("\n");
}
return 0;
}
For a given R and C, you have N=R*C cells.
If you represent those cells as a vector of elements in GF(2), i.e, 0s and 1s where arithmetic is performed mod 2 (addition is XOR and multiplication is AND), then the transformation from one turn to the next can be represented by an N*N matrix M, so that:
turn[i+1] = M*turn[i]
You can exponentiate the matrix to determine how the cells transform over k turns:
turn[i+k] = (M^k)*turn[i]
Even if k is very large, like 2^63-1, you can calculate M^k quickly using exponentiation by squaring: https://en.wikipedia.org/wiki/Exponentiation_by_squaring This only takes O(log(k)) matrix multiplications.
Then you can multiply your initial state by the matrix to get the output state.
From the limits on R, C, k, and time given in your question, it's clear that this is the solution you're supposed to come up with.
There are several ways to speed up your algorithm.
You do the neighbour-calculation with the out-of bounds checking in every turn. Do some preprocessing and calculate the neighbours of each cell once at the beginning. (Aziuth has already proposed that.)
Then you don't need to count the neighbours of all cells. Each cell is on if an odd number of neighbouring cells were on in the last turn and it is off otherwise.
You can think of this differently: Start with a clean board. For each active cell of the previous move, toggle the state of all surrounding cells. When an even number of neighbours cause a toggle, the cell is on, otherwise the toggles cancel each other out. Look at the first step of your example. It's like playing Lights Out, really.
This method is faster than counting the neighbours if the board has only few active cells and its worst case is a board whose cells are all on, in which case it is as good as neighbour-counting, because you have to touch each neighbours for each cell.
The next logical step is to represent the board as a sequence of bits, because bits already have a natural way of toggling, the exclusive or or xor oerator, ^. If you keep the list of neigbours for each cell as a bit mask m, you can then toggle the board b via b ^= m.
These are the improvements that can be made to the algorithm. The big improvement is to notice that the patterns will eventually repeat. (The toggling bears resemblance with Conway's Game of Life, where there are also repeating patterns.) Also, the given maximum number of possible iterations, 2⁶³ is suspiciously large.
The playing board is small. The example in your question will repeat at least after 2¹⁶ turns, because the 4×4 board can have at most 2¹⁶ layouts. In practice, turn 127 reaches the ring pattern of the first move after the original and it loops with a period of 126 from then.
The bigger boards may have up to 2¹⁰⁰ layouts, so they may not repeat within 2⁶³ turns. A 10×10 board with a single active cell near the middle has ar period of 2,162,622. This may indeed be a topic for a maths study, as Aziuth suggests, but we'll tacke it with profane means: Keep a hash map of all previous states and the turns where they occurred, then check whether the pattern has occurred before in each turn.
We now have:
a simple algorithm for toggling the cells' state and
a compact bitwise representation of the board, which allows us to create a hash map of the previous states.
Here's my attempt:
#include <iostream>
#include <map>
/*
* Bit representation of a playing board, at most 10 x 10
*/
struct Grid {
unsigned char data[16];
Grid() : data() {
}
void add(size_t i, size_t j) {
size_t k = 10 * i + j;
data[k / 8] |= 1u << (k % 8);
}
void flip(const Grid &mask) {
size_t n = 13;
while (n--) data[n] ^= mask.data[n];
}
bool ison(size_t i, size_t j) const {
size_t k = 10 * i + j;
return ((data[k / 8] & (1u << (k % 8))) != 0);
}
bool operator<(const Grid &other) const {
size_t n = 13;
while (n--) {
if (data[n] > other.data[n]) return true;
if (data[n] < other.data[n]) return false;
}
return false;
}
void dump(size_t n, size_t m) const {
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
std::cout << (ison(i, j) ? 1 : 0);
}
std::cout << '\n';
}
std::cout << '\n';
}
};
int main()
{
size_t n, m, k;
std::cin >> n >> m >> k;
Grid grid;
Grid mask[10][10];
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
int x;
std::cin >> x;
if (x) grid.add(i, j);
}
}
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
Grid &mm = mask[i][j];
if (i % 2 == 0) {
if (i) {
if (j) mm.add(i - 1, j - 1);
mm.add(i - 1, j);
}
if (j) mm.add(i, j - 1);
if (j < m - 1) mm.add(i, j + 1);
if (i < n - 1) {
if (j) mm.add(i + 1, j - 1);
mm.add(i + 1, j);
}
} else {
if (i) {
if (j < m - 1) mm.add(i - 1, j + 1);
mm.add(i - 1, j);
}
if (j) mm.add(i, j - 1);
if (j < m - 1) mm.add(i, j + 1);
if (i < n - 1) {
if (j < m - 1) mm.add(i + 1, j + 1);
mm.add(i + 1, j);
}
}
}
}
std::map<Grid, size_t> prev;
std::map<size_t, Grid> pattern;
for (size_t turn = 0; turn < k; turn++) {
Grid next;
std::map<Grid, size_t>::const_iterator it = prev.find(grid);
if (1 && it != prev.end()) {
size_t start = it->second;
size_t period = turn - start;
size_t index = (k - turn) % period;
grid = pattern[start + index];
break;
}
prev[grid] = turn;
pattern[turn] = grid;
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
if (grid.ison(i, j)) next.flip(mask[i][j]);
}
}
grid = next;
}
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < m; j++) {
std::cout << (grid.ison(i, j) ? 1 : 0);
}
std::cout << '\n';
}
return 0;
}
There is probably room for improvement. Especially, I'm not so sure how it fares for big boards. (The code above uses an ordered map. We don't need the order, so using an unordered map will yield faster code. The example above with a single active cell on a 10×10 board took significantly longer than a second with an ordered map.)
Not sure about how you did it - and you should really always post code here - but let's try to optimize things here.
First of all, there is not really a difference between that and a quadratic grid. Different neighbor relationships, but I mean, that is just a small translation function. If you have a problem there, we should treat this separately, maybe on CodeReview.
Now, the naive solution is:
for all fields
count neighbors
if odd: add a marker to update to one, else to zero
for all fields
update all fields by marker of former step
this is obviously in O(N). Iterating twice is somewhat twice the actual run time, but should not be that bad. Try not to allocate space every time that you do that but reuse existing structures.
I'd propose this solution:
at the start:
create a std::vector or std::list "activated" of pointers to all fields that are activated
each iteration:
create a vector "new_activated"
for all items in activated
count neighbors, if odd add to new_activated
for all items in activated
set to inactive
replace activated by new_activated*
for all items in activated
set to active
*this can be done efficiently by putting them in a smart pointer and use move semantics
This code only works on the activated fields. As long as they stay within some smaller area, this is far more efficient. However, I have no idea when this changes - if there are activated fields all over the place, this might be less efficient. In that case, the naive solution might be the best one.
EDIT: after you now posted your code... your code is quite procedural. This is C++, use classes and use representation of things. Probably you do the search for neighbors right, but you can easily make mistakes there and therefore should isolate that part in a function, or better method. Raw arrays are bad and variables like n or k are bad. But before I start tearing your code apart, I instead repeat my recommendation, put the code on CodeReview, having people tear it apart until it is perfect.
This started off as a comment, but I think it could be helpful as an answer in addition to what has already been stated.
You stated the following limitations:
1 <= R <= 10, 1 <= C <= 10
Given these restrictions, I'll take the liberty to can represent the grid/matrix M of R rows and C columns in constant space (i.e. O(1)), and also check its elements in O(1) instead of O(R*C) time, thus removing this part from our time-complexity analysis.
That is, the grid can simply be declared as bool grid[10][10];.
The key input is the large number of turns k, stated to be in the range:
1 <= k <= 2^(63) - 1
The problem is that, AFAIK, you're required to perform k turns. This makes the algorithm be in O(k). Thus, no proposed solution can do better than O(k)[1].
To improve the speed in a meaningful way, this upper-bound must be lowered in some way[1], but it looks like this cannot be done without altering the problem constraints.
Thus, no proposed solution can do better than O(k)[1].
The fact that k can be so large is the main issue. The most anyone can do is improve the rest of the implementation, but this will only improve by a constant factor; you'll have to go through k turns regardless of how you look at it.
Therefore, unless some clever fact and/or detail is found that allows this bound to be lowered, there's no other choice.
[1] For example, it's not like trying to determine if some number n is prime, where you can check all numbers in the range(2, n) to see if they divide n, making it a O(n) process, or notice that some improvements include only looking at odd numbers after checking n is not even (constant factor; still O(n)), and then checking odd numbers only up to √n, i.e., in the range(3, √n, 2), which meaningfully lowers the upper-bound down to O(√n).

C++ - Code Optimization

I have a problem:
You are given a sequence, in the form of a string with characters ‘0’, ‘1’, and ‘?’ only. Suppose there are k ‘?’s. Then there are 2^k ways to replace each ‘?’ by a ‘0’ or a ‘1’, giving 2^k different 0-1 sequences (0-1 sequences are sequences with only zeroes and ones).
For each 0-1 sequence, define its number of inversions as the minimum number of adjacent swaps required to sort the sequence in non-decreasing order. In this problem, the sequence is sorted in non-decreasing order precisely when all the zeroes occur before all the ones. For example, the sequence 11010 has 5 inversions. We can sort it by the following moves: 11010 →→ 11001 →→ 10101 →→ 01101 →→ 01011 →→ 00111.
Find the sum of the number of inversions of the 2^k sequences, modulo 1000000007 (10^9+7).
For example:
Input: ??01
-> Output: 5
Input: ?0?
-> Output: 3
Here's my code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <string.h>
#include <math.h>
using namespace std;
void ProcessSequences(char *input)
{
int c = 0;
/* Count the number of '?' in input sequence
* 1??0 -> 2
*/
for(int i=0;i<strlen(input);i++)
{
if(*(input+i) == '?')
{
c++;
}
}
/* Get all possible combination of '?'
* 1??0
* -> ??
* -> 00, 01, 10, 11
*/
int seqLength = pow(2,c);
// Initialize 2D array of integer
int **sequencelist, **allSequences;
sequencelist = new int*[seqLength];
allSequences = new int*[seqLength];
for(int i=0; i<seqLength; i++){
sequencelist[i] = new int[c];
allSequences[i] = new int[500000];
}
//end initialize
for(int count = 0; count < seqLength; count++)
{
int n = 0;
for(int offset = c-1; offset >= 0; offset--)
{
sequencelist[count][n] = ((count & (1 << offset)) >> offset);
// cout << sequencelist[count][n];
n++;
}
// cout << std::endl;
}
/* Change '?' in former sequence into all possible bits
* 1??0
* ?? -> 00, 01, 10, 11
* -> 1000, 1010, 1100, 1110
*/
for(int d = 0; d<seqLength; d++)
{
int seqCount = 0;
for(int e = 0; e<strlen(input); e++)
{
if(*(input+e) == '1')
{
allSequences[d][e] = 1;
}
else if(*(input+e) == '0')
{
allSequences[d][e] = 0;
}
else
{
allSequences[d][e] = sequencelist[d][seqCount];
seqCount++;
}
}
}
/*
* Sort each sequences to increasing mode
*
*/
// cout<<endl;
int totalNum[seqLength];
for(int i=0; i<seqLength; i++){
int num = 0;
for(int j=0; j<strlen(input); j++){
if(j==strlen(input)-1){
break;
}
if(allSequences[i][j] > allSequences[i][j+1]){
int temp = allSequences[i][j];
allSequences[i][j] = allSequences[i][j+1];
allSequences[i][j+1] = temp;
num++;
j = -1;
}//endif
}//endfor
totalNum[i] = num;
}//endfor
/*
* Sum of all Num of Inversions
*/
int sum = 0;
for(int i=0;i<seqLength;i++){
sum = sum + totalNum[i];
}
// cout<<"Output: "<<endl;
int out = sum%1000000007;
cout<< out <<endl;
} //end of ProcessSequences method
int main()
{
// Get Input
char seq[500000];
// cout << "Input: "<<endl;
cin >> seq;
char *p = &seq[0];
ProcessSequences(p);
return 0;
}
the results were right for small size input, but for bigger size input I got time CPU time limit > 1 second. I also got exceeded memory size. How to make it faster and optimal memory use? What algorithm should I use and what better data structure should I use?, Thank you.
Dynamic programming is the way to go. Imagine You are adding the last character to all sequences.
If it is 1 then You get XXXXXX1. Number of swaps is obviously the same as it was for every sequence so far.
If it is 0 then You need to know number of ones already in every sequence. Number of swaps would increase by the amount of ones for every sequence.
If it is ? You just add two previous cases together
You need to calculate how many sequences are there. For every length and for every number of ones (number of ones in the sequence can not be greater than length of the sequence, naturally). You start with length 1, which is trivial, and continue with longer. You can get really big numbers, so You should calculate modulo 1000000007 all the time. The program is not in C++, but should be easy to rewrite (array should be initialized to 0, int is 32bit, long in 64bit).
long Mod(long x)
{
return x % 1000000007;
}
long Calc(string s)
{
int len = s.Length;
long[,] nums = new long[len + 1, len + 1];
long sum = 0;
nums[0, 0] = 1;
for (int i = 0; i < len; ++i)
{
if(s[i] == '?')
{
sum = Mod(sum * 2);
}
for (int j = 0; j <= i; ++j)
{
if (s[i] == '0' || s[i] == '?')
{
nums[i + 1, j] = Mod(nums[i + 1, j] + nums[i, j]);
sum = Mod(sum + j * nums[i, j]);
}
if (s[i] == '1' || s[i] == '?')
{
nums[i + 1, j + 1] = nums[i, j];
}
}
}
return sum;
}
Optimalization
The code above is written to be as clear as possible and to show dynamic programming approach. You do not actually need array [len+1, len+1]. You calculate column i+1 from column i and never go back, so two columns are enough - old and new. If You dig more into it, You find out that row j of new column depends only on row j and j-1 of the old column. So You can go with one column if You actualize the values in the right direction (and do not overwrite values You would need).
The code above uses 64bit integers. You really need that only in j * nums[i, j]. The nums array contain numbers less than 1000000007 and 32bit integer is enough. Even 2*1000000007 can fit into 32bit signed int, we can make use of it.
We can optimize the code by nesting loop into conditions instead of conditions in the loop. Maybe it is even more natural approach, the only downside is repeating the code.
The % operator is, as every dividing, quite expensive. j * nums[i, j] is typically far smaller that capacity of 64bit integer, so we do not have to do modulo in every step. Just watch the actual value and apply when needed. The Mod(nums[i + 1, j] + nums[i, j]) can also be optimized, as nums[i + 1, j] + nums[i, j] would always be smaller than 2*1000000007.
And finally the optimized code. I switched to C++, I realized there are differences what int and long means, so rather make it clear:
long CalcOpt(string s)
{
long len = s.length();
vector<long> nums(len + 1);
long long sum = 0;
nums[0] = 1;
const long mod = 1000000007;
for (long i = 0; i < len; ++i)
{
if (s[i] == '1')
{
for (long j = i + 1; j > 0; --j)
{
nums[j] = nums[j - 1];
}
nums[0] = 0;
}
else if (s[i] == '0')
{
for (long j = 1; j <= i; ++j)
{
sum += (long long)j * nums[j];
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
}
}
else
{
sum *= 2;
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
for (long j = i + 1; j > 0; --j)
{
sum += (long long)j * nums[j];
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
long add = nums[j] + nums[j - 1];
if (add >= mod) { add -= mod; }
nums[j] = add;
}
}
}
return (long)(sum % mod);
}
Simplification
Time limit still exceeded? There is probably better way to do it. You can either
get back to the beginning and find out mathematically different way to calculate the result
or simplify actual solution using math
I went the second way. What we are doing in the loop is in fact convolution of two sequences, for example:
0, 0, 0, 1, 4, 6, 4, 1, 0, 0,... and 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,...
0*0 + 0*1 + 0*2 + 1*3 + 4*4 + 6*5 + 4*6 + 1*7 + 0*8...= 80
The first sequence is symmetric and the second is linear. It this case, the sum of convolution can be calculated from sum of the first sequence which is = 16 (numSum) and number from second sequence corresponding to the center of the first sequence, which is 5 (numMult). numSum*numMult = 16*5 = 80. We replace the whole loop with one multiplication if we are able to update those numbers in each step, which fortulately seems the case.
If s[i] == '0' then numSum does not change and numMult does not change.
If s[i] == '1' then numSum does not change, only numMult increments by 1, as we shift the whole sequence by one position.
If s[i] == '?' we add original and shiftet sequence together. numSum is multiplied by 2 and numMult increments by 0.5.
The 0.5 means a bit problem, as it is not the whole number. But we know, that the result would be whole number. Fortunately in modular arithmetics in this case exists inversion of two (=1/2) as a whole number. It is h = (mod+1)/2. As a reminder, inversion of 2 is such a number, that h*2=1 modulo mod. Implementation wisely it is easier to multiply numMult by 2 and divide numSum by 2, but it is just a detail, we would need 0.5 anyway. The code:
long CalcOptSimpl(string s)
{
long len = s.length();
long long sum = 0;
const long mod = 1000000007;
long numSum = (mod + 1) / 2;
long long numMult = 0;
for (long i = 0; i < len; ++i)
{
if (s[i] == '1')
{
numMult += 2;
}
else if (s[i] == '0')
{
sum += numSum * numMult;
if (sum > std::numeric_limits<long long>::max() / 4) { sum %= mod; }
}
else
{
sum = sum * 2 + numSum * numMult;
if (sum > std::numeric_limits<long long>::max() / 4) { sum %= mod; }
numSum = (numSum * 2) % mod;
numMult++;
}
}
return (long)(sum % mod);
}
I am pretty sure there exists some simple way to get this code, yet I am still unable to see it. But sometimes path is the goal :-)
If a sequence has N zeros with indexes zero[0], zero[1], ... zero[N - 1], the number of inversions for it would be (zero[0] + zero[1] + ... + zero[N - 1]) - (N - 1) * N / 2. (you should be able to prove it)
For example, 11010 has two zeros with indexes 2 and 4, so the number of inversions would be 2 + 4 - 1 * 2 / 2 = 5.
For all 2^k sequences, you can calculate the sum of two parts separately and then add them up.
1) The first part is zero[0] + zero[1] + ... + zero[N - 1]. Each 0 in the the given sequence contributes index * 2^k and each ? contributes index * 2^(k-1)
2) The second part is (N - 1) * N / 2. You can calculate this using a dynamic programming (maybe you should google and learn this first). In short, use f[i][j] to present the number of sequence with j zeros using the first i characters of the given sequence.

Overloaded division operator for HugeInt [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Division with really big numbers
I need to overload the / operator to work on two HugeInt objects which are defined simply as an array of 30 shorts. This is homework, btw, but I have been wracking my brain for days on this problem.
I have overloaded the * operator already:
HugeInt HugeInt::operator*(const HugeInt &op2){
HugeInt temp;
short placeProducts[hugeIntSize + 1][hugeIntSize] = {0};
short product;
int carry = 0;
int k, leftSize, rightSize, numOfSumRows;
leftSize = getDigitLength();
rightSize = op2.getDigitLength();
if(leftSize <= rightSize) {
numOfSumRows = leftSize;
for(int i = (hugeIntSize - 1), k = 0; k < numOfSumRows; i--, k++) {
for(int j = (hugeIntSize - 1); j >= k; j--) {
product = integer[i] * op2.integer[j] + carry;
if (product > 9) {
carry = product / 10;
product %= 10;
} else {
carry = 0;
}
placeProducts[k][j - k] = product;
}
}
} else {
numOfSumRows = rightSize;
for(int i = (hugeIntSize - 1), k = 0; k < numOfSumRows; i--, k++) {
for(int j = (hugeIntSize - 1); j >= k; j--) {
product = integer[j] * op2.integer[i] + carry;
if (product > 9) {
carry = product / 10;
product %= 10;
} else {
carry = 0;
}
placeProducts[k][j - k] = product;
}
}
}
sumProductsArray(placeProducts, numOfSumRows);
for(int i = 0; i < hugeIntSize; i++)
{
temp.integer[i] = placeProducts[hugeIntSize][i];
}
return temp;}
But how do I overload the / op? My main problem isn't with the C++ code or syntax, but with my algorithm to divide. When I multiply I am able to do it digit by digit. I store each product (aka 1's digit of bottom times every digit above, then 10's digit time every num above using my carry algorithm) in my 2d array. Every time I get new product it is offset to the left by n + 1, which "multiplies" it by the required power of 10. Then I just sum up all the columns.
I can't for the life of me figure out how to code the long division method. Since I'm dealing with two arrays it has to be digit by digit, and I suspect it might be as easy as reversing the multiplication algorithm. Nested loops and subtraction? I would need a variable for the quotient and reminder? Is there a better method? I just need to be pointed in the right direction.
In computational division of integers, there are a few interesting results:
numerator < denominator implies quotient = 0
numerator == denominator implies quotient = 1
numerator > denominator, long division would be needed to determine quotient.
The first two conditions can be satisfied with a for loop. You could overload the less-than and equals relational operator to encapsulate this behavior.
For the long division, you will need your multiplication operator as well as overloaded less-than and subtraction operators, and an append digit member function to perform the operation.
It's brute force, but should get the job done.