Dividing large integer by large integer - c++

Guys I'm working on class called LINT (large int) for learning purposes, and everything went ok till know. I'm stuck on implementing operator/(const LINT&). The problem here is that in when I want to divide LINT by LINT I'm getting into recursive fnc invocation i.e:
//unfinished
LINT_rep LINT_rep::divide_(const LINT_rep& bottom)const
{
typedef LINT_rep::Iterator iter;
iter topBeg = begin();
iter topEnd = end();
iter bottomBeg = bottom.begin();
iter bottomEnd = bottom.end();
LINT_rep topTmp;//for storing smallest number (dividend) which can be used to divide by divisor
while (topBeg != topEnd)
{
topTmp.insert_(*topBeg);//Number not large enough add another digit
if (topTmp >= bottom)
{//ok number >= we can divide
LINT_rep topShelf = topTmp / bottom;//HERE I'M RUNNING INTO TROUBLE
}
else
{
}
++topBeg;
}
return LINT_rep("-1");//DUMMY
}
What I'm trying to do is to implement this as if I would divide those numbers by hand, so for example having for a dividend 1589 and for divisor 27 I would go like so:
check if first digit is >= divisor and if so divide
if not add to the first digit another digit and check if a > b
At some point it will be bigger (in simplified scenario) and if so I have to divide but in this case I'm running into recursive call and I have no idea how to break it.
One note: as a tmp I have to use LINT instead of int for example because those numbers my not fit into int.
So generally what I'm asking for is there any other way to do division? Or maybe there is false logic in my thinking (quite possible).
Thank you.

When doing your part (1) you can't divide; you have to repeatedly subtract, or guess to subtract a multiple, just like when you do it by hand. You can 'guess' more effectively by setting upper and lower bounds for the multiple required and doing a binary-chop through the range.
I've done a similar thing myself; it's a handy exercise to practice operator overloading. I can supply a snippet of code if you like, although it uses arrays and half-baked exceptions so I hesitate to offer it up before the expert readers of this site.

First, please don't work on such a class. Use CGAL's big int, and there was some boost bigint submission I think, also, there're about three or four other popular implementations.
Second, the division algorithm is described here: http://en.wikipedia.org/wiki/Long_division
[EDIT] Correct way to do it:
Digit k of the result (C):
if first digit (from left) of A, call it A[nA-1] is smaller than B[nB-1], write zero into C[k]. k-- (move to next digit).
Otherwise, you seek maximum digit C[k] so that C[k]*B*10^k <= A. That is done in a loop. Actually, the previous sentence is a private case of this one. But it is not yet finished. You do A-=C[k]*B*10^k (the substracted part was zero otherwise). Only then,
k-- (next digit). Loop until k==0.
No need for recursion. Just two nested loops.
One loop for k (digit of the result), one loop for finding every digit, one loop (near it) for substracting (the -= operator).

Related

How is binary search applicable here (since the values are not monotonic)?

I am solving a LeetCode problem Search in Rotated Sorted Array, in order to learn Binary Search better. The problem statement is:
There is an integer array nums sorted in ascending order (with distinct values). Prior to being passed to your function, nums is possibly rotated at an unknown pivot index. For example, [0,1,2,4,5,6,7] might be rotated at pivot index 3 and become [4,5,6,7,0,1,2]. Given the array nums after the possible rotation and an integer target, return the index of target if it is in nums, or -1 if it is not in nums.
With some online help, I came up with the solution below, which I mostly understand:
class Solution {
public:
int search(vector<int>& nums, int target) {
int l=0, r=nums.size()-1;
while(l<r) { // 1st loop; how is BS applicable here, since array is NOT sorted?
int m=l+(r-l)/2;
if(nums[m]>nums[r]) l=m+1;
else r=m;
}
// cout<<"Lowest at: "<<r<<"\n";
if(nums[r]==target) return r; //target==lowest number
int start, end;
if(target<=nums[nums.size()-1]) {
start=r;
end=nums.size()-1;
} else {
start=0;
end=r;
}
l=start, r=end;
while(l<r) {
int m=l+(r-l)/2;
if(nums[m]==target) return m;
if(nums[m]>target) r=m;
else l=m+1;
}
return nums[l]==target ? l : -1;
}
};
My question: Are we searching over a parabola in the first while loop, trying to find the lowest point of a parabola, unlike a linear array in traditional binary search? Are we finding the minimum of a convex function? I understand how the values of l, m and r change leading to the right answer - but I do not fully follow how we can be guaranteed that if(nums[m]>nums[r]), our lowest value would be on the right.
You actually skipped something important by “getting help”.
Once, when I was struggling to integrate something tricky for Calculus Ⅰ, I went for help and the advisor said, “Oh, I know how to do this” and solved it. I learned nothing from him. It took me another week of going over it (and other problems) myself to understand it sufficient that I could do it myself.
The purpose of these assignments is to solve the problem yourself. Even if your solution is faulty, you have learned more than simply reading and understanding the basics of one example problem someone else has solved.
In this particular case...
Since you already have a solution, let’s take a look at it: Notice that it contains two binary search loops. Why?
As you observed at the beginning, the offset shift makes the array discontinuous (not convex). However, the subarrays either side of the discontinuity remain monotonic.
Take a moment to convince yourself that this is true.
Knowing this, what would be a good way to find and determine which of the two subarrays to search?
Hints:
A binary search as  ( n ⟶ ∞ )   is   O(log n)
O(log n) ≡ O(2 log n)
I should also observe to you that the prompt gives as example an arithmetic progression with a common difference of 1, but the prompt itself imposes no such restriction. All it says is that you start with a strictly increasing sequence (no duplicate values). You could have as input [19 74 512 513 3 7 12].
Does the supplied solution handle this possibility?
Why or why not?

How do I calculate the time complexity of the following function?

Here is a recursive function. Which traverses a map of strings(multimap<string, string> graph). Checks the itr -> second (s_tmp) if the s_tmp is equal to the desired string(Exp), prints it (itr -> first) and the function is executed for that itr -> first again.
string findOriginalExp(string Exp){
cout<<"*****findOriginalExp Function*****"<<endl;
string str;
if(graph.empty()){
str ="map is empty";
}else{
for(auto itr=graph.begin();itr!=graph.end();itr++){
string s_tmp = itr->second;
string f_tmp = itr->first;
string nll = "null";
//s_tmp.compare(Exp) == 0
if(s_tmp == Exp){
if(f_tmp.compare(nll) == 0){
cout<< Exp <<" :is original experience.";
return Exp;
}else{
return findOriginalExp(itr->first);
}
}else{
str="No element is equal to Exp.";
}
}
}
return str;
}
There are no rules for stopping and it seems to be completely random. How is the time complexity of this function calculated?
I am not going to analyse your function but instead try to answer in a more general way. It seems like you are looking for an simple expression such as O(n) or O(n^2) for the complexity for your function. However, not always complexity is that simple to estimate.
In your case it strongly depends on what are the contents of graph and what the user passes as parameter.
As an analogy consider this function:
int foo(int x){
if (x == 0) return x;
if (x == 42) return foo(42);
if (x > 0) return foo(x-1);
return foo(x/2);
}
In the worst case it never returns to the caller. If we ignore x >= 42 then worst case complexity is O(n). This alone isn't that useful as information for the user. What I really need to know as user is:
Don't ever call it with x >= 42.
O(1) if x==0
O(x) if x>0
O(ln(x)) if x < 0
Now try to make similar considerations for your function. The easy case is when Exp is not in graph, in that case there is no recursion. I am almost sure that for the "right" input your function can be made to never return. Find out what cases those are and document them. In between you have cases that return after a finite number of steps. If you have no clue at all how to get your hands on them analytically you can always setup a benchmark and measure. Measuring the runtime for input sizes 10,50, 100,1000.. should be sufficient to distinguish between linear, quadratic and logarithmic dependence.
PS: Just a tip: Don't forget what the code is actually supposed to do and what time complexity is needed to solve that problem (often it is easier to discuss that in an abstract way rather than diving too deep into code). In the silly example above the whole function can be replaced by its equivalent int foo(int){ return 0; } which obviously has constant complexity and does not need to be any more complex than that.
This function takes a directed graph and a vertex in that graph and chases edges going into it backwards to find a vertex with no edge pointing into it. The operation of finding the vertex "behind" any given vertex takes O(n) string comparisons in n the number of k/v pairs in the graph (this is the for loop). It does this m times, where m is the length of the path it must follow (which it does through the recursion). Therefore, it has time complexity O(m * n) string comparisons in n the number of k/v pairs and m the length of the path.
Note that there's generally no such thing as "the" time complexity for just some function you see written in code. You have to define what variables you want to describe the time in terms of, and also the operations with which you want to measure the time. E.g. if we want to write this purely in terms of n the number of k/v pairs, you run into a problem, because if the graph contains a suitably placed cycle, the function doesn't terminate! If you further constrain the graph to be acyclic, then the maximum length of any path is constrained by m < n, and then you can also get that this function does O(n^2) string comparisons for an acyclic graph with n edges.
You should approximate the control flow of the recursive calling by using a recurrence relation. It's been like 30 years since I took college classes in Discrete Math, but generally you do like pseuocode, just enough to see how many calls there are. In some cases just counting how many are on the longest condition on the right hand side is useful, but you generally need to plug one expansion back in and from that derive a polynomial or power relationship.

Heap Buffer Overflow occurs randomly....... for a simple code?(I'm new to C++)

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000114 at pc 0x000000406d27 bp 0x7ffc88f07560 sp 0x7ffc88f07558
READ of size 4 at 0x602000000114 thread T0
LeetCode No.1
I get this when I give this code
The code below works for some other inputs, but for [3,2,4]\n6, it shows the above error.
vector<int> twoSum(vector<int>& nums, int target) {
int first = 0,last = nums.size() - 1;
vector<int> ref = nums;
while(first < last){
if(ref[first]+ref[last] > target) last--;
else if(ref[first]+ref[last] < target) first++;
else break;
}
vector<int> result;
for(int i=0;i<nums.size();i++){
if(ref[first]==nums[i]) result.push_back(i);
else if(ref[last]==nums[i]) result.push_back(i);
}
if(result[0] > result[1])
swap(result[0],result[1]);
return result;
}
The expected output is [1,2], indexes of values in the array adding up to the value 6.
Consider this while loop.
while(first < last){
if(ref[first]+ref[last] > target) last--;
else if(ref[first]+ref[last] < target) first++;
else break;
}
It seems that the intent was to break and exit when the sum is exactly equal to the target number. However, it is not guaranteed that this will become true. You can also exit the loop when the while condition fails, which happens whenever you reach first == last without yet finding any exact match. That actually happens in the particular case you mention. Follow the logic through and you will find this yourself. The search process misses the desired answer. The logic will not find [1,2]. It will first consider [0,2] and when that fails as too big, it will permanently decrement last and never again consider any combination that involves position 2.
(Likewise, if it fails for being too small it would increment the first position and never again consider combinations with the first value. So there are other failure cases that would happen similarly with that scenario.)
Since you exit without finding the matching combination and first == last, only one number will be pushed into the results. Therefore, when you just assume there are two numbers (false), things blow up as you try to reference the second result number.
General Observation:
You need to plan for the case where no exact match is found and code with that possibility in mind. In that case, what would a correct return result look like to signify no solution was found?
Plus, you could think about how the algorithm could be better at not missing a solution when it is actually present. However, that doesn't change the first requirement. If the target cannot be matched by any sum, you need to be ready for that possibility.
Side Notes:
Rather than repeat the sum of two in if statements, when the sum isn't changing I would suggest that you could create and use an auto local variable once that is
auto sum(ref[first]+ref[last]);
If you want to ensure that argument vector nums is not changed, and communicate that clearly to anyone looking at the declaration of the function, a better choice would be the pass it as a const reference, e.g.
(const vector<int>& nums, ...)
Why does the code create a local copy called ref of the argument vector nums? What is the point of making the effort to make the copy?
Regarding...
last = nums.size() - 1
...notice that if the vector passed in is empty, the value of last goes negative. That might not cause a problem for some code, but it has a dangerous smell in that it looks like code that is just assuming that the vector passed in would never be empty. Practice defensive coding that can be seen to guard against the possibility of unusual input values.
p.s. Part of what saves that last initialization from being broken is the use of int. Since size() returns size_t (unsigned), a common problem is to handle it as unsigned size_t. Then instead of going negative, the result wraps around to the maximum value and the looping may try to work with that as if that was a valid position in the vector. It's hazardous to get into habits that invite those kinds of bugs.

Given (a, b) compute the maximum value of k such that a^{1/k} and b^{1/k} are whole numbers

I'm writing a program that tries to find the minimum value of k > 1 such that the kth root of a and b (which are both given) equals a whole number.
Here's a snippet of my code, which I've commented for clarification.
int main()
{
// Declare the variables a and b.
double a;
double b;
// Read in variables a and b.
while (cin >> a >> b) {
int k = 2;
// We require the kth root of a and b to both be whole numbers.
// "while a^{1/k} and b^{1/k} are not both whole numbers..."
while ((fmod(pow(a, 1.0/k), 1) != 1.0) || (fmod(pow(b, 1.0/k), 1) != 0)) {
k++;
}
Pretty much, I read in (a, b), and I start from k = 2 and increment k until the kth roots of a and b are both congruent to 0 mod 1 (meaning that they are divisible by 1 and thus whole numbers).
But, the loop runs infinitely. I've tried researching, and I think it might have to do with precision error; however, I'm not too sure.
Another approach I've tried is changing the loop condition to check whether the floor of a^{1/k} equals a^{1/k} itself. But again, this runs infinitely, likely due to precision error.
Does anyone know how I can fix this issue?
EDIT: for example, when (a, b) = (216, 125), I want to have k = 3 because 216^(1/3) and 125^(1/3) are both integers (namely, 5 and 6).
That is not a programming problem but a mathematical one:
if a is a real, and k a positive integer, and if a^(1./k) is an integer, then a is an integer. (otherwise the aim is to toy with approximation error)
So the fastest approach may be to first check if a and b are integer, then do a prime decomposition such that a=p0e0 * p1e1 * ..., where pi are distinct primes.
Notice that, for a1/k to be an integer, each ei must also be divisible by k. In other words, k must be a common divisor of the ei. The same must be true for the prime powers of b if b1/k is to be an integer.
Thus the largest k is the greatest common divisor of all ei of both a and b.
With your approach you will have problem with large numbers. All IIEEE 754 binary64 floating points (the case of double on x86) have 53 significant bits. That means that all double larger than 253 are integer.
The function pow(x,1./k) will result in the same value for two different x, so that with your approach you will necessary have false answer, for example the numbers 55*290 and 35*2120 are exactly representable with double. The result of the algorithm is k=5. You may find this value of k with these number but you will also find k=5 for 55*290-249 and 35*2120, because pow(55*290-249,1./5)==pow(55*290). Demo here
On the other hand, as there are only 53 significant bits, prime number decomposition of double is trivial.
Floating numbers are not mathematical real numbers. The computation is "approximate". See http://floating-point-gui.de/
You could replace the test fmod(pow(a, 1.0/k), 1) != 1.0 with something like fabs(fmod(pow(a, 1.0/k), 1) - 1.0) > 0.0000001 (and play with various such 𝛆 instead of 0.0000001; see also std::numeric_limits::epsilon but use it carefully, since pow might give some error in its computations, and 1.0/k also inject imprecisions - details are very complex, dive into IEEE754 specifications).
Of course, you could (and probably should) define your bool almost_equal(double x, double y) function (and use it instead of ==, and use its negation instead of !=).
As a rule of thumb, never test floating numbers for equality (i.e. ==), but consider instead some small enough distance between them; that is, replace a test like x == y (respectively x != y) with something like fabs(x-y) < EPSILON (respectively fabs(x-y) > EPSILON) where EPSILON is a small positive number, hence testing for a small L1 distance (for equality, and a large enough distance for inequality).
And avoid floating point in integer problems.
Actually, predicting or estimating floating point accuracy is very difficult. You might want to consider tools like CADNA. My colleague Franck Védrine is an expert on static program analyzers to estimate numerical errors (see e.g. his TERATEC 2017 presentation on Fluctuat). It is a difficult research topic, see also D.Monniaux's paper the pitfalls of verifying floating-point computations etc.
And floating point errors did in some cases cost human lives (or loss of billions of dollars). Search the web for details. There are some cases where all the digits of a computed number are wrong (because the errors may accumulate, and the final result was obtained by combining thousands of operations)! There is some indirect relationship with chaos theory, because many programs might have some numerical instability.
As others have mentioned, comparing floating point values for equality is problematic. If you find a way to work directly with integers, you can avoid this problem. One way to do so is to raise integers to the k power instead of taking the kth root. The details are left as an exercise for the reader.

C++ algorithm to find 'maximal difference' in an array

I am asking for your ideas regarding this problem:
I have one array A, with N elements of type double (or alternatively integer). I would like to find an algorithm with complexity less than O(N2) to find:
max A[i] - A[j]
For 1 < j <= i < n. Please notice that there is no abs(). I thought of:
dynamic programming
dichotomic method, divide and conquer
some treatment after a sort keeping track of indices
Would you have some comments or ideas? Could you point at some good ref to train or make progress to solve such algorithm questions?
Make three sweeps through the array. First from j=2 up, filling an auxiliary array a with minimal element so far. Then, do the sweep from the top i=n-1 down, filling (also from the top down) another auxiliary array, b, with maximal element so far (from the top). Now do the sweep of the both auxiliary arrays, looking for a maximal difference of b[i]-a[i].
That will be the answer. O(n) in total. You could say it's a dynamic programming algorithm.
edit: As an optimization, you can eliminate the third sweep and the second array, and find the answer in the second sweep by maintaining two loop variables, max-so-far-from-the-top and max-difference.
As for "pointers" about how to solve such problems in general, you usually try some general methods just like you wrote - divide and conquer, memoization/dynamic programming, etc. First of all look closely at your problem and concepts involved. Here, it's maximum/minimum. Take these concepts apart and see how these parts combine in the context of the problem, possibly changing order in which they're calculated. Another one is looking for hidden order/symmetries in your problem.
Specifically, fixing an arbitrary inner point k along the list, this problem is reduced to finding the difference between the minimal element among all js such that 1<j<=k, and the maximal element among is: k<=i<n. You see divide-and-conquer here, as well as taking apart the concepts of max/min (i.e. their progressive calculation), and the interaction between the parts. The hidden order is revealed (k goes along the array), and memoization helps save the interim results for max/min values.
The fixing of arbitrary point k could be seen as solving a smaller sub-problem first ("for a given k..."), and seeing whether there is anything special about it and it can be abolished - generalized - abstracted over.
There is a technique of trying to formulate and solve a bigger problem first, such that an original problem is a part of this bigger one. Here, we think of find all the differences for each k, and then finding the maximal one from them.
The double use for interim results (used both in comparison for specific k point, and in calculating the next interim result each in its direction) usually mean some considerable savings. So,
divide-and-conquer
memoization / dynamic programing
hidden order / symmetries
taking concepts apart - seeing how the parts combine
double use - find parts with double use and memoize them
solving a bigger problem
trying arbitrary sub-problem and abstracting over it
This should be possible in a single iteration. max(a[i] - a[j]) for 1 < j <= i should be the same as max[i=2..n](a[i] - min[j=2..i](a[j])), right? So you'd have to keep track of the smallest a[j] while iterating over the array, looking for the largest a[i] - min(a[j]). That way you only have one iteration and j will be less than or equal to i.
You just need go over the array find the max and min then get the difference, so the worst case is linear time . If the array is sorted, you can find the diff in constant time, or do I miss something?
Java implementation runs in linear time
public class MaxDiference {
public static void main(String[] args) {
System.out.println(betweenTwoElements(2, 3, 10, 6, 4, 8, 1));
}
private static int betweenTwoElements(int... nums) {
int maxDifference = nums[1] - nums[0];
int minElement = nums[0];
for (int i = 1; i < nums.length; i++) {
if (nums[i] - minElement > maxDifference) {
maxDifference = nums[i] - minElement;
}
if (nums[i] < minElement) {
minElement = nums[i];
}
}
return maxDifference;
}
}