edit distance solution with O(n) space issue - c++

Found a few different solutions and debugging, and especially interested in below solution which requires only O(n) space, other than store a matrix (M*N). But confused about what is the logical meaning of cur[i]. If anyone have any comments, it will be highly appreciated.
I posted solution and code.
Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)
You have the following 3 operations permitted on a word:
a) Insert a character
b) Delete a character
c) Replace a character
class Solution {
public:
int minDistance(string word1, string word2) {
int m = word1.length(), n = word2.length();
vector<int> cur(m + 1, 0);
for (int i = 1; i <= m; i++)
cur[i] = i;
for (int j = 1; j <= n; j++) {
int pre = cur[0];
cur[0] = j;
for (int i = 1; i <= m; i++) {
int temp = cur[i];
if (word1[i - 1] == word2[j - 1])
cur[i] = pre;
else cur[i] = min(pre + 1, min(cur[i] + 1, cur[i - 1] + 1));
pre = temp;
}
}
return cur[m];
}
};

You can think of cur as being as a mix of the previous line and the current line in the edit distance matrix. For example, think of a 3x3 matrix in the original algorithm. I'll number each position like below:
1 2 3
4 5 6
7 8 9
In the loop, if you are computing the position 6, you only need the values from 2, 3 and 5. In that case, cur will be exactly the values from:
4 5 3
See the 3 in the end? That's because we didn't updated it yet, so it still has a value from the first line. From the previous iteration, we have pre = 2, because it was saved before we computed the value at 5.
Then, the new value for the last cell is the minimum of pre = 2, cur[i-1] = 5 and cur[i] = 3, exactly the values mentioned before.
EDIT: completing the analogy, if in the O(n^2) version you compute min(M[i-1][j-1], M[i][j-1], M[i-1][j]), in this O(n) version you'll compute min(pre, cur[i-1], cur[i]), respectively.

Related

Tell me the Input in which this code will give incorrect Output

There's a problem, which I've to solve in c++. I've written the whole code and it's working in the given test cases but when I'm submitting it, It's saying wrong answer. I can't understand that why is it showing wrong answer.
I request you to tell me an input for the given code, which will give incorrect output so I can modify my code further.
Shrink The Array
You are given an array of positive integers A[] of length L. If A[i] and A[i+1] both are equal replace them by one element with value A[i]+1. Find out the minimum possible length of the array after performing such operation any number of times.
Note:
After each such operation, the length of the array will decrease by one and elements are renumerated accordingly.
Input format:
The first line contains a single integer L, denoting the initial length of the array A.
The second line contains L space integers A[i] − elements of array A[].
Output format:
Print an integer - the minimum possible length you can get after performing the operation described above any number of times.
Example:
Input
7
3 3 4 4 4 3 3
Output
2
Sample test case explanation
3 3 4 4 4 3 3 -> 4 4 4 4 3 3 -> 4 4 4 4 4 -> 5 4 4 4 -> 5 5 4 -> 6 4.
Thus the length of the array is 2.
My code:
#include <bits/stdc++.h>
using namespace std;
int main()
{
bool end = false;
int l;
cin >> l;
int arr[l];
for(int i = 0; i < l; i++){
cin >> arr[i];
}
int len = l, i = 0;
while(i < len - 1){
if(arr[i] == arr[i + 1]){
arr[i] = arr[i] + 1;
if((i + 1) <= (len - 1)){
for(int j = i + 1; j < len - 1; j++){
arr[j] = arr[j + 1];
}
}
len--;
i = 0;
}
else{
i++;
}
}
cout << len;
return 0;
}
THANK YOU
As noted in the comments: Just picking the first two neighbours that have the same value and combining those will lead to suboptimal results.
You will need to investigate which two neighbours you should combine somehow. When you have combined two neighbours you then need to investigate which neighbours to combine on the next level. The number of combinations may become plentiful.
One way to solve this is through recursion.
If you've followed the advice in the comments, you now have all your input data in std::vector<unsigned> A(L).
You can now do std::cout << solve(A) << '\n'; where solve has the signature size_t solve(const std::vector<unsigned>& A) and is described below:
Find the indices of all neighbour pairs in A that has the same values and put the indices in a std::vector<size_t> neighbours. Example: If A contains 2 2 2 3, put 0 and 1 in neighbours.
If no neighbours are found (neighbours.empty() == true), return A.size().
Define a minimum variable and initialize it with A.size() - 1 which is the worst result you know you can get at this point. So, size_t minimum = A.size() - 1;
Loop over all indices stored in neighbours (for(size_t idx : neighbours))
Copy A into a new std::vector<unsigned>. Let's call it cpy.
Increase cpy[idx] by one and remove cpy[idx+1].
Call size_t result = solve(cpy). This is where recursion comes in.
Is result less than minimum? If so assign result to minimum.
Return minimum.
I don't think I ruined the programming exercise by providing one algorithm for solving this. It should still have plenty of things to deal with. Recursion won't be possible with big data etc.

Array-Sum Operation

I have written this code using vector. Some case has been passed but others show timeout termination error.
The problem statement is:-
You have an identity permutation of N integers as an array initially. An identity permutation of N integers is [1,2,3,...N-1,N]. In this task, you have to perform M operations on the array and report the sum of the elements of the array after each operation.
The ith operation consists of an integer opi.
If the array contains opi, swap the first and last elements in the array.
Else, remove the last element of the array and push opi to the end of the array.
Input Format
The first line contains two space-separated integers N and M.
Then, M lines follow denoting the operations opi.
Constraints :
2<=N,M <= 10^5
1 <= op <= 5*10^5
Output Format
Print M lines, each containing a single integer denoting the answer to each of the M operations.
Sample Input 0
3 2
4
2
Sample Output 0
7
7
Explanation 0
Initially, the array is [1,2,3].
After the 1st operation, the array becomes[1,2,4] as opi = 4, as 4 is not present in the current array, we remove 3 and push 4 to the end of the array and hence, sum=7 .
After 2nd operation the array becomes [4,2,1] as opi = 2, as 2 is present in the current array, we swap 1 and 4 and hence, sum=7.
Here is my code:
#include <bits/stdc++.h>
using namespace std;
int main()
{
long int N,M,op,i,t=0;
vector<long int > g1;
cin>>N>>M;
if(N>=2 && M>=2) {
g1.reserve(N);
for(i = 1;i<=N;i++) {
g1.push_back(i);
}
while(M--) {
cin>>op;
auto it = find(g1.begin(), g1.end(), op);
if(it != (g1.end())) {
t = g1.front();
g1.front() = g1.back();
g1.back() = t;
cout<<accumulate(g1.begin(), g1.end(), 0);
cout<<endl;
}
else {
g1.back() = op;
cout<<accumulate(g1.begin(), g1.end(), 0);
cout<<endl;
}
}
}
return 0;
}
Please Suggest changes.
Looking carefully in question you will find that the operation are made only on the first and last element. So there is no need to involve a whole vector in it much less calculating the sum. we can calculate the whole sum of the elements except first and last by (n+1)(n-2)/2 and then we can manipulate the first and last element in the question. We can also shorten the search by using (1<op<n or op==first element or op == last element).
p.s. I am not sure it will work completely but it certainly is faster
my guess, let take N = 3, op = [4, 2]
N= [1,2,3]
sum = ((N-2) * (N+1)) / 2, it leave first and last element, give the sum of numbers between them.
we need to play with the first and last elements. it's big o(n).
function performOperations(N, op) {
let out = [];
let first = 1, last = N;
let sum = Math.ceil( ((N-2) * (N+1)) / 2);
for(let i =0;i<op.length;i++){
let not_between = !(op[i] >= 2 && op[i] <= N-1);
if( first!= op[i] && last != op[i] && not_between) {
last = op[i];
}else {
let t = first;
first = last;
last = t;
}
out.push(sum + first +last)
}
return out;
}

Construct mirror vector around the centre element in c++

I have a for-loop that is constructing a vector with 101 elements, using (let's call it equation 1) for the first half of the vector, with the centre element using equation 2, and the latter half being a mirror of the first half.
Like so,
double fc = 0.25
const double PI = 3.1415926
// initialise vectors
int M = 50;
int N = 101;
std::vector<double> fltr;
fltr.resize(N);
std::vector<int> mArr;
mArr.resize(N);
// Creating vector mArr of 101 elements, going from -50 to +50
int count;
for(count = 0; count < N; count++)
mArr[count] = count - M;
// using these elements, enter in to equations to form vector 'fltr'
int n;
for(n = 0; n < M+1; n++)
// for elements 0 to 50 --> use equation 1
fltr[n] = (sin((fc*mArr[n])-M))/((mArr[n]-M)*PI);
// for element 51 --> use equation 2
fltr[M] = fc/PI;
This part of the code works fine and does what I expect, but for elements 52 to 101, I would like to mirror around element 51 (the output value using equation)
For a basic example;
1 2 3 4 5 6 0.2 6 5 4 3 2 1
This is what I have so far, but it just outputs 0's as the elements:
for(n = N; n > M; n--){
for(i = 0; n < M+1; i++)
fltr[n] = fltr[i];
}
I feel like there is an easier way to mirror part of a vector but I'm not sure how.
I would expect the values to plot like this:
After you have inserted the middle element, you can get a reverse iterator to the mid point and copy that range back into the vector through std::back_inserter. The vector is named vec in the example.
auto rbeg = vec.rbegin(), rend = vec.rend();
++rbeg;
copy(rbeg, rend, back_inserter(vec));
Lets look at your code:
for(n = N; n > M; n--)
for(i = 0; n < M+1; i++)
fltr[n] = fltr[i];
And lets make things shorter, N = 5, M = 3,
array is 1 2 3 0 0 and should become 1 2 3 2 1
We start your first outer loop with n = 3, pointing us to the first zero. Then, in the inner loop, we set i to 0 and call fltr[3] = fltr[0], leaving us with the array as
1 2 3 1 0
We could now continue, but it should be obvious that this first assignment was useless.
With this I want to give you a simple way how to go through your code and see what it actually does. You clearly had something different in mind. What should be clear is that we do need to assign every part of the second half once.
What your code does is for each value of n to change the value of fltr[n] M times, ending with setting it to fltr[M] in any case, regardless of what value n has. The result should be that all values in the second half of the array are now the same as the center, in my example it ends with
1 2 3 3 3
Note that there is also a direct error: starting with n = N and then accessing fltr[n]. N is out of bounds for an arry of size N.
To give you a very simple working solution:
for(int i=0; i<M; i++)
{
fltr[N-i-1] = fltr[i];
}
N-i-1 is the mirrored address of i (i = 0 -> N-i-1 = 101-0-1 = 100, last valid address in an array with 101 entries).
Now, I saw several guys answering with a more elaborate code, but I thought that as a beginner, it might be beneficial for you to do this in a very simple manner.
Other than that, as #Pzc already said in the comments, you could do this assignment in the loop where the data is generated.
Another thing, with your code
for(n = 0; n < M+1; n++)
// for elements 0 to 50 --> use equation 1
fltr[n] = (sin((fc*mArr[n])-M))/((mArr[n]-M)*PI);
// for element 51 --> use equation 2
fltr[M] = fc/PI;
I have two issues:
First, the indentation makes it look like fltr[M]=.. would be in the loop. Don't do that, not even if this should have been a mistake when you wrote the question and is not like this in the code. This will lead to errors in the future. Indentation is important. Using the auto-indentation of your IDE is an easy way to go. And try to use brackets, even if it is only one command.
Second, n < M+1 as a condition includes the center. The center is located at adress 50, and 50 < 50+1. You haven't seen any problem as after the loop you overwrite it, but in a different situation, this can easily produce errors.
There are other small things I'd change, and I recommend that, when your code works, you post it on CodeReview.
Let's use std::iota, std::transform, and std::copy instead of raw loops:
const double fc = 0.25;
constexpr double PI = 3.1415926;
const std::size_t M = 50;
const std::size_t N = 2 * M + 1;
std::vector<double> mArr(M);
std::iota(mArr.rbegin(), mArr.rend(), 1.); // = [M, M - 1, ..., 1]
const auto fn = [=](double m) { return std::sin((fc * m) + M) / ((m + M) * PI); };
std::vector<double> fltr(N);
std::transform(mArr.begin(), mArr.end(), fltr.begin(), fn);
fltr[M] = fc / PI;
std::copy(fltr.begin(), fltr.begin() + M, fltr.rbegin());

Divide array into smaller consecutive parts such that NEO value is maximal

On this years Bubble Cup (finished) there was the problem NEO (which I couldn't solve), which asks
Given array with n integer elements. We divide it into several part (may be 1), each part is a consecutive of elements. The NEO value in that case is computed by: Sum of value of each part. Value of a part is sum all elements in this part multiple by its length.
Example: We have array: [ 2 3 -2 1 ]. If we divide it like: [2 3] [-2 1]. Then NEO = (2 + 3) * 2 + (-2 + 1) * 2 = 10 - 2 = 8.
The number of elements in array is smaller then 10^5 and the numbers are integers between -10^6 and 10^6
I've tried something like divide and conquer to constantly split array into two parts if it increases the maximal NEO number otherwise return the NEO of the whole array. But unfortunately the algorithm has worst case O(N^2) complexity (my implementation is below) so I'm wondering whether there is a better solution
EDIT: My algorithm (greedy) doesn't work, taking for example [1,2,-6,2,1] my algorithm returns the whole array while to get the maximal NEO value is to take parts [1,2],[-6],[2,1] which gives NEO value of (1+2)*2+(-6)+(1+2)*2=6
#include <iostream>
int maxInterval(long long int suma[],int first,int N)
{
long long int max = -1000000000000000000LL;
long long int curr;
if(first==N) return 0;
int k;
for(int i=first;i<N;i++)
{
if(first>0) curr = (suma[i]-suma[first-1])*(i-first+1)+(suma[N-1]-suma[i])*(N-1-i); // Split the array into elements from [first..i] and [i+1..N-1] store the corresponding NEO value
else curr = suma[i]*(i-first+1)+(suma[N-1]-suma[i])*(N-1-i); // Same excpet that here first = 0 so suma[first-1] doesn't exist
if(curr > max) max = curr,k=i; // find the maximal NEO value for splitting into two parts
}
if(k==N-1) return max; // If the max when we take the whole array then return the NEO value of the whole array
else
{
return maxInterval(suma,first,k+1)+maxInterval(suma,k+1,N); // Split the 2 parts further if needed and return it's sum
}
}
int main() {
int T;
std::cin >> T;
for(int j=0;j<T;j++) // Iterate over all the test cases
{
int N;
long long int NEO[100010]; // Values, could be long int but just to be safe
long long int suma[100010]; // sum[i] = sum of NEO values from NEO[0] to NEO[i]
long long int sum=0;
int k;
std::cin >> N;
for(int i=0;i<N;i++)
{
std::cin >> NEO[i];
sum+=NEO[i];
suma[i] = sum;
}
std::cout << maxInterval(suma,0,N) << std::endl;
}
return 0;
}
This is not a complete solution but should provide some helpful direction.
Combining two groups that each have a positive sum (or one of the sums is non-negative) would always yield a bigger NEO than leaving them separate:
m * a + n * b < (m + n) * (a + b) where a, b > 0 (or a > 0, b >= 0); m and n are subarray lengths
Combining a group with a negative sum with an entire group of non-negative numbers always yields a greater NEO than combining it with only part of the non-negative group. But excluding the group with the negative sum could yield an even greater NEO:
[1, 1, 1, 1] [-2] => m * a + 1 * (-b)
Now, imagine we gradually move the dividing line to the left, increasing the sum b is combined with. While the expression on the right is negative, the NEO for the left group keeps decreasing. But if the expression on the right gets positive, relying on our first assertion (see 1.), combining the two groups would always be greater than not.
Combining negative numbers alone in sequence will always yield a smaller NEO than leaving them separate:
-a - b - c ... = -1 * (a + b + c ...)
l * (-a - b - c ...) = -l * (a + b + c ...)
-l * (a + b + c ...) < -1 * (a + b + c ...) where l > 1; a, b, c ... > 0
O(n^2) time, O(n) space JavaScript code:
function f(A){
A.unshift(0);
let negatives = [];
let prefixes = new Array(A.length).fill(0);
let m = new Array(A.length).fill(0);
for (let i=1; i<A.length; i++){
if (A[i] < 0)
negatives.push(i);
prefixes[i] = A[i] + prefixes[i - 1];
m[i] = i * (A[i] + prefixes[i - 1]);
for (let j=negatives.length-1; j>=0; j--){
let negative = prefixes[negatives[j]] - prefixes[negatives[j] - 1];
let prefix = (i - negatives[j]) * (prefixes[i] - prefixes[negatives[j]]);
m[i] = Math.max(m[i], prefix + negative + m[negatives[j] - 1]);
}
}
return m[m.length - 1];
}
console.log(f([1, 2, -5, 2, 1, 3, -4, 1, 2]));
console.log(f([1, 2, -4, 1]));
console.log(f([2, 3, -2, 1]));
console.log(f([-2, -3, -2, -1]));
Update
This blog provides that we can transform the dp queries from
dp_i = sum_i*i + max(for j < i) of ((dp_j + sum_j*j) + (-j*sum_i) + (-i*sumj))
to
dp_i = sum_i*i + max(for j < i) of (dp_j + sum_j*j, -j, -sum_j) ⋅ (1, sum_i, i)
which means we could then look at each iteration for an already seen vector that would generate the largest dot product with our current information. The math alluded to involves convex hull and farthest point query, which are beyond my reach to implement at this point but will make a study of.

Number of Rs in a string

I have an assignment where I'm given a string S containing the letters 'R' and 'K', for example "RRRRKKKKRK".
I need to obtain the maximum number of 'R's that string could possibly hold by flipping characters i through j to their opposite. So:
for(int x = i; x < j; x++)
{
if S[x] = 'R'
{
S[X] = 'S';
}
else
{
S[X] = 'R';
}
}
However, I can only make the above call once.
So for the above example: "RRRRKKKKRK".
You would have i = 4 and j = 8 which would result in: "RRRRRRRRKR" and you would then output the number of R's in the resulting string: 9.
My code partially works, but there are some cases that it doesn't. Can anyone figure out what is missing?
Sample Input
2
RKKRK
RKKR
Sample Output
4
4
My Solution
My solution which works only for the first case, I don't know what I'm missing to complete the algorithm:
int max_R = INT_MIN;
for (int i = 0; i < s.size(); i++)
{
for (int j = i + 1; j < s.size(); j++)
{
int cnt = 0;
string t = s;
if (t[j] == 'R')
{
t[j] = 'K';
}
else
{
t[j] = 'R';
}
for (int b = 0; b < s.size(); b++)
{
if (t[b] == 'R')
{
cnt++;
if (cnt > max_R)
{
max_R = cnt;
}
}
}
}
}
cout << max_R << endl;
How about turning this into the Maximum subarray problem which has O(n) solution?
Run through the string once, giving 'K' a value of 1, and 'R' a value of -1.
E.g For 'RKRRKKKKRKK' you produce an array -> [-1, 1, -1, -1, 1, 1, 1, 1, -1, 1, 1] -> [-1, 1, -2, 4, -1, 2] (I grouped consecutive -1s and 1s to be more clear)
Apply Kadane's algorithm on the generated array. What you get from doing this is the maximum number of 'R's you can obtain from flipping 'K's.
Continuing with the example, you find that the maximum subarray is [4, -1, 2] with a sum of 5.
Now add the absolute value of the negative values outside this subarray with the sum of your maximum subarray to obtain your answer.
In our case, only -1 and -2 are negative and outside the subarray. We get |-1| + |-2| + 5 = 8
Try to carefully think about your solution. Do you understand, what it does?
First, let’s forget that the input file may contain multiple tests, so let’s get rid of the while loop. Now, we have just two for loops. The second one obviously just counts R’s in the processed string. But what does the first one do?
The answer is that the first loop flips all the letters from the second one (i.e. which has index 1) till the end of the string. We can see that in the first testcase:
RKKRK
it is indeed the optimal solution. The string turns into RRRKR and we get four R’s. But in the second case:
RKKR
the string turns into RRRK and we get three R’s. While if we flipped just the letters from 2 to 3 (i.e. indices 1 to 2) we could get RRRR which has four R’s.
So your algorithm always flips letters from index 1 to the end, but this is not always optimal. What can we do? How do we know which letters to flip? Well, there are some smart solutions, but the easiest is to just try all possible combinations!
You can flip all the letters from 0 to 1, count the number of R’s, remember it. Get back to the original string, flip letters from 0 to 2, count R’s, remember it and so on till you flip from 0 to n-1. Then you flip letters from 1 to 2, from 1 to 3, etc. And the answer is the largest value you remembered.
This is horribly inefficient, but this works. After you get more practice in solving algorithmic problems, get back to this task and try to figure out more efficient solutions. (Hint: if you consider building the optimal answer incrementally, that is by going through the string char by char and transforming the optimal solution for the substring s[0..i] into the optimal solution for s[0..i+1] you can arrive to a pretty straightforward O(n^2) algorithm. This can be enhanced to O(n), but this step is slightly more involved.)
Here is the sketch of this solution:
def solve(s):
answer = 0
for i in 0..(n-1)
for j in i..(n-1)
t = copy(s) # we will need the original string later
flip(t, i, j) # flip letters from i to j in t
c = count_R(t) # count R's in t
answer = max(answer, c)
return answer