Converting C++ function to recursive function - c++

I'm looking over a function and need to convert it to a dynamic programming form. But I'm having difficulty understanding the logic used in this function (what would be the base case?), the original author of this function is no longer available for questioning, I can't make heads or tails of his work and there is 0 documentation available.
Description:
This function takes in a matrix of positive integers and finds the maximum sum by
selecting one element from every column in the matrix, moving left-to-right. As you move through
the matrix column-by-column, there is a penalty to your sum depending on how you
move relative to your previous two positions. If the next row you select is between the previous two
selected rows, there is no penalty; however, there is a penalty of 2 to your sum for every row above
the maximum of the previous two or below the minimum of the previous two.
int calSum(int row, int cols, vector<vector<int>> inputArray, vector<int> *outputArray){
int ans[row][cols][row];
int index[row][cols][row];
int firstCol[row];
for(int i=0;i<row;i++){
firstCol[i]= inputArray[i][0] - 2*(i);
}
for(int i=0;i<row;i++){
for(int j=0;j<row;j++){
int penalty;
if(i<=j){
penalty=0;
}else{
penalty= 2* (i-j);
}
ans[i][1][j]= inputArray[i][1] - penalty+ firstCol[j];
}
}
for(int j=2;j<cols;j++){
for(int i=0;i<row;i++){
int nextRow= i;
for(int k=0;k<row;k++){
int currRow= k;
int ind=-1;
int maxVal= INT_MIN;
for(int l=0;l<row;l++){
int prevRow=l;
int max1= max(prevRow, currRow);
int min1= min(prevRow, currRow);
int penalty;
if(nextRow<=max1&&nextRow>= min1){
penalty=0;
}else if(nextRow>max1){
penalty= 2*(nextRow-max1);
}else{
penalty= 2*(min1-nextRow);
}
int val= -penalty+ inputArray[i][j] + ans[k][j-1][l];
if(val>maxVal){
maxVal=val;
ind=l;
}
}
ans[i][j][k]=maxVal;
index[i][j][k]=ind;
}
}
}
int max=INT_MIN;
int x=-1;
int y=-1;
for(int i=0;i<row;i++){
for(int j=0;j<row;j++){
if(ans[i][cols-1][j]>max){
max= ans[i][cols-1][j];
x=i;
y=i;
}
}
}
for(int j=cols-1;j>=2;j--) {
outputArray->push_back(x);
int temp=x;
x= y;
y= index[temp][j][y];
}
outputArray->push_back(x);
outputArray->push_back(y);
return max;
}
I have tried tracing the code and keep getting lost in the logic. A basic explanation of what this function is doing would be greatly appreciated.

The core datastructure ans works as follows: ans[i][j][k] is the best possible path from (k, 0) to (i, j). (Note this uses row,col notation to match the notation in the program)
If we walk the code for-loop by for-loop:
The first for-loop calculates the score of values in the first column, taking into account that everything with row > 1 has a penalty.
The second for-loop calculates ans[i][1][j], or maximum paths up to the second column, given a starting row j and ending row i.
The third for-loop gradually expands ans to the right. For every column j > 1, it fills in ans[i][j][k] by finding an l that maximizes (k, 0) to (l, j-1) to (i, j). The first part can be read from ans[k][j-1][l], the last step calculated according to the rules given in the problem.
This loop also writes the optimal choice of l in the ind datastructure, so you can reconstruct the optimal path later.
The fourth for-loop simply finds the maximal path value and stores the ending row.
The final for-loop reconstructs the path by retracing steps in the ind datastructure.

Related

OpenCL 1D range loop without knowledge of global size

I was wondering how can I iterate over a loop with a any number of work items (per group is irrelevant)
I have 3 arrays and one of them is 2-dimensional(a matrix). The first array contains a set of integers. The matrix is filled with another set of (repeated and random) integers.
The third one is only to store the results.
I need to search for the farest pair's numbers of occurrences of a number, from the first array, in the matrix.
To summarize:
A: Matrix with random numbers
num: Array with numbers to search in A
d: Array with maximum distances of pairs of each number from num
The algorithm is simple(as I don't need to optimize it), I only compare calculated Manhattan distances and keep the maximum value.
To keep it simple, it does the following (C-like pseudo code):
for(number in num){
maxDistance = 0
for(row in A){
for(column in A){
//calculateDistance is a function to another nested loop like this
//it returns the max found distance if it is, and 0 otherwise
currentDistance = calculateDistance(row, column, max)
if(currentDistance > maxDistance){
maxDistance = currentDistance
}
}
}
}
As you can see there is no dependent data between iterations. I tried to assign each work item a slice of the matrix A, but still doesn't convince me.
IMPORTANT: The kernel must be executed with only one dimension for the problem.
Any ideas? How can I use the global id to make multiple search at once?
Edit:
I added the code to clear away any doubt.
Here is the kernel:
__kernel void maxDistances(int N, __constant int *A, int n, __constant int *numbers, __global int *distances)
{
//N is matrix row and col size
//A the matrix
//n the total count of numbers to be searched
//numbers is the array containing the numbers
//distances is the array containing the computed distances
size_t id = get_global_id(0);
int slice = (N*N)/get_global_size(0);
for(int idx_num = 0; idx_num < n; idx_num++)
{
int number = numbers[idx_num];
int currentDistance = 0;
int maxDistance = 0;
for(int c = id*slice; c < (id+1)*slice; c++)
{
int i = c/N;
int j = c%N;
if(*CELL(A,N,i,j) == number){
coord_t coords;
coords.i = i;
coords.j = j;
//bestDistance is a function with 2 nested loop iterating over
//rows and column to retrieve the farest pair of the number
currentDistance = bestDistance(N,A,coords,number, maxDistance);
if(currentDistance > maxDistance)
{
maxDistance = currentDistance;
}
}
}
distances[idx_num] = maxDistance;
}
}
This answer may be seen as incomplete, nevertheless, I am going to post it in order to close the question.
My problem was not the code, the kernel (or that algorithm), it was the machine. The above code is correct and works perfectly. After I tried my program in another machine it executed and computed the solution with no problem at all.
So, in brief, the problem was the OpenCL device or most likely the host libraries.

Why does the longest prefix which is also suffix calculation part in the KMP have a time complexity of O(n) and not O(n^2)?

I was going through the code of KMP when I noticed the Longest Prefix which is also suffix calculation part of KMP. Here is how it goes,
void computeLPSArray(char* pat, int M, int* lps)
{
int len = 0;
lps[0] = 0;
int i = 1;
while (i < M) {
if (pat[i] == pat[len]) {
len++;
lps[i] = len;
i++;
}
else
{
if (len != 0) {
len = lps[len - 1]; //<----I am referring to this part
}
else
{
lps[i] = 0;
i++;
}
}
}
}
Now the part where I got confused was the one which I have shown in comments in the above code. Now we do know that when a code contains a loop like the following
int a[m];
memset(a, 0, sizeof(a));
for(int i = 0; i<m; i++){
for(int j = i; j>=0; j--){
a[j] = a[j]*2;//This inner loop is causing the same cells in the 1
//dimensional array to be visited more than once.
}
}
The complexity comes out to be O(m*m).
Similarly if we write the above LPS computation in the following format
while(i<M){
if{....}
else{
if(len != 0){
//doesn't this part cause the code to again go back a few elements
//in the LPS array the same way as the inner loop in my above
//written nested for loop does? Shouldn't that mean the same cell
//in the array is getting visited more than once and hence the
//complexity should increase to O(M^2)?
}
}
}
It might be that the way I think complexities are calculated is wrong. So please clarify.
If expressions do not take time that grows with len.
Len is an integer. Reading it takes O(1) time.
Array indexing is O(1).
Visiting something more than once does not mean you are higher O notation wise. Only if the visit count grows faster than kn for some k.
If you carefully analyze the algorithm of creating prefix table, you may notice that the total number of rollbacked positions could be m at most, so the upper bound for total number of iterations is 2*m which yields O(m)
Value of len grows alongside the main iterator i and whenever there is a mismatch, len drops back to zero value but this "drop" cannot exceed the interval passed by the main iterator i since the start of match.
For example, let's say, the main iterator i started matching with len at position 5 and mismatched at position 20.
So,
LPS[5]=1
LPS[6]=2
...
LPS[19]=15
At the moment of mismatch, len has a value of 15. Hence it may rollback at most 15 positions down to zero, which is equivalent to the interval passed by i while matching. In other words, on every mismatch, len travels back no more than i has traveled forward since the start of match

Void value not ignored as it ought to be (Trying to assign void to non-void variable?)

vector<vector<int>> matrixReshape(vector<vector<int>>& nums, int r, int c) {
int row = nums.size();
int col = nums[0].size();
vector<vector<int>> newNums;
if((row*col) < (r*c)){
return nums;
}
else{
deque<int> storage;
for(int i = 0; i < row; i++){
for(int k = 0; k < col; k++){
storage.push_back(nums[i][k]);
}
}
for(int j = 0; j < r; j++){
for(int l = 0; l < c; l++){
newNums[j][l] = storage.pop_front();
}
}
}
return newNums;
}
Hey guys, I am having a problem where I am getting the said error of the title above 'Void value not ignored as it ought to be'. When I looked up the error message, the tips stated "This is a GCC error message that means the return-value of a function is 'void', but that you are trying to assign it to a non-void variable. You aren't allowed to assign void to integers, or any other type." After reading this, I assumed my deque was not being populated; however, I can not find out why my deque is not being populated. If you guys would like to know the problem I am trying to solve, I will be posting it below. Also, I cannot run this through a debugger since it will not compile :(. Thanks in advance.
In MATLAB, there is a very useful function called 'reshape', which can reshape a matrix into a new one with different size but keep its original data.
You're given a matrix represented by a two-dimensional array, and two positive integers r and c representing the row number and column number of the wanted reshaped matrix, respectively.
The reshaped matrix need to be filled with all the elements of the original matrix in the same row-traversing order as they were.
If the 'reshape' operation with given parameters is possible and legal, output the new reshaped matrix; Otherwise, output the original matrix.
Example 1:
Input:
nums =
[[1,2],
[3,4]]
r = 1, c = 4
Output:
[[1,2,3,4]]
Explanation:
The row-traversing of nums is [1,2,3,4]. The new reshaped matrix is a 1 * 4 matrix, fill it row by row by using the previous list.
This line has two problems:
newNums[j][l] = storage.pop_front();
First, pop_front() doesn't return the element that was popped. To get the first element of the deque, use storage[0]. Then call pop_front() to remove it.
You also can't assign to newNums[j][i], because you haven't allocated those elements of the vectors. You can pre-allocate all the memory by declaring it like this.
vector<vector<int>> newNums(r, vector<int>(c));
So the above line should be replaced with:
newNums[j][l] = storage[0];
storage.pop_front();

Please tell me the efficient algorithm of Range Mex Query

I have a question about this problem.
Question
You are given a sequence a[0], a 1],..., a[N-1], and set of range (l[i], r[i]) (0 <= i <= Q - 1).
Calculate mex(a[l[i]], a[l[i] + 1],..., a[r[i] - 1]) for all (l[i], r[i]).
The function mex is minimum excluded value.
Wikipedia Page of mex function
You can assume that N <= 100000, Q <= 100000, and a[i] <= 100000.
O(N * (r[i] - l[i]) log(r[i] - l[i]) ) algorithm is obvious, but it is not efficient.
My Current Approach
#include <bits/stdc++.h>
using namespace std;
int N, Q, a[100009], l, r;
int main() {
cin >> N >> Q;
for(int i = 0; i < N; i++) cin >> a[i];
for(int i = 0; i < Q; i++) {
cin >> l >> r;
set<int> s;
for(int j = l; j < r; j++) s.insert(a[i]);
int ret = 0;
while(s.count(ret)) ret++;
cout << ret << endl;
}
return 0;
}
Please tell me how to solve.
EDIT: O(N^2) is slow. Please tell me more fast algorithm.
Here's an O((Q + N) log N) solution:
Let's iterate over all positions in the array from left to right and store the last occurrences for each value in a segment tree (the segment tree should store the minimum in each node).
After adding the i-th number, we can answer all queries with the right border equal to i.
The answer is the smallest value x such that last[x] < l. We can find by going down the segment tree starting from the root (if the minimum in the left child is smaller than l, we go there. Otherwise, we go to the right child).
That's it.
Here is some pseudocode:
tree = new SegmentTree() // A minimum segment tree with -1 in each position
for i = 0 .. n - 1
tree.put(a[i], i)
for all queries with r = i
ans for this query = tree.findFirstSmaller(l)
The find smaller function goes like this:
int findFirstSmaller(node, value)
if node.isLeaf()
return node.position()
if node.leftChild.minimum < value
return findFirstSmaller(node.leftChild, value)
return findFirstSmaller(node.rightChild)
This solution is rather easy to code (all you need is a point update and the findFisrtSmaller function shown above and I'm sure that it's fast enough for the given constraints.
Let's process both our queries and our elements in a left-to-right manner, something like
for (int i = 0; i < N; ++i) {
// 1. Add a[i] to all internal data structures
// 2. Calculate answers for all queries q such that r[q] == i
}
Here we have O(N) iterations of this loop and we want to do both update of the data structure and query the answer for suffix of currently processed part in o(N) time.
Let's use the array contains[i][j] which has 1 if suffix starting at the position i contains number j and 0 otherwise. Consider also that we have calculated prefix sums for each contains[i] separately. In this case we could answer each particular suffix query in O(log N) time using binary search: we should just find the first zero in the corresponding contains[l[i]] array which is exactly the first position where the partial sum is equal to index, and not to index + 1. Unfortunately, such arrays would take O(N^2) space and need O(N^2) time for each update.
So, we have to optimize. Let's build a 2-dimensional range tree with "sum query" and "assignment" range operations. In such tree we can query sum on any sub-rectangle and assign the same value to all the elements of any sub-rectangle in O(log^2 N) time, which allows us to do the update in O(log^2 N) time and queries in O(log^3 N) time, giving the time complexity O(Nlog^2 N + Qlog^3 N). The space complexity O((N + Q)log^2 N) (and the same time for initialization of the arrays) is achieved using lazy initialization.
UP: Let's revise how the query works in range trees with "sum". For 1-dimensional tree (to not make this answer too long), it's something like this:
class Tree
{
int l, r; // begin and end of the interval represented by this vertex
int sum; // already calculated sum
int overriden; // value of override or special constant
Tree *left, *right; // pointers to children
}
// returns sum of the part of this subtree that lies between from and to
int Tree::get(int from, int to)
{
if (from > r || to < l) // no intersection
{
return 0;
}
if (l <= from && to <= r) // whole subtree lies within the interval
{
return sum;
}
if (overriden != NO_OVERRIDE) // should push override to children
{
left->overriden = right->overriden = overriden;
left->sum = right->sum = (r - l) / 2 * overriden;
overriden = NO_OVERRIDE;
}
return left->get(from, to) + right->get(from, to); // split to 2 queries
}
Given that in our particular case all queries to the tree are prefix sum queries, from is always equal to 0, so, one of the calls to children always return a trivial answer (0 or already computed sum). So, instead of doing O(log N) queries to the 2-dimensional tree in the binary search algorithm, we could implement an ad-hoc procedure for search, very similar to this get query. It should first get the value of the left child (which takes O(1) since it's already calculated), then check if the node we're looking for is to the left (this sum is less than number of leafs in the left subtree) and go to the left or to the right based on this information. This approach will further optimize the query to O(log^2 N) time (since it's one tree operation now), giving the resulting complexity of O((N + Q)log^2 N)) both time and space.
Not sure this solution is fast enough for both Q and N up to 10^5, but it may probably be further optimized.

Number of paths in mXn grid

Is there a way to find the number of paths in mXn grid moving one cell at a time either downward, right or diagonally down-right using Permutation, starting from (1,1) and reaching (m,n)? I know there is a straight-forward DP solution and also P&C solution (i.e. m+n-2Cn-1) if the movement is only downward and right.
Look up Delannoy numbers. The combinatoric solution is expressed as a sum of multinomials.
Let t be the number of diagonal moves, the equation becomes:
This just needs a slight extension to the already existing solution DP solution that computes the path allowing movements only downwards and rightwards.
The only change you need to make is to count the number of ways you can reach a point if you move diagonally as well.
The code I took from http://www.geeksforgeeks.org/count-possible-paths-top-left-bottom-right-nxm-matrix/ should help you understand it better.
// Returns count of possible paths to reach cell at row number m and column
// number n from the topmost leftmost cell (cell at 1, 1)
int numberOfPaths(int m, int n)
{
// Create a 2D table to store results of subproblems
int count[m][n];
// Count of paths to reach any cell in first column is 1
for (int i = 0; i < m; i++)
count[i][0] = 1;
// Count of paths to reach any cell in first column is 1
for (int j = 0; j < n; j++)
count[0][j] = 1;
// Calculate count of paths for other cells in bottom-up manner using
// the recursive solution
for (int i = 1; i < m; i++)
{
for (int j = 1; j < n; j++)
// Rightwards Downwards Diagnoally right
count[i][j] = count[i-1][j] + count[i][j-1] + count[i-1][j-1];
}
return count[m-1][n-1];
}