Given an integer K and a matrix of size t x t. construct a string s consisting of first t lowercase english letters such that the total cost of s is K - c++

I'm solving this problem and stuck halfway through, looking for help and a better method to tackle such a problem:
problem:
Given an integer K and a matrix of size t x t. we have to construct a string s consisting of the first t lowercase English letters such that the total cost of s is exactly K. it is guaranteed that there exists at least one string that satisfies given conditions. Among all possible string s which is lexicographically smallest.
Specifically the cost of having the ith character followed by jth character of the English alphabet is equal to cost[i][j].
For example, the cost of having 'a' followed by 'a' is denoted by cost[0][0] and the cost of having 'b' followed by 'c' is denoted by cost[1][3].
The total cost of a string is the total cost of two consecutive characters in s. for matrix cost is
[1 2]
[3 4],
and the string is "abba", then we have
the cost of having 'a' followed by 'b' is is cost[0][1]=2.
the cost of having 'b' followed by 'b' is is `cost0=4.
the cost of having 'b' followed by 'a' is cost0=3.
In total, the cost of the string "abba" is 2+4+3=9.
Example:
consider, for example, K is 3,t is 2, the cost matrix is
[2 1]
[3 4]
There are two strings that its total cost is 3. Those strings are:
"aab"
"ba"
our answer will be "aab" as it is lexicographically smallest.
my approach
I tried to find and store all those combinations of i, j such that it sums up to desired value k or is individual equals k.
for above example
v={
{2,1},
{3,4}
}
k = 3
and v[0][0] + v[0][1] = 3 & v[1][0] = 3 . I tried to store the pairs in an array like this std::vector<std::vector<std::pair<int, int>>>. and based on it i will create all possible strings and will store in the set and it will give me the strings in lexicographical order.
i stucked by writing this much code:
#include<iostream>
#include<vector>
int main(){
using namespace std;
vector<vector<int>>v={{2,1},{3,4}};
vector<pair<int,int>>k;
int size=v.size();
for(size_t i=0;i<size;i++){
for(size_t j=0;j<size;j++){
if(v[i][j]==3){
k.push_back(make_pair(i,j));
}
}
}
}
please help me how such a problem can be tackled, Thank you. My code can only find the individual [i,j] pairs that can be equal to desired K. I don't have idea to collect multiple [i,j] pairs which sum's to desired value and it also appears my approach is totally naive and based on brute force. Looking for better perception to solve the problems and implement it in the code. Thank you.

This is a backtracking problem. General approach is :
a) Start with the "smallest" letter for e.g. 'a' and then recurse on all the available letters. If you find a string that sums to K then you have the answer because that will be the lexicographically smallest as we are finding it from smallest to largest letter.
b) If not found in 'a' move to the next letter.
Recurse/backtrack can be done as:
Start with a letter and the original value of K
explore for every j = 0 to t and reducing K by cost[i][j]
if K == 0 you found your string.
if K < 0 then that path is not possible, so remove the last letter in the string, try other paths.
Pseudocode :
string find_smallest() {
for (int i = 0; i < t; i++) {
s = (char)(i+97)
bool value = recurse(i,t,K,s)
if ( value ) return s;
s = ""
}
return ""
}
bool recurse(int i, int t, int K, string s) {
if ( K < 0 ) {
return false;
}
if ( K == 0 ) {
return true;
}
for ( int j = 0; j < t; j++ ) {
s += (char)(j+97);
bool v = recurse(j, t, K-cost[i][j], s);
if ( v ) return true;
s -= (char)(j+97);
}
return false;
}

In your implementation, you would probably need another vector of vectors of pairs to explore all your candidates. Also another vector for updating the current cost of each candidate as it builds up. Following this approach, things start to get a bit messy (IMO).
A more clean and understandable option (IMO again) could be to approach the problem with recursivity:
#include <iostream>
#include <vector>
#define K 3
using namespace std;
string exploreCandidate(int currentCost, string currentString, vector<vector<int>> &v)
{
if (currentCost == K)
return currentString;
int size = v.size();
int lastChar = (int)currentString.back() - 97; // get ASCII code
for (size_t j = 0; j < size; j++)
{
int nextTotalCost = currentCost + v[lastChar][j];
if (nextTotalCost > K)
continue;
string nextString = currentString + (char)(97 + j); // get ASCII char
string exploredString = exploreCandidate(nextTotalCost, nextString, v);
if (exploredString != "00") // It is a valid path
return exploredString;
}
return "00";
}
int main()
{
vector<vector<int>> v = {{2, 1}, {3, 4}};
int size = v.size();
string initialString = "00"; // reserve first two positions
for (size_t i = 0; i < size; i++)
{
for (size_t j = 0; j < size; j++)
{
initialString[0] = (char)(97 + i);
initialString[1] = (char)(97 + j);
string exploredString = exploreCandidate(v[i][j], initialString, v);
if (exploredString != "00") { // It is a valid path
cout << exploredString << endl;
return 0;
}
}
}
}
Let us begin from the main function:
We define our matrix and iterate over it. For each position, we define the corresponding sequence. Notice that we can use indices to get the respective character of the English alphabet, knowing that in ASCII code a=97, b=98...
Having this initial sequence, we can explore candidates recursively, which lead us to the exploreCandidate recursive function.
First, we want to make sure that the current cost is not the value we are looking for. If it is, we leave immediately without even evaluating the following iterations for candidates. We want to do this because we are looking for the lexicographically smallest element, and we are not asked to provide information about all the candidates.
If the cost condition is not satisfied (cost < K), we need to continue exploring our candidate, but not for the whole matrix but only for the row corresponding to the last character. Then we can encounter two scenarios:
The cost condition is met (cost = K): if at some point of recursivity the cost is equal to our value K, then the string is a valid one, and since it will be the first one we encounter, we want to return it and finish the execution.
The cost is not valid (cost > K): If the current cost is greater than K, then we need to abort this branch and see if other branches are luckier. Returning a boolean would be nice, but since we want to output a string (or maybe not, depending on the statement), an option could be to return a string and use "00" as our "false" value, allowing us to know whether the cost condition has been met. Other options could be returning a boolean and using an output parameter (passed by reference) to contain the output string.
EDIT:
The provided code assumes positive non-zero costs. If some costs were to be zero you could encounter infinite recursivity, so you would need to add more constraints in your recursive function.

Related

how to find distinct substrings?

Given a string, and a fixed length l, how can I count the number of distinct substrings whose length is l?
The size of character set is also known. (denote it as s)
For example, given a string "PccjcjcZ", s = 4, l = 3,
then there are 5 distinct substrings:
“Pcc”; “ccj”; “cjc”; “jcj”; “jcZ”
I try to use hash table, but the speed is still slow.
In fact I don't know how to use the character size.
I have done things like this
int diffPatterns(const string& src, int len, int setSize) {
int cnt = 0;
node* table[1 << 15];
int tableSize = 1 << 15;
for (int i = 0; i < tableSize; ++i) {
table[i] = NULL;
}
unsigned int hashValue = 0;
int end = (int)src.size() - len;
for (int i = 0; i <= end; ++i) {
hashValue = hashF(src, i, len);
if (table[hashValue] == NULL) {
table[hashValue] = new node(i);
cnt ++;
} else {
if (!compList(src, i, table[hashValue], len)) {
cnt ++;
};
}
}
for (int i = 0; i < tableSize; ++i) {
deleteList(table[i]);
}
return cnt;
}
Hastables are fine and practical, but keep in mind that if the length of substrings is L, and the whole string length is N, then the algorithm is Theta((N+1-L)*L) which is Theta(NL) for most L. Remember, just computing the hash takes Theta(L) time. Plus there might be collisions.
Suffix trees can be used, and provide a guaranteed O(N) time algorithm (count number of paths at depth L or greater), but the implementation is complicated. Saving grace is you can probably find off the shelf implementations in the language of your choice.
The idea of using a hashtable is good. It should work well.
The idea of implementing your own hashtable as an array of length 2^15 is bad. See Hashtable in C++? instead.
You can use an unorder_set and insert the strings into the set and then get the size of the set. Since the values in a set are unique it will take care of not including substrings that are the same as ones previously found. This should give you close to O(StringSize - SubstringSize) complexity
#include <iostream>
#include <string>
#include <unordered_set>
int main()
{
std::string test = "PccjcjcZ";
std::unordered_set<std::string> counter;
size_t substringSize = 3;
for (size_t i = 0; i < test.size() - substringSize + 1; ++i)
{
counter.insert(test.substr(i, substringSize));
}
std::cout << counter.size();
std::cin.get();
return 0;
}
Veronica Kham answered good to the question, but we can improve this method to expected O(n) and still use a simple hash table rather than suffix tree or any other advanced data structure.
Hash function
Let X and Y are two adjacent substrings of length L, more precisely:
X = A[i, i + L - 1]
Y = B[i + 1, i + 1 + L - 1]
Let assign to each letter of our alphabet a single non negative integer, for example a := 1, b := 2 and so on.
Let's define a hash function h now:
h(A[i, j]) := (P^(L-1) * A[i] + P^(L-2) * A[i + 1] + ... + A[j]) % M
where P is a prime number ideally greater than the alphabet size and M is a very big number denoting the number of different possible hashes, for example you can set M to maximum available unsigned long long int in your system.
Algorithm
The crucial observation is the following:
If you have a hash computed for X, you can compute a hash for Y in
O(1) time.
Let assume that we have computed h(X), which can be done in O(L) time obviously. We want to compute h(Y). Notice that since X and Y differ by only 2 characters, and we can do that easily using addition and multiplication:
h(Y) = ((h(X) - P^L * A[i]) * P) + A[j + 1]) % M
Basically, we are subtracting letter A[i] multiplied by its coefficient in h(X), multiplying the result by P in order to get proper coefficients for the rest of letters and at the end, we are adding the last letter A[j + 1].
Notice that we can precompute powers of P at the beginning and we can do it modulo M.
Since our hashing functions returns integers, we can use any hash table to store them. Remember to make all computations modulo M and avoid integer overflow.
Collisions
Of course, there might occur a collision, but since P is prime and M is really huge, it is a rare situation.
If you want to lower the probability of a collision, you can use two different hashing functions, for example by using different modulo in each of them. If probability of a collision is p using one such function, then for two functions it is p^2 and we can make it arbitrary small by this trick.
Use Rolling hashes.
This will make the runtime expected O(n).
This might be repeating pkacprzak's answer, except, it gives a name for easier remembrance etc.
Suffix Automaton also can finish it in O(N).
It's easy to code, but hard to understand.
Here are papers about it http://dl.acm.org/citation.cfm?doid=375360.375365
http://www.sciencedirect.com/science/article/pii/S0304397509002370

Number of Rs in a string

I have an assignment where I'm given a string S containing the letters 'R' and 'K', for example "RRRRKKKKRK".
I need to obtain the maximum number of 'R's that string could possibly hold by flipping characters i through j to their opposite. So:
for(int x = i; x < j; x++)
{
if S[x] = 'R'
{
S[X] = 'S';
}
else
{
S[X] = 'R';
}
}
However, I can only make the above call once.
So for the above example: "RRRRKKKKRK".
You would have i = 4 and j = 8 which would result in: "RRRRRRRRKR" and you would then output the number of R's in the resulting string: 9.
My code partially works, but there are some cases that it doesn't. Can anyone figure out what is missing?
Sample Input
2
RKKRK
RKKR
Sample Output
4
4
My Solution
My solution which works only for the first case, I don't know what I'm missing to complete the algorithm:
int max_R = INT_MIN;
for (int i = 0; i < s.size(); i++)
{
for (int j = i + 1; j < s.size(); j++)
{
int cnt = 0;
string t = s;
if (t[j] == 'R')
{
t[j] = 'K';
}
else
{
t[j] = 'R';
}
for (int b = 0; b < s.size(); b++)
{
if (t[b] == 'R')
{
cnt++;
if (cnt > max_R)
{
max_R = cnt;
}
}
}
}
}
cout << max_R << endl;
How about turning this into the Maximum subarray problem which has O(n) solution?
Run through the string once, giving 'K' a value of 1, and 'R' a value of -1.
E.g For 'RKRRKKKKRKK' you produce an array -> [-1, 1, -1, -1, 1, 1, 1, 1, -1, 1, 1] -> [-1, 1, -2, 4, -1, 2] (I grouped consecutive -1s and 1s to be more clear)
Apply Kadane's algorithm on the generated array. What you get from doing this is the maximum number of 'R's you can obtain from flipping 'K's.
Continuing with the example, you find that the maximum subarray is [4, -1, 2] with a sum of 5.
Now add the absolute value of the negative values outside this subarray with the sum of your maximum subarray to obtain your answer.
In our case, only -1 and -2 are negative and outside the subarray. We get |-1| + |-2| + 5 = 8
Try to carefully think about your solution. Do you understand, what it does?
First, let’s forget that the input file may contain multiple tests, so let’s get rid of the while loop. Now, we have just two for loops. The second one obviously just counts R’s in the processed string. But what does the first one do?
The answer is that the first loop flips all the letters from the second one (i.e. which has index 1) till the end of the string. We can see that in the first testcase:
RKKRK
it is indeed the optimal solution. The string turns into RRRKR and we get four R’s. But in the second case:
RKKR
the string turns into RRRK and we get three R’s. While if we flipped just the letters from 2 to 3 (i.e. indices 1 to 2) we could get RRRR which has four R’s.
So your algorithm always flips letters from index 1 to the end, but this is not always optimal. What can we do? How do we know which letters to flip? Well, there are some smart solutions, but the easiest is to just try all possible combinations!
You can flip all the letters from 0 to 1, count the number of R’s, remember it. Get back to the original string, flip letters from 0 to 2, count R’s, remember it and so on till you flip from 0 to n-1. Then you flip letters from 1 to 2, from 1 to 3, etc. And the answer is the largest value you remembered.
This is horribly inefficient, but this works. After you get more practice in solving algorithmic problems, get back to this task and try to figure out more efficient solutions. (Hint: if you consider building the optimal answer incrementally, that is by going through the string char by char and transforming the optimal solution for the substring s[0..i] into the optimal solution for s[0..i+1] you can arrive to a pretty straightforward O(n^2) algorithm. This can be enhanced to O(n), but this step is slightly more involved.)
Here is the sketch of this solution:
def solve(s):
answer = 0
for i in 0..(n-1)
for j in i..(n-1)
t = copy(s) # we will need the original string later
flip(t, i, j) # flip letters from i to j in t
c = count_R(t) # count R's in t
answer = max(answer, c)
return answer

How to find all substrings that start and end with 1?

You are given a string of 0’s and 1’s you have to find all substrings in the string which starts and end with a 1.
For example, given 0010110010, output should be the six strings:
101
1011
1011001
11
11001
1001
Obviously there is an O(N^2) solution, but I'm looking for a solution with complexity on the order of O(N). Is it possible?
Obviously there is an O(N^2) solution, but I'm looking for a solution with complexity on the order of O(N). Is it possible?
Let k be the number of 1s in our input string. Then there are O(k^2) such substrings. Enumerating them must take at least O(k^2) time. If k ~ N, then enumerating them must take O(N^2) time.
The only way to get an O(N) solution is if we add the requirement that k is o(sqrt(N)). There cannot be an O(N) solution in the general case with no restriction on k.
An actual O(k^2) solution is straightforward:
std::string input = ...;
std::vector<size_t> ones;
ones.reserve(input.size());
// O(N) find the 1s
for (size_t idx = 0; idx < input.size(); ++idx) {
if (input[idx] == '1') {
ones.push_back(idx);
}
}
// O(k^2) walk the indices
for (size_t i = 0; i < ones.size(); ++i) {
for (size_t j = i+1; j < ones.size(); ++j) {
std::cout << input.substr(i, j - i + 1) << '\n';
}
}
Update We have to account for the lengths of the substrings as well as the number of them. The total length of all the strings is O(k * N), which is strictly greater than the previously claimed bound of O(k^2). Thus, the o(sqrt(N)) bound on k is insufficient - we actually need k to be O(1) in order to yield an O(N) solution.
You can find the same in O(n) with the following steps :
1. Count the number of 1's.
2. Let # of 1's be x, we return x(x-1)/2.
This quite trivially counts the number of possible pairs of 1's.
The code itself is probably worth trying yourself!
EDIT:
If you want to return the substrings themselves, you must restrict the number of 1's in your substring in order to get some sort of O(N) solution (or really O(x) where x is your # of 1's) , as enumerating them in itself cannot be reduced in a general case from O(N^2) time complexity.
If you just need the number of substrings, and not the substrings themselves, you could probably pull it off by counting the number of pairs after doing an initial O(n) sum of the number of 1's you encounter
Assuming N is supposed to be the number of 1s in your string (or at least proportional to it, which is reasonable assuming a constant probability of 1 for each character):
If you need the substrings themselves, there's going to be N(N-1)/2, which is quadratic, so it's completely impossible to be less complex than quadratic.
import java.util.*;
public class DistictSubstrings {
public static void main(String args[]) {
// a hash set
Scanner in = new Scanner(System.in);
System.out.print("Enter The string");
String s = in.nextLine();
int L = s.length();
Set<String> hs = new HashSet<String>();
// add elements to the hash set
for (int i = 0; i < L; ++i) {
for (int j = 0; j < L-i ; ++j) {
if(s.charAt(j)=='1'&&s.charAt(j+i)=='1')
{
hs.add(s.substring(j, j+i + 1));
}
}
}
Iterator it=hs.iterator();
System.out.println("the string starts and endswith 1");
System.out.println(hs.size());
while(it.hasNext())
{
System.out.println(it.next()+" ");
}
String s="1001010001";
for(int i=0;i<=s.length()-1;i++)
{
for(int j=0;j<=s.length()-1;j++)
{
if(s.charAt(j)=='1' && s.charAt(i)=='1' && i<j)
{
System.out.println(s.substring(i,j+1));
}
}
}
The following python code will help you to find all substrings that starts and ends with 1.
# -*- coding: utf-8 -*-
"""
Created on Tue Sep 26 14:25:14 2017
#author: Veeramani Natarajan
"""
# Python Implementation to find all substrings that start and end with 1
# Function to calculate the count of sub-strings
def calcCount(mystring):
cnt=-1
index=0
while(index<len(mystring)):
if(mystring[index]=='1'):
cnt += 1
index += 1
return cnt
mystring="0010110010";
index=0;
overall_cnt=0
while(index<len(mystring)):
if(mystring[index]=='1'):
partcount=calcCount(mystring[index:len(mystring)])
overall_cnt=overall_cnt+partcount
# print("index is",index)
# print("passed string",mystring[index:len(mystring)])
# print("Count",partcount,"overall",overall_cnt)
index=index+1
# print the overall sub strings count
print (overall_cnt)
Note:
Though this is not O(N) solution, i believe it will help someone to understand the python implementation of the above problem statement.
O(n) solution is definitely possible using DP.
We take an array of pairs where the first element in each pair denotes the number of substrings upto that index and the second element denotes the number of substrings starting with 1 up to but not including that index. (So, if the char at that index is 1, the second element won't count the substring [1, 1])
We simply iterate through the array and build the solution incrementally as we do in DP and after the end of the loop, we have the final value in the pair's first element in the last index of our array. Here's the code:
int getoneonestrings(const string &str)
{
int n = str.length();
if (n == 1)
{
return 0;
}
vector< pair<int, int> > dp(n, make_pair(0, 0));
for (int i = 1; i < n; i++)
{
if (str[i] == '0')
{
dp[i].first = dp[i - 1].first;
}
else
{
dp[i].first = dp[i - 1].first + dp[i - 1].second +
(str[i - 1] == '1' ? 1 : 0);
}
dp[i].second = dp[i - 1].second + (str[i - 1] == '1' ? 1 : 0);
}
return dp[n - 1].first;
}

Comparisons of strings with c++

I used to have some code in C++ which stores strings as a series of characters in a character matrix (a string is a row). The classes Character matrix and LogicalVector are provided by Rcpp.h:
LogicalVector unq_mat( CharacterMatrix x ){
int nc = x.ncol() ; // Get the number of columns in the matrix.
LogicalVector out(nc); // Make a logical (bool) vector of the same length.
// For every col in the matrix, assess whether the column contains more than one unique character.
for( int i=0; i < nc; i++ ) {
out[i] = unique( x(_,i) ).size() != 1 ;
}
return out;
}
The logical vector identifies which columns contain more than one unique character. This is then passed back to the R language and used to manipulate a matrix. This is a very R way of thinking of doing this. However I'm interested in developing my thinking in C++, I'd like to write something that achieves the above: So finds out which characters in n strings are not all the same, but preferably using the stl classes like std::string. As a conceptual example given three strings:
A = "Hello", B = "Heleo", C = "Hidey". The code would point out that positions/characters 2,3,4,5 are not one value, but position/character 1 (the 'H') is the same in all strings (i.e. there is only one unique value). I have something below that I thought worked:
std::vector<int> StringsCompare(std::vector<string>& stringVector) {
std::vector<int> informative;
for (int i = 0; i < stringVector[0].size()-1; i++) {
for (int n = 1; n < stringVector.size()-1; n++) {
if (stringVector[n][i] != stringVector[n-1][i]) {
informative.push_back(i);
break;
}
}
}
return informative;
}
It's supposed to go through every character position (0 to size of string-1) with the outer loop, and with the inner loop, see if the character in string n is not the same as the character in string n-1. In cases where the character is all the same, for example the H in my hello example above, this will never be true. For cases where the characters in the strings are different the inter loops if statement will be satisfied, the character position recorded, and the inner loop broken out of. I then get a vector out containing the indicies of the characters in the n strings where the characters are not all identical. However these two functions give me different answers. How else can I go through n strings char by char and check they are not all identical?
Thanks,
Ben.
I expected #doctorlove to provide an answer. I'll enter one here in case he does not.
To iterate through all of the elements of a string or vector by index, you want i from 0 to size()-1. for (int i=0; i<str.size(); i++) stops just short of size, i.e., stops at size()-1. So remove the -1's.
Second, C++ arrays are 0-based, so you must adjust (by adding 1 to the value that is pushed into the vector).
std::vector<int> StringsCompare(std::vector<std::string>& stringVector) {
std::vector<int> informative;
for (int i = 0; i < stringVector[0].size(); i++) {
for (int n = 1; n < stringVector.size(); n++) {
if (stringVector[n][i] != stringVector[n-1][i]) {
informative.push_back(i+1);
break;
}
}
}
return informative;
}
A few things to note about this code:
The function should take a const reference to vector, as the input vector is not modified. Not really a problem here, but for various reasons, it's a good idea to declare unmodified input references as const.
This assumes that all the strings are at least as long as the first. If that doesn't hold, the behavior of the code is undefined. For "production" code, you should include a check for the length prior to extracting the ith element of each string.

How to reduce the time complexity to find the longest zigzag sequence?

I was trying to solve the problem zig zag sequences on top coder.The time complexity of my code is O(n*n). How can I reduce it to O(n) or O(nlog (n))
Pseudo code or explanation of the algorithm will be really helpful to me
Here is the problem statement.
Problem Statement
A sequence of numbers is called a zig-zag sequence if the differences between successive numbers strictly alternate between positive and negative. The first difference (if one exists) may be either positive or negative. A sequence with fewer than two elements is trivially a zig-zag sequence.
For example, 1,7,4,9,2,5 is a zig-zag sequence because the differences (6,-3,5,-7,3) are alternately positive and negative. In contrast, 1,4,7,2,5 and 1,7,4,5,5 are not zig-zag sequences, the first because its first two differences are positive and the second because its last difference is zero.
Given a sequence of integers, sequence, return the length of the longest subsequence of sequence that is a zig-zag sequence. A subsequence is obtained by deleting some number of elements (possibly zero) from the original sequence, leaving the remaining elements in their original order.
And here is my code
#include <iostream>
#include<vector>
#include<cstring>
#include<cstdio>
using namespace std;
class ZigZag
{
public:
int dp[200][2];
void print(int n)
{
for(int i=0;i<n;i++)
{
cout<<dp[i][0]<<endl;
}
}
int longestZigZag(vector<int> a)
{
int n=a.size();
//int dp[n][2];
for(int i=0;i<n;i++)
{
cout<<a[i]<<" "<<"\t";
}
cout<<endl;
memset(dp,sizeof(dp),0);
dp[0][1]=dp[0][0]=1;
for(int i=1;i<n;i++)
{
dp[i][1]=dp[i][0]=1;
for(int j=0;j<i;j++)
{
if(a[i]<a[j])
{
dp[i][0]=max(dp[j][1]+1,dp[i][0]);
}
if(a[j]<a[i])
{
dp[i][1]=max(dp[j][0]+1,dp[i][1]);
}
}
cout<<dp[i][1]<<"\t"<<dp[i][0]<<" "<<i<<endl;
//print(n);
}
cout<<dp[n-1][0]<<endl;
return max(dp[n-1][0],dp[n-1][1]);
}
};
U can do it in O(n) using a greedy approach. Take the first non-repeating number - this is the first number of your zigzag subsequence. Check whether the next number in the array is lesser than or greater than the first number.
Case 1: If lesser, check the next element to that and keep going till you find the least element (ie) the element after that would be greater than the previous element. This would be your second element.
Case 2: If greater, check the next element to that and keep going till you find the greatest element (ie) the element after that would be lesser than the previous element. This would be your second element.
If u have used Case 1 to find the second element, use Case 2 to find the third element or vice-versa. Keep alternating between these two cases till u have no more elements in the original sequence. The resultant numbers u get would form the longest zigzag subsequence.
Eg: { 1, 17, 5, 10, 13, 15, 10, 5, 16, 8 }
The resulting subsequence:
1 -> 1,17 (Case 2) -> 1,17,5 (Case 1) -> 1,17,5,15 (Case 2) -> 1,17,5,15,5 (Case 1) -> 1,17,5,15,5,16 (Case 2) -> 1,17,5,15,5,16,8 (Case 1)
Hence the length of the longest zigzag subsequence is 7.
U can refer to sjelkjd's solution for an implementation of this idea.
As the subsequence should not be necessarily contiguous you can't make it O(n). In a worst case the complexity is O(2^n). Howewer, I did some checks to cut off subtrees as soon as possible.
int maxLenght;
void test(vector<int>& a, int sign, int last, int pos, int currentLenght) {
if (maxLenght < currentLenght) maxLenght = currentLenght;
if (pos >= a.size() || pos >= a.size() + currentLenght - maxLenght) return;
if (last != a[pos] && (last - a[pos] >= 0) != sign)
test(a,!sign,a[pos],pos+1,currentLenght+1);
test(a,sign,last,pos+1,currentLenght);
}
int longestZigZag(vector<int>& a) {
maxLenght = 0;
test(a,0,a[0],1,1);
test(a,!0,a[0],1,1);
return maxLenght;
}
You can use RMQs to remove the inner for-loop. When you find the answer for dp[i][0] and dp[i][1], save it in two RMQ trees - say, RMQ0 and RMQ1 - just like you're doing now with the two rows of the dp array. So, when you calculate dp[i][0], you put the value dp[i][0] on position a[i] in RMQ0, meaning that there is a zig-zag sequence with length dp[i][0] ending increasingly with number a[i].
Then, in order to calculate dp[i + 1][0], you don't have to loop through all the numbers between 0 and i. Instead, you can query RMQ0 for the largest number on position > a[i + 1]. This will give you the longest zig-zag subsequence ending with a number larger than the current one - i.e. the longest one that can be continued decreasingly with the number a[i + 1]. Then you can do the same for RMQ1 for the other half of the zig-zag subsequences.
Since you can implement dynamic RMQ with query complexity of O(log N), this gives you an overall complexity of O(N log N).
You can solve this problem in O(n) time and O(n) extra space.
Algorithm goes as follows.
Store the difference of alternative term in new array of size n-1
Now traverse the new array and just check whether the product of alternative term is less then zero or not.
Increment result accordingly. If while traversing you find that array is product is more than zero in that case you store the result and again start counting for the rest of the element in difference array.
Find the maximum among them store it into result, and return (result+1)
Here is it's implementation in C++
#include <iostream>
#include <vector>
using namespace std;
int main()
{
ios_base::sync_with_stdio(false);
int n;
cin>>n;
vector<int> data(n);
for(int i = 0; i < n; i++)
cin>>data[i];
vector<int> diff(n-1);
for(int i = 1; i < n; i++)
diff[i-1] = data[i]-data[i-1];
int res = 1;
if( n < 2)
cout<<res<<"\n";
else
{
int temp_idx = 0;
for(int i = 1; i < n-1; i++)
{
if(diff[i]*diff[i-1] < 0)
{
temp_idx++;
res++;
}
else
{
res = max(res,temp_idx);
temp_idx = 1;
}
}
cout<<res+1<<"\n";
}
return 0;
}
This is a purely theoretical solution. This is how you would solve it if you would be asked for it in an academical environment, standing next to the chalkboard.
The solution to the problem can be created using dynamic programming:
The subproblem has the form of: if I have an element x of the sequence, what is the longest subsequence that is ending on that element?
Then you can work out your solution using recursive calls, which should look something like this (the directions of the relations might be wrong, I haven't checked it):
S - given sequence (array of integers)
P(i), Q(i) - length of the longest zigzag subsequence on elements S[0 -> i] inclusive (the longest sequence that is correct, where S[i] is the last element)
P(i) = {if i == 0 then 1
{max(Q(j) if A[i] < A[j] for every 0 <= j < i)
Q(i) = {if i == 0 then 0 #yields 0 because we are pedantic about "is zig the first relation, or is it zag?". If we aren't, then this can be a 1.
{max(P(j) if A[i] > A[j] for every 0 <= j < i)
This should be O(n) with the right memoization (storing each output of Q(i) and P(i)), because each subproblem is only computed once: n*|P| + n*|Q|.
These calls return the length of the solution - the actual result can be found by storing "parent pointer" whenever a max value is found, and then traversing backwards on these pointers.
You can avoid the recursion simply by substituting function calls with array lookups: P[i] and Q[i], and using a for loop.