LeetCode Word Break, fail on Online Judge but pass Online test - c++

I met a problem when I was doing leetcode 139, word break.
Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words. (each dictionary word can be used multiple times.)
For example, given
s = "leetcode",
dict = ["leet", "code"].
Return true because "leetcode" can be segmented as "leet code".
I use basic dynamic programming algorithm, but may implement it in a different way from the popular one on the internet.
Here is the code:
class Solution {
public:
bool wordBreak(string s, unordered_set<string>& wordDict) {
int strlen = s.length();
if(0 == strlen) return true;
vector<bool> sepable(false, strlen);
for(int i = 0; i < strlen; ++i) {
if(wordDict.count(s.substr(0,i+1)) > 0) {
sepable[i] = true;
continue;
}
for(int j = 0; j < i; ++j) {
if(sepable[j] && wordDict.count(s.substr(j+1,i-j)) > 0) {
sepable[i] = true;
break;
}
}
}
return sepable[strlen-1];
}
};
When I ran online judge, it fails at the test:" "aaaaaaa" ["aaaa","aa"]", my code output true, the expected answer is false. However, if I run it on online test, it gives the right output. Also, it works fine on my own virtual machine with clang++.
The difference between online judge and online test is that each online test is only one test. Online judge contains many tests and will fail if anyone of the tests fails. So the problem of my code may lay like this: at some test other than the "aaaaaaa", it gives the right output but cause some potential problem. And that is why my code will fail on "aaaaaaa". However, if I just run this single test, it is fine.
The leetcode website says it may because my code has some undefined behaviors. The previous test case may influence the latter one. I don't know what are all the previous test case and didn't expect anyone here know about it. But I think as long as there are problems in my code, someone can find it.
I think the question is pretty clear this time.

this line parameters are of wrong order vector<bool> sepable(false, strlen); it should be vector<bool> sepable(strlen,false);the length of the vector comes first then the default value and false is implicitly converted to int so the length is set to 0 that gave the undefined behavior

Related

What is wrong with this string palindrome program?

THIS PROBLEM IS SOLVED
Given a string, determine if it is a palindrome, considering only alphanumeric characters and ignoring cases.
int Solution::isPalindrome(string A) {
vector<int> v1,v2;
int size = A.size(),a;
for(int i=0; i<size; i++){
if(isalpha(A[i]) || isdigit(A[i]))
{
a = (int)A[i];
v1.push_back(a);
}
if(isalpha(A[size-1-i]) || isdigit(A[size-1-i]))
{
a = (int)A[size-1-i];
v2.push_back(a);
}
}
if(v1==v2)
return 1;
return 0;
}
It doesn't ignore cases, because the comparison v1 == v2 is done with the same case as the original string.
Change
v1.push_back(a);
to
v1.push_back(tolower(a));
And similarly for v2.
Honestly just five minutes testing should have told you this, it's not a hard bug to find.
EDIT
Since I answered this question the code has been changed, to something that no longer compiles. I really don't feel inclined to answer ever changing questions.
EDIT
Well it's changed again. At least it compiles now.

SCIPgetSolVal from SCIP solver is not returning the solution values

I have been using SCIP for some time and have never had problems with these functions. However, I recently ran several input scenarios to pull different results from scip.
There was a case where the SCIPgetSolVal function did not work as it should.
If I do SCIPprintBestSol (scipProblem, NULL, TRUE) it shows me the results and they are within limits. When I next get the results with the SCIPgetSolVal function it returns me values of magnitude "1E + 99".
Note that it always works, except for some cases, I think the most logical thing is to be a problem with the input data, but I think in that case SCIPprintBestSol would not have viable results.
It seems to me like a memory allocation problem but I don't understand why it happens and how I can fix it.
I don't know if it's relevant but it's in c ++ on windows.
SCIPsolve(scipProblem);
SCIPprintBestSol(scipProblem, NULL, TRUE); //VALID RESULTS
/* SCIP Problem Results */
SCIP_Status status = SCIPgetStatus(scipProblem);
if(status == SCIP_STATUS_INFEASIBLE)
{
result = false;
}
else
{
for(int t=0; t<100; t++)
{
result(t,0) = SCIPgetSolVal(scipProblem, sol, probVars.at(t));
cout<< "Solution :"<< result(t,0) << endl;//WRONG RESULTS
}
}

Searching a function with the cctype library to find the number of characters that are digits in a range

Trying to solve one of the questions I was given by an instructor and I'm having trouble understanding how to call this properly.
I'm given a function that is linked to a test driver and my goal is to use the cstring library to find any numbers in the range of 0-9 in a randomly generated string object with this function.
int countDigits(char * const line) {return 0;}
So far this is what I have:
int countDigits(char * const line)
{
int i, index;
index = -1;
found = false;
i = 0;
while (i < *line && !found)
{
if (*line > 0 && *line < 9)
index++;
}
return 0;
}
My code not great and at the moment only results in an infinite loop and failure, any help would be very much appreciated.
Well, there are several problems with your function.
you want it to return the number of digits, but it returns 0 in any case
found is never set to anything than false and thus prohibits the while loop from stopping
also the comparison i<*line does not make much sense to me, I guess you want to check for the end of the line. Maybe you would want to look for a null termination "\0" (here again i is never set to anything else than 0)
and, if you want to compare single characters, you should look up the ASCII code of the characters you are comparing to (the digits 0-9 are not equal to codes 0-9)
Hope that is a start to improve your function.
There's a readymade for this called count_if:-
count_if(begin, end, [](char c){ return isdigit(c);});

Need suggestion to improve speed for word break (dynamic programming)

The problem is: Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "hithere",
dict = ["hi", "there"].
Return true because "hithere" can be segmented as "leet code".
My implementation is as below. This code is ok for normal cases. However, it suffers a lot for input like:
s = "aaaaaaaaaaaaaaaaaaaaaaab", dict = {"aa", "aaaaaa", "aaaaaaaa"}.
I want to memorize the processed substrings, however, I cannot done it right. Any suggestion on how to improve? Thanks a lot!
class Solution {
public:
bool wordBreak(string s, unordered_set<string>& wordDict) {
int len = s.size();
if(len<1) return true;
for(int i(0); i<len; i++) {
string tmp = s.substr(0, i+1);
if((wordDict.find(tmp)!=wordDict.end())
&& (wordBreak(s.substr(i+1), wordDict)) )
return true;
}
return false;
}
};
It's logically a two-step process. Find all dictionary words within the input, consider the found positions (begin/end pairs), and then see if those words cover the whole input.
So you'd get for your example
aa: {0,2}, {1,3}, {2,4}, ... {20,22}
aaaaaa: {0,6}, {1,7}, ... {16,22}
aaaaaaaa: {0,8}, {1,9} ... {14,22}
This is a graph, with nodes 0-23 and a bunch of edges. But node 23 b is entirely unreachable - no incoming edge. This is now a simple graph theory problem
Finding all places where dictionary words occur is pretty easy, if your dictionary is organized as a trie. But even an std::map is usable, thanks to its equal_range method. You have what appears to be an O(N*N) nested loop for begin and end positions, with O(log N) lookup of each word. But you can quickly determine if s.substr(begin,end) is a still a viable prefix, and what dictionary words remain with that prefix.
Also note that you can build the graph lazily. Staring at begin=0 you find edges {0,2}, {0,6} and {0,8}. (And no others). You can now search nodes 2, 6 and 8. You even have a good algorithm - A* - that suggests you try node 8 first (reachable in just 1 edge). Thus, you'll find nodes {8,10}, {8,14} and {8,16} etc. As you see, you'll never need to build the part of the graph that contains {1,3} as it's simply unreachable.
Using graph theory, it's easy to see why your brute-force method breaks down. You arrive at node 8 (aaaaaaaa.aaaaaaaaaaaaaab) repeatedly, and each time search the subgraph from there on.
A further optimization is to run bidirectional A*. This would give you a very fast solution. At the second half of the first step, you look for edges leading to 23, b. As none exist, you immediately know that node {23} is isolated.
In your code, you are not using dynamic programming because you are not remembering the subproblems that you have already solved.
You can enable this remembering, for example, by storing the results based on the starting position of the string s within the original string, or even based on its length (because anyway the strings you are working with are suffixes of the original string, and therefore its length uniquely identifies it). Then, in the beginning of your wordBreak function, just check whether such length has already been processed and, if it has, do not rerun the computations, just return the stored value. Otherwise, run computations and store the result.
Note also that your approach with unordered_set will not allow you to obtain the fastest solution. The fastest solution that I can think of is O(N^2) by storing all the words in a trie (not in a map!) and following this trie as you walk along the given string. This achieves O(1) per loop iteration not counting the recursion call.
Thanks for all the comments. I changed my previous solution to the implementation below. At this point, I didn't explore to optimize on the dictionary, but those insights are very valuable and are very much appreciated.
For the current implementation, do you think it can be further improved? Thanks!
class Solution {
public:
bool wordBreak(string s, unordered_set<string>& wordDict) {
int len = s.size();
if(len<1) return true;
if(wordDict.size()==0) return false;
vector<bool> dq (len+1,false);
dq[0] = true;
for(int i(0); i<len; i++) {// start point
if(dq[i]) {
for(int j(1); j<=len-i; j++) {// length of substring, 1:len
if(!dq[i+j]) {
auto pos = wordDict.find(s.substr(i, j));
dq[i+j] = dq[i+j] || (pos!=wordDict.end());
}
}
}
if(dq[len]) return true;
}
return false;
}
};
Try the following:
class Solution {
public:
bool wordBreak(string s, unordered_set<string>& wordDict)
{
for (auto w : wordDict)
{
auto pos = s.find(w);
if (pos != string::npos)
{
if (wordBreak(s.substr(0, pos), wordDict) &&
wordBreak(s.substr(pos + w.size()), wordDict))
return true;
}
}
return false;
}
};
Essentially one you find a match remove the matching part from the input string and so continue testing on a smaller input.

Split a even-numbered string in c++

I am very new to c++. I am trying to split a string that contains even numbered sub strings till there is no even numbered sub string left. For example, if I input AB ABCD ABC, the output should be A B A B C D ABC. I am trying to do it without tokens, because I don't know how to..
What I have so far only split the first even sub string and it doesn't work if I only have 1 sub string. Can someone please help me out?
Any advise will be much appreciated. Thank you!
string temp = "";
void check(string &str, int &i, int &flag)
{
int count = 0;
int reminder;
do
{
count++;
temp += str[i];
i++;
} while (str[i] != ' ');
i = i - temp.size();
reminder = count % 2;
if (reminder == 0)
flag = 1;
else
flag = 0;
}
void SplitEvenWord(string &str)
{
int i = 0;
int flag = 0;
for (i = 0; i < str.size(); i++)
{
check(str, i, flag);
if (flag == 1)
{
temp.insert(temp.size() / 2, " ");
str.replace(i, temp.size() - 1, temp);
}
}
}
There are two skills that are absolutely vital in software engineering (Well, more than two, but two for now): developing new functions in isolation, and testing things in the simplest possible way.
You say that the code fails if there is only one substring. You don't say how it fails (I should have mentioned clear error reports in the list) so I don't know whether to test your code with an even-length string which it ought to split ("ABCD" => "A B C D") or an odd-length string which it ought to leave alone ("ABC" => "ABC"). Before I try to code these up, I look at your first function:
void check(string &str, int &i, int &flag)
{
...
do
{
count++;
temp += str[i];
i++;
} while (str[i] != ' ');
...
}
Trouble already. The strings I have in mind do not contain any spaces, so the loop cannot terminate. This code will run past the end of the string into whatever happens to be in that memory space, which will cause undefined behavior. (If you don't know that term, it means that there's no telling what will happen, but if you're lucky the program will just crash.)
Fix that, try running that code on "ABC" and "ABCD" and "A" and "" and "ABC DEF", and get it working perfectly. Once it does, take a look at your other function. Don't test it with random typing, test it with short, clearly defined strings. Once it works perfectly, try longer, more complicated ones. If you find a string which causes it to fail, hold onto it! That string will lead you to a bug.
That should be enough to get you started.
I'm writing this as an answer because it was too long to fit as a comment.
I have a couple of suggestions that may help you to figure out what the problem is.
Separate "check" into at least two functions, one to split the string into individual words and check them and one to check the length of the string.
Test the "check" and "tokenize" functions by separately and see if they give you the expected answers. Work on them individually until they are correct.
Separate the formatting of the answers out of "SplitEvenWord" into a separate function.
"SplitEvenWord" should then be nothing more than calling the functions you created as a result of the steps above.
When I'm stuck, I always try to break the problem down into small bite sized pieces that I know I can get working. Eventually, the problem becomes assembling the already working pieces of the solution into a larger function that solves the original problem.