Matching sequences in string

Matching sequences in string - c++

Given three string sequences "abb" "ab" "a".
Now I need to find algorithm to check if a string can be parsed by above sequences .Example
string "abbabab" can be parsed by sequences "abb" "ab" and "ab"
String "abbbaaa" It cannot be parsed as we dont have "b" sequence.
I have written following code, but I feel that its not right algo.Any suggestions.
bool checkIsStringCanBeParsed(std::string S) {
std::string seqArray[6]={"abb","ab","a","abb","ab","a"};
int index=0;
int lastIndex=0;
for(int i=0;i<6;++i)
{
lastIndex =index;
for(int idx=0;idx<seqArray[i].length();++idx)
{
if(index >= S.length())
{
index =lastIndex;
break;
}
if(S[index] == seqArray[i][idx])
{
++index;
continue;
}
else
break;
}
if(index == S.length())
return true;
}
return false;
}

What you are trying to do is to build a regular expression engine that accepts sentences from the expression (abb|ab|a)*, an option is to use a non deterministic automata to represent that regular expression. Using this tool I was able to generate:
Here we have a graph with 3 states. When you want to see if a given string is accepted by your rules then it must be accepted by this graph, by accepted it means that reading the string char by char you should be able to navigate through the graph always using valid steps. When a string is parsed you should always start at state 0.
For example string "aaba" will lead us to state 0, state 1, state 1, state 2, state 1, so the string is valid because we where able to parse it completely. The string "abbb" will lead us to state 0, state 1, state 2, state 3 but there is no way to go from state 3 using another 'b' so this string is not valid.
Pseudocode to do this:
boolean accept(word, state)
{
if(word.length == 0) //either the string is empty or the parsing has ended succesfully
{
return true;
}
else
{
parsingChar = word[0] //first char of string
switch state
case 0:
if (parsingChar == 'a')
return accept(substring(word,1),1); //recursive call, we remove the first char and move to state 1
else
return false; // the first char is not an 'a' the word is not accepted
break;
case 1:
if (parsingChar == 'a')
return accept(substring(word,1),3); // move to state 3
else if (parsingChar == 'b')
return accept(substring(word,1),2); // move to state 2
else
return false; //
break;
case 2:
if (parsingChar == 'a')
return accept(substring(word,1),3); // move to state 3
else if (parsingChar == 'b')
return accept(substring(word,1),1); // move to state 1
else
return false; //
break;
case 3:
if (parsingChar == 'a')
return accept(substring(word,1),1); // move to state 1
else
return false;
break;
}
}

You should use regex. A regex for your sequences is "^((abb)|(ab)|(a))*$". The regex library will optimize it for you.

Dynamic programming should work fine. Let's say dp(i) = true if and only if it is possible to parse prefix with length i with given sequences. Initially, dp(0) = true. Then one can compute dp values for all i the following way: if dp(j) = true and substring from j+1 to i matches one of the sequences then dp(i) = true.

Related

Index of the condition that was satisfied inside if-statement

if(command[i]=='H' or command[i]=='h' or command[i]=='C' or command[i]=='c'){
do something;
}
Once the logic flow goes inside this if-statement, I want to know what exactly command[i] was. Surely I can make individual comparisons again in the inside block and find out, but is there a more elegant way of knowing, say, the index of the condition that was satisfied?

If you use
if((myC=command[i]) =='H' ||
(myC=command[i]) =='h' ||
(myC=command[i]) =='C' ||
(myC=command[i]) =='c')
then the value of the successful expression will end up in myC, because evaluation in a chain of "or"s stops at the first true subexpression.
If you go one step further you can get a number value identifying the subexpression by index.
if(((myC=1), command[i]) =='H' ||
((myC=2), command[i]) =='h' ||
((myC=3), command[i]) =='C' ||
((myC=4), command[i]) =='c')
Same concept, the first successful subexpüression is the last to be evaluated and the , operator ensures that only the second part gets used for the comparison.

Another option is to assign a value. You could use switch, an if..else tower, or a function with return statements. Here is a version with function:
int classify( char command )
{
switch( command )
{
case 'H': return 1;
case 'h': return 2;
case 'C': return 3;
case 'c': return 4;
default : return 0;
}
}
void func(void)
{
int result = classify( command[i] );
if ( result )
{
// use result value here as appropriate
}
}
It would also be possible, in fact preferable, to use an enumerator instead of magic numbers.

Just do this -
if(command[i]=='H' or command[i]=='h' or command[i]=='C' or command[i]=='c'){
print command[i]; //use whatever command is appropriate for printing
do something;
}

C++ char-by-char comparison of a string

I'm trying to work on a string comparison check for an introductory C++ course; it's an online course and unfortunately the instructor is not very responsive. For a current lab, I need to perform a number of manipulations on string data.
Currently, I'm working on a step to check if a string has any repeated characters, and if a repetition is found, to delete the repeated characters at their present spot and move one copy of the letter to the beginning of the string. This is only to be done for the first double to be found.
I've set up a basic counter to move through the string looking for matches, checking a stored character (updated on each iteration) to the current position in the string.
I tried multiple string functions (comparing the current inputString[i] to the previous such, stored as a second string tempStore), but those always gave char conversion errors. I've tried the below instead, but this is now giving an error: "invalid conversion from 'char' to 'const char*'.
inputString is given by the user, testA and testB are defined as type char
Any ideas?
while (opComplete == false) {
if (i == 0) {
i++;
}
else if (i == inputString.size()) {
//Not Found
opComplete = true;
}
else if (i > 0) {
testA = inputString[i-1];
testB = inputString[i];
if (strcmp(testA,testB) != 0) {
i++;
}
else {
inputString.insert(0,inputString[i]);
inputString.erase(i,1);
inputString.erase(i-1,1);
opComplete = true;
}
}
}

Your problem is in this line:
inputString.insert(0,inputString[i]);
The std::string.insert() function the way you call it here has the following signature:
string& insert ( size_t pos1, const char* s );
so it expects a const char pointer. You, however, are giving it the inputString[i]. The return value of std::string.operator[] is a reference (as here), hence the error. However, by the time you reach your else, you already have the desired character in your testB variable, so you can just change the line to
inputString.insert(0, &testB);
You also can't pass normal chars into strcmp. You can use operator==, or, in your case, operator!= though.

You are using the insert method incorrectly, check its reference here for possible arguments.
while (opComplete == false)
{
if (i == 0)
i++;
else if (i == inputString.size())
opComplete = true;
else if (i > 0) {
char testA = inputString[i-1];
char testB = inputString[i];
if(testA!=testB)
i++;
else {
inputString.insert(0,&testB); //Problem Corrected here.
inputString.erase(i,1);
inputString.erase(i-1,1);
opComplete = true;
}
}
}

write trie parsing recursive function with node step over

The purpose: This function parses through a string trie following the path that matches an input string of characters. When all the char in the string are parsed, true is returned. I want to step over a char and return if there is still a valid path.
The application: the strings are a location hierarchy for a highway project. So, project 5 has an alignment C, that has an offset of N and a workzone 3; 5CN3. But, sometimes I want to define a string for all child locations for a project task that covers all the locations. So, '0' is all locations; for a half day operation like grade dirt has no workzones - all the so to represent this task is all workzones in the north alignment C; 5CN0. same for if an operation covers the whole project; 5000.
Approaches: I could have used a wildcard '?' function but I want to keep this specific step over for the purpose of abstracting the locations. Maybe '?' is the right approach, but seems to loose some control. Also, this could be written without the for loop and use a position index parameter; maybe that is where this goes wrong - maybe on backtracking.
Code: nodeT is the trie nodes, word is the input string, this function is a bool and returns 1/0 if the string path exists.
bool Lexicon::containsWordHelper(nodeT *w, string word)) //check if prefix can be combined
{
if(word == "") { //base case: all char found
return true;
} else {
for(int i = 0; i < w->alpha.size(); i++) { //Loop through all of the children of the current node
if (w->alpha[i].letter == word[0])
return containsWordHelper(w->alpha[i].next, word.substr(1));
else if (word[0] == '0') //if '0' then step over and continue searching for valid path
containsWordHelper(w->alpha[i].next, word.substr(1)); //removed return here to allow looping through all the possible paths
} //I think it is continuing through after the loop and triggering return false
}
return false; //if char is missing - meaning the exact code is not there
}
The problem is that this returns false when a '0' wildcard is used. What is going wrong here? My knowledge is limited.
I hacked on this problem for awhile and used the 'howboutthis howboutthat' approach, and found that placing the return at the end of the step over statement works.
bool Lexicon::containsWordHelper(nodeT *w, string word, int &time, int &wag, string compare) //check if prefix can be combined
{
if(word == "") { //base case: all letters found
if ((w->begin-wag) <= time && time <= (w->end+wag))
return w->isWord; // case 2: timecard check for high/low date range
else if (time == ConvertDateToEpoch(9999, 01, 01)) return w->isWord; //this is for single code lookup w/o date
} else {
for(int i = 0; i < w->alpha.size(); i++) { //Loop through all of the children of the current node
if (w->alpha[i].letter == word[0])
return containsWordHelper(w->alpha[i].next, word.substr(1), time, wag, compare);
else if (word[0] == 'ž')
if (containsWordHelper(w->alpha[i].next, word.substr(1), time, wag, compare)) return true;
}
}
return false; //if char is missing - meaning the exact code is not there
}
It seems logical that if I only one the path that ends in true to return then I should place the return after the recursion is done and then conditionally pass back only if true. It works and seems logical in retrospect, but my confidence in this is sketchy at best.
I still have the same question. What is/was going wrong?

You could test the result of the latter containsWordHelper call and return true if the result is true, else continue iterating.

Solved: place a return after an if statement containing the recursive call
bool Lexicon::containsWordHelper(nodeT *w, string word)
{
if(word == "") return w->isWord;
else {
for(int i = 0; i < w->alpha.size(); i++) {
if (w->alpha[i].letter == word[0])
return containsWordHelper(w->alpha[i].next, word.substr(1));
else if (word[0] == 'ž')
if (containsWordHelper(w->alpha[i].next, word.substr(1))) return true;
}
}
return false;
}

input string validation without external libraries for c++

I need to validate one input string from a user. Eventually it will need to break down into two coordinates. ie a4 c3. And once they are coordinates they need to be broken out into 4 separate ints. a=0 b=1, etc. They must also follow the following stipulations:
If an end-of-input signal is reached the program quits.
Otherwise, all non-alphanumeric characters are discarded from the input.
If what remains is the single letter 'Q'
Then the program quits.
If what remains consists of 4 characters, with one letter and one digit among the first two characters and one letter and one digit among the last two characters, and if each letter-digit pair is in the legal range for our grid
Then input is acceptable.
I have completely over-thought and ruined my function. Please let me know where I can make some corrections.
I am mainly having trouble going from one string, to four chars if and only if the data is valid. Everything else I can handle.
Here is what I have so far.
void Grid::playerMove()
{
string rawMove;
string pair1 = " ";
string pair2 = " ";
bool goodInput = false;
char maxChar = 'a';
char chary1, chary2;
int x11,x22,y11,y22;
for (int i =0; i<size; i++)
{
maxChar++;
}
while(!goodInput)
{
cout<<"What two dots would you like to connect? (Q to quit) ";
cin>>rawMove;
rawMove = reduceWords(rawMove);
if (rawMove == "Q")
{
cout<<"end game";
goodInput = false;
}
else if (rawMove.size() == 4)
{
for(int j=0;j<2;j++)
{
if (pair1[j] >='a' && pair1[j] <=maxChar)
{
chary1 = pair1[j];
}
else if(pair1[j] >=0 && pairl[j]<=size+1)
{
x1 = pair1[j];
}
}
for(int k=0;k<2;k++)
{
if (pair2[k] >='a' && pair2[k] <=maxChar)
{
chary2 = pair2[k];
}
else if(pair2[k] >=0 && pair2[k]<=size+1)
{
x2 = pair2[k];
}
}
}
if(char1 != NULL && char2 != NULL && x1 !=NULL && x2 != NULL)
{
for (int m = 0; m <= size m++)
{
if (char1 == m;)
{
x1 = m;
}
}
for (int n = 0; n <= size n++)
{
if (char2 == n)
{
x2 = n;
}
}
}
}
The end goal would be to have x1, x2, y1, and y2 with their respective values.
Keep in mind I am not allowed to have any external libraries.

It's not clear what exactly you want to achieve, but here are some pointers to get you started:
The while loop will never end because you're setting goodInput to false on quit which lets the loop continue.
The code probably does not even compile? You are missing a curly closing brace..
You are initializing pair1 and pair2 to empty strings but never change them again, so they will never contain any real information about your moves
maybe what you really want is to split up rawMove into the pair1 and pair2 substrings first?

Since this is a homework - and you're supposed to learn from those (right?) - I'm not going to give you the complete answer, but rather something like a recipe:
Use std::istream::getline(char*, std::streamsize s) to read a whole line from std::cin. Make sure you allocate a buffer large enough to hold the expected input (including the terminating null character) plus some more for invalid characters. After the call, check the failbit (input was too long) and the eofbit (hit the end-of-input) of the std::cin stream and handle those cases. Construct a std::string from the buffer if there was no error or EOF has not been reached.
Write a character-classification function (e.g. call it isAlNum(char c)) that returns true if the char argument is alpha-numeric, and false otherwise.
Combine std::string::erase(), std::remove_if(), std::not1(), std::ptr_fun() and your function isAlNum() to sanitise the input string.
Write a function that validates and parses the coordinates from the sanitised input string and call it with the sanitised input string.
Wrap the whole thing in an appropriate while() loop.
This should get you started in the right direction. Of course, if you're allowed to use C++11 features and you know how to write good regular expressions, by all means, use the <regex> header instead of doing the parsing manually.

C++ and GetAsyncKeyState() function

As it gives only Upper case letters, any idea how to get lower case??
If the user simultaneously pessed SHIFT+K or CAPSLOCK is on,etc, I want to get lower cases..
is it possible in this way or another??
Thanks,

Suppose "c" is the variable you put into GetAsyncKeyState().
You may use the following method to detect whether you should print upper case letter or lower case letter.
string out = "";
bool isCapsLock() { // Check if CapsLock is toggled
if ((GetKeyState(VK_CAPITAL) & 0x0001) != 0) // If the low-order bit is 1, the key is toggled
return true;
else
return false;
}
bool isShift() { // Check if shift is pressed
if ((GetKeyState(VK_SHIFT) & 0x8000) != 0) // If the high-order bit is 1, the key is down; otherwise, it is up.
return true;
else
return false;
}
if (c >= 65 && c <= 90) { // A-Z
if (!(isShift() ^ isCapsLock())) { // Check if the letter should be lower case
c += 32; // in ascii table A=65, a=97. 97-65 = 32
}
out = c;

As you rightly point out, it represents a key and not upper or lower-case. Therefore, perhaps another call to ::GetASyncKeyState(VK_SHIFT) can help you to determine if the shift-key is down and then you will be able to modify the result of your subsequent call appropriately.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Matching sequences in string - c++

You should use regex. A regex for your sequences is "^((abb)|(ab)|(a))*$". The regex library will optimize it for you.

Related

Index of the condition that was satisfied inside if-statement

C++ char-by-char comparison of a string

write trie parsing recursive function with node step over

input string validation without external libraries for c++

C++ and GetAsyncKeyState() function

Categories

Resources