I'm looking for a clean C++ way to parse a string containing expressions wrapped in ${} and build a result string from the programmatically evaluated expressions.
Example: "Hi ${user} from ${host}" will be evaluated to "Hi foo from bar" if I implement the program to let "user" evaluate to "foo", etc.
The current approach I'm thinking of consists of a state machine that eats one character at a time from the string and evaluates the expression after reaching '}'. Any hints or other suggestions?
Note: boost:: is most welcome! :-)
Update Thanks for the first three suggestions! Unfortunately I made the example too simple! I need to be able examine the contents within ${} so it's not a simple search and replace. Maybe it will say ${uppercase:foo} and then I have to use "foo" as a key in a hashmap and then convert it to uppercase, but I tried to avoid the inner details of ${} when writing the original question above... :-)
#include <iostream>
#include <conio.h>
#include <string>
#include <map>
using namespace std;
struct Token
{
enum E
{
Replace,
Literal,
Eos
};
};
class ParseExp
{
private:
enum State
{
State_Begin,
State_Literal,
State_StartRep,
State_RepWord,
State_EndRep
};
string m_str;
int m_char;
unsigned int m_length;
string m_lexme;
Token::E m_token;
State m_state;
public:
void Parse(const string& str)
{
m_char = 0;
m_str = str;
m_length = str.size();
}
Token::E NextToken()
{
if (m_char >= m_length)
m_token = Token::Eos;
m_lexme = "";
m_state = State_Begin;
bool stop = false;
while (m_char <= m_length && !stop)
{
char ch = m_str[m_char++];
switch (m_state)
{
case State_Begin:
if (ch == '$')
{
m_state = State_StartRep;
m_token = Token::Replace;
continue;
}
else
{
m_state = State_Literal;
m_token = Token::Literal;
}
break;
case State_StartRep:
if (ch == '{')
{
m_state = State_RepWord;
continue;
}
else
continue;
break;
case State_RepWord:
if (ch == '}')
{
stop = true;
continue;
}
break;
case State_Literal:
if (ch == '$')
{
stop = true;
m_char--;
continue;
}
}
m_lexme += ch;
}
return m_token;
}
const string& Lexme() const
{
return m_lexme;
}
Token::E Token() const
{
return m_token;
}
};
string DoReplace(const string& str, const map<string, string>& dict)
{
ParseExp exp;
exp.Parse(str);
string ret = "";
while (exp.NextToken() != Token::Eos)
{
if (exp.Token() == Token::Literal)
ret += exp.Lexme();
else
{
map<string, string>::const_iterator iter = dict.find(exp.Lexme());
if (iter != dict.end())
ret += (*iter).second;
else
ret += "undefined(" + exp.Lexme() + ")";
}
}
return ret;
}
int main()
{
map<string, string> words;
words["hello"] = "hey";
words["test"] = "bla";
cout << DoReplace("${hello} world ${test} ${undef}", words);
_getch();
}
I will be happy to explain anything about this code :)
How many evaluation expressions do intend to have? If it's small enough, you might just want to use brute force.
For instance, if you have a std::map<string, string> that goes from your key to its value, for instance user to Matt Cruikshank, you might just want to iterate over your entire map and do a simple replace on your string of every "${" + key + "}" to its value.
Boost::Regex would be the route I'd suggest. The regex_replace algorithm should do most of your heavy lifting.
If you don't like my first answer, then dig in to Boost Regex - probably boost::regex_replace.
How complex can the expressions get? Are they just identifiers, or can they be actual expressions like "${numBad/(double)total*100.0}%"?
Do you have to use the ${ and } delimiters or can you use other delimiters?
You don't really care about parsing. You just want to generate and format strings with placeholder data in it. Right?
For a platform neutral approach, consider the humble sprintf function. It is the most ubiquitous and does what I am assuming that you need. It works on "char stars" so you are going to have to get into some memory management.
Are you using STL? Then consider the basic_string& replace function. It doesn't do exactly what you want but you could make it work.
If you are using ATL/MFC, then consider the CStringT::Format method.
If you are managing the variables separately, why not go the route of an embeddable interpreter. I have used tcl in the past, but you might try lua which is designed for embedding. Ruby and Python are two other embeddable interpreters that are easy to embed, but aren't quite as lightweight. The strategy is to instantiate an interpreter (a context), add variables to it, then evaluate strings within that context. An interpreter will properly handle malformed input that could lead to security or stability problems for your application.
Related
Given an input string A, is there a concise way to generate a string B that is lexicographically larger than A, i.e. A < B == true?
My raw solution would be to say:
B = A;
++B.back();
but in general this won't work because:
A might be empty
The last character of A may be close to wraparound, in which case the resulting character will have a smaller value i.e. B < A.
Adding an extra character every time is wasteful and will quickly in unreasonably large strings.
So I was wondering whether there's a standard library function that can help me here, or if there's a strategy that scales nicely when I want to start from an arbitrary string.
You can duplicate A into B then look at the final character. If the final character isn't the final character in your range, then you can simply increment it by one.
Otherwise you can look at last-1, last-2, last-3. If you get to the front of the list of chars, then append to the length.
Here is my dummy solution:
std::string make_greater_string(std::string const &input)
{
std::string ret{std::numeric_limits<
std::string::value_type>::min()};
if (!input.empty())
{
if (std::numeric_limits<std::string::value_type>::max()
== input.back())
{
ret = input + ret;
}
else
{
ret = input;
++ret.back();
}
}
return ret;
}
Ideally I'd hope to avoid the explicit handling of all special cases, and use some facility that can more naturally handle them. Already looking at the answer by #JosephLarson I see that I could increment more that the last character which would improve the range achievable without adding more characters.
And here's the refinement after the suggestions in this post:
std::string make_greater_string(std::string const &input)
{
constexpr char minC = ' ', maxC = '~';
// Working with limits was a pain,
// using ASCII typical limit values instead.
std::string ret{minC};
auto rit = input.rbegin();
while (rit != input.rend())
{
if (maxC == *rit)
{
++rit;
if (rit == input.rend())
{
ret = input + ret;
break;
}
}
else
{
ret = input;
++(*(ret.rbegin() + std::distance(input.rbegin(), rit)));
break;
}
}
return ret;
}
Demo
You can copy the string and append some letters - this will produce a lexicographically larger result.
B = A + "a"
I'm trying to return the last word in a string but am having trouble with the for loops. When I try to test the function I am only getting empty strings. Not really sure what the problem is. Any help is much appreciated.
string getLastWord(string text)
{
string revLastWord = "";
string lastWord = "";
if(text == "")
{
return text;
}
for(size_t i = text.size()-1; i > -1; i--)
{
if((isalpha(text[i])))
{
revLastWord+=text[i];
}
if(revLastWord.size()>=1 && !isalpha(text[i-1]))
{
break;
}
}
for(size_t k = revLastWord.size()-1; k > -1; k--)
{
lastWord+=revLastWord[k];
}
return lastWord;
}
I was coding up another solution until I checked back and read the comments; they are extremely helpful. Moreover, the suggestion from #JustinRandall was incredibly helpful. I find that find_last_of()
and substr() better state the intent of the function--easier to write and easier to read. Thanks! Hope this helps! It helped me.
std::string get_last_word(std::string s) {
auto index = s.find_last_of(' ');
std::string last_word = s.substr(++index);
return last_word;
}
/**
* Here I have edited the above function IAW
* the recommendations.
* #param s is a const reference to a std::string
* #return the substring directly
*/
std::string get_last_word(const std::string& s) {
auto index = s.find_last_of(' ');
return s.substr(++index);
}
The other answers tell you what's wrong, though you should also know why it's wrong.
In general, you should be very careful about using unsigned value types in loop conditions. Comparing an unsigned type like std::size_t and a signed type, like your constant -1, will cause the signed to get converted into an unsigned type, so -1 becomes the largest possible std::size_t value.
If you put some print statements throughout your code, you'll notice that your loops are never actually entered, because the conditional is always false. Use an int when performing arithmetic and especially when signed numbers are compared with.
I'm currently working on a project which includes a Win32 console program on my Windows 10 PC and an app for my Windows 10 Mobile Phone. It's about controlling the master and audio session volumes on my PC over the app on my Windows Phone.
The "little" problem I have right now is to get the "difference" between 2 strings.
Let's take these 2 strings for example:
std::string oldVolumes = "MASTER:50:SYSTEM:50:STEAM:100:UPLAY:100";
std::string newVolumes = "MASTER:30:SYSTEM:50:STEAM:100:ROCKETLEAGUE:80:CHROME:100";
Now I want to compare these 2 strings. Lets say I explode each string to a vector with the ":" as delimiter (I have a function named explode to cut the given string by the delimiter and write the string before into a vector).
Good enough. But as you can see, in the old string there's UPLAY with the value 100, but it's missing in the new string. Also, there are 2 new values (RocketLeague and Chrome), which are missing in the old one. But not only the "audio sessions/names" are different, the values are different too.
What I want now is for each session, which is in both strings (like master and system), to compare the values and if the the new value is different to the old one, I want to append this change into another string, like:
std::string volumeChanges = "MASTER:30"; // Cause Master is changed, System not
If there's a session in the old string, but not in the new one, I want to append:
std::string volumeChanges = "MASTER:30:REMOVE:UPLAY";
If there's a session in the new one, which is missing in the old string, I want to append it like that:
std::string volumeChanges = "MASTER:30:REMOVE:UPLAY:ADD:ROCKETLEAGUE:ROCKETLEAGUE:80:ADD:CHROME:CHROME:100";
The volumeChanges string is just to show you, what I need. I'll try to make a better one afterwards.
Do you have any ideas of how to implement such a comparison? I don't need a specific code example or something, just some ideas of how I could do that in theory. It's like GIT at least. If you make changes in a text file, you see in red the deleted text and in green the added one. Something similar to this, just with strings or vectors of strings.
Lets say I explode each string to a vector with the ":" as delimiter (I have a function named explode to cut the given string by the delimiter and write the string before into a vector).
I'm going to advise you further extend that logic to separate them into property objects that discretely maintain a name + value:
struct property {
std::string name;
in32_t value;
bool same_name(property const& o) const {
return name == o.name;
}
bool same_value(property const& o) const {
return value == o.value;
}
bool operator==(property const& o) const {
return same_name(o) && same_value(o);
}
bool operator<(property const& o) const {
if(!same_name(o)) return name < o.name;
else return value < o.value;
}
};
This will dramatically simplify the logic needed to work out which properties were changed/added/removed.
The logic for "tokenizing" this kind of string isn't too difficult:
std::set<property> tokenify(std::string input) {
bool finding_name = true;
property prop;
std::set<property> properties;
while (input.size() > 0) {
auto colon_index = input.find(':');
if (finding_name) {
prop.name = input.substr(0, colon_index);
finding_name = false;
}
else {
prop.value = std::stoi(input.substr(0, colon_index));
finding_name = true;
properties.insert(prop);
}
if(colon_index == std::string::npos)
break;
else
input = input.substr(colon_index + 1);
}
return properties;
}
Then, the function to get the difference:
std::string get_diff_string(std::string const& old_props, std::string const& new_props) {
std::set<property> old_properties = tokenify(old_props);
std::set<property> new_properties = tokenify(new_props);
std::string output;
//We first scan for properties that were either removed or changed
for (property const& old_property : old_properties) {
auto predicate = [&](property const& p) {
return old_property.same_name(p);
};
auto it = std::find_if(new_properties.begin(), new_properties.end(), predicate);
if (it == new_properties.end()) {
//We didn't find the property, so we need to indicate it was removed
output.append("REMOVE:" + old_property.name + ':');
}
else if (!it->same_value(old_property)) {
//Found the property, but the value changed.
output.append(it->name + ':' + std::to_string(it->value) + ':');
}
}
//Finally, we need to see which were added.
for (property const& new_property : new_properties) {
auto predicate = [&](property const& p) {
return new_property.same_name(p);
};
auto it = std::find_if(old_properties.begin(), old_properties.end(), predicate);
if (it == old_properties.end()) {
//We didn't find the property, so we need to indicate it was added
output.append("ADD:" + new_property.name + ':' + new_property.name + ':' + std::to_string(new_property.value) + ':');
}
//The previous loop detects changes, so we don't need to bother here.
}
if (output.size() > 0)
output = output.substr(0, output.size() - 1); //Trim off the last colon
return output;
}
And we can demonstrate that it's working with a simple main function:
int main() {
std::string diff_string = get_diff_string("MASTER:50:SYSTEM:50:STEAM:100:UPLAY:100", "MASTER:30:SYSTEM:50:STEAM:100:ROCKETLEAGUE:80:CHROME:100");
std::cout << "Diff String was \"" << diff_string << '\"' << std::endl;
}
Which yields an output (according to IDEONE.com):
Diff String was "MASTER:30:REMOVE:UPLAY:ADD:CHROME:CHROME:100:ADD:ROCKETLEAGUE:ROCKETLEAGUE:80"
Which, although the contents are in a slightly different order than your example, still contains all the correct information. The contents are in different order because std::set implicitly sorted the attributes by name when tokenizing the properties; if you want to disable that sorting, you'd need to use a different data structure which preserves entry order. I chose it because it eliminates duplicates, which could cause odd behavior otherwise.
In this particular instance, you could do it as follows:
Split the old and new strings by the delimiter, and store the results in a vector.
Loop over the vector with the old data. Look for each word in the vector with new data: e.g. find("MASTER").
If not found add "REMOVE:MASTER" to your results.
If found, compare the numbers and add it to the results if it has been changed.
The added string can be found by looping over the new string and searching for the words in the old string.
I suggest that you enumerate some features (in your case for example: UPLAY present, REMOVE is present, ...)
for every one of those assign a weight if the two strings differs for the given feature.
At the end sum up weights for the features presents in one string and absent in the other and get a number.
This number should represent what you are looking for.
You can adjust weights until you are satisfied with the result.
Maybe my answer will give you some new thoughts. In fact, by tweaking the current code, you can find all the missing words.
std::vector<std::string> splitString(const std::string& str, const char delim)
{
std::vector<std::string> out;
std::stringstream ss(str);
std::string s;
while (std::getline(ss, s, delim)) {
out.push_back(s);
}
return out;
}
std::vector<std::string> missingWords(const std::string& first, const std::string& second)
{
std::vector<std::string> missing;
const auto firstWords = splitString(first, ' ');
const auto secWords = splitString(second, ' ');
size_t i = 0, j = 0;
for(; i < firstWords.size();){
auto findSameWord = std::find(secWords.begin() + j, secWords.end(), firstWords[i]);
if(findSameWord == secWords.end()) {
missing.push_back(firstWords[i]);
j++;
} else {
j = distance(secWords.begin(), findSameWord);
}
i++;
}
return missing;
}
Just to clarify that I also think the title is a bit silly. We all know that most built-in functions of the language are really well written and fast (there are ones even written by assembly). Though may be there still are some advices for my situation. I have a small project which demonstrates the work of a search engine. In the indexing phase, I have a filter method to filter out unnecessary things from the keywords. It's here:
bool Indexer::filter(string &keyword)
{
// Remove all characters defined in isGarbage method
keyword.resize(std::remove_if(keyword.begin(), keyword.end(), isGarbage) - keyword.begin());
// Transform all characters to lower case
std::transform(keyword.begin(), keyword.end(), keyword.begin(), ::tolower);
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (keyword.size() == 0 || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}
At first sign, these functions (alls are member functions of STL container or standard function) are supposed to be fast and not take many time in the indexing phase. But after profiling with Valgrind, the inclusive cost of this filter is ridiculous high: 33.4%. There are three standard functions of this filter take most of the time for that percentage: std::remove_if takes 6.53%, std::set::find takes 15.07% and std::transform takes 7.71%.
So if there are any thing I can do (or change) to reduce the instruction times cost by this filter (like using parallellizing or something like that), please give me your advice. Thanks in advance.
UPDATE: Thanks for all your suggestion. So in brief, I've summarize what I need to do is:
1) Merge tolower and remove_if into one by construct my own loop.
2) Use unordered_set instead of set for faster find method.
Thus I've chosen Mark_B's as the right answer.
First, are you certain that optimization and inlining are enabled when you compile?
Assuming that's the case, I would first try writing my own transformer that combines removing garbage and lower-casing into one step to prevent iterating over the keyword that second time.
There's not a lot you can do about the find without using a different container such as unordered_set as suggested in a comment.
Is it possible for your application that doing the filtering really just is a really CPU-intensive part of the operation?
If you use a boost filter iterator you can merge the remove_if and transform into one, something like (untested):
keyword.erase(std::transform(boost::make_filter_iterator(!boost::bind(isGarbage), keyword.begin(), keyword.end()),
boost::make_filter_iterator(!boost::bind(isGarbage), keyword.end(), keyword.end()),
keyword.begin(),
::tolower), keyword.end());
This is assuming you want the side effect of modifying the string to still be visible externally, otherwise pass by const reference instead and just use count_if and a predicate to do all in one. You can build a hierarchical data structure (basically a tree) for the list of stop words that makes "in-place" matching possible, for example if your stop words are SELECT, SELECTION, SELECTED you might build a tree:
|- (other/empty accept)
\- S-E-L-E-C-T- (empty, fail)
|- (other, accept)
|- I-O-N (fail)
\- E-D (fail)
You can traverse a tree structure like that simultaneously whilst transforming and filtering without any modifications to the string itself. In reality you'd want to compact the multi-character runs into a single node in the tree (probably).
You can build such a data structure fairly trivially with something like:
#include <iostream>
#include <map>
#include <memory>
class keywords {
struct node {
node() : end(false) {}
std::map<char, std::unique_ptr<node>> children;
bool end;
} root;
void add(const std::string::const_iterator& stop, const std::string::const_iterator c, node& n) {
if (!n.children[*c])
n.children[*c] = std::unique_ptr<node>(new node);
if (stop == c+1) {
n.children[*c]->end = true;
return;
}
add(stop, c+1, *n.children[*c]);
}
public:
void add(const std::string& str) {
add(str.end(), str.begin(), root);
}
bool match(const std::string& str) const {
const node *current = &root;
std::string::size_type pos = 0;
while(current && pos < str.size()) {
const std::map<char,std::unique_ptr<node>>::const_iterator it = current->children.find(str[pos++]);
current = it != current->children.end() ? it->second.get() : nullptr;
}
if (!current) {
return false;
}
return current->end;
}
};
int main() {
keywords list;
list.add("SELECT");
list.add("SELECTION");
list.add("SELECTED");
std::cout << list.match("TEST") << std::endl;
std::cout << list.match("SELECT") << std::endl;
std::cout << list.match("SELECTOR") << std::endl;
std::cout << list.match("SELECTED") << std::endl;
std::cout << list.match("SELECTION") << std::endl;
}
This worked as you'd hope and gave:
0
1
0
1
1
Which then just needs to have match() modified to call the transformation and filtering functions appropriately e.g.:
const char c = str[pos++];
if (filter(c)) {
const std::map<char,std::unique_ptr<node>>::const_iterator it = current->children.find(transform(c));
}
You can optimise this a bit (compact long single string runs) and make it more generic, but it shows how doing everything in-place in one pass might be achieved and that's the most likely candidate for speeding up the function you showed.
(Benchmark changes of course)
If a call to isGarbage() does not require synchronization, then parallelization should be the first optimization to consider (given of course that filtering one keyword is a big enough task, otherwise parallelization should be done one level higher). Here's how it could be done - in one pass through the original data, multi-threaded using Threading Building Blocks:
bool isGarbage(char c) {
return c == 'a';
}
struct RemoveGarbageAndLowerCase {
std::string result;
const std::string& keyword;
RemoveGarbageAndLowerCase(const std::string& keyword_) : keyword(keyword_) {}
RemoveGarbageAndLowerCase(RemoveGarbageAndLowerCase& r, tbb::split) : keyword(r.keyword) {}
void operator()(const tbb::blocked_range<size_t> &r) {
for(size_t i = r.begin(); i != r.end(); ++i) {
if(!isGarbage(keyword[i])) {
result.push_back(tolower(keyword[i]));
}
}
}
void join(RemoveGarbageAndLowerCase &rhs) {
result.insert(result.end(), rhs.result.begin(), rhs.result.end());
}
};
void filter_garbage(std::string &keyword) {
RemoveGarbageAndLowerCase res(keyword);
tbb::parallel_reduce(tbb::blocked_range<size_t>(0, keyword.size()), res);
keyword = res.result;
}
int main() {
std::string keyword = "ThIas_iS:saome-aTYpe_Ofa=MoDElaKEYwoRDastrang";
filter_garbage(keyword);
std::cout << keyword << std::endl;
return 0;
}
Of course, the final code could be improved further by avoiding data copying, but the goal of the sample is to demonstrate that it's an easily threadable problem.
You might make this faster by making a single pass through the string, ignoring the garbage characters. Something like this (pseudo-code):
std::string normalizedKeyword;
normalizedKeyword.reserve(keyword.size())
for (auto p = keyword.begin(); p != keyword.end(); ++p)
{
char ch = *p;
if (!isGarbage(ch))
normalizedKeyword.append(tolower(ch));
}
// then search for normalizedKeyword in stopwords
This should eliminate the overhead of std::remove_if, although there is a memory allocation and some new overhead of copying characters to normalizedKeyword.
The problem here isn't the standard functions, it's your use of them. You are making multiple passes over your string when you obviously need to be doing only one.
What you need to do probably can't be done with the algorithms straight up, you'll need help from boost or rolling your own.
You should also carefully consider whether resizing the string is actually necessary. Yeah, you might save some space but it's going to cost you in speed. Removing this alone might account for quite a bit of your operation's expense.
Here's a way to combine the garbage removal and lower-casing into a single step. It won't work for multi-byte encoding such as UTF-8, but neither did your original code. I assume 0 and 1 are both garbage values.
bool Indexer::filter(string &keyword)
{
static char replacements[256] = {1}; // initialize with an invalid char
if (replacements[0] == 1)
{
for (int i = 0; i < 256; ++i)
replacements[i] = isGarbage(i) ? 0 : ::tolower(i);
}
string::iterator tail = keyword.begin();
for (string::iterator it = keyword.begin(); it != keyword.end(); ++it)
{
unsigned int index = (unsigned int) *it & 0xff;
if (replacements[index])
*tail++ = replacements[index];
}
keyword.resize(tail - keyword.begin());
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (keyword.size() == 0 || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}
The largest part of your timing is the std::set::find so I'd also try std::unordered_set to see if it improves things.
I would implement it with lower level C functions, something like this maybe (not checking this compiles), doing the replacement in place and not resizing the keyword.
Instead of using a set for garbage characters, I'd add a static table of all 256 characters (yeah, it will work for ascii only), with 0 for all characters that are ok, and 1 for those who should be filtered out. something like:
static const char GARBAGE[256] = { 1, 1, 1, 1, 1, ...., 0, 0, 0, 0, 1, 1, ... };
then for each character in offset pos in const char *str you can just check if (GARBAGE[str[pos]] == 1);
this is more or less what an unordered set does, but will have much less instructions. stopwords should be an unordered set if they're not.
now the filtering function (I'm assuming ascii/utf8 and null terminated strings here):
bool Indexer::filter(char *keyword)
{
char *head = pos;
char *tail = pos;
while (*head != '\0') {
//copy non garbage chars from head to tail, lowercasing them while at it
if (!GARBAGE[*head]) {
*tail = tolower(*head);
++tail; //we only advance tail if no garbag
}
//head always advances
++head;
}
*tail = '\0';
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (tail == keyword || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}
Given strings s and t compute recursively, if t is contained in s return true.
Example: bool find("Names Richard", "Richard") == true;
I have written the code below, but I'm not sure if its the right way to use recursion in C++; I just learned recursion today in class.
#include <iostream>
using namespace std;
bool find(string s, string t)
{
if (s.empty() || t.empty())
return false;
int find = static_cast<int>(s.find(t));
if (find > 0)
return true;
}
int main()
{
bool b = find("Mississippi", "sip");
string s;
if (b == 1) s = "true";
else
s = "false";
cout << s;
}
If anyone find an error in my code, please tell me so I can fix it or where I can learn/read more about this topic. I need to get ready for a test on recursion on this Wednesday.
The question has changed since I wrote my answer.
My comments are on the code that looked like this (and could recurse)...
#include <iostream>
using namespace std;
bool find(string s, string t)
{
if (s.empty() || t.empty())
return false;
string start = s.substr(0, 2);
if (start == t && find(s.substr(3), t));
return true;
}
int main()
{
bool b = find("Mississippi", "sip");
string s;
if (b == 1) s = "true";
else
s = "false";
cout << s;
}
Watch out for this:
if (start == t && find(s.substr(3), t));
return true;
This does not do what you think it does.
The ; at the end of the if-statement leaves an empty body. Your find() function will return true regardless of the outcome of that test.
I recommend you turn up the warning levels on your compiler to catch this kind of issue before you have to debug it.
As an aside, I find using braces around every code-block, even one-line blocks, helps me avoid this kind of mistake.
There are other errors in your code, too. Removing the magic numbers 2 and 3 from find() will encourage you to think about what they represent and point you on the right path.
How would you expect start == t && find(s.substr(3), t) to work? If you can express an algorithm in plain English (or your native tongue), you have a much higher chance of being able to express it in C++.
Additionally, I recommend adding test cases that should return false (such as find("satsuma", "onion")) to ensure that your code works as well as calls that should return true.
The last piece of advice is stylistic, laying your code out like this will make the boolean expression that you are testing more obvious without resorting to a temporary and comparing to 1:
int main()
{
std::string s;
if (find("Mississippi", "sip"))
{
s = "true";
}
else
{
s = "false";
}
std::cout << s << std::endl;
}
Good luck with your class!
Your recursive function needs 2 things:
Definite conditions of failure and success (may be more than 1)
a call of itself to process a simpler version of the problem (getting closer to the answer).
Here's a quick analysis:
bool find(string s, string t)
{
if (s.empty() || t.empty()) //definite condition of failure. Good
return false;
string start = s.substr(0, 2);
if (start == t && find(s.substr(3), t)); //mixed up definition of success and recursive call
return true;
}
Try this instead:
bool find(string s, string t)
{
if (s.empty() || t.empty()) //definite condition of failure. Done!
return false;
string start = s.substr(0, 2);
if (start == t) //definite condition of success. Done!
return true;
else
return find(s.substr(3), t) //simply the problem and return whatever it finds
}
You're on the right lines - so long as the function calls itself you can say that it's recursive - but even the most simple testing should tell you that your code doesn't work correctly. Change "sip" to "sipx", for example, and it still outputs true. Have you compiled and run this program? Have you tested it with various different inputs?
You are not using recursion. Using std::string::find in your function feels like cheating (this will most likely not earn points).
The only reasonable interpretation of the task is: Check if t is an infix of s without using loops or string functions.
Let's look at the trivial case: Epsilon (the empty word) is an infix of ever word, so if t.empty() holds, you must return true.
Otherwise you have two choices to make:
t might be a prefix of s which is simple to check using recursion; simply check if the first character of t equals the first character of s and call isPrefix with the remainder of the strings. If this returns true, you return true.
Otherwise you pop the first character of s (and not of t) and proceed recursively (calling find this time).
If you follow this recipe (which btw. is easier to implement with char const* than with std::string if you ask me) you get a recursive function that only uses conditionals and no library support.
Note: this is not at all the most efficient implementation, but you didn't ask for efficiency but for a recursive function.