Check if string contains other string elements - c++

I am trying to check if string contains elements from different string in specific order.
For example:
large string: thisisstring
small string: hssg
it should return true.
I only figured out how to check if string contains whole other string but not parts.
This is the code that I wrote for checking for now:
if ([largestring rangeOfString:smallstring].location != NSNotFound) {
printf("contains");
}

If there are no more characters to search for from the small string, return true.
Starting from the position after the most recently found character in the large string, do a linear search for the first character from the small string that has not yet been searched for.
If the character was not found, return false.
Start back at 1.

There's no easy way to do this, at least, no built in way that I know of. You would have to iterate through each letter of your small string and find the first letter that matches your large string.
Each time you find a matching letter, you loop to the next smallstring letter, but instead only begin searching at the index after you found the previous letter.
EDIT:
some pseudo code, untested, may have syntax errors:
int foundChar = 0;
for (int l = 0; l < strlen(smallstring); l++)
{
bool found = false;
for (; foundChar < strlen(largestring); foundChar++)
{
if (smallstring[l] == largestring[foundChar])
{
// We break here because we found a matching letter.
// Notice that foundChar is still in scope so we preserve
// its value for the next check.
found = true;
foundChar++; // Increment so the next search starts with the next letter.
break;
}
}
// If we get down here, that means we've searched all of the letters
// and found no match, we can result with a failure to find the match.
if (found == false)
{
return false;
}
}
// If we get here, it means every loop resulted in a valid match.
return true;

Related

Check string for recurrence of specific character

How can I search inside of a string for more than one occurrence of a specific character (in this case a period .)?
I have already tried adapting the answer from this question, but I think I am doing it wrong.
std::string periodCheck = i.convert_to<std::string>();
char subString = '.';
std::size_t pos = periodCheck.find(subString, 0);
int counter;
while(pos != std::string::npos){
counter++;
if(counter > 1){
std::cout << "\nError: Multiple periods\n";
return false;
}
}
The first line simply converts from a Boost multi-precision cpp_dec_float (named i) to a string. I know that this part of the code works, because I use it effectively elsewhere in the program.
I am trying to check if a string contains more than one period. If the string has more than one period in it, the function returns false.
How can I achieve this?
If you find a period, then your next logical step would also be to search again, starting with the next character position.
However, if you review your code, you will not be able to find any place where it is actually searching again. There's no call to find() inside the while loop.
A while loop is not required at all. All you need to do is to call find() a second time, specifying pos+1 as the starting position for the second search, and check the results again. If you find another period, you can call it a wrap. Nothing is to be gained by searching for any remaining periods in the string. You have your answer.
std::size_t pos = periodCheck.find(subString, 0);
if (pos != std::string::npos)
{
pos=periodCheck.find(subString, pos+1);
if (pos != std::string::npos)
return false;
}

regex for at least one character

How to check if string contains at least one character? I want to eliminate strings where are only special characters, so I've decided that the easiest way is to check if there is at least one character or digit, so I've created [a-zA-Z0-9]{1,} and [a-zA-Z0-9]+ but none of these work.
boost::regex noSpecialCharacters("[a-zA-Z0-9]+");
boost::regex noSpecialCharacters2("[a-zA-Z0-9]{1,}");
string tab[SIZE] = {"father", "apple is red"};
for (int i = 0; i < SIZE; i++) {
if (!boost::regex_match(tab[i], noSpecialCharacters)) {
puts("This is it!");
} else {
puts("or not");
}
if (!boost::regex_match(tab[i], noSpecialCharacters2)) {
puts("This is it!");
} else {
puts("or not");
}
}
for "apple is red" the answer is correct but for "father" it doesn't work.
apple is red won't match because, as per here (my bold):
Note that the result is true only if the expression matches the whole of the input sequence.
That means the spaces make it invalid. It then goes on to say (again, my bold):
If you want to search for an expression somewhere within the sequence then use regex_search.
If all you're looking for is one valid character somewhere in there, you can just use regex_match() with ".*[a-zA-Z0-9].*" or regex_search() with "[a-zA-Z0-9]".

Cannot get second while to loop properly

I'm making a function that removes elements from a string. However, I cant seem to get both of my loops to work together. The first while loop works flawlessly. I looked into it and I believe it might be because when "find_last_of" isn't found, it still returns a value (which is throwing off my loop). I haven't been able to figure out how I can fix it. Thank you.
#include <iostream>
#include <string>
using namespace std;
string foo(string word) {
string compare = "!##$";
string alphabet = "abcdefghijklmnopqrstuvxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
while(word.find_first_of(compare) < word.find_first_of(alphabet)) {
int position = word.find_first_of(compare);
word = word.substr(++position);
}
while(word.find_last_of(compare) > word.find_last_of(alphabet)){
int size = word.length();
word = word.substr(0, --size);
}
return word;
}
int main() {
cout << foo("!!hi!!");
return 0;
}
I wrote it like this so compound words would not be affected. Desired result: "hi"
It's not entirely clear what you're trying to do, but how about replacing the second loop with this:
string::size_type p = word.find_last_not_of(compare);
if(p != string::npos)
word = word.substr(0, ++p);
It's not clear if you just want to trim certain characters from the front and back of word or if you want to remove every one of a certain set of characters from word no matter where they are. Based on the first sentence of your question, I'll assume you want to do the latter: remove all characters in compare from word.
A better strategy would be to more directly examine each character to see if it needs to be removed, and if so, do so, all in one pass through word. Since compare is quite short, something like this is probably good enough:
// Rewrite word by removing all characters in compare (and then erasing the
// leftover space, if any, at the end). See std::remove_if() docs.
word.erase(std::remove_if(word.begin(),
word.end(),
// Returns true if a character is to be removed.
[&](const char ch) {
return compare.find(ch) != compare.npos;
}),
word.end());
BTW, I'm not sure why there is both a compare and alphabet string in your example. It seems you would only need to define one or the other, and not both. A character is either one to keep or one to remove.

CString::find... iterative use issue?

I'm using a CString to search a text block... here's my code:
// locate file name in dir listing
in = *buf;
i = in.Find("DOWNLD .DAT ");// find start of name, two spaces (0x20) as delim
// size of search text here is 14
if (i == -1) return 0;
j = in.Find(' ',i);// now find next space char *after* file size...
// why don't I have to add to i here? There are spaces in my search string.
if (j == -1) return 0;
fileSize = in.Mid((i+14),j-i);// extract file size string, note indexing past found string
return atoi(fileSize.GetBuffer());
Here's what MSDN has to say about the return value of find:
" Return Value
The zero-based index of the first character in this CString object that matches the requested substring or characters; -1 if the substring or character is not found."
Now the way I read this, I have to index past the string I found before doing another find... but the way it actually works, I use the 'i' returned before as the start position for a new search. I'm using this in other places in my program, and I definitely have to index past it (when using ::mid(), for instance)... I'd like to know why this is happening, if by design or bug. The original string can be large; I've seen it near 300chars... is this the problem?
Your second Find call finds the space after "DOWNLD", not the one after ".DAT ". You want to increment i before the second Find call, so that it refers to the first character past the string your first call searched for.
So I didn't find a bug in CString... my code was in error. Here's the code changed that works:
j = in.Find(' ',i+14);// index past searched string
if (j == -1) return 0;
fileSize = in.Mid((i+14),j-i-14);// note -14 added
return atoi(fileSize.GetBuffer());
It was the missing -14 in the mid that was confusing me... the resulting string was 14 past where it should have been, and was missing the portion of interest. Why my original fix worked? Just a coincidence I guess.
Thanks for helping!

Instead of having different size_t variables, can I use just one for searching a std::string multiple times?

I am wondering if it is possible to cut down how many size_t variables I use here. Here is what I have:
std::size_t found, found2, found3, found4 /* etc */;
Each has its own string to find:
found1 = msg.find("string1");
found2 = msg.find("string2");
found3 = msg.find("string3");
found4 = msg.find("string4");
// etc
If the word is found, then it will discard and prevent the message to be shown:
if (found1 != std::string::npos)
{
SendMsg("You cannot say that word!");
}
I have else if statements until found21. I'd like to cut everything down in size, so it would be clean, but I don't have a clue how to do it. I would also like it to lowercase the word. I have never used tolower at all either, so I would appreciated it if someone would help me.
To lowercase a string, you can do
std::transform(msg.begin(), msg.end(), msg.begin(), std::tolower);
Transform takes a begin and end iterator as the first and second arguments, and for each element in that range, applies the fourth argument (a function) and assigns it to what the third iterator is pointing to and increments it. By passing msg.begin() as both the first and third arguments, it will assign the result of the function to what it passed to it. So transform will basically do this:
for (auto src = begin(msg), dst = begin(msg); src != end(msg); ++src, ++dst)
*dst = tolower(*src);
but using transform is so much nicer.
To check whether a string contains any of a list of substrings, you can use a for loop with a vector:
vector<string> bad_strings { "bad word 1", "bad word 2", "etc" };
for (auto i = begin(bad_strings); i != end(bad_strings); ++i)
if (msg.find(*i)) {
SendMsg("You cannot say that word!");
break; // stop when you find it matches even one bad string
}