Finding all occurrences using rfind, flow challenges? - c++

Following a c++ tutorial and teaching about find() the following code was implemented to search for all the "cat" occurrences in a string:
std::string input;
std::size_t i = 0, x_appearances = 0;
std::getline(std::cin,input);
for(i = input.find("cat",0); i != std::string::npos; i=input.find("cat", i))
{
++x_appearances;
++i; //Move past the last discovered instance to avoid finding the same string
}
Then the tutorial challenges the apprentice to change find() for rfind(), and that's where the problems came in, first I tried what seemed to be the obvious approach:
for(i = input.rfind("cat",input.length()); i != std::string::npos; i=input.rfind("cat", i))
{
++x_appearances;
--i; //Move past the last discovered instance to avoid finding the same string
}
but with this solution I fell into an infinite loop. Then I discovered that it was happening because the increment is performed before the condition check, and that the increment rfind() was always finding a match even with i==std::string::npos (if the match is on the beginning of the string, for example "cats"). My final solution came to be:
int n=input.length();
for(i = input.rfind("cat",input.length()); n>0 && i!=std::string::npos; i=input.rfind("cat", i))
{
++x_appearances;
n=i;
--i; //Move past the last discovered instance to avoid finding the same string
}
With n I can keep the track of the position in the string, and with it exit the for loop when the entire string had been searched.
So my question is: Is my approach correct? Did I need an extra variable or is there any other simpler way of doing this?

for(i = input.rfind("cat",input.length()); i != std::string::npos; i=input.rfind("cat", i))
{
++x_appearances;
--i; //Move past the last discovered instance to avoid finding the same string
}
The problem with the above is the --i inside the loop. Suppose the input string starts with "cat". Your algorithm will eventually find that "cat" with i being 0. Since you've declared i as a std::size_t, subtracting 1 from 0 results in the largest possible std::size_t. There's no warning, no overflow, no undefined behavior. This is exactly how unsigned integers must work, per the standard.
Somehow you need to handle this special case. You could use an auxiliary variable and a more convoluted test in your loop. An alternative is to keep your code simple and at the same time make it blatantly obvious you are explicitly handling this special case:
for (i = input.rfind("cat"); i != std::string::npos; i=input.rfind("cat", i-1))
{
++x_appearances;
// Finding "cat" at the start means we're done.
if (i == 0) {
break;
}
}
Note also that I've changed the loop statement a bit. The default value for pos is std::string::npos, which means search from the end of the string. There's no need for that second argument with the initializer. I also moved the --i into the update part of the for loop, changing input.rfind("cat",i) to input.rfind("cat",i-1). Since i is always positive at this point, there's no danger in subtracting one.

Related

Splitting up a string from end to start into groups of two in C++

I was curious about the way I could make a program, that takes a string, then detects the end of it, and then starts splitting it up "from end toward the start", into the groups of two?
For instance, the user enters mskkllkkk and the output has to be m sk kl lk kk.
I tried to search the net for the tools I needed, and got familiar with iterators, and tried to use them for this purpose. I did something like this:
#include "iostream"
#include "string"
#include "conio.h"
int main() {
int k=0,i=-1;
std::string str1;
std::string::iterator PlaceCounter;
std::cin >> str1;
PlaceCounter = str1.end();
for (PlaceCounter; PlaceCounter != str1.begin(); --PlaceCounter)
{
++k;
if (k % 2 == 0 && k-1 != 0) {
++i;
str1.insert(str1.end()-k-i,' ');
}
}
std::cout << str1;
_getch();
return 0;
}
At first, it seemed to be working just fine when I entered a couple of arbitrary cases(Such thing can exactly be used in calculators to make the numbers more readable by putting each three digits in one group, from the end toward the start), But suddenly when I entered this: jsfksdjfksdjfkdsjfskjdfkjsfn , I got the error message:"String iterator not decrementable".
Presumably I need to study much more pages of my book for C++ to be able to solve this myself, but for now I'm just being super-curious as a beginner. Why is that error message? Thanks in advance.
When you insert() into your string the iterators to it may get invalidated. In particular all iterators past the insertion point should be considered invalidated in all cases but also all iterators get invalidated if the std::string needs to get more memory: the internal buffer will be replaced by a bigger one, causing all existing iterator (and references and pointers) to string elements to be invalidated.
The easiest fix to the problem is to make sure that the string doesn't need to allocate more memory by reserve()ing enough space ahead of time. Since you add one space for every two characters, making sure that there is space for str1.size() + str1.size() / 2u characters should be sufficient:
str1.reserve(str1.size() + str1.size() / 2u);
for (auto PlaceCounter = str1.end(); PlaceCounter != str1.begin(); --PlaceCounter) {
// ...
}
Note that your algorithm is rather inefficient: it is an O(n2). The operation can be done with O(n) complexity instead. You'd resize the string to the appropriate size right from the start, filling the tail with some default characters and then copy the content moving from the end directly to the appropriate location.
str1.insert(str1.end()-k-i,' ');
This modifies the string the loop is iterating over. Specifically, this inserts something into the string.
With a std::string, much like a std::vector, insertion into a string will (may) invalidate all existing iterators pointing to the string. The first insertion performed by the shown code results in undefined behavior, as soon as the existing, now invalidated, iterators are referenced afterwards.
You will need to either replace your iterators with indexes into the string, or instead of modifying the existing string construct a new string, leaving the original string untouched.
Here is a possible C++ approach to try. From my tool bag, here is how I insert commas into a decimal string (i.e. s is expected to contain digits):
Input: "123456789"
// insert comma's from right (at implied decimal point) to left
std::string digiCommaL(std::string s)
{
// Note: decrementing a uint (such as size_t) will loop-around,
// and not underflow. Be sure to use int ...
int32_t sSize = static_cast<int32_t>(s.size()); // change to int
// ^^^^^-----------_____
if (sSize > 3) vvvvv
for (int32_t indx = (sSize - 3); indx > 0; indx -= 3)
s.insert(static_cast<size_t>(indx), 1, ',');
return(s);
}
Returns: "123,456,789"

Searching a function with the cctype library to find the number of characters that are digits in a range

Trying to solve one of the questions I was given by an instructor and I'm having trouble understanding how to call this properly.
I'm given a function that is linked to a test driver and my goal is to use the cstring library to find any numbers in the range of 0-9 in a randomly generated string object with this function.
int countDigits(char * const line) {return 0;}
So far this is what I have:
int countDigits(char * const line)
{
int i, index;
index = -1;
found = false;
i = 0;
while (i < *line && !found)
{
if (*line > 0 && *line < 9)
index++;
}
return 0;
}
My code not great and at the moment only results in an infinite loop and failure, any help would be very much appreciated.
Well, there are several problems with your function.
you want it to return the number of digits, but it returns 0 in any case
found is never set to anything than false and thus prohibits the while loop from stopping
also the comparison i<*line does not make much sense to me, I guess you want to check for the end of the line. Maybe you would want to look for a null termination "\0" (here again i is never set to anything else than 0)
and, if you want to compare single characters, you should look up the ASCII code of the characters you are comparing to (the digits 0-9 are not equal to codes 0-9)
Hope that is a start to improve your function.
There's a readymade for this called count_if:-
count_if(begin, end, [](char c){ return isdigit(c);});

member function erase() not working in a loop

I'm programming a little game; but stringname.erase() seems to be not working in a 'for-loop' , I want to understand why, I have other alternatives, but I don't understand what's going on in the following code.
More explications of my situation (Important!):
guess is a char.
'tmgword' and 'word' are of type string, and: tmgword = word ;
what I understand from my code:
in the first time,the 'while'-loop verifies if there is 'guess' in the string 'tmpgword'.
That is true and the for-loop is working fine, the right character(guess) that verifies the if-condition is erased.
in the second time: the 'while'-loop verifies again if there is 'guess' in the string 'tmpgword'.
that is true, and hence we go into the 'for-loop' again; and then into the 'if'-block ( the right char is found ) but here erase() don't work, and we enter in an infinite loop.
when the program finds the right index using 'for-loop', I break, and I start the search from the beginning in case there are more occurrences of guess.
the problem is: the program finds 'guess' again but erase() won't delete it!
can someone explain please. Here is my code:
while (tmpgword.find(guess,0) != string::npos )
{
for (i = 0; i < word.size(); i++) // verify the input;
{
if (word[i] == guess)
{
encword[i] = word[i];//I don't think this line is important
tmpgword.erase(tmpgword.begin() + i);
break;
}
}
}
After you do the first erase, the character positions in tmpgword are not the same as in word.
string::find() returns the position of the element when it's found, so you can use that instead of looping through word.
size_t pos = 0;
while ((pos = tmpgword.find(guess, pos)) != string::npos) {
tmpgword.erase(pos, 1);
}
I've used pos as the starting position for each call to find() so it starts from where it just erased, rather than searching from the beginning each time through (there can't be any occurrences before that, because they've all been erased).

C++ std::string::find always returns npos?

I'm trying to get this function to cut up a string, and then return it without whitespace and all lowercase. And to do this I'm trying to find a " " to see if a string, "The Time Traveller (for so it will be convenient to speak of him)", contains a space.
The code is as follows, passing in the string above to this function. It always returns string::npos. Any idea about the problem?
string chopstring(string tocut){
string totoken = "";
int start = 0;
while(tocut[0] == ' ' || tocut[0] == 10 || tocut[0 == 13]){
tocut.erase(0);
}
int finish = 0;
finish = tocut.find(" ", start);
if (finish == string::npos){
cout << "NPOS!" << endl;
}
for (int i = start; i < finish; i++){
totoken += tocut[i];
}
tocut.erase(start, finish);
return tokenize(totoken);
}
tocut.erase(0) is erasing all of tocut. The argument is the first character to erase, and the default length is "everything".
tocut[0 == 13] should probably be tocut[0] == 13. Those are very different statements. Also, please compare with character values ('\t') instead of integers. Incidentally, this in conjunction with the previous is your actual problem: tocut[0 == 13] becomes tocut[false], which is tocut[0], which is true. So the loop runs until tocut is empty, which is immediately (since you erase it all overzealously in the first go).
The net effect of the above two bugs is that when you reach the find statement, tocut is the empty string, which does not contain a space character. Moving on...
You can use the substr function instead of your loop to migrate from tocut to totoken.
Your last tocut.erase(start, finish) line isn't doing anything useful, since tocut was pass-by-value and you immediately return after that.
Actually, the majority of the code could be written much simpler (assuming my understanding that you want to remove all spaces is correct):
string chopstring(string tocut) {
std::string::size_type first(tocut.find_first_of(" \n\r"));
if (first != tocut.npos) {
tocut.substr(first);
}
tocut.erase(std::remove(tocut.begin(), tocut.end(), ' '), tocut.end());
return tokenize(tocut);
}
If you actually want to remove all whitespace, you probably want to use std::remove_if() with a suitable predicate.

Calculating the big-O complexity of this string match function?

can anyone help me calculate the complexity of the following?
I've written a strStr function for homework, and although it's not part of my homework, I want to figure out the complexity of it.
basically it takes a string, finds 1st occurence of substring, returns it's index,
I believe it O(n), because although it's double loop'd at most it'll run only n times, where n is the length of s1, am I correct?
int strStr( char s1[] , char s2[] ){
int haystackInd, needleInd;
bool found = false;
needleInd = haystackInd = 0;
while ((s1[haystackInd] != '\0') && (!found)){
while ( (s1[haystackInd] == s2[needleInd]) && (s2[needleInd] != '\0') ){
needleInd++;
haystackInd++;
}
if (s2[needleInd] == '\0'){
found = true;
}else{
if (needleInd != 0){
needleInd = 0;
}
else{
haystackInd++;
}
}
}
if (found){
return haystackInd - needleInd;
}
else{
return -1;
}
}
It is indeed O(n), but it is also not functioning properly. Consider finding "nand" in "nanand"
There is an O(n) solution to the problem though.
Actually, the outer loop could run 2n times (each iteration increments haystackInd at least once OR it sets needleInd to 0, but never sets needleInd to 0 in 2 successive iterations), but you end up w/ the same O(n) complexity.
Your algorithm isn't correct. The indices, haystackInd, in your solution are incorrect. But your conclusion based on your wrong algorithm was right. It is O(n), but just it can't find the first occurrence of the substring. The most trivial solution is like yours, compare string S2 to substrings starting from S1[0], S1[1],...And the running time is O(n^2). If you want O(n) one, you should check out KMP algorithm as templatetypedef mentioned above.