Replace 3 or more occurrences of character within string - c++

I would like to find 3 or more occurrences of a within a std::string in order to replace.
For example:
std::string foo = "This is a\n\n\n test";
std::string bar = "This is a\n\n\n\n test";
std::string baz = "This is a\n\n\n\n\n test";
std::string boo = "This is a\n\n\n\n\n\n test";
// ... etc.
Should all be converted to:
std::string expectedResult = "This is a\n\n test";
Vanilla stl would be appreciated (no regexp libs or boost) if possible.

This should find consecutive \n and replace them:
size_type i = foo.find("\n\n\n");
if (i != string::npos) {
size_type j = foo.find_first_not_of('\n', i);
foo.replace(i, j - i, "\n\n");
}

Write a function to process each string you are interested in modifying:
Read each string one character a time. Keep track of 2 char variables: a and b. For each character c you read, do:
if ( a != b ) {
a = b;
b = c;
} else if ( a == b ) {
if ( a == c ) {
// Put code here to remove c from your string at this index
}
}
I am not 100% sure if you can use something from STL directly to accomplish what you are asking, but as you can see this logic isn't much code to implement.

You can use find and replace. (this will replace "\n\n\n..." -> "\n\n"). You can pass position to string::find so that you don't have to search the start of the string again (optimization)
int pos = 0;
while ((pos = s.find ("\n\n\n", pos)) != s.npos)
s.replace (pos, 3, "\n\n", 2);
And this will replace "\n\n\n\n.." -> "\n"
int pos = 0;
while ((pos = s.find ("\n\n", pos)) != s.npos)
s.replace (pos, 2, "\n", 1);

Related

C++ find word in string multiple times

I'm currently trying to find a word inside a string. I'm using string::find(). However this only finds the word in the string for one time.
string a = "Dog";
string b = "I have a dog, his name is dog";
if (b.find(a) != string::npos)
{
...
}
Is there a way to scan string b to see how many times the word "dog" appears?
Use a loop until you can't find anymore.
std::size_t count = 0, pos = 0;
while ((pos = b.find(a, pos)) != std::string::npos) {
pos += a.size(); // Make sure the current one is skipped
count++;
}
// Now you have count

Parse String Between Brackets

I have a string that goes like this:
Room -> Subdiv("X", 0.5, 0.5) { sleep | work } : 0.5
I need to somehow extract the 2 strings between {} , i.e. sleep and work. The format is strict, there can be just 2 words between the brackets, the words can change though. The text before and after the brackets can also change. My initial way of doing it was:
string split = line.substr(line.find("Subdiv(") + _count_of_fchars);
split = split.substr(4, axis.find(") { "));
split = split.erase(split.length() - _count_of_chars);
However, I do realised that this is no going to work if the strings in side the brackets are changed o anything with a different length.
How can this be done? Thanks!
Without hard-coding any numbers:
Find A as the index of the first "{" from the end of the string, search backward.
Find B as the index of the first "|" from the position of "{", search forward.
Find C as the index of the first "}" from the position of "|", search forward.
The substring between B and A gives you the first string. While the substring between C and B gives you the first string. You can include the spaces in your substring search, or take them out later.
std::pair<std::string, std::string> SplitMyCustomString(const std::string& str){
auto first = str.find_last_of('{');
if(first == std::string::npos) return {};
auto mid = str.find_first_of('|', first);
if(mid == std::string::npos) return {};
auto last = str.find_first_of('}', mid);
if(last == std::string::npos) return {};
return { str.substr(first+1, mid-first-1), str.substr(mid+1, last-mid-1) };
}
For Trimming the spaces:
std::string Trim(const std::string& str){
auto first = str.find_first_not_of(' ');
if(first == std::string::npos) first = 0;
auto last = str.find_last_not_of(' ');
if(last == std::string::npos) last = str.size();
return str.substr(first, last-first+1);
}
Demo
Something like:
unsigned open = str.find("{ ") + 2;
unsigned separator = str.find(" | ");
unsigned close = str.find(" }") - 2;
string strNew1 = str.substr (open, separator - open);
string strNew2 = str.substr (separator + 3, close - separator);
Even though you said that the amount of words to find is fixed I made a little more flexible example using a regular expression. However you could still achieve the same result using Мотяs answer.
std::string s = ("Room -> Subdiv(\"X\", 0.5, 0.5) { sleep | work } : 0.5")
std::regex rgx("\\{((?:\\s*\\w*\\s*\\|?)+)\\}");
std::smatch match;
if (std::regex_search(s, match, rgx) && match.size() == 2) {
// match[1] now contains "sleep | work"
std::istringstream iss(match[1]);
std::string token;
while (std::getline(iss, token, '|')) {
std::cout << trim(token) << std::endl;
}
}
trim removes leading and trailing spaces and the input string could easily be expanded to look like this: "...{ sleep | work | eat }...".
Here is the complete code.

string::replace not working correctly 100% of the time?

I'm trying to replace every space character with '%20' in a string, and I'm thinking of using the built in replace function for the string class.
Currently, I have:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = 0; i < len; i++) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}
When I pass in the string "_a_b_c_e_f_g__", where the underscores represent space, my output is "%20a%20b%20c%20e_f_g__". Again, underscores represent space.
Why is that the spaces near the beginning of the string are replaced, but the spaces towards the end aren't?
You are making s longer with each replacement, but you are not updating len which is used in the loop condition.
Modifying the string that you are just scanning is like cutting the branch under your feet. It may work if you are careful, but in this case you aren't.
Namely, you take the string len at the beginning but with each replacement your string gets longer and you are pushing the replacement places further away (so you never reach all of them).
The correct way to cut this branch is from its end (tip) towards the trunk - this way you always have a safe footing:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = len - 1; i >= 0; i--) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}
You're growing the string but only looping to its initial size.
Looping over a collection while modifying it is very prone to error.
Here's a solution that doesn't:
void replace(string& s)
{
string s1;
std::for_each(s.begin(),
s.end(),
[&](char c) {
if (c == ' ') s1 += "%20";
else s1 += c;
});
s.swap(s1);
}
As others have already mentioned, the problem is you're using the initial string length in your loop, but the string gets bigger along the way. Your loop never reaches the end of the string.
You have a number of ways to fix this. You can correct your solution and make sure you go to the end of the string as it is now, not as it was before you started looping.
Or you can use #molbdnilo 's way, which creates a copy of the string along the way.
Or you can use something like this:
std::string input = " a b c e f g ";
std::string::size_type pos = 0;
while ((pos = input.find(' ', pos)) != std::string::npos)
{
input.replace(pos, 1, "%20");
}
Here's a function that can make it easier for you:
string replace_char_str(string str, string find_str, string replace_str)
{
size_t pos = 0;
for ( pos = str.find(find_str); pos != std::string::npos; pos = str.find(find_str,pos) )
{
str.replace(pos ,1, replace_str);
}
return str;
}
So if when you want to replace the spaces, try it like this:
string new_str = replace_char_str(yourstring, " ", "%20");
Hope this helps you ! :)

Finding all wanted words in a string

I have a string which is too long, I want to find and locate all of the wanted words. For example I want to find the locations of all "apple"s in the string. Can you tell me how I do that?
Thanks
Apply repeatedly std::string::find if you are using C++ strings, or std::strstr if you are using C strings; in both cases, at each iteration start to search n characters after the last match, where n is the length of your word.
std::string str="one apple two apples three apples";
std::string search="apple";
for(std::string::size_type pos=0; pos<str.size(); pos+=search.size())
{
pos=str.find(search, pos);
if(pos==std::string::npos)
break;
std::cout<<"Match found at: "<<pos<<std::endl;
}
(link)
Use a loop which repeatedly calls std::string::find; on each iteration, you start finding beyond your last hit:
std::vector<std::string::size_type> indicesOf( const std::string &s,
const std::string &needle )
{
std::vector<std::string::size_type> indices;
std::string::size_type p = 0;
while ( p < s.size() ) {
std::string::size_type q = s.find( needle, p );
if ( q == std::string::npos ) {
break;
}
indices.push_back( q );
p = q + needle.size(); // change needle.size() to 1 for overlapping matches
}
return indices;
}
void findApples(const char* someString)
{
const char* loc = NULL;
while ((loc = strstr(someString, "apple")) != NULL) {
// do something
someString = loc + strlen("apple");
}
}

Split array of chars into two arrays of chars

I would like to split one array of char containing two "strings "separated by '|' into two arays of char.
Here is my sample code.
void splitChar(const char *text, char *text1, char *text2)
{
for (;*text!='\0' && *text != '|';) *text1++ = *text++;
*text1 = '\0';
for (;*++text!='\0';) *text2++ = *text;
*text2 = '\0';
}
int main(int argc, char* argv[])
{
char *text = "monday|tuesday", text1[255], text2 [255];
splitChar (text, text1, text2);
return 0;
}
I have two questions:
How to further improve this code in C (for example rewrite it in 1 for cycle).
How to rewrite this code in C++?
If you wan to write it in C++, use the STL
string s = "monday|tuesday";
int pos = s.find('|');
if(pos == string::npos)
return 1;
string part1 = s.substr(0, pos);
string part2 = s.substr(pos+1, s.size() - pos);
For A, using internal libraries:
void splitChar(const char *text, char *text1, char *text2)
{
int len = (strchr(text,'|')-text)*sizeof(char);
strncpy(text1, text, len);
strcpy(text2, text+len+1);
}
I don't know about A), but for B), Here's a method from a utility library I use in various projects, showing how to split any number of words into a vector. It's coded to split on space and tab, but you could pass that in as an additional parameter if you wanted. It returns the number of words split:
unsigned util::split_line(const string &line, vector<string> &parts)
{
const string delimiters = " \t";
unsigned count = 0;
parts.clear();
// skip delimiters at beginning.
string::size_type lastPos = line.find_first_not_of(delimiters, 0);
// find first "non-delimiter".
string::size_type pos = line.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos)
{
// found a token, add it to the vector.
parts.push_back(line.substr(lastPos, pos - lastPos));
count++;
// skip delimiters. Note the "not_of"
lastPos = line.find_first_not_of(delimiters, pos);
// find next "non-delimiter"
pos = line.find_first_of(delimiters, lastPos);
}
return count;
}
Probably one of these solutions will work: Split a string in C++?
Take a look at the example given here: strtok, wcstok, _mbstok
I've found a destructive split is the best balance of performance and flexibility.
void split_destr(std::string &str, char split_by, std::vector<char*> &fields) {
fields.push_back(&str[0]);
for (size_t i = 0; i < str.size(); i++) {
if (str[i] == split_by) {
str[i] = '\0';
if (i+1 == str.size())
str.push_back('\0');
fields.push_back(&str[i+1]);
}
}
}
Then a non-destructive version for lazies.
template<typename C>
void split_copy(const std::string &str_, char split_by, C &container) {
std::string str = str_;
std::vector<char*> tokens;
parse::split_destr(str, split_by, tokens);
for (size_t i = 0 ; i < tokens.size(); i++)
container.push_back(std::string( tokens[i] ));
}
I arrived at this when things like boost::Tokenizer have fallen flat on their face dealing with gb+ size files.
I apologize advance for my answer :) No one should try this at home.
To answer the first part of your question.
A] How to further improve this code in C (for example rewrite it in 1 for cycle).
The complexity of this algorithm will depend on where the position of '|' is in the string but this example only works for 2 strings separated by a '|'. You can easily modify it later for more than that.
#include <stdio.h>
void splitChar(char *text, char **text1, char **text2)
{
char * temp = *text1 = text;
while (*temp != '\0' && *temp != '|') temp++;
if (*temp == '|')
{
*temp ='\0';
*text2 = temp + 1;
}
}
int main(int argc, char* argv[])
{
char text[] = "monday|tuesday", *text1,*text2;
splitChar (text, &text1, &text2);
printf("%s\n%s\n%s", text,text1,text2);
return 0;
}
This works because c-style arrays use the null character to terminate the string. Since initializing a character string with "" will add a null char to the end, all you would have to do is replace the occurrences of '|' with the null character and assign the other char pointers to the next byte past the '|'.
You have to make sure to initialize your original character string with [] because that tells the compiler to allocate storage for your character array where char * might initialize the string in a static area of memory that can't be changed.