I found this on another stack question:
//http://stackoverflow.com/questions/3418231/c-replace-part-of-a-string-with-another-string
//
void replaceAll(std::string& str, const std::string& from, const std::string& to) {
size_t start_pos = 0;
while((start_pos = str.find(from, start_pos)) != std::string::npos) {
size_t end_pos = start_pos + from.length();
str.replace(start_pos, end_pos, to);
start_pos += to.length(); // In case 'to' contains 'from', like replacing 'x' with 'yx'
}
}
and my method:
string convert_FANN_array_to_binary(string fann_array)
{
string result = fann_array;
cout << result << "\n";
replaceAll(result, "-1 ", "0");
cout << result << "\n";
replaceAll(result, "1 ", "1");
return result;
}
which, for this input:
cout << convert_FANN_array_to_binary("1 1 -1 -1 1 1 ");
now, the output should be "110011"
here is the output of the method:
1 1 -1 -1 1 1 // original
1 1 0 1 // replacing -1's with 0's
11 1 // result, as it was returned from convert_FANN_array_to_binary()
I've been looking at the replaceAll code, and, I'm really not sure why it is replacing consecutive -1's with one 0, and then not returning any 0's (and some 1's) in the final result. =\
A complete code:
std::string ReplaceString(std::string subject, const std::string& search,
const std::string& replace) {
size_t pos = 0;
while ((pos = subject.find(search, pos)) != std::string::npos) {
subject.replace(pos, search.length(), replace);
pos += replace.length();
}
return subject;
}
If you need performance, here is a more optimized function that modifies the input string, it does not create a copy of the string:
void ReplaceStringInPlace(std::string& subject, const std::string& search,
const std::string& replace) {
size_t pos = 0;
while ((pos = subject.find(search, pos)) != std::string::npos) {
subject.replace(pos, search.length(), replace);
pos += replace.length();
}
}
Tests:
std::string input = "abc abc def";
std::cout << "Input string: " << input << std::endl;
std::cout << "ReplaceString() return value: "
<< ReplaceString(input, "bc", "!!") << std::endl;
std::cout << "ReplaceString() input string not changed: "
<< input << std::endl;
ReplaceStringInPlace(input, "bc", "??");
std::cout << "ReplaceStringInPlace() input string modified: "
<< input << std::endl;
Output:
Input string: abc abc def
ReplaceString() return value: a!! a!! def
ReplaceString() input string not changed: abc abc def
ReplaceStringInPlace() input string modified: a?? a?? def
The bug is in str.replace(start_pos, end_pos, to);
From the std::string doc at http://www.cplusplus.com/reference/string/string/replace/
string& replace ( size_t pos1, size_t n1, const string& str );
You are using an end-position, while the function expects a length.
So change to:
while((start_pos = str.find(from, start_pos)) != std::string::npos) {
str.replace(start_pos, from.length(), to);
start_pos += to.length(); // ...
}
Note: untested.
This is going to go in my list of 'just use a Boost library' answers, but here it goes anyway:
Have you considered Boost.String? It has more features than the standard library, and where features overlap, Boost.String has a more much more natural syntax, in my opinion.
I found the replace functions given in previous answers, all using in-place str.replace() call internally, very slow when working with a string of about 2 MB length. Specifically, I called something like ReplaceAll(str, "\r", ""), and on my particular device, with the text file containing a lot of newlines, it took about 27 seconds. I then replaced it with function just concatenating sub-strings in a new copy, and it took only about 1 seconds. Here is my version of ReplaceAll():
void replaceAll(string& str, const string& from, const string& to) {
if(from.empty())
return;
string wsRet;
wsRet.reserve(str.length());
size_t start_pos = 0, pos;
while((pos = str.find(from, start_pos)) != string::npos) {
wsRet += str.substr(start_pos, pos - start_pos);
wsRet += to;
pos += from.length();
start_pos = pos;
}
wsRet += str.substr(start_pos);
str.swap(wsRet); // faster than str = wsRet;
}
Greg
C++11 now includes the header <regex> which has regular expression functionality. From the docs:
// regex_replace example
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
int main ()
{
std::string s ("there is a subsequence in the string\n");
std::regex e ("\\b(sub)([^ ]*)"); // matches words beginning by "sub"
// using string/c-string (3) version:
std::cout << std::regex_replace (s,e,"sub-$2");
std::cout << std::endl;
return 0;
}
Of course, now you have two problems.
Try this:
#include <string>
string replace_str(string & str, const string & from, const string & to)
{
while(str.find(from) != string::npos)
str.replace(str.find(from), from.length(), to);
return str;
}
Related
I am trying to create a function that will take in a long string with " " and "," dividing different elements gained from a text file. "," represents a new line and " " represents the end of a number in the line. once split the strings should be part of an array.
string* splitstr(string text, string deli = " ")
{
int start = 0;
int end = text.find(deli);
string *numbers[end];
for (int i = 0; end != -1; i++)
{
numbers[i] = text.substr(start, end - start);
start = end + deli.size();
end = text.find(deli, start);
}
return numbers;
}
However I am facing a problem where I get this:
cannot convert 'std::string**' {aka 'std::__cxx11::basic_string<char>**'} to 'std::string*' {aka 'std::__cxx11::basic_string<char>*'} in return
How can I resolve this, or improve my function?
I was expecting to be able to store string elements in a string array and make the function return the array. I'm fairly new to programming and I started off with java so this logic might not work in c++.
You should avoid using arrays like this and instead use a vector
like as follows:
std::vector<std::string> splitstr(const std::string & text, const std::string & deli = " ")
{
int start = 0;
int end = text.find(deli);
std::vector<std::string> numbers;
for (int i = 0; end != -1; i++)
{
numbers.push_back(text.substr(start, end - start));
start = end + deli.size();
end = text.find(deli, start);
}
return numbers;
}
note this will not get the last item in the string if there is not a delimiter after it.
Use a combination of std::vector and std::string_view.
A string_view avoids the need to allocate new output strings, the caveat is that you can only use the split result if the original input string is still in memory.
#include <iostream>
#include <vector>
#include <string>
#include <string_view>
auto split(std::string_view string, std::string_view delimiters)
{
std::vector<std::string_view> substrings;
if (delimiters.size() == 0ul)
{
substrings.emplace_back(string);
return substrings;
}
auto start_pos = string.find_first_not_of(delimiters);
auto end_pos = start_pos;
auto max_length = string.length();
while (start_pos < max_length)
{
end_pos = std::min(max_length, string.find_first_of(delimiters, start_pos));
if (end_pos != start_pos)
{
substrings.emplace_back(&string[start_pos], end_pos - start_pos);
start_pos = string.find_first_not_of(delimiters, end_pos);
}
}
return substrings;
}
struct test_data_t
{
std::string input;
std::string delim;
};
int main()
{
using namespace std::string_view_literals;
std::vector<test_data_t> test_data
{
{ " ", " " },
{ "a", " " },
{ "aa", " " },
{ "a a", " " },
{ " a a", " " },
{ " a a", " " },
{ "a a ", " " },
{ "`The time has come,` the Walrus said, `To talk of many things : Of shoes -- and ships -- and sealing -- wax -- Of cabbages -- and kings -- And why the sea is boiling hot --And whether pigs have wings.`", "`, -.:" }
};
for (const auto& [input, delim] : test_data)
{
auto substrings = split(input, delim);
std::cout << substrings.size() << " : ";
bool comma = false;
for (const auto& substring : substrings)
{
if (comma) std::cout << ", ";
std::cout << substring;
comma = true;
}
std::cout << "\n";
}
return 0;
}
What would be easiest method to split a string using c++11?
I've seen the method used by this post, but I feel that there ought to be a less verbose way of doing it using the new standard.
Edit: I would like to have a vector<string> as a result and be able to delimitate on a single character.
std::regex_token_iterator performs generic tokenization based on a regex. It may or may not be overkill for doing simple splitting on a single character, but it works and is not too verbose:
std::vector<std::string> split(const string& input, const string& regex) {
// passing -1 as the submatch index parameter performs splitting
std::regex re(regex);
std::sregex_token_iterator
first{input.begin(), input.end(), re, -1},
last;
return {first, last};
}
Here is a (maybe less verbose) way to split string (based on the post you mentioned).
#include <string>
#include <sstream>
#include <vector>
std::vector<std::string> split(const std::string &s, char delim) {
std::stringstream ss(s);
std::string item;
std::vector<std::string> elems;
while (std::getline(ss, item, delim)) {
elems.push_back(item);
// elems.push_back(std::move(item)); // if C++11 (based on comment from #mchiasson)
}
return elems;
}
Here's an example of splitting a string and populating a vector with the extracted elements using boost.
#include <boost/algorithm/string.hpp>
std::string my_input("A,B,EE");
std::vector<std::string> results;
boost::algorithm::split(results, my_input, boost::is_any_of(","));
assert(results[0] == "A");
assert(results[1] == "B");
assert(results[2] == "EE");
Another regex solution inspired by other answers but hopefully shorter and easier to read:
std::string s{"String to split here, and here, and here,..."};
std::regex regex{R"([\s,]+)"}; // split on space and comma
std::sregex_token_iterator it{s.begin(), s.end(), regex, -1};
std::vector<std::string> words{it, {}};
I don't know if this is less verbose, but it might be easier to grok for those more seasoned in dynamic languages such as javascript. The only C++11 features it uses is auto and range-based for loop.
#include <string>
#include <cctype>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string s = "hello how are you won't you tell me your name";
vector<string> tokens;
string token;
for (const auto& c: s) {
if (!isspace(c))
token += c;
else {
if (token.length()) tokens.push_back(token);
token.clear();
}
}
if (token.length()) tokens.push_back(token);
return 0;
}
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
using namespace std;
vector<string> split(const string& str, int delimiter(int) = ::isspace){
vector<string> result;
auto e=str.end();
auto i=str.begin();
while(i!=e){
i=find_if_not(i,e, delimiter);
if(i==e) break;
auto j=find_if(i,e, delimiter);
result.push_back(string(i,j));
i=j;
}
return result;
}
int main(){
string line;
getline(cin,line);
vector<string> result = split(line);
for(auto s: result){
cout<<s<<endl;
}
}
My choice is boost::tokenizer but I didn't have any heavy tasks and test with huge data.
Example from boost doc with lambda modification:
#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>
#include <vector>
int main()
{
using namespace std;
using namespace boost;
string s = "This is, a test";
vector<string> v;
tokenizer<> tok(s);
for_each (tok.begin(), tok.end(), [&v](const string & s) { v.push_back(s); } );
// result 4 items: 1)This 2)is 3)a 4)test
return 0;
}
This is my answer. Verbose, readable and efficient.
std::vector<std::string> tokenize(const std::string& s, char c) {
auto end = s.cend();
auto start = end;
std::vector<std::string> v;
for( auto it = s.cbegin(); it != end; ++it ) {
if( *it != c ) {
if( start == end )
start = it;
continue;
}
if( start != end ) {
v.emplace_back(start, it);
start = end;
}
}
if( start != end )
v.emplace_back(start, end);
return v;
}
#include <string>
#include <vector>
#include <sstream>
inline vector<string> split(const string& s) {
vector<string> result;
istringstream iss(s);
for (string w; iss >> w; )
result.push_back(w);
return result;
}
Here is a C++11 solution that uses only std::string::find(). The delimiter can be any number of characters long. Parsed tokens are output via an output iterator, which is typically a std::back_inserter in my code.
I have not tested this with UTF-8, but I expect it should work as long as the input and delimiter are both valid UTF-8 strings.
#include <string>
template<class Iter>
Iter splitStrings(const std::string &s, const std::string &delim, Iter out)
{
if (delim.empty()) {
*out++ = s;
return out;
}
size_t a = 0, b = s.find(delim);
for ( ; b != std::string::npos;
a = b + delim.length(), b = s.find(delim, a))
{
*out++ = std::move(s.substr(a, b - a));
}
*out++ = std::move(s.substr(a, s.length() - a));
return out;
}
Some test cases:
void test()
{
std::vector<std::string> out;
size_t counter;
std::cout << "Empty input:" << std::endl;
out.clear();
splitStrings("", ",", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, empty delimiter:" << std::endl;
out.clear();
splitStrings("Hello, world!", "", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", no delimiter in string:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xyz", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string:" << std::endl;
out.clear();
splitStrings("abxycdxy!!xydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", input contains blank token:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", nothing after last delimiter:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", only delimiter exists string:" << std::endl;
out.clear();
splitStrings("xy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
}
Expected output:
Empty input:
0:
Non-empty input, empty delimiter:
0: Hello, world!
Non-empty input, non-empty delimiter, no delimiter in string:
0: abxycdxyxydefxya
Non-empty input, non-empty delimiter, delimiter exists string:
0: ab
1: cd
2: !!
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, input contains blank token:
0: ab
1: cd
2:
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, nothing after last delimiter:
0: ab
1: cd
2:
3: def
4:
Non-empty input, non-empty delimiter, only delimiter exists string:
0:
1:
One possible way of doing this is finding all occurrences of the split string and storing locations to a list. Then count input string characters and when you get to a position where there is a 'search hit' in the position list then you jump forward by 'length of the split string'. This approach takes a split string of any length. Here is my tested and working solution.
#include <iostream>
#include <string>
#include <list>
#include <vector>
using namespace std;
vector<string> Split(string input_string, string search_string)
{
list<int> search_hit_list;
vector<string> word_list;
size_t search_position, search_start = 0;
// Find start positions of every substring occurence and store positions to a hit list.
while ( (search_position = input_string.find(search_string, search_start) ) != string::npos) {
search_hit_list.push_back(search_position);
search_start = search_position + search_string.size();
}
// Iterate through hit list and reconstruct substring start and length positions
int character_counter = 0;
int start, length;
for (auto hit_position : search_hit_list) {
// Skip over substrings we are splitting with. This also skips over repeating substrings.
if (character_counter == hit_position) {
character_counter = character_counter + search_string.size();
continue;
}
start = character_counter;
character_counter = hit_position;
length = character_counter - start;
word_list.push_back(input_string.substr(start, length));
character_counter = character_counter + search_string.size();
}
// If the search string is not found in the input string, then return the whole input_string.
if (word_list.size() == 0) {
word_list.push_back(input_string);
return word_list;
}
// The last substring might be still be unprocessed, get it.
if (character_counter < input_string.size()) {
word_list.push_back(input_string.substr(character_counter, input_string.size() - character_counter));
}
return word_list;
}
int main() {
vector<string> word_list;
string search_string = " ";
// search_string = "the";
string text = "thetheThis is some text to test with the split-thethe function.";
word_list = Split(text, search_string);
for (auto item : word_list) {
cout << "'" << item << "'" << endl;
}
cout << endl;
}
What would be easiest method to split a string using c++11?
I've seen the method used by this post, but I feel that there ought to be a less verbose way of doing it using the new standard.
Edit: I would like to have a vector<string> as a result and be able to delimitate on a single character.
std::regex_token_iterator performs generic tokenization based on a regex. It may or may not be overkill for doing simple splitting on a single character, but it works and is not too verbose:
std::vector<std::string> split(const string& input, const string& regex) {
// passing -1 as the submatch index parameter performs splitting
std::regex re(regex);
std::sregex_token_iterator
first{input.begin(), input.end(), re, -1},
last;
return {first, last};
}
Here is a (maybe less verbose) way to split string (based on the post you mentioned).
#include <string>
#include <sstream>
#include <vector>
std::vector<std::string> split(const std::string &s, char delim) {
std::stringstream ss(s);
std::string item;
std::vector<std::string> elems;
while (std::getline(ss, item, delim)) {
elems.push_back(item);
// elems.push_back(std::move(item)); // if C++11 (based on comment from #mchiasson)
}
return elems;
}
Here's an example of splitting a string and populating a vector with the extracted elements using boost.
#include <boost/algorithm/string.hpp>
std::string my_input("A,B,EE");
std::vector<std::string> results;
boost::algorithm::split(results, my_input, boost::is_any_of(","));
assert(results[0] == "A");
assert(results[1] == "B");
assert(results[2] == "EE");
Another regex solution inspired by other answers but hopefully shorter and easier to read:
std::string s{"String to split here, and here, and here,..."};
std::regex regex{R"([\s,]+)"}; // split on space and comma
std::sregex_token_iterator it{s.begin(), s.end(), regex, -1};
std::vector<std::string> words{it, {}};
I don't know if this is less verbose, but it might be easier to grok for those more seasoned in dynamic languages such as javascript. The only C++11 features it uses is auto and range-based for loop.
#include <string>
#include <cctype>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string s = "hello how are you won't you tell me your name";
vector<string> tokens;
string token;
for (const auto& c: s) {
if (!isspace(c))
token += c;
else {
if (token.length()) tokens.push_back(token);
token.clear();
}
}
if (token.length()) tokens.push_back(token);
return 0;
}
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
using namespace std;
vector<string> split(const string& str, int delimiter(int) = ::isspace){
vector<string> result;
auto e=str.end();
auto i=str.begin();
while(i!=e){
i=find_if_not(i,e, delimiter);
if(i==e) break;
auto j=find_if(i,e, delimiter);
result.push_back(string(i,j));
i=j;
}
return result;
}
int main(){
string line;
getline(cin,line);
vector<string> result = split(line);
for(auto s: result){
cout<<s<<endl;
}
}
My choice is boost::tokenizer but I didn't have any heavy tasks and test with huge data.
Example from boost doc with lambda modification:
#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>
#include <vector>
int main()
{
using namespace std;
using namespace boost;
string s = "This is, a test";
vector<string> v;
tokenizer<> tok(s);
for_each (tok.begin(), tok.end(), [&v](const string & s) { v.push_back(s); } );
// result 4 items: 1)This 2)is 3)a 4)test
return 0;
}
This is my answer. Verbose, readable and efficient.
std::vector<std::string> tokenize(const std::string& s, char c) {
auto end = s.cend();
auto start = end;
std::vector<std::string> v;
for( auto it = s.cbegin(); it != end; ++it ) {
if( *it != c ) {
if( start == end )
start = it;
continue;
}
if( start != end ) {
v.emplace_back(start, it);
start = end;
}
}
if( start != end )
v.emplace_back(start, end);
return v;
}
#include <string>
#include <vector>
#include <sstream>
inline vector<string> split(const string& s) {
vector<string> result;
istringstream iss(s);
for (string w; iss >> w; )
result.push_back(w);
return result;
}
Here is a C++11 solution that uses only std::string::find(). The delimiter can be any number of characters long. Parsed tokens are output via an output iterator, which is typically a std::back_inserter in my code.
I have not tested this with UTF-8, but I expect it should work as long as the input and delimiter are both valid UTF-8 strings.
#include <string>
template<class Iter>
Iter splitStrings(const std::string &s, const std::string &delim, Iter out)
{
if (delim.empty()) {
*out++ = s;
return out;
}
size_t a = 0, b = s.find(delim);
for ( ; b != std::string::npos;
a = b + delim.length(), b = s.find(delim, a))
{
*out++ = std::move(s.substr(a, b - a));
}
*out++ = std::move(s.substr(a, s.length() - a));
return out;
}
Some test cases:
void test()
{
std::vector<std::string> out;
size_t counter;
std::cout << "Empty input:" << std::endl;
out.clear();
splitStrings("", ",", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, empty delimiter:" << std::endl;
out.clear();
splitStrings("Hello, world!", "", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", no delimiter in string:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xyz", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string:" << std::endl;
out.clear();
splitStrings("abxycdxy!!xydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", input contains blank token:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", nothing after last delimiter:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", only delimiter exists string:" << std::endl;
out.clear();
splitStrings("xy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
}
Expected output:
Empty input:
0:
Non-empty input, empty delimiter:
0: Hello, world!
Non-empty input, non-empty delimiter, no delimiter in string:
0: abxycdxyxydefxya
Non-empty input, non-empty delimiter, delimiter exists string:
0: ab
1: cd
2: !!
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, input contains blank token:
0: ab
1: cd
2:
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, nothing after last delimiter:
0: ab
1: cd
2:
3: def
4:
Non-empty input, non-empty delimiter, only delimiter exists string:
0:
1:
One possible way of doing this is finding all occurrences of the split string and storing locations to a list. Then count input string characters and when you get to a position where there is a 'search hit' in the position list then you jump forward by 'length of the split string'. This approach takes a split string of any length. Here is my tested and working solution.
#include <iostream>
#include <string>
#include <list>
#include <vector>
using namespace std;
vector<string> Split(string input_string, string search_string)
{
list<int> search_hit_list;
vector<string> word_list;
size_t search_position, search_start = 0;
// Find start positions of every substring occurence and store positions to a hit list.
while ( (search_position = input_string.find(search_string, search_start) ) != string::npos) {
search_hit_list.push_back(search_position);
search_start = search_position + search_string.size();
}
// Iterate through hit list and reconstruct substring start and length positions
int character_counter = 0;
int start, length;
for (auto hit_position : search_hit_list) {
// Skip over substrings we are splitting with. This also skips over repeating substrings.
if (character_counter == hit_position) {
character_counter = character_counter + search_string.size();
continue;
}
start = character_counter;
character_counter = hit_position;
length = character_counter - start;
word_list.push_back(input_string.substr(start, length));
character_counter = character_counter + search_string.size();
}
// If the search string is not found in the input string, then return the whole input_string.
if (word_list.size() == 0) {
word_list.push_back(input_string);
return word_list;
}
// The last substring might be still be unprocessed, get it.
if (character_counter < input_string.size()) {
word_list.push_back(input_string.substr(character_counter, input_string.size() - character_counter));
}
return word_list;
}
int main() {
vector<string> word_list;
string search_string = " ";
// search_string = "the";
string text = "thetheThis is some text to test with the split-thethe function.";
word_list = Split(text, search_string);
for (auto item : word_list) {
cout << "'" << item << "'" << endl;
}
cout << endl;
}
Last week I got an homework to write a function: the function gets a string and a char value and should divide the string in two parts, before and after the first occurrence of the existing char.
The code worked but my teacher told me to do it again, because it is not well written code. But I don't understand how to make it better. I understand so far that defining two strings with white spaces is not good, but i get out of bounds exceptions otherwise. Since the string input changes, the string size changes everytime.
#include <iostream>
#include <string>
using namespace std;
void divide(char search, string text, string& first_part, string& sec_part)
{
bool firstc = true;
int counter = 0;
for (int i = 0; i < text.size(); i++) {
if (text.at(i) != search && firstc) {
first_part.at(i) = text.at(i);
}
else if (text.at(i) == search&& firstc == true) {
firstc = false;
sec_part.at(counter) = text.at(i);
}
else {
sec_part.at(counter) = text.at(i);
counter++;
}
}
}
int main() {
string text;
string part1=" ";
string part2=" ";
char search_char;
cout << "Please enter text? ";
getline(cin, text);
cout << "Please enter a char: ? ";
cin >> search_char;
divide(search_char,text,aprt1,part2);
cout << "First string: " << part1 <<endl;
cout << "Second string: " << part2 << endl;
system("PAUSE");
return 0;
}
I would suggest you, learn to use c++ standard functions. there are plenty utility function that can help you in programming.
void divide(const std::string& text, char search, std::string& first_part, std::string& sec_part)
{
std::string::const_iterator pos = std::find(text.begin(), text.end(), search);
first_part.append(text, 0, pos - text.begin());
sec_part.append(text, pos - text.begin());
}
int main()
{
std::string text = "thisisfirst";
char search = 'f';
std::string first;
std::string second;
divide(text, search, first, second);
}
Here I used std::find that you can read about it from here and also Iterators.
You have some other mistakes. you are passing your text by value that will do a copy every time you call your function. pass it by reference but qualify it with const that will indicate it is an input parameter not an output.
Why is your teacher right ?
The fact that you need to initialize your destination strings with empty space is terrible:
If the input string is longer, you'll get out of bound errors.
If it's shorter, you got wrong answer, because in IT and programming, "It works " is not the same as "It works".
In addition, your code does not fit the specifications. It should work all the time, independently of the current value which is stored in your output strings.
Alternative 1: your code but working
Just clear the destination strings at the beginning. Then iterate as you did, but use += or push_back() to add chars at the end of the string.
void divide(char search, string text, string& first_part, string& sec_part)
{
bool firstc = true;
first_part.clear(); // make destinations strings empty
sec_part.clear();
for (int i = 0; i < text.size(); i++) {
char c = text.at(i);
if (firstc && c != search) {
first_part += c;
}
else if (firstc && c == search) {
firstc = false;
sec_part += c;
}
else {
sec_part += c;
}
}
}
I used a temporary c instead of text.at(i) or text\[i\], in order to avoid multiple indexing But this is not really required: nowadays, optimizing compilers should produce equivalent code, whatever variant you use here.
Alternative 2: use string member functions
This alternative uses the find() function, and then constructs a string from the start until that position, and another from that position. There is a special case when the character was not found.
void divide(char search, string text, string& first_part, string& sec_part)
{
auto pos = text.find(search);
first_part = string(text, 0, pos);
if (pos== string::npos)
sec_part.clear();
else sec_part = string(text, pos, string::npos);
}
As you understand yourself these declarations
string part1=" ";
string part2=" ";
do not make sense because the entered string in the object text can essentially exceed the both initialized strings. In this case using the string method at can result in throwing an exception or the strings will have trailing spaces.
From the description of the assignment it is not clear whether the searched character should be included in one of the strings. You suppose that the character should be included in the second string.
Take into account that the parameter text should be declared as a constant reference.
Also instead of using loops it is better to use methods of the class std::string such as for example find.
The function can look the following way
#include <iostream>
#include <string>
void divide(const std::string &text, char search, std::string &first_part, std::string &sec_part)
{
std::string::size_type pos = text.find(search);
first_part = text.substr(0, pos);
if (pos == std::string::npos)
{
sec_part.clear();
}
else
{
sec_part = text.substr(pos);
}
}
int main()
{
std::string text("Hello World");
std::string first_part;
std::string sec_part;
divide(text, ' ', first_part, sec_part);
std::cout << "\"" << text << "\"\n";
std::cout << "\"" << first_part << "\"\n";
std::cout << "\"" << sec_part << "\"\n";
}
The program output is
"Hello World"
"Hello"
" World"
As you can see the separating character is included in the second string though I think that maybe it would be better to exclude it from the both strings.
An alternative and in my opinion more clear approach can look the following way
#include <iostream>
#include <string>
#include <utility>
std::pair<std::string, std::string> divide(const std::string &s, char c)
{
std::string::size_type pos = s.find(c);
return { s.substr(0, pos), pos == std::string::npos ? "" : s.substr(pos) };
}
int main()
{
std::string text("Hello World");
auto p = divide(text, ' ');
std::cout << "\"" << text << "\"\n";
std::cout << "\"" << p.first << "\"\n";
std::cout << "\"" << p.second << "\"\n";
}
Your code will only work as long the character is found within part1.length(). You need something similar to this:
void string_split_once(const char s, const string & text, string & first, string & second) {
first.clear();
second.clear();
std::size_t pos = str.find(s);
if (pos != string::npos) {
first = text.substr(0, pos);
second = text.substr(pos);
}
}
The biggest problem I see is that you are using at where you should be using push_back. See std::basic_string::push_back. at is designed to access an existing character to read or modify it. push_back appends a new character to the string.
divide could look like this :
void divide(char search, string text, string& first_part,
string& sec_part)
{
bool firstc = true;
for (int i = 0; i < text.size(); i++) {
if (text.at(i) != search && firstc) {
first_part.push_back(text.at(i));
}
else if (text.at(i) == search&& firstc == true) {
firstc = false;
sec_part.push_back(text.at(i));
}
else {
sec_part.push_back(text.at(i));
}
}
}
Since you aren't handling exceptions, consider using text[i] rather than text.at(i).
I'm trying not to use any storage containers. I don't know if it's even possible. Here is what I have so far. (I'm getting a segmentation fault).
#include <iostream>
#include <string>
using namespace std;
void foo(string s)
{
size_t pos;
pos = s.find(' ');
if(pos == string::npos)
return;
foo(s.erase(0, pos));
cout << s.substr(0, pos) << " ";
}
int main()
{
foo("hello world");
return 0;
}
I know there's probably many things wrong with this code. So rip away. I'm eager to learn. I'm trying to imitate a post order print as you would do in a reverse print of a singly linked list. Thanks.
EDIT:
An example:
"You are amazing" becomes "amazing are You"
The segfault is a stack overflow.
foo( "hello world" ) erases everything up to the first space (" world") and recurses.
foo( " world" ) erases everything up to the first space (" world") and recurses.
foo( " world" )... you get the idea.
Also, once you called foo( s.erase( 0, pos ) ), trying to print s.substr( 0, pos ) after the recursion returns does not make sense. You need to save the substring somewhere before you erase it, so you still have it to print afterwards.
void foo(string s)
{
size_t pos = s.find(' '); // declare-and-use in one line
string out = s.substr( 0, pos ); // saving the substring
if ( pos != string::npos )
{
foo( s.erase( 0, pos + 1 ) ); // recurse, skipping the space...
cout << " "; // ...but *print* the space
}
cout << out; // print the saved substring
}
The problem is that your recursion continues until you run out of memory.
Pay attention to this line:
if(pos == string::npos)
when your erase the substring you don't erase the white space so in the next recursion s.find returns pos = 0 which means that your recursion never ends.
Here is a code that works. Also note that I added a level variable to be able to control the behaviour on the first level (in this case add a endl)
#include <iostream>
#include <string>
using namespace std;
void foo(string s, int l)
{
size_t pos;
pos = s.find(' ');
if(pos == string::npos){
cout << s << " ";
return;
}
string temp = s.substr(0, pos);
foo(s.erase(0, pos+1),l+1);
cout << temp << " ";
if(l == 0)
cout << endl;
}
int main()
{
foo("hello world", 0);
return 0;
}
An approach to recursion, which may allow your compiler to transform automatically to iteration, is to accumulate the result in the function arguments. This will be familiar if you've written recursive functions in any of the Lisp family of languages:
#include <iostream>
#include <string>
std::string reverse_words(const std::string& s, const std::string& o = {})
{
using std::string;
const auto npos = string::npos;
static const string whitespace(" \n\r\t");
// find start and end of the first whitespace block
auto start = s.find_first_of(whitespace);
if (start == npos)
return s + o;
auto end = s.find_first_not_of(whitespace, start);
if (end == npos)
return s + o;
auto word = s.substr(0, start);
auto space = s.substr(start, end-start);
auto rest = s.substr(end);
return reverse_words(rest, space + word + o);
}
int main()
{
std::cout << reverse_words("hello to all the world") << std::endl;
std::cout << reverse_words(" a more difficult\n testcase ") << std::endl;
return 0;
}
I tried to make a brief example by using standard algorithms. I also handles more kinds of spaces than just standard whitespace (tabs for instance).
#include <cctype>
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
void print_reverse(string words) {
// Termination condition
if(words.empty())
return;
auto predicate = (int(*)(int))isspace;
auto sit = begin(words);
auto wit = find_if_not(sit, end(words), predicate);
auto nit = find_if (wit, end(words), predicate);
print_reverse(string(nit, end(words)));
// word spaces
cout << string(wit, nit) << string(sit, wit);
}
int main() {
string line;
getline(cin, line);
print_reverse(line);
cout << endl;
}
Here is an example run:
$ ./print-out-the-words-of-a-line-in-reverse-order-through-recursion
You are amazing
amazing are You
The key is in adding 1 to pos in the erase statement.
So try:
#include <iostream>
#include <string>
using namespace std;
void foo(string s)
{
size_t pos;
pos = s.find(' ');
if(pos == string::npos)
{
cout << s << " ";
return;
}
string out = s.substr(0, pos);
foo(s.erase(0, pos+1));
cout << out << " ";
}
int main()
{
foo("hello world");
cout << endl;
return 0;
}
EDIT
Alternatively you could use a char* instead of a std::string, then you do not need to make a temp variable. Try it online.
#include <iostream>
#include <cstring>
void foo(char* s)
{
char* next = std::strchr(s, ' ');
if(next != nullptr)
{
foo(next + 1);
*next = 0;
}
std::cout << s << " ";
}
int main()
{
char s[] = "You are amazing";
foo(s);
std::cout << std::endl;
}
The problem is that you're not doing anything with the last word and you're not doing anything with the remaining chunk.
If you have a recursive reverse printer, you'll want something like this (pseudocode):
def recursive-reverse(string) {
pos = string.find-last(" ");
if pos doesn't exist {
print string;
return;
} else {
print string.create-substring(pos+1, string.end);
recursive-reverse(string.create-substring(0, pos));
}
}
To implement this in C++:
#include <iostream>
#include <string>
void recursive_reverse(std::string &s) {
// find the last space
size_t pos = s.find_last_of(" ");
// base case - there's no space
if(pos == std::string::npos) {
// print only word in the string
std::cout << s << std::endl;
// end of recursion
return;
} else {
// grab everything after the space
std::string substring = s.substr(pos+1);
// print it
std::cout << substring << std::endl;
// grab everything before the space
std::string rest = s.substr(0, pos);
// recursive call on everything before the space
recursive_reverse(rest);
}
}
int main() {
std::string s("Hello World!");
recursive_reverse(s);
return 0;
}
ideone