Having trouble cleaning out spaces in a string

Having trouble cleaning out spaces in a string - c++

Lets say I have three strings like that
"RES-003 :"
"RES 007 :"
" RES-015 :"
I want them to look like that when I show them
"RES-003:"
"RES 007:"
"RES-015:"
I was trying to solve that, however I can't get it right because when I try to clean up the spaces in between the colon and the numbers I delete the space in the second string "RES 007 :" so it changes it to "RES007:".
My trim function looks like that.
std::string& Reservation::trim(std::string& s) {
bool valid = true;
s.erase(0, s.find_first_not_of(' '));
s.erase(s.find_last_not_of(' ') + 1);
while (valid)
{
if (s.find(" ") != std::string::npos) {
s.erase(s.find(" "), 1);
valid = true;
if (s.find(" ") != std::string::npos) {
s.erase(s.find(" "), 1);
valid = true;
}
}
else
valid = false;
}
return s;
}
What can I do to improve it or I have to completely replace it?

Making the assumption, based on the examples you provided, that the format of your string is:
\s*[A-Z]{3}[\s\-][0-9]{3}\s*:\s*
#include <string>
#include <vector>
#include <algorithm>
#include <iostream>
int main()
{
auto trimStr = [](const auto& str) -> std::string
{
auto first = std::find_if(str.begin(), str.end(),
[](unsigned char c){ return !std::isspace(c); });
return std::string(first, first+7) + ":";
};
std::vector<std::string> examples =
{
"RES-003 :",
"RES 007 :",
" RES-015 :"
};
for(const auto& elem : examples)
{
std::cout << trimStr(elem) << "\n";
}
}
Godbolt

Related

Reverse Word Wise using basics of C++

The question Goes like this ( my code in the last )
Reverse the given string word wise. That is, the last word in given string should come at 1st place, last second word at 2nd place and so on. Individual words should remain as it is.
Input format :
String in a single line
Output format :
Word wise reversed string in a single line
Constraints :
0 <= |S| <= 10^7
where |S| represents the length of string, S.
Sample Input 1:
Welcome to Coding Ninjas
Sample Output 1:
Ninjas Coding to Welcome
Sample Input 2:
Always indent your code
Sample Output 2:
code your indent Always
This code is in c++:
void reverseStringWordWise(char input[])
{
// Length
int count=0;
for(int i=0; input[i]!='\0'; i++)
{
count++;
}
int len=count;
//reversing the complete string
int i=0;
int j=len-1;
while(i<j)
{
char temp=input[i];
input[i]=input[j];
input[j]=temp;
i++;
j--;
}
//individual reverse
int k=0;
int a,b;
for(;k<len;)
{
for(;input[k]==' ';k++)
{
b=k-1;
break;
}
while(a<b)
{
char temp=input[a];
input[a]=input[b];
input[b]=temp;
}
}
}
can someone help me with the logic of reversing the individual word, c or c++ works.

I would get rid of the char[]s and use std::string.
Example:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
#include <list>
void reverseStringWordWise(std::string input) {
std::list<std::string> words;
for(auto sit = input.begin();;) {
// find a space from `sit` and forward
auto eit = std::find(sit, input.end(), ' ');
// store the word first in the list
words.emplace_front(sit, eit);
if(eit == input.end()) break; // last word, break out
sit = std::next(eit); // start next search after the found space
}
// print result
for(auto& word : words) std::cout << word << ' ';
std::cout << '\n';
}
int main() {
reverseStringWordWise("Hello world");
}
Output
world Hello
If you don't want the trailing space after the last word:
void reverseStringWordWise(std::string inp) {
std::list<std::string> words;
for (auto sit = inp.begin(), eit = sit; eit != inp.end(); sit = eit + 1) {
eit = std::find(sit, inp.end(), ' ');
words.emplace_front(sit, eit);
}
if(auto it = words.begin(); it != words.end()) {
std::cout << *it;
for(++it; it != words.end(); ++it) std::cout << ' ' << *it;
}
std::cout << '\n';
}

You can use the standard library algorithms to shorten the code. If you've got start and end iterators, you can use std::reverse, you can use std::strlen to calculate the end iterator and you can use std::find to identify the next word boundary. Assuming every word seperator is a space character, this could result in the following algorithm
void reverseStringWordWise(char input[])
{
if (input[0] == '\0')
{
return;
}
auto const end = input + std::strlen(input);
std::reverse(input, end);
auto wordEnd = input;
while(true)
{
auto wordStart = wordEnd;
wordEnd = std::find(wordStart, end, ' ');
std::reverse(wordStart, wordEnd);
if (wordEnd == end)
{
break;
}
++wordEnd;
}
}
int main() {
char input1[] = "Welcome to Coding Ninjas";
char input2[] = "Always indent your code";
reverseStringWordWise(input1);
reverseStringWordWise(input2);
std::cout << input1 << '\n'
<< input2 << '\n';
}

Here is another solution, using std::stack:
#include <stack>
#include <string>
#include <sstream>
#include <iostream>
void reverseStringWordWise(std::string input)
{
std::stack<std::string> wordStack;
std::istringstream strm(input);
std::string word;
// push each word on the stack
while (strm >> word)
wordStack.push(word);
// pop stack for each word
while (!wordStack.empty())
{
std::cout << wordStack.top() << ' ';
wordStack.pop();
}
}
int main()
{
reverseStringWordWise("Welcome to Coding Ninjas");
}
Output:
Ninjas Coding to Welcome

With boost, it's easier:
#include <boost/tokenizer.hpp>
#include <boost/algorithm/string/join.hpp>
std::string reverse_words(std::string s)
{
boost::char_delimiters_separator<char> const sep(" ");
boost::tokenizer<boost::char_delimiters_separator<char>> const words(s, sep);
return boost::algorithm::join(std::reverse(words.begin(), words.end()), " ");
}
In C++, you are expected to use algorithms and not bother with rewriting everything from scratch (unless you are writing a library such as boost). Especially, as others mentioned, reading/writing to arrays directly is most often going to end up with errors (your code is missing several boundary checks).

Here is a modern C++ version. The algorithm is to reverse each word and then reverse the whole string. The code takes the string by value so it has a copy of the string and then modifies the string in-place and returns it. It uses no extra memory.
#include <iostream>
#include <string>
#include <algorithm>
std::string word_reverse(std::string s) {
auto it = s.begin();
while(it != s.end()) {
auto it2 = std::find(it, s.end(), ' ');
std::reverse(it, it2);
it = it2 + (it2 != s.end());
}
std::reverse(s.begin(), s.end());
return s;
}
int main() {
std::string s = "Always indent your code";
std::string t = word_reverse(s);
std::cout << s << std::endl;
std::cout << t << std::endl;
}

Use Regex:-
// ```c++
#include <regex>
#include <iterator>
#include <iostream>
#include <string>
using it = std::regex_iterator<std::string::const_reverse_iterator>;
int main() {
const std::string s = "Always indent your code.";
std::regex regex("[\\w]+");
for (auto i = it(s.rbegin(), s.rend(), regex); i != it(); ++i) {
auto w = i->str();
std::copy(std::rbegin(w), std::rend(w), std::ostream_iterator<char>(std::cout));
std::cout << ' ';
}
}

Why is my string extraction function using back referencing in regex not working as intended?

Extraction Function
string extractStr(string str, string regExpStr) {
regex regexp(regExpStr);
smatch m;
regex_search(str, m, regexp);
string result = "";
for (string x : m)
result = result + x;
return result;
}
The Main Code
#include <iostream>
#include <regex>
using namespace std;
string extractStr(string, string);
int main(void) {
string test = "(1+1)*(n+n)";
cout << extractStr(test, "n\\+n") << endl;
cout << extractStr(test, "(\\d)\\+\\1") << endl;
cout << extractStr(test, "([a-zA-Z])[+-/*]\\1") << endl;
cout << extractStr(test, "([a-zA-Z])[+-/*]([a-zA-Z])") << endl;
return 0;
}
The Output
String = (1+1)*(n+n)
n\+n = n+n
(\d)\+\1 = 1+11
([a-zA-Z])[+-/*]\1 = n+nn
([a-zA-Z])[+-/*]([a-zA-Z]) = n+nnn
If anyone could kindly point the error I've done or point me to a similar question in SO that I've missed while searching, it would be greatly appreciated.

Regexes in C++ don't work quite like "normal" regexes. Specialy when you are looking for multiple groups later. I also have some C++ tips in here (constness and references).
#include <cassert>
#include <iostream>
#include <sstream>
#include <regex>
#include <string>
// using namespace std; don't do this!
// https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice
// pass strings by const reference
// 1. const, you promise not to change them in this function
// 2. by reference, you avoid making copies
std::string extractStr(const std::string& str, const std::string& regExpStr)
{
std::regex regexp(regExpStr);
std::smatch m;
std::ostringstream os; // streams are more efficient for building up strings
auto begin = str.cbegin();
bool comma = false;
// C++ matches regexes in parts so work you need to loop
while (std::regex_search(begin, str.end(), m, regexp))
{
if (comma) os << ", ";
os << m[0];
comma = true;
begin = m.suffix().first;
}
return os.str();
}
// small helper function to produce nicer output for your tests.
void test(const std::string& input, const std::string& regex, const std::string& expected)
{
auto output = extractStr(input, regex);
if (output == expected)
{
std::cout << "test succeeded : output = " << output << "\n";
}
else
{
std::cout << "test failed : output = " << output << ", expected : " << expected << "\n";
}
}
int main(void)
{
std::string input = "(1+1)*(n+n)";
test(input, "n\\+n", "n+n");
test(input, "(\\d)\\+\\1", "1+1");
test(input, "([a-zA-Z])[+-/*]\\1", "n+n");
return 0;
}

i.m trying to split string by whitespace using c++, where the data from database [duplicate]

What would be easiest method to split a string using c++11?
I've seen the method used by this post, but I feel that there ought to be a less verbose way of doing it using the new standard.
Edit: I would like to have a vector<string> as a result and be able to delimitate on a single character.

std::regex_token_iterator performs generic tokenization based on a regex. It may or may not be overkill for doing simple splitting on a single character, but it works and is not too verbose:
std::vector<std::string> split(const string& input, const string& regex) {
// passing -1 as the submatch index parameter performs splitting
std::regex re(regex);
std::sregex_token_iterator
first{input.begin(), input.end(), re, -1},
last;
return {first, last};
}

Here is a (maybe less verbose) way to split string (based on the post you mentioned).
#include <string>
#include <sstream>
#include <vector>
std::vector<std::string> split(const std::string &s, char delim) {
std::stringstream ss(s);
std::string item;
std::vector<std::string> elems;
while (std::getline(ss, item, delim)) {
elems.push_back(item);
// elems.push_back(std::move(item)); // if C++11 (based on comment from #mchiasson)
}
return elems;
}

Here's an example of splitting a string and populating a vector with the extracted elements using boost.
#include <boost/algorithm/string.hpp>
std::string my_input("A,B,EE");
std::vector<std::string> results;
boost::algorithm::split(results, my_input, boost::is_any_of(","));
assert(results[0] == "A");
assert(results[1] == "B");
assert(results[2] == "EE");

Another regex solution inspired by other answers but hopefully shorter and easier to read:
std::string s{"String to split here, and here, and here,..."};
std::regex regex{R"([\s,]+)"}; // split on space and comma
std::sregex_token_iterator it{s.begin(), s.end(), regex, -1};
std::vector<std::string> words{it, {}};

I don't know if this is less verbose, but it might be easier to grok for those more seasoned in dynamic languages such as javascript. The only C++11 features it uses is auto and range-based for loop.
#include <string>
#include <cctype>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string s = "hello how are you won't you tell me your name";
vector<string> tokens;
string token;
for (const auto& c: s) {
if (!isspace(c))
token += c;
else {
if (token.length()) tokens.push_back(token);
token.clear();
}
}
if (token.length()) tokens.push_back(token);
return 0;
}

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
using namespace std;
vector<string> split(const string& str, int delimiter(int) = ::isspace){
vector<string> result;
auto e=str.end();
auto i=str.begin();
while(i!=e){
i=find_if_not(i,e, delimiter);
if(i==e) break;
auto j=find_if(i,e, delimiter);
result.push_back(string(i,j));
i=j;
}
return result;
}
int main(){
string line;
getline(cin,line);
vector<string> result = split(line);
for(auto s: result){
cout<<s<<endl;
}
}

My choice is boost::tokenizer but I didn't have any heavy tasks and test with huge data.
Example from boost doc with lambda modification:
#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>
#include <vector>
int main()
{
using namespace std;
using namespace boost;
string s = "This is, a test";
vector<string> v;
tokenizer<> tok(s);
for_each (tok.begin(), tok.end(), [&v](const string & s) { v.push_back(s); } );
// result 4 items: 1)This 2)is 3)a 4)test
return 0;
}

This is my answer. Verbose, readable and efficient.
std::vector<std::string> tokenize(const std::string& s, char c) {
auto end = s.cend();
auto start = end;
std::vector<std::string> v;
for( auto it = s.cbegin(); it != end; ++it ) {
if( *it != c ) {
if( start == end )
start = it;
continue;
}
if( start != end ) {
v.emplace_back(start, it);
start = end;
}
}
if( start != end )
v.emplace_back(start, end);
return v;
}

#include <string>
#include <vector>
#include <sstream>
inline vector<string> split(const string& s) {
vector<string> result;
istringstream iss(s);
for (string w; iss >> w; )
result.push_back(w);
return result;
}

Here is a C++11 solution that uses only std::string::find(). The delimiter can be any number of characters long. Parsed tokens are output via an output iterator, which is typically a std::back_inserter in my code.
I have not tested this with UTF-8, but I expect it should work as long as the input and delimiter are both valid UTF-8 strings.
#include <string>
template<class Iter>
Iter splitStrings(const std::string &s, const std::string &delim, Iter out)
{
if (delim.empty()) {
*out++ = s;
return out;
}
size_t a = 0, b = s.find(delim);
for ( ; b != std::string::npos;
a = b + delim.length(), b = s.find(delim, a))
{
*out++ = std::move(s.substr(a, b - a));
}
*out++ = std::move(s.substr(a, s.length() - a));
return out;
}
Some test cases:
void test()
{
std::vector<std::string> out;
size_t counter;
std::cout << "Empty input:" << std::endl;
out.clear();
splitStrings("", ",", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, empty delimiter:" << std::endl;
out.clear();
splitStrings("Hello, world!", "", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", no delimiter in string:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xyz", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string:" << std::endl;
out.clear();
splitStrings("abxycdxy!!xydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", input contains blank token:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", nothing after last delimiter:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", only delimiter exists string:" << std::endl;
out.clear();
splitStrings("xy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
}
Expected output:
Empty input:
0:
Non-empty input, empty delimiter:
0: Hello, world!
Non-empty input, non-empty delimiter, no delimiter in string:
0: abxycdxyxydefxya
Non-empty input, non-empty delimiter, delimiter exists string:
0: ab
1: cd
2: !!
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, input contains blank token:
0: ab
1: cd
2:
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, nothing after last delimiter:
0: ab
1: cd
2:
3: def
4:
Non-empty input, non-empty delimiter, only delimiter exists string:
0:
1:

One possible way of doing this is finding all occurrences of the split string and storing locations to a list. Then count input string characters and when you get to a position where there is a 'search hit' in the position list then you jump forward by 'length of the split string'. This approach takes a split string of any length. Here is my tested and working solution.
#include <iostream>
#include <string>
#include <list>
#include <vector>
using namespace std;
vector<string> Split(string input_string, string search_string)
{
list<int> search_hit_list;
vector<string> word_list;
size_t search_position, search_start = 0;
// Find start positions of every substring occurence and store positions to a hit list.
while ( (search_position = input_string.find(search_string, search_start) ) != string::npos) {
search_hit_list.push_back(search_position);
search_start = search_position + search_string.size();
}
// Iterate through hit list and reconstruct substring start and length positions
int character_counter = 0;
int start, length;
for (auto hit_position : search_hit_list) {
// Skip over substrings we are splitting with. This also skips over repeating substrings.
if (character_counter == hit_position) {
character_counter = character_counter + search_string.size();
continue;
}
start = character_counter;
character_counter = hit_position;
length = character_counter - start;
word_list.push_back(input_string.substr(start, length));
character_counter = character_counter + search_string.size();
}
// If the search string is not found in the input string, then return the whole input_string.
if (word_list.size() == 0) {
word_list.push_back(input_string);
return word_list;
}
// The last substring might be still be unprocessed, get it.
if (character_counter < input_string.size()) {
word_list.push_back(input_string.substr(character_counter, input_string.size() - character_counter));
}
return word_list;
}
int main() {
vector<string> word_list;
string search_string = " ";
// search_string = "the";
string text = "thetheThis is some text to test with the split-thethe function.";
word_list = Split(text, search_string);
for (auto item : word_list) {
cout << "'" << item << "'" << endl;
}
cout << endl;
}

C++ alternative of Java's split(str, -1) [duplicate]

What would be easiest method to split a string using c++11?
I've seen the method used by this post, but I feel that there ought to be a less verbose way of doing it using the new standard.
Edit: I would like to have a vector<string> as a result and be able to delimitate on a single character.

std::regex_token_iterator performs generic tokenization based on a regex. It may or may not be overkill for doing simple splitting on a single character, but it works and is not too verbose:
std::vector<std::string> split(const string& input, const string& regex) {
// passing -1 as the submatch index parameter performs splitting
std::regex re(regex);
std::sregex_token_iterator
first{input.begin(), input.end(), re, -1},
last;
return {first, last};
}

Here is a (maybe less verbose) way to split string (based on the post you mentioned).
#include <string>
#include <sstream>
#include <vector>
std::vector<std::string> split(const std::string &s, char delim) {
std::stringstream ss(s);
std::string item;
std::vector<std::string> elems;
while (std::getline(ss, item, delim)) {
elems.push_back(item);
// elems.push_back(std::move(item)); // if C++11 (based on comment from #mchiasson)
}
return elems;
}

Here's an example of splitting a string and populating a vector with the extracted elements using boost.
#include <boost/algorithm/string.hpp>
std::string my_input("A,B,EE");
std::vector<std::string> results;
boost::algorithm::split(results, my_input, boost::is_any_of(","));
assert(results[0] == "A");
assert(results[1] == "B");
assert(results[2] == "EE");

Another regex solution inspired by other answers but hopefully shorter and easier to read:
std::string s{"String to split here, and here, and here,..."};
std::regex regex{R"([\s,]+)"}; // split on space and comma
std::sregex_token_iterator it{s.begin(), s.end(), regex, -1};
std::vector<std::string> words{it, {}};

I don't know if this is less verbose, but it might be easier to grok for those more seasoned in dynamic languages such as javascript. The only C++11 features it uses is auto and range-based for loop.
#include <string>
#include <cctype>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string s = "hello how are you won't you tell me your name";
vector<string> tokens;
string token;
for (const auto& c: s) {
if (!isspace(c))
token += c;
else {
if (token.length()) tokens.push_back(token);
token.clear();
}
}
if (token.length()) tokens.push_back(token);
return 0;
}

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
using namespace std;
vector<string> split(const string& str, int delimiter(int) = ::isspace){
vector<string> result;
auto e=str.end();
auto i=str.begin();
while(i!=e){
i=find_if_not(i,e, delimiter);
if(i==e) break;
auto j=find_if(i,e, delimiter);
result.push_back(string(i,j));
i=j;
}
return result;
}
int main(){
string line;
getline(cin,line);
vector<string> result = split(line);
for(auto s: result){
cout<<s<<endl;
}
}

My choice is boost::tokenizer but I didn't have any heavy tasks and test with huge data.
Example from boost doc with lambda modification:
#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>
#include <vector>
int main()
{
using namespace std;
using namespace boost;
string s = "This is, a test";
vector<string> v;
tokenizer<> tok(s);
for_each (tok.begin(), tok.end(), [&v](const string & s) { v.push_back(s); } );
// result 4 items: 1)This 2)is 3)a 4)test
return 0;
}

This is my answer. Verbose, readable and efficient.
std::vector<std::string> tokenize(const std::string& s, char c) {
auto end = s.cend();
auto start = end;
std::vector<std::string> v;
for( auto it = s.cbegin(); it != end; ++it ) {
if( *it != c ) {
if( start == end )
start = it;
continue;
}
if( start != end ) {
v.emplace_back(start, it);
start = end;
}
}
if( start != end )
v.emplace_back(start, end);
return v;
}

#include <string>
#include <vector>
#include <sstream>
inline vector<string> split(const string& s) {
vector<string> result;
istringstream iss(s);
for (string w; iss >> w; )
result.push_back(w);
return result;
}

Here is a C++11 solution that uses only std::string::find(). The delimiter can be any number of characters long. Parsed tokens are output via an output iterator, which is typically a std::back_inserter in my code.
I have not tested this with UTF-8, but I expect it should work as long as the input and delimiter are both valid UTF-8 strings.
#include <string>
template<class Iter>
Iter splitStrings(const std::string &s, const std::string &delim, Iter out)
{
if (delim.empty()) {
*out++ = s;
return out;
}
size_t a = 0, b = s.find(delim);
for ( ; b != std::string::npos;
a = b + delim.length(), b = s.find(delim, a))
{
*out++ = std::move(s.substr(a, b - a));
}
*out++ = std::move(s.substr(a, s.length() - a));
return out;
}
Some test cases:
void test()
{
std::vector<std::string> out;
size_t counter;
std::cout << "Empty input:" << std::endl;
out.clear();
splitStrings("", ",", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, empty delimiter:" << std::endl;
out.clear();
splitStrings("Hello, world!", "", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", no delimiter in string:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xyz", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string:" << std::endl;
out.clear();
splitStrings("abxycdxy!!xydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", input contains blank token:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxya", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", delimiter exists string"
", nothing after last delimiter:" << std::endl;
out.clear();
splitStrings("abxycdxyxydefxy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
std::cout << "Non-empty input, non-empty delimiter"
", only delimiter exists string:" << std::endl;
out.clear();
splitStrings("xy", "xy", std::back_inserter(out));
counter = 0;
for (auto i = out.begin(); i != out.end(); ++i, ++counter) {
std::cout << counter << ": " << *i << std::endl;
}
}
Expected output:
Empty input:
0:
Non-empty input, empty delimiter:
0: Hello, world!
Non-empty input, non-empty delimiter, no delimiter in string:
0: abxycdxyxydefxya
Non-empty input, non-empty delimiter, delimiter exists string:
0: ab
1: cd
2: !!
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, input contains blank token:
0: ab
1: cd
2:
3: def
4: a
Non-empty input, non-empty delimiter, delimiter exists string, nothing after last delimiter:
0: ab
1: cd
2:
3: def
4:
Non-empty input, non-empty delimiter, only delimiter exists string:
0:
1:

One possible way of doing this is finding all occurrences of the split string and storing locations to a list. Then count input string characters and when you get to a position where there is a 'search hit' in the position list then you jump forward by 'length of the split string'. This approach takes a split string of any length. Here is my tested and working solution.
#include <iostream>
#include <string>
#include <list>
#include <vector>
using namespace std;
vector<string> Split(string input_string, string search_string)
{
list<int> search_hit_list;
vector<string> word_list;
size_t search_position, search_start = 0;
// Find start positions of every substring occurence and store positions to a hit list.
while ( (search_position = input_string.find(search_string, search_start) ) != string::npos) {
search_hit_list.push_back(search_position);
search_start = search_position + search_string.size();
}
// Iterate through hit list and reconstruct substring start and length positions
int character_counter = 0;
int start, length;
for (auto hit_position : search_hit_list) {
// Skip over substrings we are splitting with. This also skips over repeating substrings.
if (character_counter == hit_position) {
character_counter = character_counter + search_string.size();
continue;
}
start = character_counter;
character_counter = hit_position;
length = character_counter - start;
word_list.push_back(input_string.substr(start, length));
character_counter = character_counter + search_string.size();
}
// If the search string is not found in the input string, then return the whole input_string.
if (word_list.size() == 0) {
word_list.push_back(input_string);
return word_list;
}
// The last substring might be still be unprocessed, get it.
if (character_counter < input_string.size()) {
word_list.push_back(input_string.substr(character_counter, input_string.size() - character_counter));
}
return word_list;
}
int main() {
vector<string> word_list;
string search_string = " ";
// search_string = "the";
string text = "thetheThis is some text to test with the split-thethe function.";
word_list = Split(text, search_string);
for (auto item : word_list) {
cout << "'" << item << "'" << endl;
}
cout << endl;
}

Print out the words of a line in reverse order through recursion

I'm trying not to use any storage containers. I don't know if it's even possible. Here is what I have so far. (I'm getting a segmentation fault).
#include <iostream>
#include <string>
using namespace std;
void foo(string s)
{
size_t pos;
pos = s.find(' ');
if(pos == string::npos)
return;
foo(s.erase(0, pos));
cout << s.substr(0, pos) << " ";
}
int main()
{
foo("hello world");
return 0;
}
I know there's probably many things wrong with this code. So rip away. I'm eager to learn. I'm trying to imitate a post order print as you would do in a reverse print of a singly linked list. Thanks.
EDIT:
An example:
"You are amazing" becomes "amazing are You"

The segfault is a stack overflow.
foo( "hello world" ) erases everything up to the first space (" world") and recurses.
foo( " world" ) erases everything up to the first space (" world") and recurses.
foo( " world" )... you get the idea.
Also, once you called foo( s.erase( 0, pos ) ), trying to print s.substr( 0, pos ) after the recursion returns does not make sense. You need to save the substring somewhere before you erase it, so you still have it to print afterwards.
void foo(string s)
{
size_t pos = s.find(' '); // declare-and-use in one line
string out = s.substr( 0, pos ); // saving the substring
if ( pos != string::npos )
{
foo( s.erase( 0, pos + 1 ) ); // recurse, skipping the space...
cout << " "; // ...but *print* the space
}
cout << out; // print the saved substring
}

The problem is that your recursion continues until you run out of memory.
Pay attention to this line:
if(pos == string::npos)
when your erase the substring you don't erase the white space so in the next recursion s.find returns pos = 0 which means that your recursion never ends.
Here is a code that works. Also note that I added a level variable to be able to control the behaviour on the first level (in this case add a endl)
#include <iostream>
#include <string>
using namespace std;
void foo(string s, int l)
{
size_t pos;
pos = s.find(' ');
if(pos == string::npos){
cout << s << " ";
return;
}
string temp = s.substr(0, pos);
foo(s.erase(0, pos+1),l+1);
cout << temp << " ";
if(l == 0)
cout << endl;
}
int main()
{
foo("hello world", 0);
return 0;
}

An approach to recursion, which may allow your compiler to transform automatically to iteration, is to accumulate the result in the function arguments. This will be familiar if you've written recursive functions in any of the Lisp family of languages:
#include <iostream>
#include <string>
std::string reverse_words(const std::string& s, const std::string& o = {})
{
using std::string;
const auto npos = string::npos;
static const string whitespace(" \n\r\t");
// find start and end of the first whitespace block
auto start = s.find_first_of(whitespace);
if (start == npos)
return s + o;
auto end = s.find_first_not_of(whitespace, start);
if (end == npos)
return s + o;
auto word = s.substr(0, start);
auto space = s.substr(start, end-start);
auto rest = s.substr(end);
return reverse_words(rest, space + word + o);
}
int main()
{
std::cout << reverse_words("hello to all the world") << std::endl;
std::cout << reverse_words(" a more difficult\n testcase ") << std::endl;
return 0;
}

I tried to make a brief example by using standard algorithms. I also handles more kinds of spaces than just standard whitespace (tabs for instance).
#include <cctype>
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
void print_reverse(string words) {
// Termination condition
if(words.empty())
return;
auto predicate = (int(*)(int))isspace;
auto sit = begin(words);
auto wit = find_if_not(sit, end(words), predicate);
auto nit = find_if (wit, end(words), predicate);
print_reverse(string(nit, end(words)));
// word spaces
cout << string(wit, nit) << string(sit, wit);
}
int main() {
string line;
getline(cin, line);
print_reverse(line);
cout << endl;
}
Here is an example run:
$ ./print-out-the-words-of-a-line-in-reverse-order-through-recursion
You are amazing
amazing are You

The key is in adding 1 to pos in the erase statement.
So try:
#include <iostream>
#include <string>
using namespace std;
void foo(string s)
{
size_t pos;
pos = s.find(' ');
if(pos == string::npos)
{
cout << s << " ";
return;
}
string out = s.substr(0, pos);
foo(s.erase(0, pos+1));
cout << out << " ";
}
int main()
{
foo("hello world");
cout << endl;
return 0;
}
EDIT
Alternatively you could use a char* instead of a std::string, then you do not need to make a temp variable. Try it online.
#include <iostream>
#include <cstring>
void foo(char* s)
{
char* next = std::strchr(s, ' ');
if(next != nullptr)
{
foo(next + 1);
*next = 0;
}
std::cout << s << " ";
}
int main()
{
char s[] = "You are amazing";
foo(s);
std::cout << std::endl;
}

The problem is that you're not doing anything with the last word and you're not doing anything with the remaining chunk.
If you have a recursive reverse printer, you'll want something like this (pseudocode):
def recursive-reverse(string) {
pos = string.find-last(" ");
if pos doesn't exist {
print string;
return;
} else {
print string.create-substring(pos+1, string.end);
recursive-reverse(string.create-substring(0, pos));
}
}
To implement this in C++:
#include <iostream>
#include <string>
void recursive_reverse(std::string &s) {
// find the last space
size_t pos = s.find_last_of(" ");
// base case - there's no space
if(pos == std::string::npos) {
// print only word in the string
std::cout << s << std::endl;
// end of recursion
return;
} else {
// grab everything after the space
std::string substring = s.substr(pos+1);
// print it
std::cout << substring << std::endl;
// grab everything before the space
std::string rest = s.substr(0, pos);
// recursive call on everything before the space
recursive_reverse(rest);
}
}
int main() {
std::string s("Hello World!");
recursive_reverse(s);
return 0;
}
ideone

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Having trouble cleaning out spaces in a string - c++

Related

Reverse Word Wise using basics of C++

Why is my string extraction function using back referencing in regex not working as intended?

i.m trying to split string by whitespace using c++, where the data from database [duplicate]

C++ alternative of Java's split(str, -1) [duplicate]

Print out the words of a line in reverse order through recursion

Categories

Resources