C++ separate string by selected commas - c++

I was reading the following question Parsing a comma-delimited std::string on how to split a string by a comma (Someone gave me the link from my previous question) and one of the answers was:
stringstream ss( "1,1,1,1, or something else ,1,1,1,0" );
vector<string> result;
while( ss.good() )
{
string substr;
getline( ss, substr, ',' );
result.push_back( substr );
}
But what if my string was like the following, and I wanted to separate values only by the bold commas and ignoring what appears inside <>?
<a,b>,<c,d>,,<d,l>,
I want to get:
<a,b>
<c,d>
"" //Empty string
<d,l>
""
Given:<a,b>,,<c,d> It should return: <a,b> and "" and <c,d>
Given:<a,b>,<c,d> It should return:<a,b> and <c,d>
Given:<a,b>, It should return:<a,b> and ""
Given:<a,b>,,,<c,d> It should return:<a,b> and "" and "" and <c,d>
In other words, my program should behave just like the given solution above separated by , (Supposing there is no other , except the bold ones)
Here are some suggested solution and their problems:
Delete all bold commas: This will result in treating the following 2 inputs the same way while they shouldn't
<a,b>,<c,d>
<a,b>,,<c,d>
Replace all bold commas with some char and use the above algorithm: I can't select some char to replace the commas with since any value could appear in the rest of my string

Adding to #Carlos' answer, apart from regex (take a look at my comment); you can implement the substitution like the following (Here, I actually build a new string):
#include <algorithm>
#include <iostream>
#include <string>
int main() {
std::string str;
getline(std::cin,str);
std::string str_builder;
for (auto it = str.begin(); it != str.end(); it++) {
static bool flag = false;
if (*it == '<') {
flag = true;
}
else if (*it == '>') {
flag = false;
str_builder += *it;
}
if (flag) {
str_builder += *it;
}
}
}

Why not replace one set of commas with some known-to-not-clash character, then split it by the other commas, then reverse the replacement?
So replace the commas that are inside the <> with something, do the string split, replace again.

I think what you want is something like this:
vector<string> result;
string s = "<a,b>,,<c,d>"
int in_string = 0;
int latest_comma = 0;
for (int i = 0; i < s.size(); i++) {
if(s[i] == '<'){
result.push_back(s[i]);
in_string = 1;
latest_comma = 0;
}
else if(s[i] == '>'){
result.push_back(s[i]);
in_string = 0;
}
else if(!in_string && s[i] == ','){
if(latest_comma == 1)
result.push_back('\n');
else
latest_comma = 1;
}
else
result.push_back(s[i]);
}

Here is a possible code that scans a string one char at a time and splits it on commas (',') unless they are masked between brackets ('<' and '>').
Algo:
assume starting outside brackets
loop for each character:
if not a comma, or if inside brackets
store the character in the current item
if a < bracket: note that we are inside brackets
if a > bracket: note that we are outside brackets
else (an unmasked comma)
store the current item as a string into the resulting vector
clear the current item
store the last item into the resulting vector
Only 10 lines and my rubber duck agreed that it should work...
C++ implementation: I will use a vector to handle the current item because it is easier to build it one character at a time
std::vector<std::string> parse(const std::string& str) {
std::vector<std::string> result;
bool masked = false;
std::vector<char> current; // stores chars of the current item
for (const char c : str) {
if (masked || (c != ',')) {
current.push_back(c);
switch (c) {
case '<': masked = true; break;
case '>': masked = false;
}
}
else { // unmasked comma: store item and prepare next
current.push_back('\0'); // a terminating null for the vector data
result.push_back(std::string(&current[0]));
current.clear();
}
}
// do not forget the last item...
current.push_back('\0');
result.push_back(std::string(&current[0]));
return result;
}
I tested it with all your example strings and it gives the expected results.

Seems quite straight forward to me.
vector<string> customSplit(string s)
{
vector<string> results;
int level = 0;
std::stringstream ss;
for (char c : s)
{
switch (c)
{
case ',':
if (level == 0)
{
results.push_back(ss.str());
stringstream temp;
ss.swap(temp); // Clear ss for the new string.
}
else
{
ss << c;
}
break;
case '<':
level += 2;
case '>':
level -= 1;
default:
ss << c;
}
}
results.push_back(ss.str());
return results;
}

Related

Word counter returning incorrect number of words

I've been trying to create a program that reads text from a file and stores it in a string. I feed the string to a function that counts every word in the string.
However its only accurate assuming the user leaves some whitespace at the end of a line and doesn't creates blank lines.... not a very good word counter.
Creating a blank line results in a false increment to the word count.
I'm not sure if my main problem is using a boolean to do this or checking for whitespace and '\n' characters.
bool countingLetters = false;
int wordCount = 0;
for (int i = 0; i < text.length(); i++)
{
if (text[i] == ' ' && countingLetters == true)
{
countingLetters = false;
wordCount++;
}
if (text[i] != ' ' && countingLetters == false)
{
countingLetters = true;
}
if (text[i] == '\n' && countingLetters == true)
{
countingLetters = false;
wordCount++;
}
}
Your code is basically a state machine. To complete your solution, just count in the string ending.
Add this to the end of your code:
if(countingLetters) { // word at the end of string, without any space charactor
wordCount++;
}
Or if you can be sure it's C-style string, like std::string, you can just index 1 pass the last charactor, and handle '\0'in same way of space and '\n' .
To improve your code, use isspace (and this covers more space charactor, including '\t', etc.). And better to use else if pattern. Also, it's not good pratice to ==true. Just use boolean as condition.
Or maybe, isalpha(c) fits more to your need.
bool countingLetters = false;
int wordCount = 0;
for (char c:text) {
if (!isalpha(c) && countingLetters) { // this also works for newline
countingLetters = false;
++wordCount;
} else if (isalpha(c) && !countingLetters) {
countingLetters = true;
} // otherwise just skip
}
if(countingLetters) { // word at the end of string, without any space charactor
++wordCount;
}
And it's not acceptable to insert extra charactor just for such a simple task. For example, text may be const.
An alternative is to count the beginning of a "word".
Let us say the beginning of a word is a letter after a non-letter. We can adjust this if desired.
int wordCount = 0;
int prior = '\n'; // some non-letter
for (int i = 0; i < text.length(); i++) {
if (isalpha(text[i]) && !isalpha(prior)) {
wordCount++;
}
prior = text[i];
}
C++ also provides some very high-level ways to do this.
One is by using a loop over a stringstream, which splits text on whitespace:
#include <sstream>
#include <string>
std::size_t count_words( const std::string& s )
{
std::size_t count = 0;
std::istringstream ss( s );
std::string t;
while (ss >> t) count += 1;
return count;
}
Another is using a stream iterator algorithm:
#include <iterator>
#include <sstream>
#include <string>
std::size_t count_words( const std::string& s )
{
std::istringstream ss( s );
return std::distance(
std::istream_iterator <std::string> ( ss ),
std::istream_iterator <std::string> ()
);
}
Yet another is using a regular expression:
#include <iterator>
#include <regex>
#include <string>
std::size_t count_words( const std::string& s )
{
std::regex re( "\\w+" );
return std::distance(
std::sregex_iterator( s.begin(), s.end(), re ),
std::sregex_iterator()
);
}
I’m sure there are many more, but those three are the ones that come off the top of my head.

How do I parse comma-delimited string in C++ with some elements being quoted with commas?

I have a comma-delimited string that I want to store in a string vector. The string and vectors are:
string s = "1, 10, 'abc', 'test, 1'";
vector<string> v;
Ideally I want the strings 'abc' and 'test, 1' to be stored without the single quotes as below, but I can live with storing them with single quotes:
v[0] = "1";
v[1] = "10";
v[2] = "abc";
v[3] = "test, 1";
bool nextToken(const string &s, string::size_type &start, string &token)
{
token.clear();
start = s.find_first_not_of(" \t", start);
if (start == string::npos)
return false;
string::size_type end;
if (s[start] == '\'')
{
++start;
end = s.find('\'', start);
}
else
end = s.find_first_of(" \t,", start);
if (end == string::npos)
{
token = s.substr(start);
start = s.size();
}
else
{
token = s.substr(start, end-start);
if ((s[end] != ',') && ((end = s.find(',', end + 1)) == string::npos))
start = s.size();
else
start = end + 1;
}
return true;
}
string s = "1, 10, 'abc', 'test, 1'", token;
vector<string> v;
string::size_type start = 0;
while (nextToken(s, start, token))
v.push_back(token);
Demo
What you need to do here, is make yourself a parser that parses as you want it to. Here I have made a parsing function for you:
#include <string>
#include <vector>
using namespace std;
vector<string> parse_string(string master) {
char temp; //the current character
bool encountered = false; //for checking if there is a single quote
string curr_parse; //the current string
vector<string>result; //the return vector
for (int i = 0; i < master.size(); ++i) { //while still in the string
temp = master[i]; //current character
switch (temp) { //switch depending on the character
case '\'': //if the character is a single quote
if (encountered) encountered = false; //if we already found a single quote, reset encountered
else encountered = true; //if we haven't found a single quote, set encountered to true
[[fallthrough]];
case ',': //if it is a comma
if (!encountered) { //if we have not found a single quote
result.push_back(curr_parse); //put our current string into our vector
curr_parse = ""; //reset the current string
break; //go to next character
}//if we did find a single quote, go to the default, and push_back the comma
[[fallthrough]];
default: //if it is a normal character
if (encountered && isspace(temp)) curr_parse.push_back(temp); //if we have found a single quote put the whitespace, we don't care
else if (isspace(temp)) break; //if we haven't found a single quote, trash the whitespace and go to the next character
else if (temp == '\'') break; //if the current character is a single quote, trash it and go to the next character.
else curr_parse.push_back(temp); //if all of the above failed, put the character into the current string
break; //go to the next character
}
}
for (int i = 0; i < result.size(); ++i) {
if (result[i] == "") result.erase(result.begin() + i);
//check that there are no empty strings in the vector
//if there are, delete them
}
return result;
}
This parses your string as you want it to, and returns a vector. Then, you can use it in your program:
#include <iostream>
int main() {
string s = "1, 10, 'abc', 'test, 1'";
vector<string> v = parse_string(s);
for (int i = 0; i < v.size(); ++i) {
cout << v[i] << endl;
}
}
and it properly prints out:
1
10
abc
test, 1
A proper solution would require a parser implementation. If you need a quick hack, just write a cell reading function (demo). The c++14's std::quoted manipulator is of great help here. The only problem is the manipulator requires a stream. This is easily solved with istringstream - see the second function. Note that the format of your string is CELL COMMA CELL COMMA... CELL.
istream& get_cell(istream& is, string& s)
{
char c;
is >> c; // skips ws
is.unget(); // puts back in the stream the last read character
if (c == '\'')
return is >> quoted(s, '\'', '\\'); // the first character of the cell is ' - read quoted
else
return getline(is, s, ','), is.unget(); // read unqoted, but put back comma - we need it later, in get function
}
vector<string> get(const string& s)
{
istringstream iss{ s };
string cell;
vector<string> r;
while (get_cell(iss, cell))
{
r.push_back( cell );
char comma;
iss >> comma; // expect a cell separator
if (comma != ',')
break; // cell separator not found; we are at the end of stream/string - break the loop
}
if (char c; iss >> c) // we reached the end of what we understand - probe the end of stream
throw "ill formed";
return r;
}
And this is how you use it:
int main()
{
string s = "1, 10, 'abc', 'test, 1'";
try
{
auto v = get(s);
}
catch (const char* e)
{
cout << e;
}
}

Split string path with space

I am writing a program that should receive 3 parameters by User: file_upload "local_path" "remote_path"
code example:
std::vector split(std::string str, char delimiter) {
std::vector<string> v;
std::stringstream src(str);
std::string buf;
while(getline(src, buf, delimiter)) {
v.push_back(buf);
}
return v;
}
void function() {
std::string input
getline(std::cin, input);
// user input like this: file_upload /home/Space Dir/file c:\dir\file
std::vector<std::string> v_input = split(input, ' ');
// the code will do something like this
if(v_input[0].compare("file_upload") == 0) {
FILE *file;
file = fopen(v_input[1].c_str(), "rb");
send_upload_dir(v_input[2].c_str());
// bla bla bla
}
}
My question is: the second and third parameter are directories, then they can contain spaces in name. How can i make the split function does not change the spaces of the second and third parameter?
I thought to put quotes in directories and make a function to recognize, but not work 100% because the program has other functions that take only 2 parameters not three. can anyone help?
EDIT: /home/user/Space Dir/file.out <-- path with space name.
If this happens the vector size is greater than expected, and the path to the directory will be broken.. this can not happen..
the vector will contain something like this:
vector[1] = /home/user/Space
vector[2] = Dir/file.out
and what I want is this:
vector[1] = /home/user/Space Dir/file.out
Since you need to accept three values from a single string input, this is a problem of encoding.
Encoding is sometimes done by imposing fixed-width requirements on some or all fields, but that's clearly not appropriate here, since we need to support variable-width file system paths, and the first value (which appears to be some kind of mode specifier) may be variable-width as well. So that's out.
This leaves 4 possible solutions for variable-width encoding:
1: Unambiguous delimiter.
If you can select a separator character that is guaranteed never to show up in the delimited values, then you can split on that. For example, if NUL is guaranteed never to be part of the mode value or the path values, then we can do this:
std::vector<std::string> v_input = split(input,'\0');
Or maybe the pipe character:
std::vector<std::string> v_input = split(input,'|');
Hence the input would have to be given like this (for the pipe character):
file_upload|/home/user/Space Dir/file.out|/home/user/Other Dir/blah
2: Escaping.
You can write the code to iterate through the input line and properly split it on unescaped instances of the separator character. Escaped instances will not be considered separators. You can parameterize the escape character. For example:
std::vector<std::string> escapedSplit(std::string str, char delimiter, char escaper ) {
std::vector<std::string> res;
std::string cur;
for (size_t i = 0; i < str.size(); ++i) {
if (str[i] == delimiter) {
res.push_back(cur);
cur.clear();
} else if (str[i] == escaper) {
++i;
if (i == str.size()) break;
cur.push_back(str[i]);
} else {
cur.push_back(str[i]);
} // end if
} // end for
if (!cur.empty()) res.push_back(cur);
return res;
} // end escapedSplit()
std::vector<std::string> v_input = escapedSplit(input,' ','\\');
With input as:
file_upload /home/user/Space\ Dir/file.out /home/user/Other\ Dir/blah
3: Quoting.
You can write the code to iterate through the input line and properly split it on unquoted instances of the separator character. Quoted instances will not be considered separators. You can parameterize the quote character.
A complication of this approach is that it is not possible to include the quote character itself inside a quoted extent unless you introduce an escaping mechanism, similar to solution #2. A common strategy is to allow repetition of the quote character to escape it. For example:
std::vector<std::string> quotedSplit(std::string str, char delimiter, char quoter ) {
std::vector<std::string> res;
std::string cur;
for (size_t i = 0; i < str.size(); ++i) {
if (str[i] == delimiter) {
res.push_back(cur);
cur.clear();
} else if (str[i] == quoter) {
++i;
for (; i < str.size(); ++i) {
if (str[i] == quoter) {
if (i+1 == str.size() || str[i+1] != quoter) break;
++i;
cur.push_back(quoter);
} else {
cur.push_back(str[i]);
} // end if
} // end for
} else {
cur.push_back(str[i]);
} // end if
} // end for
if (!cur.empty()) res.push_back(cur);
return res;
} // end quotedSplit()
std::vector<std::string> v_input = quotedSplit(input,' ','"');
With input as:
file_upload "/home/user/Space Dir/file.out" "/home/user/Other Dir/blah"
Or even just:
file_upload /home/user/Space" "Dir/file.out /home/user/Other" "Dir/blah
4: Length-value.
Finally, you can write the code to take a length before each value, and only grab that many characters. We could require a fixed-width length specifier, or skip a delimiting character following the length specifier. For example (note: light on error checking):
std::vector<std::string> lengthedSplit(std::string str) {
std::vector<std::string> res;
size_t i = 0;
while (i < str.size()) {
size_t len = std::atoi(str.c_str());
if (len == 0) break;
i += (size_t)std::log10(len)+2; // +1 to get base-10 digit count, +1 to skip delim
res.push_back(str.substr(i,len));
i += len;
} // end while
return res;
} // end lengthedSplit()
std::vector<std::string> v_input = lengthedSplit(input);
With input as:
11:file_upload29:/home/user/Space Dir/file.out25:/home/user/Other Dir/blah
I had similar problem few days ago and solve it like this:
First I've created a copy, Then replace the quoted strings in the copy with some padding to avoid white spaces, finally I split the original string according to the white space indexes from the copy.
Here is my full solution:
you may want to also remove the double quotes, trim the original string and so on:
#include <sstream>
#include<iostream>
#include<vector>
#include<string>
using namespace std;
string padString(size_t len, char pad)
{
ostringstream ostr;
ostr.fill(pad);
ostr.width(len);
ostr<<"";
return ostr.str();
}
void splitArgs(const string& s, vector<string>& result)
{
size_t pos1=0,pos2=0,len;
string res = s;
pos1 = res.find_first_of("\"");
while(pos1 != string::npos && pos2 != string::npos){
pos2 = res.find_first_of("\"",pos1+1);
if(pos2 != string::npos ){
len = pos2-pos1+1;
res.replace(pos1,len,padString(len,'X'));
pos1 = res.find_first_of("\"");
}
}
pos1=res.find_first_not_of(" \t\r\n",0);
while(pos1 < s.length() && pos2 < s.length()){
pos2 = res.find_first_of(" \t\r\n",pos1+1);
if(pos2 == string::npos ){
pos2 = res.length();
}
len = pos2-pos1;
result.push_back(s.substr(pos1,len));
pos1 = res.find_first_not_of(" \t\r\n",pos2+1);
}
}
int main()
{
string s = "234 \"5678 91\" 8989";
vector<string> args;
splitArgs(s,args);
cout<<"original string:"<<s<<endl;
for(size_t i=0;i<args.size();i++)
cout<<"arg "<<i<<": "<<args[i]<<endl;
return 0;
}
and this is the output:
original string:234 "5678 91" 8989
arg 0: 234
arg 1: "5678 91"
arg 2: 8989

Remove whitespace from string excluding parts between pairs of " and ' C++

So essentially what I want to do is erase all the whitespace from an std::string object, however excluding parts within speech marks and quote marks (so basically strings), eg:
Hello, World! I am a string
Would result in:
Hello,World!Iamastring
However things within speech marks/quote marks would be ignored:
"Hello, World!" I am a string
Would result in:
"Hello, World!"Iamastring
Or:
Hello,' World! I' am a string
Would be:
Hello,' World! I'amastring
Is there a simple routine to perform this to a string, either one build into the standard library or an example of how to write my own? It doesn't have to be the most efficient one possible, as it will only be run once or twice every time the program runs.
No, there is not such a routine ready.
You may build your own though.
You have to loop over the string and you want to use a flag. If the flag is true, then you delete the spaces, if it is false, you ignore them. The flag is true when you are not in a part of quotes, else it's false.
Here is a naive, not widely tested example:
#include <string>
#include <iostream>
using namespace std;
int main() {
// we will copy the result in new string for simplicity
// of course you can do it inplace. This takes into account only
// double quotes. Easy to extent do single ones though!
string str("\"Hello, World!\" I am a string");
string new_str = "";
// flags for when to delete spaces or not
// 'start' helps you find if you are in an area of double quotes
// If you are, then don't delete the spaces, otherwise, do delete
bool delete_spaces = true, start = false;
for(unsigned int i = 0; i < str.size(); ++i) {
if(str[i] == '\"') {
start ? start = false : start = true;
if(start) {
delete_spaces = false;
}
}
if(!start) {
delete_spaces = true;
}
if(delete_spaces) {
if(str[i] != ' ') {
new_str += str[i];
}
} else {
new_str += str[i];
}
}
cout << "new_str=|" << new_str << "|\n";
return 0;
}
Output:
new_str=|"Hello, World!"Iamastring|
Here we go. I ended up iterating through the string, and if it finds either a " or a ', it will flip the ignore flag. If the ignore flag is true and the current character is not a " or a ', the iterator just increments until it either reaches the end of the string or finds another "/'. If the ignore flag is false, it will remove the current character if it's whitespace (either space, newline or tab).
EDIT: this code now supports ignoring escaped characters (\", \') and making sure a string starting with a " ends with a ", and a string starting with a ' ends with a ', ignoring anything else in between.
#include <iostream>
#include <string>
int main() {
std::string str("I am some code, with \"A string here\", but not here\\\". 'This sentence \" should not end yet', now it should. There is also 'a string here' too.\n");
std::string::iterator endVal = str.end(); // a kind of NULL pointer
std::string::iterator type = endVal; // either " or '
bool ignore = false; // whether to ignore the current character or not
for (std::string::iterator it=str.begin(); it!=str.end();)
{
// ignore escaped characters
if ((*it) == '\\')
{
it += 2;
}
else
{
if ((*it) == '"' || (*it) == '\'')
{
if (ignore) // within a string
{
if (type != endVal && (*it) == (*type))
{
// end of the string
ignore = false;
type = endVal;
}
}
else // outside of a string, so one must be starting.
{
type = it;
ignore = true;
}
it++;
//ignore ? ignore = false : ignore = true;
//type = it;
}
else
{
if (!ignore)
{
if ((*it) == ' ' || (*it) == '\n' || (*it) == '\t')
{
it = str.erase(it);
}
else
{
it++;
}
}
else
{
it++;
}
}
}
}
std::cout << "string now is: " << str << std::endl;
return 0;
}
Argh, and here I spent time writing this (simple) version:
#include <cctype>
#include <ciso646>
#include <iostream>
#include <string>
template <typename Predicate>
std::string remove_unquoted_chars( const std::string& s, Predicate p )
{
bool skip = false;
char q = '\0';
std::string result;
for (char c : s)
if (skip)
{
result.append( 1, c );
skip = false;
}
else if (q)
{
result.append( 1, c );
skip = (c == '\\');
if (c == q) q = '\0';
}
else
{
if (!std::isspace( c ))
result.append( 1, c );
q = p( c ) ? c : '\0';
}
return result;
}
std::string remove_unquoted_whitespace( const std::string& s )
{
return remove_unquoted_chars( s, []( char c ) -> bool { return (c == '"') or (c == '\''); } );
}
int main()
{
std::string s;
std::cout << "s? ";
std::getline( std::cin, s );
std::cout << remove_unquoted_whitespace( s ) << "\n";
}
Removes all characters identified by the given predicate except stuff inside a single-quoted or double-quoted C-style string, taking care to respect escaped characters.
you may use erase-remove idiom like this
#include <string>
#include <iostream>
#include <algorithm>
int main()
{
std::string str("\"Hello, World!\" I am a string");
std::size_t x = str.find_last_of("\"");
std::string split1 = str.substr(0, ++x);
std::string split2 = str.substr(x, str.size());
split1.erase(std::remove(split1.begin(), split1.end(), '\\'), split1.end());
split2.erase(std::remove(split2.begin(), split2.end(), ' '), split2.end());
std::cout << split1 + split2;
}

Remove whitespace, convert case, in string except in quotes

I am using C++03 without Boost.
Suppose I have a string such as.. The day is "Mon day"
I want to process this to
THEDAYISMon day
That is, convert to upper case what is not in the quote, and remove whitespace that isn't in the quote.
The string may not contain quotes, but if it does, there will only be 2.
I tried using STL algorithms but I get stuck on how to remember if it's in a quote or not between elements.
Of course I can do it with good old for loops, but I was wondering if there is a fancy C++ way.
Thanks.
This is what I have using a for loop
while (getline(is, str))
{
// remove whitespace and convert case except in quotes
temp.clear();
bool bInQuote = false;
for (string::const_iterator it = str.begin(), end_it = str.end(); it != end_it; ++it)
{
char c = *it;
if (c == '\"')
{
bInQuote = (! bInQuote);
}
else
{
if (! ::isspace(c))
{
temp.push_back(bInQuote ? c : ::toupper(c));
}
}
}
swap(str, temp);
You can do something with STL algorithms like the following:
#include <iostream>
#include <string>
#include <algorithm>
#include <cctype>
using namespace std;
struct convert {
void operator()(char& c) { c = toupper((unsigned char)c); }
};
bool isSpace(char c)
{
return std::isspace(c);
}
int main() {
string input = "The day is \"Mon Day\" You know";
cout << "original string: " << input <<endl;
unsigned int firstQuote = input.find("\"");
unsigned int secondQuote = input.find_last_of("\"");
string firstPart="";
string secondPart="";
string quotePart="";
if (firstQuote != string::npos)
{
firstPart = input.substr(0,firstQuote);
if (secondQuote != string::npos)
{
secondPart = input.substr(secondQuote+1);
quotePart = input.substr(firstQuote+1, secondQuote-firstQuote-1);
//drop those quotes
}
std::for_each(firstPart.begin(), firstPart.end(), convert());
firstPart.erase(remove_if(firstPart.begin(),
firstPart.end(), isSpace),firstPart.end());
std::for_each(secondPart.begin(), secondPart.end(), convert());
secondPart.erase(remove_if(secondPart.begin(),
secondPart.end(), isSpace),secondPart.end());
input = firstPart + quotePart + secondPart;
}
else //does not contains quote
{
std::for_each(input.begin(), input.end(), convert());
input.erase(remove_if(input.begin(),
input.end(), isSpace),input.end());
}
cout << "transformed string: " << input << endl;
return 0;
}
It gave the following output:
original string: The day is "Mon Day" You know
transformed string: THEDAYISMon DayYOUKNOW
With the test case you have shown:
original string: The day is "Mon Day"
transformed string: THEDAYISMon Day
Just for laughs, use a custom iterator, std::copy and a std::back_insert_iterator, and an operator++ that knows to skip whitespace and set a flag on a quote character:
CustomStringIt& CustomStringIt::operator++ ()
{
if(index_<originalString_.size())
++index_;
if(!inQuotes_ && isspace(originalString_[index_]))
return ++(*this);
if('\"'==originalString_[index_])
{
inQuotes_ = !inQuotes_;
return ++(*this);
}
return *this;
}
char CustomStringIt::operator* () const
{
char c = originalString_[index_];
return inQuotes_ ? c : std::toupper(c) ;
}
Full code here.
You can use stringstream and getline with the \" character as the delimiter instead of newline.
Split your string into 3 cases: the part of the string before the first quote, the part in quotes, and the part after the second quote.
You would process the first and third parts before adding to your output, but add the second part without processing.
If your string contains no quotes, the entire string will be contained in the first part. The second and third parts will just be empty.
while (getline (is, str)) {
string processed;
stringstream line(str);
string beforeFirstQuote;
string inQuotes;
getline(line, beforeFirstQuote, '\"');
Process(beforeFirstQuote, processed);
getline(line, inQuotes, '\"');
processed += inQuotes;
getline(line, afterSecondQuote, '\"');
Process(afterFirstQuote, processed);
}
void Process(const string& input, string& output) {
for (string::const_iterator it = input.begin(), end_it = input.end(); it != end_it; ++it)
{
char c = *it;
if (! ::isspace(c))
{
output.push_back(::toupper(c));
}
}
}