Remove whitespace from string excluding parts between pairs of " and ' C++ - c++

So essentially what I want to do is erase all the whitespace from an std::string object, however excluding parts within speech marks and quote marks (so basically strings), eg:
Hello, World! I am a string
Would result in:
Hello,World!Iamastring
However things within speech marks/quote marks would be ignored:
"Hello, World!" I am a string
Would result in:
"Hello, World!"Iamastring
Or:
Hello,' World! I' am a string
Would be:
Hello,' World! I'amastring
Is there a simple routine to perform this to a string, either one build into the standard library or an example of how to write my own? It doesn't have to be the most efficient one possible, as it will only be run once or twice every time the program runs.

No, there is not such a routine ready.
You may build your own though.
You have to loop over the string and you want to use a flag. If the flag is true, then you delete the spaces, if it is false, you ignore them. The flag is true when you are not in a part of quotes, else it's false.
Here is a naive, not widely tested example:
#include <string>
#include <iostream>
using namespace std;
int main() {
// we will copy the result in new string for simplicity
// of course you can do it inplace. This takes into account only
// double quotes. Easy to extent do single ones though!
string str("\"Hello, World!\" I am a string");
string new_str = "";
// flags for when to delete spaces or not
// 'start' helps you find if you are in an area of double quotes
// If you are, then don't delete the spaces, otherwise, do delete
bool delete_spaces = true, start = false;
for(unsigned int i = 0; i < str.size(); ++i) {
if(str[i] == '\"') {
start ? start = false : start = true;
if(start) {
delete_spaces = false;
}
}
if(!start) {
delete_spaces = true;
}
if(delete_spaces) {
if(str[i] != ' ') {
new_str += str[i];
}
} else {
new_str += str[i];
}
}
cout << "new_str=|" << new_str << "|\n";
return 0;
}
Output:
new_str=|"Hello, World!"Iamastring|

Here we go. I ended up iterating through the string, and if it finds either a " or a ', it will flip the ignore flag. If the ignore flag is true and the current character is not a " or a ', the iterator just increments until it either reaches the end of the string or finds another "/'. If the ignore flag is false, it will remove the current character if it's whitespace (either space, newline or tab).
EDIT: this code now supports ignoring escaped characters (\", \') and making sure a string starting with a " ends with a ", and a string starting with a ' ends with a ', ignoring anything else in between.
#include <iostream>
#include <string>
int main() {
std::string str("I am some code, with \"A string here\", but not here\\\". 'This sentence \" should not end yet', now it should. There is also 'a string here' too.\n");
std::string::iterator endVal = str.end(); // a kind of NULL pointer
std::string::iterator type = endVal; // either " or '
bool ignore = false; // whether to ignore the current character or not
for (std::string::iterator it=str.begin(); it!=str.end();)
{
// ignore escaped characters
if ((*it) == '\\')
{
it += 2;
}
else
{
if ((*it) == '"' || (*it) == '\'')
{
if (ignore) // within a string
{
if (type != endVal && (*it) == (*type))
{
// end of the string
ignore = false;
type = endVal;
}
}
else // outside of a string, so one must be starting.
{
type = it;
ignore = true;
}
it++;
//ignore ? ignore = false : ignore = true;
//type = it;
}
else
{
if (!ignore)
{
if ((*it) == ' ' || (*it) == '\n' || (*it) == '\t')
{
it = str.erase(it);
}
else
{
it++;
}
}
else
{
it++;
}
}
}
}
std::cout << "string now is: " << str << std::endl;
return 0;
}

Argh, and here I spent time writing this (simple) version:
#include <cctype>
#include <ciso646>
#include <iostream>
#include <string>
template <typename Predicate>
std::string remove_unquoted_chars( const std::string& s, Predicate p )
{
bool skip = false;
char q = '\0';
std::string result;
for (char c : s)
if (skip)
{
result.append( 1, c );
skip = false;
}
else if (q)
{
result.append( 1, c );
skip = (c == '\\');
if (c == q) q = '\0';
}
else
{
if (!std::isspace( c ))
result.append( 1, c );
q = p( c ) ? c : '\0';
}
return result;
}
std::string remove_unquoted_whitespace( const std::string& s )
{
return remove_unquoted_chars( s, []( char c ) -> bool { return (c == '"') or (c == '\''); } );
}
int main()
{
std::string s;
std::cout << "s? ";
std::getline( std::cin, s );
std::cout << remove_unquoted_whitespace( s ) << "\n";
}
Removes all characters identified by the given predicate except stuff inside a single-quoted or double-quoted C-style string, taking care to respect escaped characters.

you may use erase-remove idiom like this
#include <string>
#include <iostream>
#include <algorithm>
int main()
{
std::string str("\"Hello, World!\" I am a string");
std::size_t x = str.find_last_of("\"");
std::string split1 = str.substr(0, ++x);
std::string split2 = str.substr(x, str.size());
split1.erase(std::remove(split1.begin(), split1.end(), '\\'), split1.end());
split2.erase(std::remove(split2.begin(), split2.end(), ' '), split2.end());
std::cout << split1 + split2;
}

Related

Word counter returning incorrect number of words

I've been trying to create a program that reads text from a file and stores it in a string. I feed the string to a function that counts every word in the string.
However its only accurate assuming the user leaves some whitespace at the end of a line and doesn't creates blank lines.... not a very good word counter.
Creating a blank line results in a false increment to the word count.
I'm not sure if my main problem is using a boolean to do this or checking for whitespace and '\n' characters.
bool countingLetters = false;
int wordCount = 0;
for (int i = 0; i < text.length(); i++)
{
if (text[i] == ' ' && countingLetters == true)
{
countingLetters = false;
wordCount++;
}
if (text[i] != ' ' && countingLetters == false)
{
countingLetters = true;
}
if (text[i] == '\n' && countingLetters == true)
{
countingLetters = false;
wordCount++;
}
}
Your code is basically a state machine. To complete your solution, just count in the string ending.
Add this to the end of your code:
if(countingLetters) { // word at the end of string, without any space charactor
wordCount++;
}
Or if you can be sure it's C-style string, like std::string, you can just index 1 pass the last charactor, and handle '\0'in same way of space and '\n' .
To improve your code, use isspace (and this covers more space charactor, including '\t', etc.). And better to use else if pattern. Also, it's not good pratice to ==true. Just use boolean as condition.
Or maybe, isalpha(c) fits more to your need.
bool countingLetters = false;
int wordCount = 0;
for (char c:text) {
if (!isalpha(c) && countingLetters) { // this also works for newline
countingLetters = false;
++wordCount;
} else if (isalpha(c) && !countingLetters) {
countingLetters = true;
} // otherwise just skip
}
if(countingLetters) { // word at the end of string, without any space charactor
++wordCount;
}
And it's not acceptable to insert extra charactor just for such a simple task. For example, text may be const.
An alternative is to count the beginning of a "word".
Let us say the beginning of a word is a letter after a non-letter. We can adjust this if desired.
int wordCount = 0;
int prior = '\n'; // some non-letter
for (int i = 0; i < text.length(); i++) {
if (isalpha(text[i]) && !isalpha(prior)) {
wordCount++;
}
prior = text[i];
}
C++ also provides some very high-level ways to do this.
One is by using a loop over a stringstream, which splits text on whitespace:
#include <sstream>
#include <string>
std::size_t count_words( const std::string& s )
{
std::size_t count = 0;
std::istringstream ss( s );
std::string t;
while (ss >> t) count += 1;
return count;
}
Another is using a stream iterator algorithm:
#include <iterator>
#include <sstream>
#include <string>
std::size_t count_words( const std::string& s )
{
std::istringstream ss( s );
return std::distance(
std::istream_iterator <std::string> ( ss ),
std::istream_iterator <std::string> ()
);
}
Yet another is using a regular expression:
#include <iterator>
#include <regex>
#include <string>
std::size_t count_words( const std::string& s )
{
std::regex re( "\\w+" );
return std::distance(
std::sregex_iterator( s.begin(), s.end(), re ),
std::sregex_iterator()
);
}
I’m sure there are many more, but those three are the ones that come off the top of my head.

Deleting spaces from the beginning of strings [duplicate]

How to remove spaces from a string object in C++.
For example, how to remove leading and trailing spaces from the below string object.
//Original string: " This is a sample string "
//Desired string: "This is a sample string"
The string class, as far as I know, doesn't provide any methods to remove leading and trailing spaces.
To add to the problem, how to extend this formatting to process extra spaces between words of the string. For example,
// Original string: " This is a sample string "
// Desired string: "This is a sample string"
Using the string methods mentioned in the solution, I can think of doing these operations in two steps.
Remove leading and trailing spaces.
Use find_first_of, find_last_of, find_first_not_of, find_last_not_of and substr, repeatedly at word boundaries to get desired formatting.
This is called trimming. If you can use Boost, I'd recommend it.
Otherwise, use find_first_not_of to get the index of the first non-whitespace character, then find_last_not_of to get the index from the end that isn't whitespace. With these, use substr to get the sub-string with no surrounding whitespace.
In response to your edit, I don't know the term but I'd guess something along the lines of "reduce", so that's what I called it. :) (Note, I've changed the white-space to be a parameter, for flexibility)
#include <iostream>
#include <string>
std::string trim(const std::string& str,
const std::string& whitespace = " \t")
{
const auto strBegin = str.find_first_not_of(whitespace);
if (strBegin == std::string::npos)
return ""; // no content
const auto strEnd = str.find_last_not_of(whitespace);
const auto strRange = strEnd - strBegin + 1;
return str.substr(strBegin, strRange);
}
std::string reduce(const std::string& str,
const std::string& fill = " ",
const std::string& whitespace = " \t")
{
// trim first
auto result = trim(str, whitespace);
// replace sub ranges
auto beginSpace = result.find_first_of(whitespace);
while (beginSpace != std::string::npos)
{
const auto endSpace = result.find_first_not_of(whitespace, beginSpace);
const auto range = endSpace - beginSpace;
result.replace(beginSpace, range, fill);
const auto newStart = beginSpace + fill.length();
beginSpace = result.find_first_of(whitespace, newStart);
}
return result;
}
int main(void)
{
const std::string foo = " too much\t \tspace\t\t\t ";
const std::string bar = "one\ntwo";
std::cout << "[" << trim(foo) << "]" << std::endl;
std::cout << "[" << reduce(foo) << "]" << std::endl;
std::cout << "[" << reduce(foo, "-") << "]" << std::endl;
std::cout << "[" << trim(bar) << "]" << std::endl;
}
Result:
[too much space]
[too much space]
[too-much-space]
[one
two]
Easy removing leading, trailing and extra spaces from a std::string in one line
value = std::regex_replace(value, std::regex("^ +| +$|( ) +"), "$1");
removing only leading spaces
value.erase(value.begin(), std::find_if(value.begin(), value.end(), std::bind1st(std::not_equal_to<char>(), ' ')));
or
value = std::regex_replace(value, std::regex("^ +"), "");
removing only trailing spaces
value.erase(std::find_if(value.rbegin(), value.rend(), std::bind1st(std::not_equal_to<char>(), ' ')).base(), value.end());
or
value = std::regex_replace(value, std::regex(" +$"), "");
removing only extra spaces
value = regex_replace(value, std::regex(" +"), " ");
I am currently using these functions:
// trim from left
inline std::string& ltrim(std::string& s, const char* t = " \t\n\r\f\v")
{
s.erase(0, s.find_first_not_of(t));
return s;
}
// trim from right
inline std::string& rtrim(std::string& s, const char* t = " \t\n\r\f\v")
{
s.erase(s.find_last_not_of(t) + 1);
return s;
}
// trim from left & right
inline std::string& trim(std::string& s, const char* t = " \t\n\r\f\v")
{
return ltrim(rtrim(s, t), t);
}
// copying versions
inline std::string ltrim_copy(std::string s, const char* t = " \t\n\r\f\v")
{
return ltrim(s, t);
}
inline std::string rtrim_copy(std::string s, const char* t = " \t\n\r\f\v")
{
return rtrim(s, t);
}
inline std::string trim_copy(std::string s, const char* t = " \t\n\r\f\v")
{
return trim(s, t);
}
Boost string trim algorithm
#include <boost/algorithm/string/trim.hpp>
[...]
std::string msg = " some text with spaces ";
boost::algorithm::trim(msg);
This is my solution for stripping the leading and trailing spaces ...
std::string stripString = " Plamen ";
while(!stripString.empty() && std::isspace(*stripString.begin()))
stripString.erase(stripString.begin());
while(!stripString.empty() && std::isspace(*stripString.rbegin()))
stripString.erase(stripString.length()-1);
The result is "Plamen"
Here is how you can do it:
std::string & trim(std::string & str)
{
return ltrim(rtrim(str));
}
And the supportive functions are implemeted as:
std::string & ltrim(std::string & str)
{
auto it2 = std::find_if( str.begin() , str.end() , [](char ch){ return !std::isspace<char>(ch , std::locale::classic() ) ; } );
str.erase( str.begin() , it2);
return str;
}
std::string & rtrim(std::string & str)
{
auto it1 = std::find_if( str.rbegin() , str.rend() , [](char ch){ return !std::isspace<char>(ch , std::locale::classic() ) ; } );
str.erase( it1.base() , str.end() );
return str;
}
And once you've all these in place, you can write this as well:
std::string trim_copy(std::string const & str)
{
auto s = str;
return ltrim(rtrim(s));
}
C++17 introduced std::basic_string_view, a class template that refers to a constant contiguous sequence of char-like objects, i.e. a view of the string. Apart from having a very similar interface to std::basic_string, it has two additional functions: remove_prefix(), which shrinks the view by moving its start forward; and
remove_suffix(), which shrinks the view by moving its end backward. These can be used to trim leading and trailing space:
#include <string_view>
#include <string>
std::string_view ltrim(std::string_view str)
{
const auto pos(str.find_first_not_of(" \t\n\r\f\v"));
str.remove_prefix(std::min(pos, str.length()));
return str;
}
std::string_view rtrim(std::string_view str)
{
const auto pos(str.find_last_not_of(" \t\n\r\f\v"));
str.remove_suffix(std::min(str.length() - pos - 1, str.length()));
return str;
}
std::string_view trim(std::string_view str)
{
str = ltrim(str);
str = rtrim(str);
return str;
}
int main()
{
std::string str = " hello world ";
auto sv1{ ltrim(str) }; // "hello world "
auto sv2{ rtrim(str) }; // " hello world"
auto sv3{ trim(str) }; // "hello world"
//If you want, you can create std::string objects from std::string_view objects
std::string s1{ sv1 };
std::string s2{ sv2 };
std::string s3{ sv3 };
}
Note: the use of std::min to ensure pos is not greater than size(), which happens when all characters in the string are whitespace and find_first_not_of returns npos. Also, std::string_view is a non-owning reference, so it's only valid as long as the original string still exists. Trimming the string view has no effect on the string it is based on.
Example for trim leading and trailing spaces following jon-hanson's suggestion to use boost (only removes trailing and pending spaces):
#include <boost/algorithm/string/trim.hpp>
std::string str = " t e s t ";
boost::algorithm::trim ( str );
Results in "t e s t"
There is also
trim_left results in "t e s t "
trim_right results in " t e s t"
/// strip a string, remove leading and trailing spaces
void strip(const string& in, string& out)
{
string::const_iterator b = in.begin(), e = in.end();
// skipping leading spaces
while (isSpace(*b)){
++b;
}
if (b != e){
// skipping trailing spaces
while (isSpace(*(e-1))){
--e;
}
}
out.assign(b, e);
}
In the above code, the isSpace() function is a boolean function that tells whether a character is a white space, you can implement this function to reflect your needs, or just call the isspace() from "ctype.h" if you want.
Example for trimming leading and trailing spaces
std::string aString(" This is a string to be trimmed ");
auto start = aString.find_first_not_of(' ');
auto end = aString.find_last_not_of(' ');
std::string trimmedString;
trimmedString = aString.substr(start, (end - start) + 1);
OR
trimmedSring = aString.substr(aString.find_first_not_of(' '), (aString.find_last_not_of(' ') - aString.find_first_not_of(' ')) + 1);
Using the standard library has many benefits, but one must be aware of some special cases that cause exceptions. For example, none of the answers covered the case where a C++ string has some Unicode characters. In this case, if you use the function isspace, an exception will be thrown.
I have been using the following code for trimming the strings and some other operations that might come in handy. The major benefits of this code are: it is really fast (faster than any code I have ever tested), it only uses the standard library, and it never causes an exception:
#include <string>
#include <algorithm>
#include <functional>
#include <locale>
#include <iostream>
typedef unsigned char BYTE;
std::string strTrim(std::string s, char option = 0)
{
// convert all whitespace characters to a standard space
std::replace_if(s.begin(), s.end(), (std::function<int(BYTE)>)::isspace, ' ');
// remove leading and trailing spaces
size_t f = s.find_first_not_of(' ');
if (f == std::string::npos) return "";
s = s.substr(f, s.find_last_not_of(' ') - f + 1);
// remove consecutive spaces
s = std::string(s.begin(), std::unique(s.begin(), s.end(),
[](BYTE l, BYTE r){ return l == ' ' && r == ' '; }));
switch (option)
{
case 'l': // convert to lowercase
std::transform(s.begin(), s.end(), s.begin(), ::tolower);
return s;
case 'U': // convert to uppercase
std::transform(s.begin(), s.end(), s.begin(), ::toupper);
return s;
case 'n': // remove all spaces
s.erase(std::remove(s.begin(), s.end(), ' '), s.end());
return s;
default: // just trim
return s;
}
}
This might be the simplest of all.
You can use string::find and string::rfind to find whitespace from both sides and reduce the string.
void TrimWord(std::string& word)
{
if (word.empty()) return;
// Trim spaces from left side
while (word.find(" ") == 0)
{
word.erase(0, 1);
}
// Trim spaces from right side
size_t len = word.size();
while (word.rfind(" ") == --len)
{
word.erase(len, len + 1);
}
}
To add to the problem, how to extend this formatting to process extra spaces between words of the string.
Actually, this is a simpler case than accounting for multiple leading and trailing white-space characters. All you need to do is remove duplicate adjacent white-space characters from the entire string.
The predicate for adjacent white space would simply be:
auto by_space = [](unsigned char a, unsigned char b) {
return std::isspace(a) and std::isspace(b);
};
and then you can get rid of those duplicate adjacent white-space characters with std::unique, and the erase-remove idiom:
// s = " This is a sample string "
s.erase(std::unique(std::begin(s), std::end(s), by_space),
std::end(s));
// s = " This is a sample string "
This does potentially leave an extra white-space character at the front and/or the back. This can be removed quite easily:
if (std::size(s) && std::isspace(s.back()))
s.pop_back();
if (std::size(s) && std::isspace(s.front()))
s.erase(0, 1);
Here's a demo.
I've tested this, it all works. So this method processInput will just ask the user to type something in. it will return a string that has no extra spaces internally, nor extra spaces at the begining or the end. Hope this helps. (also put a heap of commenting in to make it simple to understand).
you can see how to implement it in the main() at the bottom
#include <string>
#include <iostream>
string processInput() {
char inputChar[256];
string output = "";
int outputLength = 0;
bool space = false;
// user inputs a string.. well a char array
cin.getline(inputChar,256);
output = inputChar;
string outputToLower = "";
// put characters to lower and reduce spaces
for(int i = 0; i < output.length(); i++){
// if it's caps put it to lowercase
output[i] = tolower(output[i]);
// make sure we do not include tabs or line returns or weird symbol for null entry array thingy
if (output[i] != '\t' && output[i] != '\n' && output[i] != 'Ì') {
if (space) {
// if the previous space was a space but this one is not, then space now is false and add char
if (output[i] != ' ') {
space = false;
// add the char
outputToLower+=output[i];
}
} else {
// if space is false, make it true if the char is a space
if (output[i] == ' ') {
space = true;
}
// add the char
outputToLower+=output[i];
}
}
}
// trim leading and tailing space
string trimmedOutput = "";
for(int i = 0; i < outputToLower.length(); i++){
// if it's the last character and it's not a space, then add it
// if it's the first character and it's not a space, then add it
// if it's not the first or the last then add it
if (i == outputToLower.length() - 1 && outputToLower[i] != ' ' ||
i == 0 && outputToLower[i] != ' ' ||
i > 0 && i < outputToLower.length() - 1) {
trimmedOutput += outputToLower[i];
}
}
// return
output = trimmedOutput;
return output;
}
int main() {
cout << "Username: ";
string userName = processInput();
cout << "\nModified Input = " << userName << endl;
}
Why complicate?
std::string removeSpaces(std::string x){
if(x[0] == ' ') { x.erase(0, 1); return removeSpaces(x); }
if(x[x.length() - 1] == ' ') { x.erase(x.length() - 1, x.length()); return removeSpaces(x); }
else return x;
}
This works even if boost was to fail, no regex, no weird stuff nor libraries.
EDIT:
Fix for M.M.'s comment.
No boost, no regex, just the string library. It's that simple.
string trim(const string& s) { // removes whitespace characters from beginnig and end of string s
const int l = (int)s.length();
int a=0, b=l-1;
char c;
while(a<l && ((c=s[a])==' '||c=='\t'||c=='\n'||c=='\v'||c=='\f'||c=='\r'||c=='\0')) a++;
while(b>a && ((c=s[b])==' '||c=='\t'||c=='\n'||c=='\v'||c=='\f'||c=='\r'||c=='\0')) b--;
return s.substr(a, 1+b-a);
}
The constant time and space complexity for removing leading and trailing spaces can be achieved by using pop_back() function in the string. Code looks as follows:
void trimTrailingSpaces(string& s) {
while (s.size() > 0 && s.back() == ' ') {
s.pop_back();
}
}
void trimSpaces(string& s) {
//trim trailing spaces.
trimTrailingSpaces(s);
//trim leading spaces
//To reduce complexity, reversing and removing trailing spaces
//and again reversing back
reverse(s.begin(), s.end());
trimTrailingSpaces(s);
reverse(s.begin(), s.end());
}
char *str = (char*) malloc(50 * sizeof(char));
strcpy(str, " some random string (<50 chars) ");
while(*str == ' ' || *str == '\t' || *str == '\n')
str++;
int len = strlen(str);
while(len >= 0 &&
(str[len - 1] == ' ' || str[len - 1] == '\t' || *str == '\n')
{
*(str + len - 1) = '\0';
len--;
}
printf(":%s:\n", str);
void removeSpaces(string& str)
{
/* remove multiple spaces */
int k=0;
for (int j=0; j<str.size(); ++j)
{
if ( (str[j] != ' ') || (str[j] == ' ' && str[j+1] != ' ' ))
{
str [k] = str [j];
++k;
}
}
str.resize(k);
/* remove space at the end */
if (str [k-1] == ' ')
str.erase(str.end()-1);
/* remove space at the begin */
if (str [0] == ' ')
str.erase(str.begin());
}
string trim(const string & sStr)
{
int nSize = sStr.size();
int nSPos = 0, nEPos = 1, i;
for(i = 0; i< nSize; ++i) {
if( !isspace( sStr[i] ) ) {
nSPos = i ;
break;
}
}
for(i = nSize -1 ; i >= 0 ; --i) {
if( !isspace( sStr[i] ) ) {
nEPos = i;
break;
}
}
return string(sStr, nSPos, nEPos - nSPos + 1);
}
For leading- and trailing spaces, how about:
string string_trim(const string& in) {
stringstream ss;
string out;
ss << in;
ss >> out;
return out;
}
Or for a sentence:
string trim_words(const string& sentence) {
stringstream ss;
ss << sentence;
string s;
string out;
while(ss >> s) {
out+=(s+' ');
}
return out.substr(0, out.length()-1);
}
neat and clean
void trimLeftTrailingSpaces(string &input) {
input.erase(input.begin(), find_if(input.begin(), input.end(), [](int ch) {
return !isspace(ch);
}));
}
void trimRightTrailingSpaces(string &input) {
input.erase(find_if(input.rbegin(), input.rend(), [](int ch) {
return !isspace(ch);
}).base(), input.end());
}
This was the most intuitive way for me to solve this problem:
/**
* #brief Reverses a string, a helper function to removeLeadingTrailingSpaces
*
* #param line
* #return std::string
*/
std::string reverseString (std::string line) {
std::string reverse_line = "";
for(int i = line.length() - 1; i > -1; i--) {
reverse_line += line[i];
}
return reverse_line;
}
/**
* #brief Removes leading and trailing whitespace
* as well as extra whitespace within the line
*
* #param line
* #return std::string
*/
std::string removeLeadingTrailingSpaces(std::string line) {
std::string filtered_line = "";
std::string curr_line = line;
for(int loop = 0; loop < 2; loop++) {
bool leading_spaces_exist = true;
filtered_line = "";
std::string prev_char = "";
for(int i = 0; i < line.length(); i++) {
// Ignores leading whitespace
if(leading_spaces_exist) {
if(curr_line[i] != ' ') {
leading_spaces_exist = false;
}
}
// Puts the rest of the line in a variable
// and ignore back-to-back whitespace
if(!leading_spaces_exist) {
if(!(curr_line[i] == ' ' && prev_char == " ")) {
filtered_line += curr_line[i];
}
prev_char = curr_line[i];
}
}
/*
Reverses the line so that after we remove the leading whitespace
the trailing whitespace becomes the leading whitespace.
After the second round, it needs to reverse the string back to
its regular order.
*/
curr_line = reverseString(filtered_line);
}
return curr_line;
}
Basically, I looped through the string and removed the leading whitespace, then flipped the string and repeated the same process, then flipped back to normal.
I also added the functionality of cleaning up the line if there were back-to-back spaces.
My Solution for this problem not using any STL methods but only C++ string's own methods is as following:
void processString(string &s) {
if ( s.empty() ) return;
//delete leading and trailing spaces of the input string
int notSpaceStartPos = 0, notSpaceEndPos = s.length() - 1;
while ( s[notSpaceStartPos] == ' ' ) ++notSpaceStartPos;
while ( s[notSpaceEndPos] == ' ' ) --notSpaceEndPos;
if ( notSpaceStartPos > notSpaceEndPos ) { s = ""; return; }
s = s.substr(notSpaceStartPos, notSpaceEndPos - notSpaceStartPos + 1);
//reduce multiple spaces between two words to a single space
string temp;
for ( int i = 0; i < s.length(); i++ ) {
if ( i > 0 && s[i] == ' ' && s[i-1] == ' ' ) continue;
temp.push_back(s[i]);
}
s = temp;
}
I have used this method to pass a LeetCode problem Reverse Words in a String
void TrimWhitespaces(std::wstring& str)
{
if (str.empty())
return;
const std::wstring& whitespace = L" \t";
std::wstring::size_type strBegin = str.find_first_not_of(whitespace);
std::wstring::size_type strEnd = str.find_last_not_of(whitespace);
if (strBegin != std::wstring::npos || strEnd != std::wstring::npos)
{
strBegin == std::wstring::npos ? 0 : strBegin;
strEnd == std::wstring::npos ? str.size() : 0;
const auto strRange = strEnd - strBegin + 1;
str.substr(strBegin, strRange).swap(str);
}
else if (str[0] == ' ' || str[0] == '\t') // handles non-empty spaces-only or tabs-only
{
str = L"";
}
}
void TrimWhitespacesTest()
{
std::wstring EmptyStr = L"";
std::wstring SpacesOnlyStr = L" ";
std::wstring TabsOnlyStr = L" ";
std::wstring RightSpacesStr = L"12345 ";
std::wstring LeftSpacesStr = L" 12345";
std::wstring NoSpacesStr = L"12345";
TrimWhitespaces(EmptyStr);
TrimWhitespaces(SpacesOnlyStr);
TrimWhitespaces(TabsOnlyStr);
TrimWhitespaces(RightSpacesStr);
TrimWhitespaces(LeftSpacesStr);
TrimWhitespaces(NoSpacesStr);
assert(EmptyStr == L"");
assert(SpacesOnlyStr == L"");
assert(TabsOnlyStr == L"");
assert(RightSpacesStr == L"12345");
assert(LeftSpacesStr == L"12345");
assert(NoSpacesStr == L"12345");
}
What about the erase-remove idiom?
std::string s("...");
s.erase( std::remove(s.begin(), s.end(), ' '), s.end() );
Sorry. I saw too late that you don't want to remove all whitespace.

ask for text to edit, text formatting

I would like to make a program that asks for text (a paragraph with several words) that would be separated by commas.
To transform the text and add a tag between the two, like to format the text to html text
Example:
word1, word2, word3
to
<a> word1 </a>, <a> word2 </a>, <a> word3 </a>
So I started doing this code but I do not know how to continue. How can I test the text to find the front of the word? I imagine with ASCII tests?
Maybe with a table that will test every case ?
I do not necessarily ask the complete answer but maybe a direction to follow could help.
#include <iostream>
#include <iomanip>
#include <string> //For getline()
using namespace std;
// Creating class
class GetText
{
public:
string text;
string line; //Using this as a buffer
void userText()
{
cout << "Please type a message: ";
do
{
getline(cin, line);
text += line;
}
while(line != "");
}
void to_string()
{
cout << "\n" << "User's Text: " << "\n" << text << endl;
}
};
int main() {
GetText test;
test.userText();
test.to_string();
system("pause");
return 0;
}
The next thing you would need to do is to split your input by a deltimeter (in your case ',') into a vector and later combine everything with pre and posfixes. C++ does not support splitting by default, you would have to be creative or search for a solution like here.
If you want to keep it really simple, you can detect word boundaries by checking two characters at a time. Here's a working example.
using namespace std;
#include <iostream>
#include <string>
#include <cctype>
typedef enum boundary_type_e {
E_BOUNDARY_TYPE_ERROR = -1,
E_BOUNDARY_TYPE_NONE,
E_BOUNDARY_TYPE_LEFT,
E_BOUNDARY_TYPE_RIGHT,
} boundary_type_t;
typedef struct boundary_s {
boundary_type_t type;
int pos;
} boundary_t;
bool is_word_char(int c) {
return ' ' <= c && c <= '~' && !isspace(c) && c != ',';
}
boundary_t maybe_word_boundary(string str, int pos) {
int len = str.length();
if (pos < 0 || pos >= len) {
return (boundary_t){.type = E_BOUNDARY_TYPE_ERROR};
} else {
if (pos == 0 && is_word_char(str[pos])) {
// if the first character is word-y, we have a left boundary at the beginning
return (boundary_t){.type = E_BOUNDARY_TYPE_LEFT, .pos = pos};
} else if (pos == len - 1 && is_word_char(str[pos])) {
// if the last character is word-y, we have a right boundary left of the null terminator
return (boundary_t){.type = E_BOUNDARY_TYPE_RIGHT, .pos = pos + 1};
} else if (!is_word_char(str[pos]) && is_word_char(str[pos + 1])) {
// if we have a delimiter followed by a word char, we have a left boundary left of the word char
return (boundary_t){.type = E_BOUNDARY_TYPE_LEFT, .pos = pos + 1};
} else if (is_word_char(str[pos]) && !is_word_char(str[pos + 1])) {
// if we have a word char followed by a delimiter, we have a right boundary right of the word char
return (boundary_t){.type = E_BOUNDARY_TYPE_RIGHT, .pos = pos + 1};
}
return (boundary_t){.type = E_BOUNDARY_TYPE_NONE};
}
}
int main() {
string str;
string ins_left("<tag>");
string ins_right("</tag>");
getline(cin, str);
// can't use length for the loop condition without recalculating it all the time
for (int i = 0; str[i] != '\0'; i++) {
boundary_t boundary = maybe_word_boundary(str, i);
if (boundary.type == E_BOUNDARY_TYPE_LEFT) {
str.insert(boundary.pos, ins_left);
i += ins_left.length();
} else if (boundary.type == E_BOUNDARY_TYPE_RIGHT) {
str.insert(boundary.pos, ins_right);
i += ins_right.length();
}
}
}
It would be better to use enum class but I forgot the notation. You can also copy to a buffer instead of generating the new string in-place, I was just trying to keep it simple. Feel free to expand it to a class based C++ style. To get your exact desired output, strip the spaces first and add spaces to ins_left and ins_right.

Remove whitespace, convert case, in string except in quotes

I am using C++03 without Boost.
Suppose I have a string such as.. The day is "Mon day"
I want to process this to
THEDAYISMon day
That is, convert to upper case what is not in the quote, and remove whitespace that isn't in the quote.
The string may not contain quotes, but if it does, there will only be 2.
I tried using STL algorithms but I get stuck on how to remember if it's in a quote or not between elements.
Of course I can do it with good old for loops, but I was wondering if there is a fancy C++ way.
Thanks.
This is what I have using a for loop
while (getline(is, str))
{
// remove whitespace and convert case except in quotes
temp.clear();
bool bInQuote = false;
for (string::const_iterator it = str.begin(), end_it = str.end(); it != end_it; ++it)
{
char c = *it;
if (c == '\"')
{
bInQuote = (! bInQuote);
}
else
{
if (! ::isspace(c))
{
temp.push_back(bInQuote ? c : ::toupper(c));
}
}
}
swap(str, temp);
You can do something with STL algorithms like the following:
#include <iostream>
#include <string>
#include <algorithm>
#include <cctype>
using namespace std;
struct convert {
void operator()(char& c) { c = toupper((unsigned char)c); }
};
bool isSpace(char c)
{
return std::isspace(c);
}
int main() {
string input = "The day is \"Mon Day\" You know";
cout << "original string: " << input <<endl;
unsigned int firstQuote = input.find("\"");
unsigned int secondQuote = input.find_last_of("\"");
string firstPart="";
string secondPart="";
string quotePart="";
if (firstQuote != string::npos)
{
firstPart = input.substr(0,firstQuote);
if (secondQuote != string::npos)
{
secondPart = input.substr(secondQuote+1);
quotePart = input.substr(firstQuote+1, secondQuote-firstQuote-1);
//drop those quotes
}
std::for_each(firstPart.begin(), firstPart.end(), convert());
firstPart.erase(remove_if(firstPart.begin(),
firstPart.end(), isSpace),firstPart.end());
std::for_each(secondPart.begin(), secondPart.end(), convert());
secondPart.erase(remove_if(secondPart.begin(),
secondPart.end(), isSpace),secondPart.end());
input = firstPart + quotePart + secondPart;
}
else //does not contains quote
{
std::for_each(input.begin(), input.end(), convert());
input.erase(remove_if(input.begin(),
input.end(), isSpace),input.end());
}
cout << "transformed string: " << input << endl;
return 0;
}
It gave the following output:
original string: The day is "Mon Day" You know
transformed string: THEDAYISMon DayYOUKNOW
With the test case you have shown:
original string: The day is "Mon Day"
transformed string: THEDAYISMon Day
Just for laughs, use a custom iterator, std::copy and a std::back_insert_iterator, and an operator++ that knows to skip whitespace and set a flag on a quote character:
CustomStringIt& CustomStringIt::operator++ ()
{
if(index_<originalString_.size())
++index_;
if(!inQuotes_ && isspace(originalString_[index_]))
return ++(*this);
if('\"'==originalString_[index_])
{
inQuotes_ = !inQuotes_;
return ++(*this);
}
return *this;
}
char CustomStringIt::operator* () const
{
char c = originalString_[index_];
return inQuotes_ ? c : std::toupper(c) ;
}
Full code here.
You can use stringstream and getline with the \" character as the delimiter instead of newline.
Split your string into 3 cases: the part of the string before the first quote, the part in quotes, and the part after the second quote.
You would process the first and third parts before adding to your output, but add the second part without processing.
If your string contains no quotes, the entire string will be contained in the first part. The second and third parts will just be empty.
while (getline (is, str)) {
string processed;
stringstream line(str);
string beforeFirstQuote;
string inQuotes;
getline(line, beforeFirstQuote, '\"');
Process(beforeFirstQuote, processed);
getline(line, inQuotes, '\"');
processed += inQuotes;
getline(line, afterSecondQuote, '\"');
Process(afterFirstQuote, processed);
}
void Process(const string& input, string& output) {
for (string::const_iterator it = input.begin(), end_it = input.end(); it != end_it; ++it)
{
char c = *it;
if (! ::isspace(c))
{
output.push_back(::toupper(c));
}
}
}

Removing leading and trailing spaces from a string

How to remove spaces from a string object in C++.
For example, how to remove leading and trailing spaces from the below string object.
//Original string: " This is a sample string "
//Desired string: "This is a sample string"
The string class, as far as I know, doesn't provide any methods to remove leading and trailing spaces.
To add to the problem, how to extend this formatting to process extra spaces between words of the string. For example,
// Original string: " This is a sample string "
// Desired string: "This is a sample string"
Using the string methods mentioned in the solution, I can think of doing these operations in two steps.
Remove leading and trailing spaces.
Use find_first_of, find_last_of, find_first_not_of, find_last_not_of and substr, repeatedly at word boundaries to get desired formatting.
This is called trimming. If you can use Boost, I'd recommend it.
Otherwise, use find_first_not_of to get the index of the first non-whitespace character, then find_last_not_of to get the index from the end that isn't whitespace. With these, use substr to get the sub-string with no surrounding whitespace.
In response to your edit, I don't know the term but I'd guess something along the lines of "reduce", so that's what I called it. :) (Note, I've changed the white-space to be a parameter, for flexibility)
#include <iostream>
#include <string>
std::string trim(const std::string& str,
const std::string& whitespace = " \t")
{
const auto strBegin = str.find_first_not_of(whitespace);
if (strBegin == std::string::npos)
return ""; // no content
const auto strEnd = str.find_last_not_of(whitespace);
const auto strRange = strEnd - strBegin + 1;
return str.substr(strBegin, strRange);
}
std::string reduce(const std::string& str,
const std::string& fill = " ",
const std::string& whitespace = " \t")
{
// trim first
auto result = trim(str, whitespace);
// replace sub ranges
auto beginSpace = result.find_first_of(whitespace);
while (beginSpace != std::string::npos)
{
const auto endSpace = result.find_first_not_of(whitespace, beginSpace);
const auto range = endSpace - beginSpace;
result.replace(beginSpace, range, fill);
const auto newStart = beginSpace + fill.length();
beginSpace = result.find_first_of(whitespace, newStart);
}
return result;
}
int main(void)
{
const std::string foo = " too much\t \tspace\t\t\t ";
const std::string bar = "one\ntwo";
std::cout << "[" << trim(foo) << "]" << std::endl;
std::cout << "[" << reduce(foo) << "]" << std::endl;
std::cout << "[" << reduce(foo, "-") << "]" << std::endl;
std::cout << "[" << trim(bar) << "]" << std::endl;
}
Result:
[too much space]
[too much space]
[too-much-space]
[one
two]
Easy removing leading, trailing and extra spaces from a std::string in one line
value = std::regex_replace(value, std::regex("^ +| +$|( ) +"), "$1");
removing only leading spaces
value.erase(value.begin(), std::find_if(value.begin(), value.end(), std::bind1st(std::not_equal_to<char>(), ' ')));
or
value = std::regex_replace(value, std::regex("^ +"), "");
removing only trailing spaces
value.erase(std::find_if(value.rbegin(), value.rend(), std::bind1st(std::not_equal_to<char>(), ' ')).base(), value.end());
or
value = std::regex_replace(value, std::regex(" +$"), "");
removing only extra spaces
value = regex_replace(value, std::regex(" +"), " ");
I am currently using these functions:
// trim from left
inline std::string& ltrim(std::string& s, const char* t = " \t\n\r\f\v")
{
s.erase(0, s.find_first_not_of(t));
return s;
}
// trim from right
inline std::string& rtrim(std::string& s, const char* t = " \t\n\r\f\v")
{
s.erase(s.find_last_not_of(t) + 1);
return s;
}
// trim from left & right
inline std::string& trim(std::string& s, const char* t = " \t\n\r\f\v")
{
return ltrim(rtrim(s, t), t);
}
// copying versions
inline std::string ltrim_copy(std::string s, const char* t = " \t\n\r\f\v")
{
return ltrim(s, t);
}
inline std::string rtrim_copy(std::string s, const char* t = " \t\n\r\f\v")
{
return rtrim(s, t);
}
inline std::string trim_copy(std::string s, const char* t = " \t\n\r\f\v")
{
return trim(s, t);
}
Boost string trim algorithm
#include <boost/algorithm/string/trim.hpp>
[...]
std::string msg = " some text with spaces ";
boost::algorithm::trim(msg);
This is my solution for stripping the leading and trailing spaces ...
std::string stripString = " Plamen ";
while(!stripString.empty() && std::isspace(*stripString.begin()))
stripString.erase(stripString.begin());
while(!stripString.empty() && std::isspace(*stripString.rbegin()))
stripString.erase(stripString.length()-1);
The result is "Plamen"
Here is how you can do it:
std::string & trim(std::string & str)
{
return ltrim(rtrim(str));
}
And the supportive functions are implemeted as:
std::string & ltrim(std::string & str)
{
auto it2 = std::find_if( str.begin() , str.end() , [](char ch){ return !std::isspace<char>(ch , std::locale::classic() ) ; } );
str.erase( str.begin() , it2);
return str;
}
std::string & rtrim(std::string & str)
{
auto it1 = std::find_if( str.rbegin() , str.rend() , [](char ch){ return !std::isspace<char>(ch , std::locale::classic() ) ; } );
str.erase( it1.base() , str.end() );
return str;
}
And once you've all these in place, you can write this as well:
std::string trim_copy(std::string const & str)
{
auto s = str;
return ltrim(rtrim(s));
}
C++17 introduced std::basic_string_view, a class template that refers to a constant contiguous sequence of char-like objects, i.e. a view of the string. Apart from having a very similar interface to std::basic_string, it has two additional functions: remove_prefix(), which shrinks the view by moving its start forward; and
remove_suffix(), which shrinks the view by moving its end backward. These can be used to trim leading and trailing space:
#include <string_view>
#include <string>
std::string_view ltrim(std::string_view str)
{
const auto pos(str.find_first_not_of(" \t\n\r\f\v"));
str.remove_prefix(std::min(pos, str.length()));
return str;
}
std::string_view rtrim(std::string_view str)
{
const auto pos(str.find_last_not_of(" \t\n\r\f\v"));
str.remove_suffix(std::min(str.length() - pos - 1, str.length()));
return str;
}
std::string_view trim(std::string_view str)
{
str = ltrim(str);
str = rtrim(str);
return str;
}
int main()
{
std::string str = " hello world ";
auto sv1{ ltrim(str) }; // "hello world "
auto sv2{ rtrim(str) }; // " hello world"
auto sv3{ trim(str) }; // "hello world"
//If you want, you can create std::string objects from std::string_view objects
std::string s1{ sv1 };
std::string s2{ sv2 };
std::string s3{ sv3 };
}
Note: the use of std::min to ensure pos is not greater than size(), which happens when all characters in the string are whitespace and find_first_not_of returns npos. Also, std::string_view is a non-owning reference, so it's only valid as long as the original string still exists. Trimming the string view has no effect on the string it is based on.
Example for trim leading and trailing spaces following jon-hanson's suggestion to use boost (only removes trailing and pending spaces):
#include <boost/algorithm/string/trim.hpp>
std::string str = " t e s t ";
boost::algorithm::trim ( str );
Results in "t e s t"
There is also
trim_left results in "t e s t "
trim_right results in " t e s t"
/// strip a string, remove leading and trailing spaces
void strip(const string& in, string& out)
{
string::const_iterator b = in.begin(), e = in.end();
// skipping leading spaces
while (isSpace(*b)){
++b;
}
if (b != e){
// skipping trailing spaces
while (isSpace(*(e-1))){
--e;
}
}
out.assign(b, e);
}
In the above code, the isSpace() function is a boolean function that tells whether a character is a white space, you can implement this function to reflect your needs, or just call the isspace() from "ctype.h" if you want.
Example for trimming leading and trailing spaces
std::string aString(" This is a string to be trimmed ");
auto start = aString.find_first_not_of(' ');
auto end = aString.find_last_not_of(' ');
std::string trimmedString;
trimmedString = aString.substr(start, (end - start) + 1);
OR
trimmedSring = aString.substr(aString.find_first_not_of(' '), (aString.find_last_not_of(' ') - aString.find_first_not_of(' ')) + 1);
Using the standard library has many benefits, but one must be aware of some special cases that cause exceptions. For example, none of the answers covered the case where a C++ string has some Unicode characters. In this case, if you use the function isspace, an exception will be thrown.
I have been using the following code for trimming the strings and some other operations that might come in handy. The major benefits of this code are: it is really fast (faster than any code I have ever tested), it only uses the standard library, and it never causes an exception:
#include <string>
#include <algorithm>
#include <functional>
#include <locale>
#include <iostream>
typedef unsigned char BYTE;
std::string strTrim(std::string s, char option = 0)
{
// convert all whitespace characters to a standard space
std::replace_if(s.begin(), s.end(), (std::function<int(BYTE)>)::isspace, ' ');
// remove leading and trailing spaces
size_t f = s.find_first_not_of(' ');
if (f == std::string::npos) return "";
s = s.substr(f, s.find_last_not_of(' ') - f + 1);
// remove consecutive spaces
s = std::string(s.begin(), std::unique(s.begin(), s.end(),
[](BYTE l, BYTE r){ return l == ' ' && r == ' '; }));
switch (option)
{
case 'l': // convert to lowercase
std::transform(s.begin(), s.end(), s.begin(), ::tolower);
return s;
case 'U': // convert to uppercase
std::transform(s.begin(), s.end(), s.begin(), ::toupper);
return s;
case 'n': // remove all spaces
s.erase(std::remove(s.begin(), s.end(), ' '), s.end());
return s;
default: // just trim
return s;
}
}
This might be the simplest of all.
You can use string::find and string::rfind to find whitespace from both sides and reduce the string.
void TrimWord(std::string& word)
{
if (word.empty()) return;
// Trim spaces from left side
while (word.find(" ") == 0)
{
word.erase(0, 1);
}
// Trim spaces from right side
size_t len = word.size();
while (word.rfind(" ") == --len)
{
word.erase(len, len + 1);
}
}
To add to the problem, how to extend this formatting to process extra spaces between words of the string.
Actually, this is a simpler case than accounting for multiple leading and trailing white-space characters. All you need to do is remove duplicate adjacent white-space characters from the entire string.
The predicate for adjacent white space would simply be:
auto by_space = [](unsigned char a, unsigned char b) {
return std::isspace(a) and std::isspace(b);
};
and then you can get rid of those duplicate adjacent white-space characters with std::unique, and the erase-remove idiom:
// s = " This is a sample string "
s.erase(std::unique(std::begin(s), std::end(s), by_space),
std::end(s));
// s = " This is a sample string "
This does potentially leave an extra white-space character at the front and/or the back. This can be removed quite easily:
if (std::size(s) && std::isspace(s.back()))
s.pop_back();
if (std::size(s) && std::isspace(s.front()))
s.erase(0, 1);
Here's a demo.
I've tested this, it all works. So this method processInput will just ask the user to type something in. it will return a string that has no extra spaces internally, nor extra spaces at the begining or the end. Hope this helps. (also put a heap of commenting in to make it simple to understand).
you can see how to implement it in the main() at the bottom
#include <string>
#include <iostream>
string processInput() {
char inputChar[256];
string output = "";
int outputLength = 0;
bool space = false;
// user inputs a string.. well a char array
cin.getline(inputChar,256);
output = inputChar;
string outputToLower = "";
// put characters to lower and reduce spaces
for(int i = 0; i < output.length(); i++){
// if it's caps put it to lowercase
output[i] = tolower(output[i]);
// make sure we do not include tabs or line returns or weird symbol for null entry array thingy
if (output[i] != '\t' && output[i] != '\n' && output[i] != 'Ì') {
if (space) {
// if the previous space was a space but this one is not, then space now is false and add char
if (output[i] != ' ') {
space = false;
// add the char
outputToLower+=output[i];
}
} else {
// if space is false, make it true if the char is a space
if (output[i] == ' ') {
space = true;
}
// add the char
outputToLower+=output[i];
}
}
}
// trim leading and tailing space
string trimmedOutput = "";
for(int i = 0; i < outputToLower.length(); i++){
// if it's the last character and it's not a space, then add it
// if it's the first character and it's not a space, then add it
// if it's not the first or the last then add it
if (i == outputToLower.length() - 1 && outputToLower[i] != ' ' ||
i == 0 && outputToLower[i] != ' ' ||
i > 0 && i < outputToLower.length() - 1) {
trimmedOutput += outputToLower[i];
}
}
// return
output = trimmedOutput;
return output;
}
int main() {
cout << "Username: ";
string userName = processInput();
cout << "\nModified Input = " << userName << endl;
}
Why complicate?
std::string removeSpaces(std::string x){
if(x[0] == ' ') { x.erase(0, 1); return removeSpaces(x); }
if(x[x.length() - 1] == ' ') { x.erase(x.length() - 1, x.length()); return removeSpaces(x); }
else return x;
}
This works even if boost was to fail, no regex, no weird stuff nor libraries.
EDIT:
Fix for M.M.'s comment.
No boost, no regex, just the string library. It's that simple.
string trim(const string& s) { // removes whitespace characters from beginnig and end of string s
const int l = (int)s.length();
int a=0, b=l-1;
char c;
while(a<l && ((c=s[a])==' '||c=='\t'||c=='\n'||c=='\v'||c=='\f'||c=='\r'||c=='\0')) a++;
while(b>a && ((c=s[b])==' '||c=='\t'||c=='\n'||c=='\v'||c=='\f'||c=='\r'||c=='\0')) b--;
return s.substr(a, 1+b-a);
}
The constant time and space complexity for removing leading and trailing spaces can be achieved by using pop_back() function in the string. Code looks as follows:
void trimTrailingSpaces(string& s) {
while (s.size() > 0 && s.back() == ' ') {
s.pop_back();
}
}
void trimSpaces(string& s) {
//trim trailing spaces.
trimTrailingSpaces(s);
//trim leading spaces
//To reduce complexity, reversing and removing trailing spaces
//and again reversing back
reverse(s.begin(), s.end());
trimTrailingSpaces(s);
reverse(s.begin(), s.end());
}
char *str = (char*) malloc(50 * sizeof(char));
strcpy(str, " some random string (<50 chars) ");
while(*str == ' ' || *str == '\t' || *str == '\n')
str++;
int len = strlen(str);
while(len >= 0 &&
(str[len - 1] == ' ' || str[len - 1] == '\t' || *str == '\n')
{
*(str + len - 1) = '\0';
len--;
}
printf(":%s:\n", str);
void removeSpaces(string& str)
{
/* remove multiple spaces */
int k=0;
for (int j=0; j<str.size(); ++j)
{
if ( (str[j] != ' ') || (str[j] == ' ' && str[j+1] != ' ' ))
{
str [k] = str [j];
++k;
}
}
str.resize(k);
/* remove space at the end */
if (str [k-1] == ' ')
str.erase(str.end()-1);
/* remove space at the begin */
if (str [0] == ' ')
str.erase(str.begin());
}
string trim(const string & sStr)
{
int nSize = sStr.size();
int nSPos = 0, nEPos = 1, i;
for(i = 0; i< nSize; ++i) {
if( !isspace( sStr[i] ) ) {
nSPos = i ;
break;
}
}
for(i = nSize -1 ; i >= 0 ; --i) {
if( !isspace( sStr[i] ) ) {
nEPos = i;
break;
}
}
return string(sStr, nSPos, nEPos - nSPos + 1);
}
For leading- and trailing spaces, how about:
string string_trim(const string& in) {
stringstream ss;
string out;
ss << in;
ss >> out;
return out;
}
Or for a sentence:
string trim_words(const string& sentence) {
stringstream ss;
ss << sentence;
string s;
string out;
while(ss >> s) {
out+=(s+' ');
}
return out.substr(0, out.length()-1);
}
neat and clean
void trimLeftTrailingSpaces(string &input) {
input.erase(input.begin(), find_if(input.begin(), input.end(), [](int ch) {
return !isspace(ch);
}));
}
void trimRightTrailingSpaces(string &input) {
input.erase(find_if(input.rbegin(), input.rend(), [](int ch) {
return !isspace(ch);
}).base(), input.end());
}
This was the most intuitive way for me to solve this problem:
/**
* #brief Reverses a string, a helper function to removeLeadingTrailingSpaces
*
* #param line
* #return std::string
*/
std::string reverseString (std::string line) {
std::string reverse_line = "";
for(int i = line.length() - 1; i > -1; i--) {
reverse_line += line[i];
}
return reverse_line;
}
/**
* #brief Removes leading and trailing whitespace
* as well as extra whitespace within the line
*
* #param line
* #return std::string
*/
std::string removeLeadingTrailingSpaces(std::string line) {
std::string filtered_line = "";
std::string curr_line = line;
for(int loop = 0; loop < 2; loop++) {
bool leading_spaces_exist = true;
filtered_line = "";
std::string prev_char = "";
for(int i = 0; i < line.length(); i++) {
// Ignores leading whitespace
if(leading_spaces_exist) {
if(curr_line[i] != ' ') {
leading_spaces_exist = false;
}
}
// Puts the rest of the line in a variable
// and ignore back-to-back whitespace
if(!leading_spaces_exist) {
if(!(curr_line[i] == ' ' && prev_char == " ")) {
filtered_line += curr_line[i];
}
prev_char = curr_line[i];
}
}
/*
Reverses the line so that after we remove the leading whitespace
the trailing whitespace becomes the leading whitespace.
After the second round, it needs to reverse the string back to
its regular order.
*/
curr_line = reverseString(filtered_line);
}
return curr_line;
}
Basically, I looped through the string and removed the leading whitespace, then flipped the string and repeated the same process, then flipped back to normal.
I also added the functionality of cleaning up the line if there were back-to-back spaces.
My Solution for this problem not using any STL methods but only C++ string's own methods is as following:
void processString(string &s) {
if ( s.empty() ) return;
//delete leading and trailing spaces of the input string
int notSpaceStartPos = 0, notSpaceEndPos = s.length() - 1;
while ( s[notSpaceStartPos] == ' ' ) ++notSpaceStartPos;
while ( s[notSpaceEndPos] == ' ' ) --notSpaceEndPos;
if ( notSpaceStartPos > notSpaceEndPos ) { s = ""; return; }
s = s.substr(notSpaceStartPos, notSpaceEndPos - notSpaceStartPos + 1);
//reduce multiple spaces between two words to a single space
string temp;
for ( int i = 0; i < s.length(); i++ ) {
if ( i > 0 && s[i] == ' ' && s[i-1] == ' ' ) continue;
temp.push_back(s[i]);
}
s = temp;
}
I have used this method to pass a LeetCode problem Reverse Words in a String
void TrimWhitespaces(std::wstring& str)
{
if (str.empty())
return;
const std::wstring& whitespace = L" \t";
std::wstring::size_type strBegin = str.find_first_not_of(whitespace);
std::wstring::size_type strEnd = str.find_last_not_of(whitespace);
if (strBegin != std::wstring::npos || strEnd != std::wstring::npos)
{
strBegin == std::wstring::npos ? 0 : strBegin;
strEnd == std::wstring::npos ? str.size() : 0;
const auto strRange = strEnd - strBegin + 1;
str.substr(strBegin, strRange).swap(str);
}
else if (str[0] == ' ' || str[0] == '\t') // handles non-empty spaces-only or tabs-only
{
str = L"";
}
}
void TrimWhitespacesTest()
{
std::wstring EmptyStr = L"";
std::wstring SpacesOnlyStr = L" ";
std::wstring TabsOnlyStr = L" ";
std::wstring RightSpacesStr = L"12345 ";
std::wstring LeftSpacesStr = L" 12345";
std::wstring NoSpacesStr = L"12345";
TrimWhitespaces(EmptyStr);
TrimWhitespaces(SpacesOnlyStr);
TrimWhitespaces(TabsOnlyStr);
TrimWhitespaces(RightSpacesStr);
TrimWhitespaces(LeftSpacesStr);
TrimWhitespaces(NoSpacesStr);
assert(EmptyStr == L"");
assert(SpacesOnlyStr == L"");
assert(TabsOnlyStr == L"");
assert(RightSpacesStr == L"12345");
assert(LeftSpacesStr == L"12345");
assert(NoSpacesStr == L"12345");
}
What about the erase-remove idiom?
std::string s("...");
s.erase( std::remove(s.begin(), s.end(), ' '), s.end() );
Sorry. I saw too late that you don't want to remove all whitespace.