Removing Characters from String until it becomes empty - c++

I can't get my head around this. I'm trying to remove all occurrences of a certain character within a string until the string becomes empty. I know we can remove all character occurrences from an std::string by using the combination of string::erase and std::remove like so:
s.erase(remove(s.begin(), s.end(), '.'), s.end());
where the '.' is the actual character to be removed. It even works if I try to remove certain characters. Now let's consider the following string: 'abababababababa'. What I'm trying to achieve is to reduce this string to ashes be removing all 'a's for startes, which will leave me with a couple of 'b's. Then remove all those 'b's which will leave me with an empty string. Of course this is just a part of my task but I could narrow it down for this problem. Here's my naive approach based on the upper combination of functions:
string s = "abababababababa";
while (!s.empty()) {
...
s.erase(remove(s.begin(), s.end(), s[0]), s.end());
...
}
Of course it doesn't work, I just can't seem to find out why. By debugging the application I can see how the "s" string is being modified. While the s.erase... works perfectly if I set a character constant for remove's third parameter it fails if I try to use char variables. Here's what the s string looks like after each iteration:
Removing[a] from [abababababababa] Result: baaaaaaa
Removing[b] from [baaaaaaa] Result: a
Removing[a] from [a] Result: -
While I expected 2 operations until a string should become empty - which works, if I hardcode the letters by hand and use s.erase twice - it actually takes 3 iteration. The most frustrating part however is the fact that, while I'm removing 'a' in the first iteration only the first 'a' is removed and all other 'b'.
Why is this happening? Is it the cause of how erase / remove works internally?

You have undefined behavior.
You get the results you get because std::remove takes the value to remove by reference, once s[0] has been removed, what happens to the reference to it then?
The simple solution is to create a temporary variable, assign e.g. s[0] to it, and pass the variable instead.

The behavior of function remove() template is equivalent to:
template <class ForwardIterator, class T>
ForwardIterator remove (ForwardIterator first, ForwardIterator last, const T& val)
{
ForwardIterator result = first;
while (first!=last) {
if (!(*first == val)) {
*result = move(*first);
++result;
}
++first;
}
return result;
}
As you see, the function will move the element different with val to the front of the range.
so in your case "ababababab",
if you call remove() like you did, the original s[0] is 'a', but it will be instead by 'b' during the remove(), the remaining code will remove the 'b', so the result is not right.
Like Joachim say, assign s[0] to a temporary variable.
the code is reference from http://www.cplusplus.com/reference/algorithm/remove/?kw=remove

Related

How to search and delete characters in string

I'm trying to convert .fsp files to strings but new .fsp file is very abnormal. It contains some undesirable characters that I want to delete from string. How can I make it?
I have tried to search char in string and delete them but I dont know how to make it.
The string looks like this:
string s;
s = 144˙037˙412˙864;
and I need to make it just like that
s = 144037412864;
So I except result like this:
string s = 144037412864;
Thank you for help.
We can use the remove-erase idiom to remove unnecessary characters from the string! There's a function in <algorithm> called remove_if. What remove_if does is it removes elements that match some predicate. remove_if returns a iterator pointing to the new end of the container after all elements have been removed. I'll show you how to write a function that does the job!
#include <algorithm>
#include <string>
void erase_ticks(std::string& s) {
// Returns true for characters that should be removed
auto condition = [](char c) { return c == '`'; };
// Removes characters that match the condition,
// and returns the new endpoint of the string
auto new_end = std::remove_if(s.begin(), s.end(), condition);
// Erases characters from the new endpoint to the current endpoint
s.erase(new_end, s.end());
}
We can use this in main, and it works just as expected!
#include <iostream>
int main() {
std::string s("123`456`789");
std::cout << s << '\n'; // prints 123`456`789
erase_ticks(s);
std::cout << s << '\n'; // prints 123456789
}
This problem has two parts, first we need to identify any characters in the string which we don't want. From your use case it seems that anything that is not numeric needs to go. This is simple enough as the standard library defines a function std::isdigit (simply add the following inclusion "#include <locale>") which takes a character and returns a bool which indicates whether or not the character is numeric.
Second we need a way to quickly and cleanly remove all occurrences of these from the string. Thus we can use the 'Erase Remove' idiom to iterate through the string and do what we want.
string s = "123'4'5";
s.erase(std::remove_if(s.begin(), s.end(), [](char x)->bool {return !std::isdigit(x);}), s.end());
In the snippit above we're calling erase on the string which takes two iterators, the first refers to where we want to begin to delete from and the second tells the call where we want to delete to. The magic in this trick is actually all in the call to remove_if (include "#include <algorithm>" for it). remove_if actually works by shifting the elements (or characters) of string forward to the end of the string.
So "123'4'5'" becomes "12345'''", then it returns an iterator to where it shifted these characters to which is then passed to erase to tell it remove the characters starting here. In the end we're left with "12345" as expected.
Edit: Forgot to mention, remove_if also takes a predicate here I'm using a lambda which takes a character and returns a bool.

Copy a std::string using copy_if (into another string)

I'm wondering what's the best way to selectively copy_if characters from one string to another. I have something like
string buffer1("SomeUnknownwSizeAtCompileTime");
string buffer2; // WillBeAtMostSameSizeAsBuffer1ButMaybeLessAfterWeRemoveSpaces
buffer2.resize(buffer1.length());
std::copy_if(buffer1.begin(), buffer1.end(), buffer2.begin(), [](char c){
//don't copy spaces
return c != ' ';
});
buffer2 could potentially be a lot smaller than buffer1, yet we have to allocate the same amount of memory as buffer1's length. After copying however, buffer2's end iterator will point past the null termination character. I googled around and apparently this is by design, so now I'm wondering should I not be using copy_if with strings?
Thanks
You need to use std::back_inserter.
#include <iterator>
std::copy_if(buffer1.begin(), buffer1.end(), back_inserter(buffer2), [](char c){
//don't copy spaces
return c != ' ';
});
back_inserter(buffer2) returns a specialized iterator which appends to instead of overwriting the elements of buffer2.
For this to work correctly, you'll have to make sure that you start out with an empty buffer2. i.e. don't use:
buffer2.resize(buffer1.length());

how to erase an element hasn't a certain character in vector elements

i found many answers for erase an element of vector has a certain character
but i tried to make some of these solutions to erase the element which hasn't that character but dosen't work
for(int k=0; k<temp.size();k++)
{
while(temp[k].find_first_of("xX")!= string::npos)
{
temp.erase(temp.begin()+k);
}
}
variables_print.push_back(temp);
here an example , these code erase the elements have char "xX" but i tried to make it temp[K].find_first_not_of("xX") and doesn't work
also make it temp[K].find_first_of("xX")== string::npos and doesn't work
how to erase the elements haven't x or X characters
You could do it this way:
auto newEnd = std::remove_if(v.begin(), v.end(),
[](const auto& s) { return s.find_first_of("xX") == std::string::npos; });
v.erase(newEnd, v.end());
remove_if moves all elements not matching the condition to front, replacing those that satisfy the it. The condition here is satisfied when the lambda given as the third argument returns true. newEnd is the iterator pointing to the first element after those that are not removed.
For example, if input is this: {"aaa", "bbx", "ccc"}, after call to remove_if the vector looks like this: {"aaa", "ccc", <used to be bbx>}.
The second line removes all elements starting fromnewEnd. So in example above, you end up with {"aaa", "ccc"}.
The condition here is a lambda which returns true for each element that contains neither 'x' nor 'X'.
This is the same condition you tried - and is correct one. Problem with your original code different.
Look at: while(temp[k].find_first_of("xX")!= string::npos). If the string does not contain X, the body of this nested loop will not be executed and nothing gets removed. Also, you could replace the loop with a simple if statement.
There's another problem with the outer loop. You increment k each time, even if you've just removed an element. Consider this example: {"x", "x"}. When incrementing k each time, you will skip the second string and end up with {"x"}.
The corrected code looks would look like this:
for(size_t k=0; k<v.size(); )
{
if(v[k].find_first_of("xX") == std::string::npos)
{
v.erase(v.begin()+k);
}
else
{
++k;
}
}
Compare this with the first version. It's not only shorter, but also leaves much less room for bugs.
As #Bob_ points out in comments, the first version is known as erase-remove idiom. It's a common thing to see in modern C++ (i. e. C++11 and newer), so it's worth getting used to it and using it.

std::string::erase doesn't work as I expected

There are many questions here on splitting string by comma. I am trying to make another one.
#include<iostream>
#include<algorithm>
#include<string>
#include<cctype>
int main()
{
std::string str1 = "1.11, 2.11, 3.11, 4.11, 5.11, ";
str1.erase(std::remove_if(str1.begin(), str1.end(), [](unsigned char x){return std::isspace(x);}));
std::cout<<"New string = "<<str1<<std::endl;
return 0;
}
But I am getting the unexpected output below.
New string = 1.11,2.11,3.11,4.11,5.11, 4.11, 5.11,
Did I miss something?
std::remove_if moves the non-removed elements to the front of the string, and returns iterator to the first element to be erased. You use the single iterator argument erase, which only erases a single element. To erase all of the matching characters, you need to use the two argument version, by passing end iterator:
str1.erase(
std::remove_if(
str1.begin(),
str1.end(),
[](unsigned char x){return std::isspace(x);}
),
str1.end() // this was missing
);
In case you were wondering why there are some non-space characters at the end, std::remove_if is not required the keep the eliminated elements intact, and some of them have been overwritten.
There are two iterator based versions of string::erase. One that erases a single character, and one that erases a range. You have to add the end of the range to get rid of all of it.
str1.erase(std::remove_if(str1.begin(), str1.end(),
[](unsigned char x){return std::isspace(x);}),
str1.end());
Your call to erase uses the single iterator argument overload, which removes 1 character. Add str1.end() as second argument to get the usual remove+erase idiom.

Replace multiple spaces with one space in a string

How would I do something in c++ similar to the following code:
//Lang: Java
string.replaceAll(" ", " ");
This code-snippet would replace all multiple spaces in a string with a single space.
bool BothAreSpaces(char lhs, char rhs) { return (lhs == rhs) && (lhs == ' '); }
std::string::iterator new_end = std::unique(str.begin(), str.end(), BothAreSpaces);
str.erase(new_end, str.end());
How this works. The std::unique has two forms. The first form goes through a range and removes adjacent duplicates. So the string "abbaaabbbb" becomes "abab". The second form, which I used, takes a predicate which should take two elements and return true if they should be considered duplicates. The function I wrote, BothAreSpaces, serves this purpose. It determines exactly what it's name implies, that both of it's parameters are spaces. So when combined with std::unique, duplicate adjacent spaces are removed.
Just like std::remove and remove_if, std::unique doesn't actually make the container smaller, it just moves elements at the end closer to the beginning. It returns an iterator to the new end of range so you can use that to call the erase function, which is a member function of the string class.
Breaking it down, the erase function takes two parameters, a begin and an end iterator for a range to erase. For it's first parameter I'm passing the return value of std::unique, because that's where I want to start erasing. For it's second parameter, I am passing the string's end iterator.
So, I tried a way with std::remove_if & lambda expressions - though it seems still in my eyes easier to follow than above code, it doesn't have that "wow neat, didn't realize you could do that" thing to it.. Anyways I still post it, if only for learning purposes:
bool prev(false);
char rem(' ');
auto iter = std::remove_if(str.begin(), str.end(), [&] (char c) -> bool {
if (c == rem && prev) {
return true;
}
prev = (c == rem);
return false;
});
in.erase(iter, in.end());
EDIT realized that std::remove_if returns an iterator which can be used.. removed unnecessary code.
A variant of Benjamin Lindley's answer that uses a lambda expression to make things cleaner:
std::string::iterator new_end =
std::unique(str.begin(), str.end(),
[=](char lhs, char rhs){ return (lhs == rhs) && (lhs == ' '); }
);
str.erase(new_end, str.end());
Why not use a regular expression:
boost::regex_replace(str, boost::regex("[' ']{2,}"), " ");
how about isspace(lhs) && isspace(rhs) to handle all types of whitespace