Copy a std::string using copy_if (into another string) - c++

I'm wondering what's the best way to selectively copy_if characters from one string to another. I have something like
string buffer1("SomeUnknownwSizeAtCompileTime");
string buffer2; // WillBeAtMostSameSizeAsBuffer1ButMaybeLessAfterWeRemoveSpaces
buffer2.resize(buffer1.length());
std::copy_if(buffer1.begin(), buffer1.end(), buffer2.begin(), [](char c){
//don't copy spaces
return c != ' ';
});
buffer2 could potentially be a lot smaller than buffer1, yet we have to allocate the same amount of memory as buffer1's length. After copying however, buffer2's end iterator will point past the null termination character. I googled around and apparently this is by design, so now I'm wondering should I not be using copy_if with strings?
Thanks

You need to use std::back_inserter.
#include <iterator>
std::copy_if(buffer1.begin(), buffer1.end(), back_inserter(buffer2), [](char c){
//don't copy spaces
return c != ' ';
});
back_inserter(buffer2) returns a specialized iterator which appends to instead of overwriting the elements of buffer2.
For this to work correctly, you'll have to make sure that you start out with an empty buffer2. i.e. don't use:
buffer2.resize(buffer1.length());

Related

C++ transform parameter initialization question

I was trying to transform a string into lowercase and store it in another variable using std::transform and std::tolower. I first tried:
string str1("Hello");
string lowerStr1;
transform(str1.begin(), str1.end(), lowerStr1.begin(), ::tolower);
cout << lowerStr1 << endl;
But, lowerStr1 contained nothing. After initializing lowerStr1 with str1, I got the desired result. I want to know the intuition behind this. Could someone explain why lowerStr1 should be initialized in this case?
lowerStr1 is empty, and std::transform won't insert elements into it.
std::transform applies the given function to a range and stores the result in another range, beginning at d_first.
You can use std::back_inserter, which constructs a std::back_insert_iterator, which would call push_back() on the container to insert elements.
transform(str1.begin(), str1.end(), back_inserter(lowerStr1), ::tolower);
Or make lowerStr1 containing 5 elements in advance.
string lowerStr1(5, '\0');
transform(str1.begin(), str1.end(), lowerStr1.begin(), ::tolower);
or
string lowerStr1;
lowerStr1.resize(5);
transform(str1.begin(), str1.end(), lowerStr1.begin(), ::tolower);
Could someone explain why lowerStr1 should be initialized in this case?
That's because you initialize lowerStr1 containing 5 elements in advance as above. What's the value of the initialized elements doens't matter in fact.
This is because your call to std::transform is logically equivalent to the following code:
auto b=str1.begin();
auto e=str1.end();
auto p=lowerStr1.begin();
while (b != e)
{
*p=tolower(*b);
++b;
++e;
}
But lowerStr1, is a completely empty string. lowerStr1.begin() gives you, loosely speaking, a pointer to an empty string. So writing to that pointer and, adding insult to injury, incrementing it and continuing to write to it, result in undefined behavior, memory corruption, and a non-trivial possibility of a crash.
You do not add content to an empty string by grabbing a pointer to it, and then scribbling into that pointer. There are several ways of doing that correctly, with push_back() or insert() methods. You can also use an iterator that does that, like a std::back_insert_iterator, which can use with std::transform.
Generic algorithms won't change the size of the containers.
You need to use an iterator adapter which implements operator= in a special way so that it actually insert elements.
Therefore you can use back_inserter(lowerStr1) to make sure that lowerStr1 gets extended as trasform() does assignments.
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main() {
string str1("Hello");
string lowerStr1;
transform(str1.begin(), str1.end(), std::back_inserter(lowerStr1), ::tolower);
cout << lowerStr1 << endl;
}

How to search and delete characters in string

I'm trying to convert .fsp files to strings but new .fsp file is very abnormal. It contains some undesirable characters that I want to delete from string. How can I make it?
I have tried to search char in string and delete them but I dont know how to make it.
The string looks like this:
string s;
s = 144˙037˙412˙864;
and I need to make it just like that
s = 144037412864;
So I except result like this:
string s = 144037412864;
Thank you for help.
We can use the remove-erase idiom to remove unnecessary characters from the string! There's a function in <algorithm> called remove_if. What remove_if does is it removes elements that match some predicate. remove_if returns a iterator pointing to the new end of the container after all elements have been removed. I'll show you how to write a function that does the job!
#include <algorithm>
#include <string>
void erase_ticks(std::string& s) {
// Returns true for characters that should be removed
auto condition = [](char c) { return c == '`'; };
// Removes characters that match the condition,
// and returns the new endpoint of the string
auto new_end = std::remove_if(s.begin(), s.end(), condition);
// Erases characters from the new endpoint to the current endpoint
s.erase(new_end, s.end());
}
We can use this in main, and it works just as expected!
#include <iostream>
int main() {
std::string s("123`456`789");
std::cout << s << '\n'; // prints 123`456`789
erase_ticks(s);
std::cout << s << '\n'; // prints 123456789
}
This problem has two parts, first we need to identify any characters in the string which we don't want. From your use case it seems that anything that is not numeric needs to go. This is simple enough as the standard library defines a function std::isdigit (simply add the following inclusion "#include <locale>") which takes a character and returns a bool which indicates whether or not the character is numeric.
Second we need a way to quickly and cleanly remove all occurrences of these from the string. Thus we can use the 'Erase Remove' idiom to iterate through the string and do what we want.
string s = "123'4'5";
s.erase(std::remove_if(s.begin(), s.end(), [](char x)->bool {return !std::isdigit(x);}), s.end());
In the snippit above we're calling erase on the string which takes two iterators, the first refers to where we want to begin to delete from and the second tells the call where we want to delete to. The magic in this trick is actually all in the call to remove_if (include "#include <algorithm>" for it). remove_if actually works by shifting the elements (or characters) of string forward to the end of the string.
So "123'4'5'" becomes "12345'''", then it returns an iterator to where it shifted these characters to which is then passed to erase to tell it remove the characters starting here. In the end we're left with "12345" as expected.
Edit: Forgot to mention, remove_if also takes a predicate here I'm using a lambda which takes a character and returns a bool.

Removing Characters from String until it becomes empty

I can't get my head around this. I'm trying to remove all occurrences of a certain character within a string until the string becomes empty. I know we can remove all character occurrences from an std::string by using the combination of string::erase and std::remove like so:
s.erase(remove(s.begin(), s.end(), '.'), s.end());
where the '.' is the actual character to be removed. It even works if I try to remove certain characters. Now let's consider the following string: 'abababababababa'. What I'm trying to achieve is to reduce this string to ashes be removing all 'a's for startes, which will leave me with a couple of 'b's. Then remove all those 'b's which will leave me with an empty string. Of course this is just a part of my task but I could narrow it down for this problem. Here's my naive approach based on the upper combination of functions:
string s = "abababababababa";
while (!s.empty()) {
...
s.erase(remove(s.begin(), s.end(), s[0]), s.end());
...
}
Of course it doesn't work, I just can't seem to find out why. By debugging the application I can see how the "s" string is being modified. While the s.erase... works perfectly if I set a character constant for remove's third parameter it fails if I try to use char variables. Here's what the s string looks like after each iteration:
Removing[a] from [abababababababa] Result: baaaaaaa
Removing[b] from [baaaaaaa] Result: a
Removing[a] from [a] Result: -
While I expected 2 operations until a string should become empty - which works, if I hardcode the letters by hand and use s.erase twice - it actually takes 3 iteration. The most frustrating part however is the fact that, while I'm removing 'a' in the first iteration only the first 'a' is removed and all other 'b'.
Why is this happening? Is it the cause of how erase / remove works internally?
You have undefined behavior.
You get the results you get because std::remove takes the value to remove by reference, once s[0] has been removed, what happens to the reference to it then?
The simple solution is to create a temporary variable, assign e.g. s[0] to it, and pass the variable instead.
The behavior of function remove() template is equivalent to:
template <class ForwardIterator, class T>
ForwardIterator remove (ForwardIterator first, ForwardIterator last, const T& val)
{
ForwardIterator result = first;
while (first!=last) {
if (!(*first == val)) {
*result = move(*first);
++result;
}
++first;
}
return result;
}
As you see, the function will move the element different with val to the front of the range.
so in your case "ababababab",
if you call remove() like you did, the original s[0] is 'a', but it will be instead by 'b' during the remove(), the remaining code will remove the 'b', so the result is not right.
Like Joachim say, assign s[0] to a temporary variable.
the code is reference from http://www.cplusplus.com/reference/algorithm/remove/?kw=remove

c++ combine erase and remove string [duplicate]

This question already has answers here:
Difference between erase and remove
(7 answers)
Closed 8 years ago.
I saw someone use this line to remove white spaces from string stored in a vector, but I fail to understand the reason for using erase and remove this way?
The second question: how can I, instead of only removing white spaces, remove anything that is not a 'num' or a '-' ?
this is not the full code, it is only a snippet, will not compile. the vector simply contains raw strings of a text file, the strings were comma delimited, currently the strings could contain any possible char except the comma.
vector <string> vecS;
ifstream vecStream;
while(vecStream.good()) {
vecS.resize(i+1);
getline(vecStream, vecS.at(i), ',');
vector <string> vecS;
vecS.at(i).erase(remove( vecS.at(i).begin(), vecS.at(i).end(), ' '), vecS.at(i).end());
i++
}
EDIT; added more code, hope this is clearer now
but I fail to understand the reason for using erase and remove this
way?
std::remove basically rearranges the sequence so that the elements which are not to be removed are all shifted to the beginning of the sequence - a past-the-end iterator for that part, and effectively the new end of the sequence, is then returned.
There is absolutely no need for a file stream in that snippet though:
vector <string> vecS;
// Do something with vecS
for( auto& s : vecS )
s.erase( remove_if( std::begin(s), std::end(s),
[](char c){ return std::isspace(c); }), // Use isspace instead, that recognizes all white spaces
std::end(s) );

copying an element from list<string>mylist to string mystr

list<string> mylist;
mylist.push_back("random stuff");
list<string>::iterator it;
it=mylist.begin();
string mystr;
//and this doesn't work:
mystr=*it;
Let's say I have a list<string> mylist and it has 3 items. Since I can't work on the characters of each element I must copy what item I want to a simple string or a char buffer. But I can't find a way at all, I've tried with pointers to arrays as well.
So is there a way to copy those items out of the list ?
Edit:
Yeah sorry , revisited my code , the project that is , and found the error to be somewere else, i was copying from listmylist to a string mystr, with the help of an iterator, and i was using a for loop that had the condition to stop when it encountered the character '\0' put when i was copying it, it didn't copy the '\0' in my string so in the end i had to put it manually so the function would not work outside the string
Good code:
string temp;
list<string>::iterator it;
it=mylist.begin();//let's say myslist has "random stuff"
temp=*it;//this does not copy the '\0'
temp+='\0';//so i add it myself
for(int n(0);temp[n]!='\0';n++)//now the for loop stops properly
cout<<temp[n];
If you want characters from the string:
for (std::string::iterator it=mystr.begin(); it!=mystr.end(); it++)
{
char ch = *it;
// do something with the character?
}
If you want to pass the string as a C (zero-terminated) string, use
mystr.c_str()
The code works great and outputs the correct result. Also, you can work with characters of each element like this:
for (list<string>::iterator iter = mylist.begin(); iter != mylist.end(); ++iter)
{
char c = (*iter)[0]; //this is the first character.
}
You can cycle through the string with an iterator too even, strings have them as well :) Though strings support random-access-iterators, so you can just access them as arrays like a c-string as I showed in the for loop.
Are you perhaps looking for
list<string> mylist;
mylist.push_back("random stuff");
list<string>::iterator it = it=mylist.begin();
string& mystr = *it; // Note the &
The type string& is a reference to a string. It introduces a new name for the first element of the list. Well, at least what's currently the first element. You could of course do mylist_push_front("Hi There"); and mystr would still be "random stuff". But if you now say mystr = "Not so random stuff";, you will change the string inside the list.
Note that this is explicitly not a copy.
Small warning: mystr doesn't magically keep the string alive. If you remove the underlying string from the list, you must stop using mystr as well.