I'm following the book Accelerated C++, and to write a function to split a string into a vector of words (separated by space characters), find_if is utilized.
vector<string> split(const string& str) {
typedef string::const_iterator iter;
vector<string> ret;
iter i = str.begin();
while (i != str.end()) {
i = find_if(i, str.end(), not_space);
iter j = find_if(i, str.end(), space);
if (i != str.end())
ret.push_back(string(i, j));
i = j;
}
return ret;
}
and the definitions of space and not_space:
bool space(char c) {
return isspace(c);
}
bool not_space(char c) {
return !isspace(c);
}
Is it necessary to write two separate predicates here, or could one simply pass !space in place of not_space?
Just use std::not1(std::ptr_fun(space)). std::not1 is declared in <functional>.
(There is also a std::not2 for use with binary predicates; std::not1 is for unary predicates.)
You cannot simply use !space instead of not_space because all you'll be doing in that case is passing false to find_if. That happens because space will decay to a pointer to function, and function pointers are implicitly convertible to bool. Applying ! to the boolean value will always result in false (because the function pointer is never going to be nullptr).
You can reuse the function space by wrapping it in std::not1, which will negate the result of the predicate passed to it. Unfortunately, it's not as simple as writing std::not1(space), because not1 requires that the predicate define a nested type named argument_type, which your predicate doesn't satisfy.
To convert your function into a predicate usable with not1, you must first wrap it in std::ptr_fun. So the line in your split function becomes:
i = find_if(i, str.end(), std::not1(std::ptr_fun(space)));
With C++11, there's no need for the not1 and ptr_fun shenanigans, just use a lambda expression:
i = find_if(i, str.end(), [](char c) {return !space(c);});
You can also declare
template <bool find_space> bool space(char c) {
return find_space ^ (!isspace(c));
}
and then refer to it as space<true> and space<false> in the argument to find_if(). Much more versatile than std::not1().
Related
How I should define lambda to take char from string iterator? In the code below lambda detect_bracket has problem with input parameter x.
I don't want to delete ALL brackets from the string, just at the beginning and at the end.
auto detect_bracket = [](char* x){ return(')' == x || '(' == x);};
this->str.erase(std::remove_if(str.begin(), str.begin(),
detect_bracket)
);
this->str.erase(std::remove_if(str.back(), str.back(),
detect_bracket)
);
You should take char as the parameter type of the lambda with std::remove_if, since the signature of the predicate function is supposed to check the element directly.
auto detect_bracket = [](char x){ return(')' == x || '(' == x);};
this->str.erase(std::remove_if(str.begin(), str.end(),
detect_bracket)
);
Note std::string::back() won't work with std::remove_if. It will return a char and std::remove_if expects a range expressed by iterator.
And str.begin(), str.begin() is just an empty range, if you just want to remove element at the begin and end, you could
auto detect_bracket = [](char x){ return(')' == x || '(' == x);};
if (!this->str.empty()) {
this->str.erase(std::remove_if(str.begin(), str.begin() + 1, detect_bracket), str.begin() + 1);
}
if (!this->str.empty()) {
this->str.erase(std::remove_if(str.end() - 1, str.end(), detect_bracket), str.end());
}
Note we need to specify the correct end iterator for std::string::erase, because std::remove_if will return an iterator even if it found nothing, and then the char will be erased wrongly.
LIVE
std::remove_if is a function with the following signature:
template< class ForwardIt, class UnaryPredicate >
ForwardIt remove_if( ForwardIt first, ForwardIt last, UnaryPredicate p );
p - unary predicate which returns true if the element should be removed.
The signature of the predicate function should be equivalent to the following:
bool pred(const Type &a);
The type Type must be such that an object of type ForwardIt can be
dereferenced and then implicitly converted to Type.
All you need is to change your function parameter from char* to char.
Multiple remove_if and erase calls are anyway modifying/invalidating the string. Why not simply create a new string, and conditionally assign source string from 0th location or 1st location? And then assign till the last or second last character, conditionally?
string target;
target.assign(source.begin() + skip_if_bracket_at_begin,
source.end() - skip_if_bracket_at_end);
The code below comes from an answer to this question on string splitting. It uses pointers, and a comment on that answer suggested it could be adapted for std::string. How can I use the features of std::string to implement the same algorithm, for example using iterators?
#include <vector>
#include <string>
using namespace std;
vector<string> split(const char *str, char c = ',')
{
vector<string> result;
do
{
const char *begin = str;
while(*str != c && *str)
str++;
result.push_back(string(begin, str));
} while (0 != *str++);
return result;
}
Ok so I obviously replaced char by string but then I noticed he is using a pointer to the beginning of the character. Is that even possible for strings? How do the loop termination criteria change? Is there anything else I need to worry about when making this change?
You can use iterators instead of pointers. Iterators provide a way to traverse containers, and can usually be thought of as analogous to pointers.
In this case, you can use the begin() member function (or cbegin() if you don't need to modify the elements) of a std::string object to obtain an iterator that references the first character, and the end() (or cend()) member function to obtain an iterator for "one-past-the-end".
For the inner loop, your termination criterion is the same; you want to stop when you hit the delimiter on which you'll be splitting the string. For the outer loop, instead of comparing the character value against '\0', you can compare the iterator against the end iterator you already obtained from the end() member function. The rest of the algorithm is pretty similar; iterators work like pointers in terms of dereference and increment:
std::vector<std::string> split(const std::string& str, const char delim = ',') {
std::vector<std::string> result;
auto end = str.cend();
auto iter = str.cbegin();
while (iter != end) {
auto begin = iter;
while (iter != end && *iter != delim) ++iter;
result.push_back(std::string(begin, iter));
if (iter != end) ++iter; // See note (**) below.
}
return result;
}
Note the subtle difference in the inner loop condition: it now tests whether we've hit the end before trying to dereference. This is because we can't dereference an iterator that points to the end of a container, so we must check this before trying to dereference. The original algorithm assumes that a null character ends the string, so we're ok to dereference a pointer to that position.
(**) The validity of iter++ != end when iter is already end is under discussion in Are end+1 iterators for std::string allowed?
I've added this if statement to the original algorithm to break the loop when iter reaches end in the inner loop. This avoids adding one to an iterator which is already the end iterator, and avoids the potential problem.
typedef std::vector<std::string> TVector;
TVector a_list;
populate vector...
for_each(a_list.begin(),a_list.end(),std::toupper);
error
no matching function for call to 'for_each(std::vector<std::basic_string<char> >::iterator, std::vector<std::basic_string<char> >::iterator, <unresolved overloaded function type>)
Do I need to iterate over the elements using the a standard for loop ? Or is there another way I am not allowed to use c++ 11 features.
Thanks
The toupper function is used for characters, not strings. It also returns the uppercase character, so won't work with for_each, but will with std::transform. There is also the problem that std::toupper exists in two overloads, and the compiler can't decide which one to use. Include <cctype> and use plain toupper (or optionally ::toupper) to get the right function.
You need to iterate first over all strings in the vector, and the iterate again over the string to call toupper.
You can either do it manually, or use transform and use functor objects, something like
struct strtoupper
{
std::string operator()(const std::string& str) const
{
std::string upper;
std::transform(str.begin(), str.end(), std::back_inserter(upper), ::toupper);
return upper;
}
};
// ...
std::transform(a_list.begin(), a_list.end(), a_list.begin(), strtoupper());
You have a vector of std::string and std::toupper expects a char as parameter. So it can not be used. What you can do is:
std::for_each(list.begin(), list.end(),[](std::string& s) { std::for_each(s.begin(), s.end(), std::toupper);});
std::toupper is an overloaded function; that’s why you’re getting <unresolved overloaded function type> in the error message. To select a particular overload, you need to cast it:
static_cast<int(*)(int)>(std::toupper)
for_each is also not the right choice for this task—it will call toupper for each string in the list, then discard the result. std::transform would be the appropriate choice—it writes its output to an output iterator. However, toupper works on characters, not strings. You could still use transform to call toupper for each character in a string:
std::transform(
a_string.begin(),
a_string.end(),
a_string.begin(),
static_cast<int(*)(int)>(std::toupper)
);
It would probably be clearer in this simple case to use loops:
for (TVector::iterator i = a_list.begin(), end = a_list.end(); i != end; ++i) {
for (std::string::size_type j = 0; j < i->size(); ++j) {
(*i)[j] = toupper((*i)[j]);
}
}
But if you wanted to write it with <algorithm> and <iterator> tools only, you could make a functor:
struct string_to_upper {
std::string operator()(const std::string& input) const {
std::string output;
std::transform(
input.begin(),
input.end(),
std::back_inserter(output),
static_cast<int(*)(int)>(std::toupper)
);
return output;
}
};
// ...
std::transform(
a_list.begin(),
a_list.end(),
a_list.begin(),
string_to_upper()
);
How would I do something in c++ similar to the following code:
//Lang: Java
string.replaceAll(" ", " ");
This code-snippet would replace all multiple spaces in a string with a single space.
bool BothAreSpaces(char lhs, char rhs) { return (lhs == rhs) && (lhs == ' '); }
std::string::iterator new_end = std::unique(str.begin(), str.end(), BothAreSpaces);
str.erase(new_end, str.end());
How this works. The std::unique has two forms. The first form goes through a range and removes adjacent duplicates. So the string "abbaaabbbb" becomes "abab". The second form, which I used, takes a predicate which should take two elements and return true if they should be considered duplicates. The function I wrote, BothAreSpaces, serves this purpose. It determines exactly what it's name implies, that both of it's parameters are spaces. So when combined with std::unique, duplicate adjacent spaces are removed.
Just like std::remove and remove_if, std::unique doesn't actually make the container smaller, it just moves elements at the end closer to the beginning. It returns an iterator to the new end of range so you can use that to call the erase function, which is a member function of the string class.
Breaking it down, the erase function takes two parameters, a begin and an end iterator for a range to erase. For it's first parameter I'm passing the return value of std::unique, because that's where I want to start erasing. For it's second parameter, I am passing the string's end iterator.
So, I tried a way with std::remove_if & lambda expressions - though it seems still in my eyes easier to follow than above code, it doesn't have that "wow neat, didn't realize you could do that" thing to it.. Anyways I still post it, if only for learning purposes:
bool prev(false);
char rem(' ');
auto iter = std::remove_if(str.begin(), str.end(), [&] (char c) -> bool {
if (c == rem && prev) {
return true;
}
prev = (c == rem);
return false;
});
in.erase(iter, in.end());
EDIT realized that std::remove_if returns an iterator which can be used.. removed unnecessary code.
A variant of Benjamin Lindley's answer that uses a lambda expression to make things cleaner:
std::string::iterator new_end =
std::unique(str.begin(), str.end(),
[=](char lhs, char rhs){ return (lhs == rhs) && (lhs == ' '); }
);
str.erase(new_end, str.end());
Why not use a regular expression:
boost::regex_replace(str, boost::regex("[' ']{2,}"), " ");
how about isspace(lhs) && isspace(rhs) to handle all types of whitespace
I'm having a beginner problem:
bool _isPalindrome(const string& str)
{
return _isPalindrome(str.begin(), str.end()); // won't compile
}
bool _isPalindrome(string::iterator begin, string::iterator end)
{
return begin == end || *begin == *end && _isPalindrome(++begin, --end);
}
What am I doing wrong here? Why doesn't str.begin() get type checked to be a string::iterator?
Update: Better version:
bool BrittlePalindrome::_isPalindrome(string::const_iterator begin, string::const_iterator end)
{
return begin >= end || *begin == *(end - 1) && _isPalindrome(++begin, --end);
}
Assuming that you have a declaration of the second function before the first function, the main issue is that you are passing the strings by const reference.
This means that the only overloads of begin() and end() that you have access to are the const versions which return std::string::const_iterator and not std::string::iterator.
The convention for iterators is that the end iterator points one beyond the end of a range and is not dereferencable - certainly if you pass str.end() as the end parameter. This means that *begin == *end is not valid, you need to decrement end once first. You are also going to have an issue with ranges with odd numbers of elements. By doing ++begin and --end with no further checking your iterators may cross over in the recursion rather than triggering the begin == end condition.
Also note that for maximum portability, global identifiers shouldn't start with an underscore.
str.begin() is non-const, while the argument str is const.
You can either change the iterator-accepting method to accept const_iterators, or you can change the string-accepting method to accept a non-const string.
Or you could cast away str's const-ness, but that would be a patent Bad Idea TM.
(I would also parenthesize your return statement on the iterator-accepting method to make your intent more clear, but that's neither here nor there.)
As previously mentioned your iterators need to be constant iterators, but there's something else wrong with your algorithm. It works fine if you have a string of odd length, but do you see what happens when your string is even length? Consider the palindrome:
aa
Your algorithm will pass in an iterator pointing to the front and to the end. All's good, then it will go to the next level, and all will still be good, but it won't end. Because your first condition will never be true. You need to check not only if begin==end but if begin+1==end or begin==end-1 if you prefer. Otherwise you're iterators are going to be upset.
What error are you getting?
Have you tried this?
bool _isPalindrome(string::const_iterator begin, string::const_iterator end)
replace iterator by const_iterator
swap function definitions
decrement end
Code:
bool isPalindrome(string::const_iterator begin, string::const_iterator end)
{
return (begin == end || begin == --end ||
*begin == *end && isPalindrome(++begin, end));
}
bool isPalindrome(const string& str)
{
return isPalindrome(str.begin(), str.end());
}
You haven't declared the second function before calling it in the first function. The compiler can't find it and thus tries to convert str.begin() (string::iterator) into a const string &. You can move the first function behind the second function.