Simple customized iterator with lambdas in C++ - c++

Suppose I have a container which contains int, a function that works over containers containing Point, and that I have a function that given some int gives me the corresponding Point it represents (imagine that I have indexed all the points in my scene in some big std::vector<Point>). How do I create a simple (and efficient) wrapper to use my first container without copying its content?
The code I want to type is something like that:
template<typename InputIterator>
double compute_area(InputIterator first, InputIterator beyond) {
// Do stuff
}
template<typename InputIterator, typename OutputIterator>
void convex_hull(InputIterator first, InputIterator beyond, OutputIterator result) {
// Do stuff
}
struct Scene {
std::vector<Point> vertices;
foo(const std::vector<int> &polygon) {
// Create a simple wraper with limited amount of mumbo-jumbo
auto functor = [](int i) -> Point& { return vertices[polygon[i]]; });
MagicIterator polyBegin(0, functor);
MagicIterator polyEnd(polygon.size(), functor);
// NOTE: I want them to act as random access iterator
// And then use it directly
double a = compute_area(polyBegin, polyEnd);
// Bonus: create custom inserter similar to std::back_inserter
std::vector<int> result;
convex_hull(polyBegin, polyEnd, MagicInserter(result));
}
};
So, as you've seen, I'm looking for something a bit generic. I thought about using lambdas as well, but I'm getting a bit mixed up on how to proceed to keep it simple and user-friendly.

I suggest Boost's Transform Iterator. Here's an example usage:
#include <boost/iterator/transform_iterator.hpp>
#include <vector>
#include <cassert>
#include <functional>
struct Point { int x, y; };
template<typename It>
void compute(It begin, It end)
{
while (begin != end) {
begin->x = 42;
begin->y = 42;
++begin;
}
}
int main()
{
std::vector<Point> vertices(5);
std::vector<int> polygon { 2, 3, 4 };
std::function<Point&(int)> functor = [&](int i) -> Point& { return vertices[i]; };
auto polyBegin = boost::make_transform_iterator(polygon.begin(), functor);
auto polyEnd = boost::make_transform_iterator(polygon.end(), functor);
compute(polyBegin, polyEnd);
assert(vertices[2].y == 42);
}
I didn't quite get the part about custom back_inserter. If the type stored in result vector is the same as what the functor returns, the one from standard library will do. Otherwise you can just wrap it in transform_iterator, too.
Note that the functor is stored in a std::function. Boost relies on the functor to have a typedef result_type defined and lambdas don't have it.

I see two methods. Either start with boost::iterator_facade, then write the "functional iterator" type.
Or, use boost::counting_iterator iterator or write your own (they are pretty easy), then use boost::transform_iterator to map that Index iterator over to your Point iterator.
All of the above can also be written directly. I'd write it as a random access iterator: which requires a number of typedefs, ++, --, a number of +=, -=, -, +s, the comparisons, and * and -> to be defined properly. It is a bit of boilerplate, the boost libraries above just make it a touch less boilerplate (by having the boilerplate within itself).
I've written myself a version of this that takes the function type as an argument, then stores the function alongside the index. It advances/compares/etc using the index, and dereferences using the function type. By making the function type std::function<blah()> I get the type-erased version of it, and by making it a decltype of a lambda argument, or the type of a functor, I get a more efficient version.

Related

Any way to trick std::transform into operating on the iterator themselves?

So I wrote this code which won't compile. I think the reason is because std::transform, when given an iterator range such as this will operate on the type pointed to by the iterator, not the iterator itself. Is there any simple wrapper, standard lib tool, etc. to make this code work i.e. to store all the iterators of the original map into a new vector, with minimum changes required? Thanks!
#include <map>
#include <iostream>
#include <vector>
using MT = std::multimap<char, int>;
using MTI = MT::iterator;
int main()
{
MT m;
m.emplace('a', 1); m.emplace('a', 2); m.emplace('a', 3);
m.emplace('b', 101);
std::vector<MTI> itrs;
std::transform(m.begin(), m.end(), std::back_inserter(itrs), [](MTI itr){
return itr;
});
}
EDIT 1: Failed to compile with gcc11 and clang13, C++17/20
EDIT 2: The purpose of the question is mostly out of curiosity. I want to see what's a good way to manipulate existing standard algorithm to work on the level that I want. The sample code and problem are entirely made up for demonstration but they are not related to any real problem that requires a solution
Is there such a wrapper? Not in the standard. But it doesn't mean you can't write one, even fairly simply.
template<typename It>
struct PassIt : It {
It& operator*() { return *this; }
It const& operator*() const { return *this; }
PassIt & operator++() { ++static_cast<It&>(*this); return *this; }
PassIt operator++(int) const { return PassIt{static_cast<It&>(*this)++}; }
};
template<typename It>
PassIt(It) -> PassIt<It>;
That is just an example1 of wrapper that is a iterator of the specified template parameter type. It delegates to its base for the bookkeeping, while ensuring the the return types conform to returning the wrapped iterator itself when dereferencing.
You can use it in your example to simply copy the iterators
std::copy(PassIt{m.begin()}, PassIt{m.end()}, std::back_inserter(itrs));
See it live
(1) - It relies on std::iterator_traits deducing the correct things. As written in this example, it may not conform to all the requirements of the prescribed iterator type (in this case, we aimed at a forward iterator). If that happens, more boiler-plate will be required.
The function you pass to std::transform and algorithms in general are supposed to use elements not iterators. You could use the key to find the iterator in the map, though thats neither efficient nor simple. Instead use a plain loop:
for (auto it = m.begin(); it != m.end(); ++it) itrs.push_back(it);

Why doesn’t std::map provide key_iterator and value_iterator?

I am working in a C++03 environment, and applying a function to every key of a map is a lot of code:
const std::map<X,Y>::const_iterator end = m_map.end();
for (std::map<X,Y>::const_iterator element = m_map.begin(); element != end; ++element)
{
func( element->first );
}
If a key_iterator existed, the same code could take advantage of std::for_each:
std::for_each( m_map.key_begin(), m_map.key_end(), &func );
So why isn’t it provided? And is there a way to adapt the first pattern to the second one?
Yes, it is a silly shortcoming. But it's easily rectified: you can write your own generic key_iterator class which can be constructed from the map (pair) iterator. I've done it, it's only a few lines of code, and it's then trivial to make value_iterator too.
There is no need for std::map<K, V> to provide iterators for the keys and/or the values: such an iterator can easily be built based on the existing iterator(s). Well, it isn't as easy as it should/could be but it is certainly doable. I know that Boost has a library of iterator adapters.
The real question could be: why doesn't the standard C++ library provide iterator adapters to project iterators? The short answer is in my opinion: because, in general, you don't want to modify the iterator to choose the property accessed! You rather want to project or, more general, transform the accessed value but still keep the same notion of position. Formulated different, I think it is necessary to separate the notion of positioning (i.e., advancing iterator and testing whether their position is valid) from accessing properties at a given position. The approach I envision is would look like this:
std::for_each(m_map.key_pm(), m_map.begin(), m_map.end(), &func);
or, if you know the underlying structure obtained from the map's iterator is a std::pair<K const, V> (as is the case for std::map<K, V> but not necessarily for other containers similar to associative containers; e.g., a associative container based on a b-tree would benefit from splitting the key and the value into separate entities):
std::for_each(_1st, m_map.begin(), m_map.end(), &func);
My STL 2.0 page is an [incomplete] write-up with a bit more details on how I think the standard C++ library algorithms should be improved, including the above separation of iterators into positioning (cursors) and property access (property maps).
So why isn’t it provided?
I don't know.
And is there a way to adapt the first pattern to the second one?
Alternatively to making a “key iterator” (cf. my comment and other answers), you can write a small wrapper around func, e.g.:
class FuncOnFirst { // (maybe find a better name)
public:
void operator()(std::map<X,Y>::value_type const& e) const { func(e.first); }
};
then use:
std::for_each( m_map.begin(), m_map.end(), FuncOnFirst() );
Slightly more generic wrapper:
class FuncOnFirst { // (maybe find a better name)
public:
template<typename T, typename U>
void operator()(std::pair<T, U> const& p) const { func(p.first); }
};
There is no need for key_iterator or value_iterator as value_type of a std::map is a std::pair<const X, Y>, and this is what function (or functor) called by for_each() will operate on. There is no performance gain to be had from individual iterators as the pair is aggregated in the underlying node in the binary tree used by the map.
Accessing the key and value through a std::pair is hardly strenuous.
#include <iostream>
#include <map>
typedef std::map<unsigned, unsigned> Map;
void F(const Map::value_type &v)
{
std::cout << "Key: " << v.first << " Value: " << v.second << std::endl;
}
int main(int argc, const char * argv[])
{
Map map;
map.insert(std::make_pair(10, 20));
map.insert(std::make_pair(43, 10));
map.insert(std::make_pair(5, 55));
std::for_each(map.begin(), map.end(), F);
return 0;
}
Which gives the output:
Key: 5 Value: 55
Key: 10 Value: 20
Key: 43 Value: 10
Program ended with exit code: 0

One line std::vector ctor from mapping another vector?

C++11
There should be a one-line version of the last two lines.
typedef std::pair<T1, T2> impl_node;
std::vector<impl_node> impl;
/* do stuff with impl */
std::vector<T1> retval(impl.size());
std::transform(impl.cbegin(), impl.cend(), retval.begin(),
[](const impl_node& in) { return *in.first; });
I tried writing some sort of custom iterator adapter, and the types are getting hairy. What's the "right" solution? (And it probably generalizes to all sorts of other adapters.)
This is still two lines, but less typing (in both senses):
std::vector<T1> retval(impl.size());
for (const auto& p : impl) retval.push_back(p.first);
Actually, now that I look at it, I'd prefer three lines:
std::vector<T1> retval;
retval.reserve(impl.size());
for (const auto& p : impl) retval.push_back(p.first);
(Edited to remove move because there's no evidence that it's appropriate)
I do not know of a way to do this in one line using only the standard STL from C++11, without writing at least a (templated) helper function first.
You may be looking for a concept where the 2 iterators become one object and C++ starts to support behaviour similar to the LINQ extension methods in .NET:
http://www.boost.org/doc/libs/1_52_0/libs/range/doc/html/index.html
You can get at least half of what you're looking for by using an insert iterator.
Allocate the vector without specifying a size,
std::vector<T1> retval;
...and then populate it by using back_inserter (from #include <iterator>):
std::transform(impl.cbegin(), impl.cend(), back_inserter(retval),[](const impl_node& in) { return *in.first; });
Well, we could start with this:
template<typename Output, typename Input, typename Transformation>
auto transform( Input const& input, Transformation t )->Output {
Output retval;
retval.reserve(input.size());
using std::cbegin; using std::cend;
std::transform(cbegin(input), cend(input), std::back_inserter(retval));
return retval;
}
Then work up to something like this:
namespace aux{
using std::cbegin;
template<typename T>
auto adl_cbegin( T&& t )->decltype(cbegin(std::forward(t)));
}
template<typename Input, typename Transformation>
auto transform_vec( Input const& input, Transformation t )->
std::vector<typename std::remove_ref<decltype(t(*adl_cbegin(input)))>::type>
{
typedef std::vector<typename std::remove_ref<decltype(t(*adl_cbegin(input)))>::type> Output;
Output retval;
// retval.reserve(input.size()); -- need a way to do this if Input has an easy way to get size. Too lazy to bother right now.
using std::cbegin; using std::cend;
std::transform(cbegin(input), cend(input), std::back_inserter(retval));
return retval;
}
Notes: this takes anything iterable (vectors, arrays, pairs of iterators) and produces a
and, from there, upgrade to producing a std::pair of boost::transform_iterator on the input range, so we can then insert the resulting transformation into an arbitrary container, and we only do the transformation work if we actually dereference the iterators.
Or, you know, just use the std::back_inserter(input) directly. :) The downside to that approach is that it doesn't do the reserve, so there are performance hits.

C++ algorithms that create their output-storage instead of being applied to existing storage?

The C++ std algorithms define a number of algorithms that take an input and an output sequence, and create the elements of the output sequence from the elements of the input sequence. (Best example being std::transform.)
The std algorithms obviously take iterators, so there's no question that the container for the OutputIterator has to exist prior to the algorithm being invoked.
That is:
std::vector<int> v1; // e.g. v1 := {1, 2, 3, 4, 5};
std::vector<int> squared;
squared.reserve(v1.size()); // not strictly necessary
std::transform(v1.begin(), v1.end(), std::back_inserter(squared),
[](int x) { return x*x; } ); // λ for convenience, needn't be C++11
And this is fine as far as the std library goes. When I find iterators too cumbersome, I often look to Boost.Range to simplify things.
In this case however, it seems that the mutating algorithms in Boost.Range also use OutputIterators.
So I'm currently wondering whether there's any convenient library out there, that allows me to write:
std::vector<int> const squared = convenient::transform(v1, [](int x) { return x*x; });
-- and if there is none, whether there is a reason that there is none?
Edit: example implementation (not sure if this would work in all cases, and whether this is the most ideal one):
template<typename C, typename F>
C transform(C const& input, F fun) {
C result;
std::transform(input.begin(), input.end(), std::back_inserter(result), fun);
return result;
}
(Note: I think convenient::transform will have the same performance characteristics than the handwritten one, as the returned vector won't be copied due to (N)RVO. Anyway, I think performance is secondary for this question.)
Edit/Note: Of the answers(comments, really) given so far, David gives a very nice basic generic example.
And Luc mentions a possible problem with std::back_inserter wrt. genericity.
Both just go to show why I'm hesitating to whip this up myself and why a "proper" (properly tested) library would be preferable to coding this myself.
My question phrased in bold above, namely is there one, or is there a reason there is none remains largely unanswered.
This is not meant as an answer to the question itself, it's a complement to the other answers -- but it wouldn't fit in the comments.
well - what if you wanted list or deque or some other sequence type container - it's pretty limiting.
namespace detail {
template<typename Iter, typename Functor>
struct transform {
Iter first, last;
Functor functor;
template<typename Container> // SFINAE is also available here
operator Container()
{
Container c;
std::transform(first, last, std::back_inserter(c), std::forward<Functor>(functor));
return c;
}
};
} // detail
template<typename Iter, typename Functor>
detail::transform<Iter, typename std::decay<Functor>::type>
transform(Iter first, Iter last, Functor&& functor)
{ return { first, last, std::forward<Functor>(functor) }; }
While this would work with a handful of containers, it's still not terribly generic since it requires that the container be 'compatible' with std::back_inserter(c) (BackInsertable?). Possibly you could use SFINAE to instead use std::inserter with c.begin() if c.push_back() is not available (left as an exercise to the reader).
All of this also assume that the container is DefaultConstructible -- consider containers that make use of scoped allocators. Presumably that loss of genericity is a feature, as we're only trying to cover the 'simplest' uses.
And this is in fact while I would not use such a library: I don't mind creating the container just outside next to the algorithm to separate the concerns. (I suppose this can be considered my answer to the question.)
IMHO, the point of such an algorithm is to be generic, i.e. mostly container agnostic. What you are proposing is that the transform function be very specific, and return a std::vector, well - what if you wanted list or deque or some other sequence type container - it's pretty limiting.
Why not wrap if you find it so annoying? Create your own little utilities header which does this - after all, it's pretty trivial...
The Boost.Range.Adaptors can be kind of seen as container-returning algorithms. Why not use them?
The only thing that needs to be done is to define a new range adaptor create<T> that can be piped into the adapted ranges and produces the desired result container:
template<class T> struct converted{}; // dummy tag class
template<class FwdRange, class T>
T operator|(FwdRange const& r, converted<T>){
return T(r.begin(), r.end());
}
Yep, that's it. No need for anything else. Just pipe that at the end of your adaptor list.
Here could be a live example on Ideone. Alas, it isn't, because Ideone doesn't provide Boost in C++0x mode.. meh. In any case, here's main and the output:
int main(){
using namespace boost::adaptors;
auto range = boost::irange(1, 10);
std::vector<int> v1(range.begin(), range.end());
auto squared = v1 | transformed([](int i){ return i * i; });
boost::for_each(squared, [](int i){ std::cout << i << " "; });
std::cout << "\n========================\n";
auto modded = squared | reversed
| filtered([](int i){ return (i % 2) == 0; })
| converted<std::vector<int>>(); // gimme back my vec!
modded.push_back(1);
boost::for_each(modded, [](int i){ std::cout << i << " "; });
}
Output:
1 4 9 16 25 36 49 64 81
========================
64 36 16 4 1
There is no one and correct way of enabling
std::vector<int> const squared =
convenient::transform(v1, [](int x) { return x*x; });
without a potential performance cost. You either need an explicit
std::vector<int> const squared =
convenient::transform<std::vector> (v1, [](int x) { return x*x; });
Note the explicit mentioning of the container type: Iterators don't tell anything about the container they belong to. This becomes obvious if you remind that a container's iterator is allowed by the standard to be an ordinary pointer.
Letting the algorithm take a container instead of iterators is not a solution, either. That way, the algorithm can't know how to correctly get the first and last element. For example, a long int-array does not have methods for begin(), end() and length(), not all containers have random access iterators, not operator[] defined. So there is no truly generic way to take containers.
Another possibility that allows for container-agnostic, container-returning algorithms would be some kind of generic factory (see live at http://ideone.com/7d4E2):
// (not production code; is even lacking allocator-types)
//-- Generic factory. -------------------------------------------
#include <list>
template <typename ElemT, typename CacheT=std::list<ElemT> >
struct ContCreator {
CacheT cache; // <-- Temporary storage.
// Conversion to target container type.
template <typename ContT>
operator ContT () const {
// can't even move ...
return ContT (cache.begin(), cache.end());
}
};
Not so much magic there apart from the templated cast operator. You then return that thing from your algorithm:
//-- A generic algorithm, like std::transform :) ----------------
ContCreator<int> some_ints () {
ContCreator<int> cc;
for (int i=0; i<16; ++i) {
cc.cache.push_back (i*4);
}
return cc;
}
And finally use it like this to write magic code:
//-- Example. ---------------------------------------------------
#include <vector>
#include <iostream>
int main () {
typedef std::vector<int>::iterator Iter;
std::vector<int> vec = some_ints();
for (Iter it=vec.begin(), end=vec.end(); it!=end; ++it) {
std::cout << *it << '\n';
}
}
As you see, in operator T there's a range copy.
A move might be possible by means of template specialization in case the target and source containers are of the same type.
Edit: As David points out, you can of course do the real work inside the conversion operator, which will come at probably no extra cost (with some more work it can be done more convenient; this is just for demonstration):
#include <list>
template <typename ElemT, typename Iterator>
struct Impl {
Impl(Iterator it, Iterator end) : it(it), end(end) {}
Iterator it, end;
// "Conversion" + Work.
template <typename ContT>
operator ContT () {
ContT ret;
for ( ; it != end; ++it) {
ret.push_back (*it * 4);
}
return ret;
}
};
template <typename Iterator>
Impl<int,Iterator> foo (Iterator begin, Iterator end) {
return Impl<int,Iterator>(begin, end);
}
#include <vector>
#include <iostream>
int main () {
typedef std::vector<int>::iterator Iter;
const int ints [] = {1,2,4,8};
std::vector<int> vec = foo (ints, ints + sizeof(ints) / sizeof(int));
for (Iter it=vec.begin(), end=vec.end(); it!=end; ++it) {
std::cout << *it << '\n';
}
}
The one requirement is that the target has a push_back method. Using std::distance to reserve a size may lead to sub-optimal performance if the target-container-iterator is not a random-access one.
Again, a no-answer, but rather a follow up from the comments to another answer
On the genericity of the returned type in the questions code
The code as it stands does not allow the conversion of the return type, but that can be easily solvable by providing two templates:
template <typename R, typename C, typename F>
R transform( C const & c, F f ) {_
R res;
std::transform( c.begin(), c.end(), std::back_inserter(res), f );
return res;
}
template <typename C, typename F>
C transform( C const & c, F f ) {
return transform<C,C,F>(c,f);
}
std::vector<int> src;
std::vector<int> v = transform( src, functor );
std::deque<int> d = transform<std::deque<int> >( src, functor );

What is wrong with `std::set`?

In the other topic I was trying to solve this problem. The problem was to remove duplicate characters from a std::string.
std::string s= "saaangeetha";
Since the order was not important, so I sorted s first, and then used std::unique and finally resized it to get the desired result:
aeghnst
That is correct!
Now I want to do the same, but at the same time I want the order of characters intact. Means, I want this output:
sangeth
So I wrote this:
template<typename T>
struct is_repeated
{
std::set<T> unique;
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
s.erase(std::remove_if(s.begin(), s.end(), is_repeated<char>()), s.end());
std::cout << s ;
}
Which gives this output:
saangeth
That is, a is repeated, though other repetitions gone. What is wrong with the code?
Anyway I change my code a bit: (see the comment)
template<typename T>
struct is_repeated
{
std::set<T> & unique; //made reference!
is_repeated(std::set<T> &s) : unique(s) {} //added line!
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
std::set<char> set; //added line!
s.erase(std::remove_if(s.begin(),s.end(),is_repeated<char>(set)),s.end());
std::cout << s ;
}
Output:
sangeth
Problem gone!
So what is wrong with the first solution?
Also, if I don't make the member variable unique reference type, then the problem doesn't go.
What is wrong with std::set or is_repeated functor? Where exactly is the problem?
I also note that if the is_repeated functor is copied somewhere, then every member of it is also copied. I don't see the problem here!
Functors are supposed to be designed in a way where a copy of a functor is identical to the original functor. That is, if you make a copy of one functor and then perform a sequence of operations, the result should be the same no matter which functor you use, or even if you interleave the two functors. This gives the STL implementation the flexibility to copy functors and pass them around as it sees fit.
With your first functor, this claim does not hold because if I copy your functor and then call it, the changes you make to its stored set do not reflect in the original functor, so the copy and the original will perform differently. Similarly, if you take your second functor and make it not store its set by reference, the two copies of the functor will not behave identically.
The reason that your final version of the functor works, though, is because the fact that the set is stored by reference means that any number of copies of tue functor will behave identically to one another.
Hope this helps!
In GCC (libstdc++), remove_if is implemented essentially as
template<typename It, typename Pred>
It remove_if(It first, It last, Pred predicate) {
first = std::find_if(first, last, predicate);
// ^^^^^^^^^
if (first == last)
return first;
else {
It result = first;
++ result;
for (; first != last; ++ first) {
if (!predicate(*first)) {
// ^^^^^^^^^
*result = std::move(*first);
++ result;
}
}
}
}
Note that your predicate is passed by-value to find_if, so the struct, and therefore the set, modified inside find_if will not be propagated back to caller.
Since the first duplicate appears at:
saaangeetha
// ^
The initial "sa" will be kept after the find_if call. Meanwhile, the predicate's set is empty (the insertions within find_if are local). Therefore the loop afterwards will keep the 3rd a.
sa | angeth
// ^^ ^^^^^^
// || kept by the loop in remove_if
// ||
// kept by find_if
Not really an answer, but as another interesting tidbit to consider, this does work, even though it uses the original functor:
#include <set>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
template<typename T>
struct is_repeated {
std::set<T> unique;
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
std::remove_copy_if(s.begin(), s.end(),
std::ostream_iterator<char>(std::cout),
is_repeated<char>());
return 0;
}
Edit: I don't think it affects this behavior, but I've also corrected a minor slip in your functor (operator() should apparently take a parameter of type T, not char).
I suppose the problem could lie in that the is_repeated functor is copied somewhere inside the implementation of std::remove_if. If that is the case, the default copy constructor is used and this in turn calls std::set copy constructor. You end up with two is_repeated functors possibly used independently. However as the sets in both of them are distinct objects, they don't see the mutual changes. If you turn the field is_repeated::unique to a reference, then the copied functor still uses the original set which is what you want in this case.
Functor classes should be pure functions and have no state of their own. See item 39 in Scott Meyer's Effective STL book for a good explanation on this. But the gist of it is that your functor class may be copied 1 or more times inside the algorithm.
The other answers are correct, in that the issue is that the functor that you are using is not copyable safe. In particular, the STL that comes with gcc (4.2) implements std::remove_if as a combination of std::find_if to locate the first element to delete followed by a std::remove_copy_if to complete the operation.
template <typename ForwardIterator, typename Predicate>
std::remove_if( ForwardIterator first, ForwardIterator end, Predicate pred ) {
first = std::find_if( first, end, pred ); // [1]
ForwardIterator i = it;
return first == last? first
: std::remove_copy_if( ++i, end, fist, pred ); // [2]
}
The copy in [1] means that the first element found is added to the copy of the functor and that means that the first 'a' will be lost in oblivion. The functor is also copied in [2], and that would be fine if it were not because the original for that copy is an empty functor.
Depending on the implementation of remove_if can make copies of your predicate. Either refactor your functor and make it stateless or use Boost.Ref to "for passing references to function templates (algorithms) that would usually take copies of their arguments", like so:
#include <set>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <boost/ref.hpp>
#include <boost/bind.hpp>
template<typename T>
struct is_repeated {
std::set<T> unique;
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
s.erase(std::remove_if(s.begin(), s.end(), boost::bind<bool>(boost::ref(is_repeated<char>()),_1)), s.end());
std::cout << s;
return 0;
}