std::inserter with set - insert to begin() or end()? [duplicate] - c++

This question already has answers here:
Is there a difference between using .begin() vs .end() for std::inserter for std::set?
(2 answers)
Closed 6 years ago.
I have some code that looks like this:
std::set<int> s1, s2, out;
// ... s1 and s2 are populated ...
std::set_intersection(s1.begin(), s1.end(),
s2.begin(), s2.end(),
std::inserter(out, out.end()));
I've read inserts can be done in amortized constant time if the value being inserted to the set immediately follows the iterator given as a "hint". This would obviously be beneficial when running the set intersection, especially since everything being written to out is already in sorted order.
How do I guarantee this optimal performance? When creating the std::inserter, out is empty so out.begin() == out.end() so I can't see it makes any difference whether I specify out.begin() or out.end() as the hint. However, if this is interpreted at inserting every element at begin(), it doesn't seem that I would get the optimum algorithmic performance. Can this be done better?

I've chosen Alexander Gessler's answer as the 'correct' answer, because it led me to this solution, which I thought I would post anyway. I've written a last_inserter(), which guarantees that the insert position is always an iterator to the last element (or begin() if empty), because set wants an iterator to the element preceding the actual insert position for best performance (so not end() - that would be one after the actual insert position).
The usage as per the original example is like this:
std::set<int> s1, s2, out;
// ... s1 and s2 are populated ...
std::set_intersection(s1.begin(), s1.end(),
s2.begin(), s2.end(),
last_inserter(out)); // note no iterator provided
This guarantees that the insert hint is always an iterator to the last element, hopefully providing best-case performance when using an output iterator to a set with a sorted range, as above.
Below is my implementation. I think it's platform specific to Visual C++ 2010's STL implementation, because it's based heavily on the existing insert_iterator, and I can only get it working by deriving from std::_Outit. If anyone knows how to make this portable, let me know:
// VC10 STL wants this to be a checked output iterator. I haven't written one, but
// this needs to be defined to silence warnings about this.
#define _SCL_SECURE_NO_WARNINGS
template<class Container>
class last_inserter_iterator : public std::_Outit {
public:
typedef last_inserter_iterator<Container> _Myt;
typedef Container container_type;
typedef typename Container::const_reference const_reference;
typedef typename Container::value_type _Valty;
last_inserter_iterator(Container& cont)
: container(cont)
{
}
_Myt& operator=(const _Valty& _Val)
{
container.insert(get_insert_hint(), _Val);
return (*this);
}
_Myt& operator=(_Valty&& _Val)
{
container.insert(get_insert_hint(), std::forward<_Valty>(_Val));
return (*this);
}
_Myt& operator*()
{
return (*this);
}
_Myt& operator++()
{
return (*this);
}
_Myt& operator++(int)
{
return (*this);
}
protected:
Container& container;
typename Container::iterator get_insert_hint() const
{
// Container is empty: no last element to insert ahead of; just insert at begin.
if (container.empty())
return container.begin();
else
{
// Otherwise return iterator to last element in the container. std::set wants the
// element *preceding* the insert position as a hint, so this should be an iterator
// to the last actual element, not end().
return (--container.end());
}
}
};
template<typename Container>
inline last_inserter_iterator<Container> last_inserter(Container& cont)
{
return last_inserter_iterator<Container>(cont);
}

You could use a custom functor instead of std::inserter and re-call out.end() every time a new element is inserted.
Alternatively, if your values are sorted descendingly, out.begin() will be fine.

According to http://gcc.gnu.org/onlinedocs/gcc-4.8.0/libstdc++/api/a01553_source.html
insert_iterator&
operator=(typename _Container::value_type&& __value)
{
iter = container->insert(iter, std::move(__value));
++iter;
return *this;
}
Where iter originally pointed to the iterator you passed to std::inserter. So iter will always point to one past the value you just inserted and if you're inserting in order, should be optimally efficient.

Related

std::multiset define comparator for insertion and comparison

I'm using a std::multiset of pointers to objects to implement Z-ordering in my game, so I don't need to sort the structure on each insertion. I use a comparator for insertion by the object's depth:
struct rendererComparator
{
bool operator ()(const Renderable* r1, const Renderable* r2) const
{
return r1->depth < r2->depth;
}
};
std::multiset<Renderable*, rendererComparator> m_Renderables;
However when it comes to erasing an element in the multiset, the call to erase removes all elements which have the same depth which is undesirable. I tried the suggestions in this question: In std::multiset is there a function or algorithm to erase just one sample (unicate or duplicate) if an element is found but
auto iterator = m_Renderables.find(renderable);
if (iterator != m_Renderables.end())
{
m_Renderables.erase(renderable);
}
Still erases all the elements with the same depth because of the comparator.
Is it possible to define 2 comparators for std::multiset without boost? (How can I set two kind of comparator (one for insert, one for find) on this multiset?) One for insertion and one for comparison?
Thanks
Edit: Jignatious pointed out that I wasn't erasing the iterator (typo by me). I solved it by using std::find_if
auto iterator = std::find_if(m_Renderables.begin(), m_Renderables.end(), [renderable](const Renderable* r1) { return r1 == renderable; });
if (iterator != m_Renderables.end())
{
m_Renderables.erase(iterator);
}
The problem is on this line:
m_Renderables.erase(renderable);
which erases all elements with the same value.
You need to erase with the iterator from the find() function call instead. That will erase the single element that the iterator points to:
m_Renderables.erase(iterator);
Note that std::multiset::find() returns an iterator pointing to the lower bound (or first) of the elements which is searched in the multiset if it exists, otherwise the one past the end element iterator.
Instead of multiset, you can use std::set with comparer like this:
struct Element
{
int value;
Element(int v)
{
value = v;
}
bool operator() (Element* const& left, Element* const& right) const
{
if (left->value == right->value)
return (left < right);
return left->value < right->value;
}
};
It will store multiple values like multimap, but without 'erase all' on erase, without replace same values on insert and with proper find by reference.
std::set<Element*, Element> set;
set.insert(new Element(10));
auto last = new Element(10);
set.insert(last); // 10 10 like in multiset
set.erase(last); // will delete proper references

Implementation of a contiguous (flat) unordered container

I am trying to implement or conceptually design a container that has contiguous memory but where the element order is unimportant (and that is exploited for insertion/removal of objects).
This is something that is similar to std::vector, but lifting the constraint that when an element is removed the relative order of the other elements is preserved, as in this case the last element can be put in place of the removed one.
I more or less know how to implement it (based on std::vector and some special back referenced iterator) but I am looking for a reference implementation to avoid reinventing the wheel.
I am familiar with Boost.Container, but I didn't find such container.
boost::container::flat_set is close, but it maintains the order, which is unnecessary. In some sense, I am looking for some sort of "boost::container::unordered_flat_set" or "unordered_vector".
This is the behavior that I expect:
unordered_flat_set<T> ufs(100); // allocates 100 elements
ufs.reserve(120);
unordered_flat_set<T>::iterator it = ...; // find something
ufs.erase(it); // overwrite last element to that position, destroy last element
ufs.insert(T{}); //add element at "end", only if necessary reallocate, keep buffer memory in multiples of 2 (or 1.6). Element order is not fundamental, can be altered completely by a call to "erase".
ufs.size(); // report size
Both erase and insert are O(1), (unless reallocation is necessary).
Is this a concept that is not already in standard or non-standard containers.
(Perhaps it is the concept of being unordered that doesn't play well with the current containers.
After all the only "unordered" currently is std::unordered_set and it is fairly new.)
This is a reference (very minimal) implementation, it is mainly to give a concrete realization of the concept I am looking for. In fact I am looking to see if the concept already exists to apply it to an existing base-code.
I am not trying to reinvent the wheel.
#include<iostream>
#include<vector>
template<class T>
class unordered_vector{
std::vector<T> impl_;
public:
unordered_vector(){}
void reserve(int i){impl_.reserve(i);}
struct iterator{
std::vector<T>* back_ptr;
int i;
T& operator*(){return back_ptr->operator[](i);}
iterator operator++(){++i; return *this;}
iterator operator--(){--i; return *this;}
bool operator==(iterator const& other) const{return back_ptr == other.back_ptr and i == other.i;}
bool operator!=(iterator const& other) const{return not(*this == other);}
};
int size(){return impl_.size();}
iterator erase(iterator it){
*it = it.back_ptr->last(); // should I use placement new here to not rely in customized (or not assignable object type)?
return it.back_ptr->erase(it.rbegin()); // I return this for compatibility, although there is no use for this
}
iterator insert(T t){
impl_.push_back(t); return {&impl_, size()-1};
}
iterator begin(){return {&impl_, 0};} // does an unordered container have a begin ?? ok, for compatibility, like std::unordered_set
iterator end(){return {&impl_, (int)impl_.size()};} // same question,
T& operator[](int i){return impl_[i];} // same question, if it is unordered v[i] has not a "salient" meaning.
};
int main(){
unordered_vector<double> uv;
uv.reserve(10);
uv.insert(1.1);
uv.insert(2.3);
uv.insert(5.4);
uv.insert(3.1);
std::cout << uv.size() << std::endl;
auto it = uv.begin();
assert( uv.begin() != uv.end());
assert( it != uv.end() );
for(auto it = uv.begin(); it != uv.end(); ++it){
std::cout << *it << std::endl;
}
}
Please see sfl library that I have recently updated to GitHub:
https://github.com/slavenf/sfl-library
It is C++11 header only library that offers flat ordered and unordered containers that store elements contiguously in memory. All containers meet requirements of Container, AllocatorAwareContainer and ContiguousContainer.

Iterate over std::multimap and delete certain entries

I want to iterate over all items in a std::multimap (all values of all keys), and delete all entries that satisfy some condition:
#include <map>
typedef int KEY_TYPE;
typedef int VAL_TYPE;
bool shouldRemove(const KEY_TYPE&, const VAL_TYPE&);
void removeFromMap(std::multimap<KEY_TYPE,VAL_TYPE>& map){
for (auto it = map.begin(); it != map.end(); it++){
if (shouldRemove(it->first,it->second))
map.erase(it);
}
}
The iteration works unless the first item gets deleted, and the following error is thrown then:
map/set iterator not incrementable
How can the removeFromMap function be rewritten in order to work properly? The code should work for all kinds of key- and value types of the map.
I am using C++ 11 and Visual Studio 2013.
You need to increment your iterator before you do the erase. When you do map.erase(it); the iterator it becomes invalid. However, other iterators in the map will still be valid. Therefore you can fix this by doing a post-increment on the iterator...
auto it = map.begin();
const auto end = map.end();
while (it != end)
{
if (shouldRemove(it->first,it->second))
{
map.erase(it++);
// ^^ Note the increment here.
}
else
{
++it;
}
}
The post-increment applied to it inside the map.erase() parameters will ensure that it remains valid after the item is erased by incrementing the iterator to point to the next item in the map just before erasing.
map.erase(it++);
... is functionally equivalent to...
auto toEraseIterator = it; // Remember the iterator to the item we want to erase.
++it; // Move to the next item in the map.
map.erase(toEraseIterator); // Erase the item.
As #imbtfab points out in the comments, you can also use it = map.erase(it) to do the same thing in C++11 without the need for post-incrementing.
Note also that the for loop has now been changed to a while loop since we're controlling the iterator manually.
Additionally, if you're looking to make your removeFromMap function as generic as possible, you should consider using template parameters and pass your iterators in directly, rather than passing in a reference to the multi-map. This will allow you to use any map-style container type, rather than forcing a multimap to be pased in.
e.g.
template <typename Iterator>
void removeFromMap(Iterator it, const Iterator &end){
...
}
This is how the standard C++ <algorithm> functions do it also (e.g. std::sort(...)).

Trying to get Vector iterator to iterating in a different order

My question is twofold:
I have a vector of objects and a vector of integers, I want to iterate on my object vector in the order of the integer vector:
meaning if {water,juice,milk,vodka} is my object vector and {1,0,3,2} is my integer vector I wish to have a const iterator for my object vector that will have juice for the first object, water for the second, vodka and last milk.
is there a simple way of doing this?
suppose I have a function returning const iterator (itr) to a unknown (but accessible) vector
meaning, I can use (itr.getvalue()) but i don't have the size of the vector I'm iterating on, is there a way to make a while loop and know the end or the vector by iterator means?
Question 1:
Omitting most of the boilerplate needed for a proper iterator, the following is how it would work:
template<typename Container, typename Iterator>
class index_iterator
{
public:
typedef typename Container::value_type value_type;
index_iterator(Container& c, Iterator iter):
container(c),
iterator(iter)
{
}
value_type& operator*() { return container[*iterator]; }
index_iterator& operator++() { ++iterator; return *this; }
bool operator==(index_iterator const& other)
{
return &container == &other.container && iterator == other.iterator;
}
// ...
private:
Container& container;
Iterator iterator;
};
template<typename C, typename I>
index_iterator<C, I> indexer(C& container, I iter)
{
return index_iterator<C, I>(container, iter);
}
Then you could write e.g.
std::vector<std::string> vs;
std::vector<int> vi
// fill vs and vi
std::copy(indexer(vs, vi.begin()),
indexer(vs, vi.end()),
std::ostream_iterator<std::string>(std::cout, " "));
Question 2:
No, it isn't possible.
1
#include <iostream>
#include <vector>
std::vector<std::string> foods{"water", "juice", "milk", "vodka"};
std::vector<unsigned int> indexes{1,0,3,2};
for (int i : indexes) { // ranged-for; use normal iteration if you must
std::cout << foods[i] << " ";
}
// Output: juice water vodka milk
Live demo
If you really want to wrap this behaviour into a single iterator for foods, this can be done but it gets a bit more complicated.
2
suppose I have a function returning const iterator (itr) to a unknown (but accessible) vector meaning, I can use (itr.getvalue()) but i don't have the size of the vector I'm iterating on, is there a way to make a while loop and know the end or the vector by iterator means?
If you don't have the vector for its size, and you don't have the vector's end iterator then, no, you can't. You can't reliably iterate over anything with just one iterator; you need a pair or a distance.
Others have already covered number 1. For number 2, it basically comes down to a question of what you're willing to call an iterator. It's certainly possible to define a class that will do roughly what you're asking for -- a single object that both represents a current position and has some way of figuring out when it's been incremented as much as possible.
Most people would call that something like a range rather than an iterator though. You'd have to use it somewhat differently from a normal iterator. Most iterators are used by explicitly comparing them to another iterator representing the end of the range. In this case, you'd pass two separate positions when you created the "iterator" (one for the beginning/current position, the other for the end position) and you'd overload operator bool (for the most obvious choice) to indicate whether the current position had been incremented past the end. You'd use it something like: while (*my_iterator++) operator_on(*my_iterator); -- quite a bit different from using a normal iterator.
I wish to have a const iterator for my object vector that will have
juice for the first object
typedef std::vector<Drink> Drinks;
Drinks drinks;
drinks.push_back("water");
drinks.push_back("juice");
drinks.push_back("milk");
drinks.push_back("vodka");
Drinks::const_iterator i = drinks.begin();
const iterator (itr) to a unknown (but accessible) vector
Drinks::const_iterator itr = some_func();
while (itr != drinks.end()) {
doStuff;
++itr;
}

What's wrong with my vector<T>::erase here?

I have two vector<T> in my program, called active and non_active respectively. This refers to the objects it contains, as to whether they are in use or not.
I have some code that loops the active vector and checks for any objects that might have gone non active. I add these to a temp_list inside the loop.
Then after the loop, I take my temp_list and do non_active.insert of all elements in the temp_list.
After that, I do call erase on my active vector and pass it the temp_list to erase.
For some reason, however, the erase crashes.
This is the code:
non_active.insert(non_active.begin(), temp_list.begin(), temp_list.end());
active.erase(temp_list.begin(), temp_list.end());
I get this assertion:
Expression:("_Pvector == NULL || (((_Myvec*)_Pvector)->_Myfirst <= _Ptr && _Ptr <= ((_Myvect*)_Pvector)->_Mylast)",0)
I've looked online and seen that there is a erase-remove idiom, however not sure how I'd apply that to a removing a range of elements from a vector<T>
I'm not using C++11.
erase expects a range of iterators passed to it that lie within the current vector. You cannot pass iterators obtained from a different vector to erase.
Here is a possible, but inefficient, C++11 solution supported by lambdas:
active.erase(std::remove_if(active.begin(), active.end(), [](const T& x)
{
return std::find(temp_list.begin(), temp_list.end(), x) != temp_list.end();
}), active.end());
And here is the equivalent C++03 solution without the lambda:
template<typename Container>
class element_of
{
Container& container;
element_of(Container& container) : container(container) {}
public:
template<typename T>
bool operator()(const T& x) const
{
return std::find(container.begin(), container.end(), x)
!= container.end();
}
};
// ...
active.erase(std::remove_if(active.begin(), active.end(),
element_of<std::vector<T> >(temp_list)),
active.end());
If you replace temp_list with a std::set and the std::find_if with a find member function call on the set, the performance should be acceptable.
The erase method is intended to accept iterators to the same container object. You're trying to pass in iterators to temp_list to use to erase elements from active which is not allowed for good reasons, as a Sequence's range erase method is intended to specify a range in that Sequence to remove. It's important that the iterators are in that sequence because otherwise we're specifying a range of values to erase rather than a range within the same container which is a much more costly operation.
The type of logic you're trying to perform suggests to me that a set or list might be better suited for the purpose. That is, you're trying to erase various elements from the middle of a container that match a certain condition and transfer them to another container, and you could eliminate the need for temp_list this way.
With list, for example, it could be as easy as this:
for (ActiveList::iterator it = active.begin(); it != active.end();)
{
if (it->no_longer_active())
{
inactive.push_back(*it);
it = active.erase(it);
}
else
++it;
}
However, sometimes vector can outperform these solutions, and maybe you have need for vector for other reasons (like ensuring contiguous memory). In that case, std::remove_if is your best bet.
Example:
bool not_active(const YourObjectType& obj);
active_list.erase(
remove_if(active_list.begin(), active_list.end(), not_active),
active_list.end());
More info on this can be found under the topic, 'erase-remove idiom' and you may need predicate function objects depending on what external states are required to determine if an object is no longer active.
You can actually make the erase/remove idiom usable for your case. You just need to move the value over to the other container before std::remove_if possibly shuffles it around: in the predicate.
template<class OutIt, class Pred>
struct copy_if_predicate{
copy_if_predicate(OutIt dest, Pred p)
: dest(dest), pred(p) {}
template<class T>
bool operator()(T const& v){
if(pred(v)){
*dest++ = v;
return true;
}
return false;
}
OutIt dest;
Pred pred;
};
template<class OutIt, class Pred>
copy_if_predicate<OutIt,Pred> copy_if_pred(OutIt dest, Pred pred){
return copy_if_predicate<OutIt,Pred>(dest,pred);
}
Live example on Ideone. (I directly used bools to make the code shorter, not bothering with output and the likes.)
The function std::vector::erase requires the iterators to be iterators into this vector, but you are passing iterators from temp_list. You cannot erase elements from a container that are in a completely different container.
active.erase(temp_list.begin(), temp_list.end());
You try to erase elements from one list, but you use iterators for second list. First list iterators aren't the same, like in second list.
I would like to suggest that this is an example of where std::list should be used. You can splice members from one list to another. Look at std::list::splice()for this.
Do you need random access? If not then you don't need a std::vector.
Note that with list, when you splice, your iterators, and references to the objects in the list remain valid.
If you don't mind making the implementation "intrusive", your objects can contain their own iterator value, so they know where they are. Then when they change state, they can automate their own "moving" from one list to the other, and you don't need to transverse the whole list for them. (If you want this sweep to happen later, you can get them to "register" themselves for later moving).
I will write an algorithm here now to run through one collection and if a condition exists, it will effect a std::remove_if but at the same time will copy the element into your "inserter".
//fwd iterator must be writable
template< typename FwdIterator, typename InputIterator, typename Pred >
FwdIterator copy_and_remove_if( FwdIterator inp, FwdIterator end, InputIterator outp, Pred pred )
{
for( FwdIterator test = inp; test != end; ++test )
{
if( pred(*test) ) // insert
{
*outp = *test;
++outp;
}
else // keep
{
if( test != inp )
{
*inp = *test;
}
++inp;
}
}
return inp;
}
This is a bit like std::remove_if but will copy the ones being removed into an alternative collection. You would invoke it like this (for a vector) where isInactive is a valid predicate that indicates it should be moved.
active.erase( copy_and_remove_if( active.begin(), active.end(), std::back_inserter(inactive), isInactive ), active.end() );
The iterators you pass to erase() should point into the vector itself; the assertion is telling you that they don't. This version of erase() is for erasing a range out of the vector.
You need to iterate over temp_list yourself and call active.erase() on the result of dereferencing the iterator at each step.