about erasing an interator of a boost multiindex - c++

I'd like to delete some element out of a boost multi-index container by erasing iterators while visiting the collection.
What I am not sure about is if any iterator invalidation is involved and whether my code below would invalidate firstand last iterators.
If the code below is incorrect, which is the best way considering the specific index (ordered_unique) below?
#include <iostream>
#include <stdint.h>
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/key_extractors.hpp>
#include <boost/shared_ptr.hpp>
using namespace std;
class MyClass{
public:
MyClass(int32_t id) : id_(id) {}
int32_t id() const
{ return id_; }
private:
int32_t id_;
};
typedef boost::shared_ptr<MyClass> MyClass_ptr;
typedef boost::multi_index_container<
MyClass_ptr,
boost::multi_index::indexed_by<
boost::multi_index::ordered_unique<
boost::multi_index::const_mem_fun<MyClass,int32_t,&MyClass::id>
>
>
> Coll;
int main() {
Coll coll;
// ..insert some entries 'coll.insert(MyClass_ptr(new MyClass(12)));'
Coll::iterator first = coll.begin();
Coll::iterator last = coll.end();
while(first != last) {
if((*first)->id() == 3)
coll.erase(first++);
else
++first;
}
}

The reason that erase for containers returns an iterator is to use that result:
first = coll.erase(first);
Then you don't have to worry about how the underlying implementation handles erase or whether it shifts elements around. (In a vector, for instance, your code would've skipped an element in your iteration) However, the documentation does state that:
It is tempting to see random access indices as an analogue of std::vector for use in Boost.MultiIndex, but this metaphor can be misleading, as both constructs, though similar in many respects, show important semantic differences. An advantage of random access indices is that their iterators, as well as references to their elements, are stable, that is, they remain valid after any insertions or deletions.
Still, just seeing coll.erase(first++) is a flag for me, so prefer to do it the other way.

Related

Are boost::multi_index iterators invalidated when erasing or modifying values that are the key of a different index?

In testing it seems to work fine, but I could not find any mention of the expected behaviour in the documentation.
Essentially, if my multi_index_container has 2 ordered_non_unique indices using keys A and B respectively, if I iterate over a range from A and modify the B value (that might cause re-ordering), are the iterators for A invalidated?
Iterators are never invalidated as long as the element is not erased. Please note that invalidation is not the same as repositioning (caused by re-ordering).
Iterators to an index dependent on key A will not be invalidated nor repositioned (i.e., the index keeps its order) upon changes on a different key B, as long as the affected element is not erased (which can happen if the index dependent on key B is unique).
If you want to safely range over an A-index modifying B keys even in the case of erasures, you can do as exemplified below:
Live On Wandbox
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/key.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <iostream>
#include <iterator>
using namespace boost::multi_index;
struct element
{
int a;
int b;
};
using container=multi_index_container<
element,
indexed_by<
ordered_unique<key<&element::a>>,
ordered_unique<key<&element::b>>
>
>;
int main()
{
container c={{0,0},{1,1},{2,2},{3,3},{4,4},{5,5}};
auto print=[](auto& c){
for(const auto& x:c)std::cout<<"{"<<x.a<<","<<x.b<<"}";
std::cout<<"\n";
};
std::cout<<"before: ";
print(c);
for(auto first=c.begin(),last=c.end();first!=last;){
// we get next position now in case first will be invalidated
auto next=std::next(first);
c.modify(first,[](auto& x){
x.b*=2;
});
first=next;
}
std::cout<<"after: ";
print(c);
}
Output
before: {0,0}{1,1}{2,2}{3,3}{4,4}{5,5}
after: {0,0}{3,6}{4,8}{5,10}
Expanded answer: When you're modifying the key of the index you're ranging on, you can either do a first pass to store all the iterators in the range before doing any actual modification (see modify_unstable_range here) or, in case you want to do the thing in just one pass, store the addresses of modified elements along the way to avoid revisitation:
Live On Wandbox
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/key.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <iostream>
#include <iterator>
#include <unordered_set>
using namespace boost::multi_index;
struct element
{
int a;
int b;
};
using container=multi_index_container<
element,
indexed_by<
ordered_unique<key<&element::a>>,
ordered_unique<key<&element::b>>
>
>;
int main()
{
container c={{0,0},{1,1},{2,2},{3,3},{4,4},{5,5}};
auto print=[](auto& c){
for(const auto& x:c)std::cout<<"{"<<x.a<<","<<x.b<<"}";
std::cout<<"\n";
};
std::cout<<"before: ";
print(c);
std::unordered_set<const element*> visited;
for(auto first=c.begin(),last=c.end();first!=last;){
// we get next position now before first is invalidated/repositioned
auto next=std::next(first);
if(c.modify(first,[](auto& x){
x.a*=2; // note we're modifying the key of the index we're at
})){
// element succesfully modified, store address to avoid revisitation
visited.insert(&*first);
}
// move to next nonvisited element
first=next;
while(first!=last&&visited.find(&*first)!=visited.end())++first;
}
std::cout<<"after: ";
print(c);
}
Output
before: {0,0}{1,1}{2,2}{3,3}{4,4}{5,5}
after: {0,0}{6,3}{8,4}{10,5}

C++: 'unique vector' data structure

I need a data structure like std::vector or std::list whose elements will be unique. In most of time I will call push_back on it, sometimes maybe erase. When I insert an element which is already there, I need to be notified either by some boolean or exception.
And the most important property it should have: the order of insertions. Each time I iterate over it, it should return elements in the order they were inserted.
We can think other way: a queue which guarantees the uniqueness of elements. But I don't want to pop elements, instead I want to iterate over them just like we do for vector or list.
What is the best data structure for my needs?
You can use a std::set
It will return a pair pair<iterator,bool> when the insert method is called. The bool in the pair is false when the element already exists in the set (the element won't be added in that case).
Use a struct with a regular std::vector and a std::set.
When you push, check the set for existence of the element. When you need to iterate, iterate over the vector. If you need to erase from the vector, also erase from the set.
Basically, use the set as an aside, only for fast "presence of an element" check.
// either make your class a template or use a fixed type of element
class unique_vector
{
public:
// implement the various operator you need like operator[]
// alternatively, consider inheriting from std::vector
private:
std::set<T> m_set; // fast lookup for existence of elements
std::vector<T> m_vector; // vector of elements
};
I would prefer using std::unordered_set to stores existing elements in a std::vector and it has faster lookup time of O(1), while the lookup time of std::set is O(logn).
You can use Boost.MultiIndex for this:
Live On Coliru
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/sequenced_index.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/identity.hpp>
using namespace boost::multi_index;
template<typename T>
using unique_list=multi_index_container<
T,
indexed_by<
sequenced<>,
hashed_unique<identity<T>>
>
>;
#include <iostream>
int main()
{
unique_list<int> l;
auto print=[&](){
const char* comma="";
for(const auto& x:l){
std::cout<<comma<<x;
comma=",";
}
std::cout<<"\n";
};
l.push_back(0);
l.push_back(1);
l.push_back(2);
l.push_back(0);
l.push_back(2);
l.push_back(4);
print();
}
Output
0,1,2,4

Creating map of boost::tuple<std::string, std::string, int>and std::vector<int>

I want to create map with Key as a combination of two strings and one int and value can be multiple ints based on key.
So I tried to create map of boost::tupleand std::vector. I tried writing sample program for this like below:
#include "stdafx.h"
#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <string>
#include <boost/tuple/tuple.hpp>
#include <boost/unordered_map.hpp>
using namespace std;
typedef boost::tuple<std::string, std::string, int> tpl_t;
struct key_hash : public std::unary_function<tpl_t, std::size_t>
{
std::size_t operator()(const tpl_t& k) const
{
return boost::get<0>(k)[0] ^ boost::get<1>(k)[0] ^ boost::get<2>(k);
}
};
struct key_equal : public std::binary_function<tpl_t, tpl_t, bool>
{
bool operator()(const tpl_t& v0, const tpl_t& v1) const
{
return (
boost::get<2>(v0) == boost::get<2>(v1) &&
boost::get<0>(v0) == boost::get<0>(v1) &&
boost::get<1>(v0) == boost::get<1>(v1)
);
}
};
typedef boost::unordered_map<tpl_t, std::vector<int>, key_hash,key_equal> map_t;
void function1(map_t& myMap, std::string file, std::string txt, int num1, int num2)
{
tpl_t key = boost::make_tuple(file, txt, num1);
map_t::iterator itr = myMap.find(key);
if(itr != myMap.end())
{
itr->second.push_back(num2);
}
else
{
std::vector<int> num2Vec;
num2Vec.push_back(num2);
myMap.insert(std::make_pair(boost::make_tuple(file,txt,num1),num2Vec));
}
}
int main()
{
map_t myMap;
function1(myMap, "file1", "text", 5, 10);
function1(myMap, "file1", "text_t", 5, 30);
function1(myMap, "file2", "text", 5, 50);
}
This program is working fine but I want to know if there is any better way to do this. I am worried about performance as size of map can grow to anything. I have not measured performance though.
Thanks,
Shrik
I am worried about performance as size of map can grow to anything. I have not measured performance though.
You should worry about having a performance measurement in place before you worry about the eventuality of your design being unsuitable for the task.
Design a number of use cases, and create a sample data distribution - the common cases for composite keys and values, a standard deviation to each side, and the tails. Consider not only the data set itself but also its setup and use profile - the frequency of inserts, searches, removals.
That said, overall your approach of modelling the composite key as a tuple is sensible, although, subjectively, I'd prefer a struct instead, but that's a very minor comment.
For the values - consider using a multi-map instead of a map with a vector, this will likely to be faster, but it depends on the number of values.
Here are a couple more things to consider:
Does your (multi-)map have to be ordered or can it remain unordered? Depending on the use profile, an unordered map could be significantly faster.
Could you tailor your knowledge of the keys to make comparisons faster? For example, if the strings are long and static (e.g., almost always "file1"), could you benefit by evaluating the integer part of two compared key first?
Could you benefit by having a hierarchical map instead of a map with a composite key?
It's better to have many of the questions above answered with sample data and scenarios which form part of the test suite for your program. That way you can observe the changes to performance as you change your data structures.

std::insert_iterator and iterator invalidation

I tried writing a generic, in place, intersperse function. The function should intersperse a given element into a sequence of elements.
#include <vector>
#include <list>
#include <algorithm>
#include <iostream>
template<typename ForwardIterator, typename InserterFunc>
void intersperse(ForwardIterator begin, ForwardIterator end, InserterFunc ins,
// we cannot use rvalue references here,
// maybe taking by value and letting users feed in std::ref would be smarter
const ForwardIterator::value_type& elem) {
if(begin == end) return;
while(++begin != end) {
// bugfix would be something like:
// begin = (ins(begin) = elem); // insert_iterator is convertible to a normal iterator
// or
// begin = (ins(begin) = elem).iterator(); // get the iterator to the last inserted element
// begin now points to the inserted element and we need to
// increment the iterator once again, which is safe
// ++begin;
ins(begin) = elem;
}
}
int main()
{
typedef std::list<int> container;
// as expected tumbles, falls over and goes up in flames with:
// typedef std::vector<int> container;
typedef container::iterator iterator;
container v{1,2,3,4};
intersperse(v.begin(), v.end(),
[&v](iterator it) { return std::inserter(v, it); },
23);
for(auto x : v)
std::cout << x << std::endl;
return 0;
}
The example works only for containers that do not invalidate their
iterators on insertion. Should I simply get rid of the iterators and
accept a container as the argument or am I missing something about
insert_iterator that makes this kind of usage possible?
The example works only for containers that do not invalidate their iterators on insertion.
Exactly.
Should I simply get rid of the iterators and accept a container as the argument
That would be one possibility. Another would be not making the algorithm in-place (ie. output to a different container/output-iterator).
am I missing something about insert_iterator that makes this kind of usage possible?
No. insert_iterator is meant for repeated inserts to a single place of a container eg. by a transform algorithm.
The problems with your implementation have absolutely nothing to do with the properties of insert_iterator. All kinds of insert iterators in C++ standard library are guaranteed to remain valid, even if you perform insertion into a container that potentially invalidates iterators on insert. This is, of course, true only if all insertions are performed through only through the insert iterator.
In other words, the implementation of insert iterators guarantees that the iterator will automatically "heal" itself, even if the insertion lead to a potentially iterator-invalidating event in the container.
The problem with your code is that begin and end iterators can potentially get invalidated by insertion into certain container types. It is begin and end that you need to worry about in your code, not the insert iterator.
Meanwhile, you do it completely backwards for some reason. You seem to care about refreshing the insert iterator (which is completely unnecessary), while completely ignoring begin and end.

How to write standard C++ iterator?

I have the following simple Graph class, where for each Node, I store a set of outgoing Arcs:
#include <iostream>
#include <vector>
#include <map>
#include <set>
struct Arc {
char label;
int targetNode;
};
struct Graph {
std::vector<int> nodes;
std::map< int, std::set<Arc*> > outgoingArcsPerNode;
};
How can I provide a standard C++ iterator over all the arcs in the graph (order of iteration doesn't matter) that hides how the arcs are stored in the graph?
I would like to use it similar to the following:
int main() {
Graph g;
for (Graph::const_iterator it = g.arcsBegin(); it != g.arcsEnd(); ++it) {
Arc* a = *it;
}
}
I heard of boost::iterator, but I find it confusing. Maybe someone could give a hint how to use it in this case?
If you don't want to use boost, have a look at what iterators must provide : STL documentation.
Otherwise, you may use boost iterator library. See the iterator_facade tutorial which is very close to what you're asking.
Create class which has two iterators inside: one over map and another over set.
Each ++ is applied to set iterator. When it reaches the end, increment map iterator and reinitialize set iterator.
Also you can use boost::iterator_facade - it will not help to implement algorithm of the iteration, but will minimize your effort on making your iterator compatible to STL expectations...