I'm in need of a container that has the properties of both a vector and a list. I need fast random access to elements within the container, but I also need to be able to remove elements in the middle of the container without moving the other elements. I also need to be able to iterate over all elements in the container, and see at a glance (without iteration) how many elements are in the container.
After some thought, I've figured out how I could create such a container, using a vector as the base container, and wrapping the actual stored data within a struct that also contained fields to record whether the element was valid, and pointers to the next/previous valid element in the vector. Combined with some overloading and such, it sounds like it should be fairly transparent and fulfill my requirements.
But before I actually work on creating yet another container, I'm curious if anyone knows of an existing library that implements this very thing? I'd rather use something that works than spend time debugging a custom implementation. I've looked through the Boost library (which I'm already using), but haven't found this in there.
If the order does not matter, I would just use a hash table mapping integers to pointers. std::tr1::unordered_map<int, T *> (or std::unordered_map<int, unique_ptr<T>> if C++0x is OK).
The hash table's elements can move around which is why you need to use a pointer, but it will support very fast insertion / lookup / deletion. Iteration is fast too, but the elements will come out in an indeterminate order.
Alternatively, I think you can implement your own idea as a very simple combination of a std::vector and a std::list. Just maintain both a list<T> my_list and a vector<list<T>::iterator> my_vector. To add an object, push it onto the back of my_list and then push its iterator onto my_vector. (Set an iterator to my_list.end() and decrement it to get the iterator for the last element.) To lookup, look up in the vector and just dereference the iterator. To delete, remove from the list (which you can do by iterator) and set the location in the vector to my_list.end().
std::list guarantees the elements within will not move when you delete them.
[update]
I am feeling motivated. First pass at an implementation:
#include <vector>
#include <list>
template <typename T>
class NairouList {
public:
typedef std::list<T> list_t;
typedef typename list_t::iterator iterator;
typedef std::vector<iterator> vector_t;
NairouList() : my_size(0)
{ }
void push_back(const T &elt) {
my_list.push_back(elt);
iterator i = my_list.end();
--i;
my_vector.push_back(i);
++my_size;
}
T &operator[](typename vector_t::size_type n) {
if (my_vector[n] == my_list.end())
throw "Dave's not here, man";
return *(my_vector[n]);
}
void remove(typename vector_t::size_type n) {
my_list.erase(my_vector[n]);
my_vector[n] = my_list.end();
--my_size;
}
size_t size() const {
return my_size;
}
iterator begin() {
return my_list.begin();
}
iterator end() {
return my_list.end();
}
private:
list_t my_list;
vector_t my_vector;
size_t my_size;
};
It is missing some Quality of Implementation touches... Like, you probably want more error checking (what if I delete the same element twice?) and maybe some const versions of operator[], begin(), end(). But it's a start.
That said, for "a few thousand" elements a map will likely serve at least as well. A good rule of thumb is "Never optimize anything until your profiler tells you to".
Looks like you might be wanting a std::deque. Removing an element is not as efficient as a std::list, but because deque's are typically created by using non-contiguous memory "blocks" that are managed via an additional pointer array/vector internal to the container (each "block" would be an array of N elements), removal of an element inside of a deque does not cause the same re-shuffling operation that you would see with a vector.
Edit: On second though, and after reviewing some of the comments, while I think a std::deque could work, I think a std::map or std::unordered_map will actually be better for you since it will allow the array-syntax indexing you want, yet give you fast removal of elements as well.
Related
I ran into some existing code that I inherited that looks up and stores references of structs that are stored in a tbb::concurrent_unordered_map. I know if it was an iterator it would be safe but a reference to the actual object seems fishy.
The code constantly inserts new items into the tbb::concurrent_unordered_map. Can an insert not change the physical location of the items contained inside the tbb map which would make the stored references point to the wrong place as it would for an std::map?
The tbb documentation states:
"Like std::list, insertion of new items does not invalidate any
iterators, nor change the order of items already in the map. Insertion
and traversal may be concurrent."
I know for an std::list the location would not change when inserting new items but because the concurrent_unordered_map documentation talks about the order only I am worried that that does not explicitly say that it can move location.
Below is some pseudo code that demonstrates the concept of the code I ran into.
struct MyStruct {
int i;
int j;
};
//some other thread will insert items into myMap
tbb::concurrent_unordered_map <int, MyStruct> myMap;
MyStruct& getMyStruct (int id)
{
auto itr=myMap.find (id);
if (itr!=myMap.end ()) return itr->second;
static MyStruct dummy {1,2};
return dummy;
}
class MyClass {
public:
MyClass (int id)
:m_myStruct {getMyStruct (id)}
{
}
void DoSomething () {
std::cout<<m_myStruct.i<<std::endl;
}
protected:
MyStruct& m_myStruct; //reference to an item held into a tbb::concurrent_unordered_map
};
This code has been running for over a year at a relatively high frequency which seems to suggest that it is safe but I would like to know for sure that it is safe to keep hold of a reference to an item contained in a tbb::concurrent_unordered_map.
Rewriting the code is a lot of work so I would prefer to not have to do it if it is ok to leave it as is.
Thanks,
Paul
If you are worried so much about going into gray area of not completely documented behavior, why don't you store and use iterators instead? One problem is its size though (run it):
#include <tbb/tbb.h>
int main() {
tbb::concurrent_unordered_map<int, int> m;
printf("Size of iterator: %lu\n", sizeof(m.begin()));
}
This is because besides the pointer to the element, iterator has the pointer to the container it belongs to. There is no list of pointers or any other structure, which registers all the iterators users create, thus the container cannot update an interator to point to a different address (it would be especially tricky in a parallel program). Anyway, such a relocation would cause additional user visible calls to moving/copy constructors of key/value types and thus would be documented. There is no such a warning in the documentation and adding it would cause backward incompatible change, which gives you a sort of indirect guarantee that you can safely keep pointers to your items instead of using iterators.
It seems to me given what I know about linked lists that this should be possible but I haven't found anywhere that has the answer so I'm asking here.
Given two iterators to items in the same list. I'd like to take the item pointed to by iterator "frm" and "insert" it into the list before the item pointed to by iterator "to".
It seems that all that is needed is to change the pointers on the items in the list pointing to "frm" (to remove "frm"), then changing the pointers on the item pointing at "to" so that it references "frm" then changing the pointers on "frm" node to point to "to".
I looked everywhere for this and couldn't find an answer.
NOTE that I cannot use splice as I do not have access to the list only the iterators to the items in the list.
template <typename T>
void move(typename std::list<T>::iterator frm, typename std::list<T>::iterator to) {
//remove the item from the list at frm
//insert the item at frm before the item at to
}
Iterators contain the minimal information required to point to a piece of data, what you are missing is the fact that linked lists have other bookkeeping that go along with it as well, so essentially the list class looks something like the following
template <typename Type>
class list {
int size; // for O(1) size()
Type* head;
Type* tail;
class Iterator {
Type* element;
// no back pointer to list<Type>*
};
...
};
And to remove an element from the list you would need to update those data members as well. And to do that an iterator must contain a back pointer to the list itself, which is not required as per the interface offered for most iterators. Notice also that the algorithms in the STL do not actually modify the bookkeeping for the containers the operate on, only maybe swap elements, and rearrange things.
I would encourage you took look into the <algorithm> header, as well as into facilities like std::back_inserter and std::move_iterator to get an idea of how iterators are wrapped to actually modify the container they represent.
The implementation of this is implementation defined but the c++ standard allows the use of iter_swap though it doesn't do this exactly. This maybe optimized to swap the pointers on the values held in the linked list similar to what I have described effectively reordering the items in the list without a full swap needed.
iter_swap() versus swap() -- what's the difference?
Just for fun, I have implemented the simplest sorting algorithm imaginable:
template<typename Iterator>
void treesort(Iterator begin, Iterator end)
{
typedef typename std::iterator_traits<Iterator>::value_type element_type;
// copy data into the tree
std::multiset<element_type> tree(begin, end);
// copy data out of the tree
std::copy(tree.begin(), tree.end(), begin);
}
It's only about 20 times slower than std::sort for my test data :)
Next, I wanted to improve the performance with move semantics:
template<typename Iterator>
void treesort(Iterator begin, Iterator end)
{
typedef typename std::iterator_traits<Iterator>::value_type element_type;
// move data into the tree
std::multiset<element_type> tree(std::make_move_iterator(begin),
std::make_move_iterator(end));
// move data out of the tree
std::move(tree.begin(), tree.end(), begin);
}
But this did not affect the performance in a significant way, even though I am sorting std::strings.
Then I remembered that associative containers are constant from the outside, that is, std::move and std::copy will do the same thing here :( Is there any other way to move the data out of the tree?
std::set and std::multiset only provide const access to their elements. This means you cannot move something out of the set. If you could move items out (or modify them at all), you could break the set by changing the sort order of the items. So C++11 forbids it.
So your attempt to use the std::move algorithm will just invoke the copy constructor.
I believe you could make a custom allocator for the multiset to use (3rd template argument) which actually moves the elements in it's destroy method back to the user's container. Then erase each element in the set and during its destruction it should move your string back to the original container. I think the custom allocator would need to have 2 phase construction (pass it the begin iterator passed to yourtreesort function to hold as a member, but not during construction because it has to be default constructible).
Obviously this would be bizarre and is a silly workaround for not having a pop method in set/multiset. But it should be possible.
I like Dave's idea of a freaky allocator that remembers the source of each move constructed object and automatically moves back on destruction, I'd never thought of doing that!
But here's an answer closer to your original attempt:
template<typename Iterator>
void treesort_mv(Iterator begin, Iterator end)
{
typedef typename std::iterator_traits<Iterator>::value_type element_type;
// move the elements to tmp storage
std::vector<element_type> tmp(std::make_move_iterator(begin),
std::make_move_iterator(end));
// fill the tree with sorted references
typedef std::reference_wrapper<element_type> element_ref;
std::multiset<element_ref, std::less<element_type>> tree(tmp.begin(), tmp.end());
// move data out of the vector, in sorted order
std::move(tree.begin(), tree.end(), begin);
}
This sorts a multiset of references, so they don't need to be moved out of the tree.
However, when moving back into the original range the move assignments are not necessarily safe for self-assignment, so I moved them into a vector first, so that when re-assigning them back to the original range there will not be self-assignments.
This is marginally faster than your original version in my tests. It probably loses efficiency because it has to allocate the vector as well as all the tree nodes. That and the fact that my compiler uses COW strings so moving isn't much faster than copying anyway.
Good day!
In his "Effective STL" Scott Meyers wrote
A third is to use the information in an ordered container of iterators to iteratively splice the list's elements into the positions you'd like them to be in. As you can see, there are lots of options. (Item 31, second part)
Can someone explain me this way?
More text (to understand the context):
The algorithms sort, stable_sort, partial_sort, and nth_element require random access iterators, so they may be applied only to vectors, strings, deques, and arrays. It makes no sense to sort elements in standard associative containers, because such containers use their comparison functions to remain sorted at all times. The only container where we might like to use sort, stable_sort, partial_sort, or nth_element, but can't, is list, and list compensates somewhat by offering its sort member function. (Interestingly, list::sort performs a stable sort.) If you want to sort a list, then, you can, but if you want to use partial_sort, or nth_element on the objects in a list, you have to do it indirectly. One indirect approach is to copy the elements into a container with random access iterators, then apply the desired algorithm to that. Another is to create a container of list::iterators, use the algorithm on that container, then access the list elements via the iterators. A third is to use the information in an ordered container of iterators to iteratively splice the list's elements into the positions you'd like them to be in. As you can see, there are lots of options.
I'm not sure what the confusion is but I suspect that it is what "splicing" refers to: the std::list<T> has an splice() member function (well, actually several overloads) which transfer nodes between lists. That is, you create a std::vector<std::list<T>::const_iterator> and apply the sorting algorithm (e.g. std::partial_sort()) to this. Then you create a new std::list<T> and use the splice() member with the iterators from the sorted vector to put the nodes into their correct order without moving the objects themselves.
This would look something like this:
std::vector<std::list<T>::const_iterator> tmp;
for (auto it(list.begin()), end(list.end()); it != end; ++it) {
tmp.push_back(it);
}
some_sort_of(tmp);
std::list<T> result;
for (auto it(tmp.begin()), end(tmp.end()); it != end; ++it) {
result.splice(result.end(), list, it);
}
Let's say you wanted to do a partial_sort on a list. You could store the iterators to the list in a set, by providing a comparison function that can sort using the iterators, like this:
struct iterator_less
{
bool operator() (std::list<int>::iterator lhs,
std::list<int>::iterator rhs) const
{
return (*lhs < *rhs);
}
};
typedef std::multiset<
std::list<int>::iterator, iterator_less
> iterator_set;
The you could let set perform the sort, but since it contains iterators to list, you could you list::splice to splice them into a partial_sorted list:
std::list<int> unsorted, partialSorted;
unsorted.push_back(11);
unsorted.push_back(2);
unsorted.push_back(2);
unsorted.push_back(99);
unsorted.push_back(2);
unsorted.push_back(4);
unsorted.push_back(5);
unsorted.push_back(7);
unsorted.push_back(34);
// First copy the iterators into the set
iterator_set itSet;
for( auto it = unsorted.begin(); it!=unsorted.end();++it)
{
itSet.insert(it);
}
// now if you want a partial_sort with the first 3 elements, iterate through the
// set grabbing the first item in the set and then removing it.
int count = 3;
while(count--)
{
iterator_set::iterator setTop = itSet.begin();
partialSorted.splice(
partialSorted.begin(),
unsorted,
*setTop);
itSet.erase(setTop);
}
partialSorted.splice(
partialSorted.end(),
unsorted,
unsorted.begin(),
unsorted.end());
An ordered container would be either std::set or std::map. If you're willing to make a comparator that takes iterators you would use std::set<std::list<mydata>::iterator,comparator>, otherwise you could use std::map<mydata,std::list<mydata>::iterator>. You go through your list from begin() to end() and insert the iterators into the set or map; now you can use it to access the items in the list in sorted order by iterating the set or map, because it's automatically sorted.
Ordered containers are std::set and std::multiset. std::set implements a BST. So what it says is that you should crate an std::set<std::list::iterators> and then use the inherent BST structure to do the sorting. Here is a link on BST to get you started.
Edit Ah. Just noticed "ordered container of iterators". That would imply creating an index into another container.
Boost Multi Index has many example of such things (where a single collections is indexed by several different ordering predicates and the indices are nothing more than collections of 'pointers' (usually iterators) into the base container.
"A third is to use the information in an ordered container of iterators to iteratively splice the list's elements into the positions you'd like them to be in"
One thing I think would match that description is when doing std::sort_heap of a list/vector which has had std::make_heap/push_heap/pop_heap operating on it.
make_heap : convert a sequence to a heap
sort_heap : sort a heap
push_heap : insert an element in a heap
pop_heap : remove the top element from a heap
Heaps are organizations of elements within sequences, which make it (relatively) efficient to keep the collection in a known ordering under insert/removal. The order is implicit (like a recursive defined binary tree stored in a contiguous array) and can be transformed into the corresponding properly sorted sequence by doing the (highly efficient) sort_heap algorithm on it.
Been a while since I've used C++. Can I do something like this?:
for (vector<Node>::iterator n = active.begin(); n!=active.end(); ++n) {
n->ax /= n->m;
}
where Node is an object with a few floats in it?
If written in Java, what I'm trying to accomplish is something similar to:
for (Node n : this.active) {
n.ax /= n.m;
}
where active is an arrayList of Node objects.
I think I am forgetting some quirk about passing by reference or something throws hands in the air in desperation
Yes. This syntax basically works for almost all STL containers.
// this will walk it the container from the beginning to the end.
for(container::iterator it = object.begin(); it != object.end(); it++)
{
}
object.begin() - basically gives an iterator the first element of the container.
object.end() - the iterator is set to this value once it has gone through all elements. Note that to check the end we used !=.
operator ++ - Move the iterator to the next element.
Based on the type of container you may have other ways to navigate the iterator (say backwards, randomly to a spot in the container, etc). A good introduction to iterators is here.
Short answer: yes, you can.
The iterator is a proxy for the container element. In some cases the iterator is literally just a pointer to the element.
Your code works fine for me
#include <vector>
using std::vector;
struct Node{
double ax;
double m;
};
int main()
{
vector<Node> active;
for (vector<Node>::iterator n = active.begin(); n!=active.end(); ++n) {
n->ax /= n->m;
}
}
You can safely change an object contained in a container without invalidating iterators (with the associative containers, this applies only to the 'value' part of the element, not the 'key' part).
However, what you might be thinking of is that if you change the container (say by deleting or moving the element), then existing iterators might be invalidated, depending on the container, the operation being performed, and the details of the iterators involved (which is why you aren't allowed to change the 'key' of an object in an associative container - that would necessitate moving the object in the container in the general case).
In the case of std::vector, yes, you can manipulate the object simply by dereferencing the iterator. In the case of sorted containers, such as a std::set, you can't do that. Since a set inserts its elements in order, directly manipulating the contents isn't permitted since it might mess up the ordering.
What you have will work. You can also use one of the many STL algorithms to accomplish the same thing.
std::for_each(active.begin(), active.end(), [](Node &n){ n.ax /= n.m; });