Item in multiple lists - c++

So I have some legacy code which I would love to use more modern techniques. But I fear that given the way that things are designed, it is a non-option. The core issue is that often a node is in more than one list at a time. Something like this:
struct T {
T *next_1;
T *prev_1;
T *next_2;
T *prev_2;
int value;
};
this allows the core have a single object of type T be allocated and inserted into 2 doubly linked lists, nice and efficient.
Obviously I could just have 2 std::list<T*>'s and just insert the object into both...but there is one thing which would be way less efficient...removal.
Often the code needs to "destroy" an object of type T and this includes removing the element from all lists. This is nice because given a T* the code can remove that object from all lists it exists in. With something like a std::list I would need to search for the object to get an iterator, then remove that (I can't just pass around an iterator because it is in several lists).
Is there a nice c++-ish solution to this, or is the manually rolled way the best way? I have a feeling the manually rolled way is the answer, but I figured I'd ask.

As another possible solution, look at Boost Intrusive, which has an alternate list class a lot of properties that may make it useful for your problem.
In this case, I think it'd look something like this:
using namespace boost::intrusive;
struct tag1; struct tag2;
typedef list_base_hook< tag<tag1> > base1;
typedef list_base_hook< tag<tag2> > base2;
class T: public base1, public base2
{
int value;
}
list<T, base_hook<base1> > list1;
list<T, base_hook<base2> > list2;
// constant time to get iterator of a T item:
where_in_list1 = list1.iterator_to(item);
where_in_list2 = list2.iterator_to(item);
// once you have iterators, you can remove in contant time, etc, etc.

Instead of managing your own next/previous pointers, you could indeed use an std::list. To solve the performance of remove problem, you could store an iterator to the object itself (one member for each std::list the element can be stored in).
You can extend this to store a vector or array of iterators in the class (in case you don't know the number of lists the element is stored in).

I think the proper answer depends on how performance-critical this application is. Is it in an inner loop that could potentially cost the program a user-perceivable runtime difference?
There is a way to create this sort of functionality by creating your own classes derived from some of the STL containers, but it might not even be worth it to you. At the risk of sounding tiresome, I think this might be an example of premature optimization.

The question to answer is why this C struct exists in the first place. You can't re-implement the functionality in C++ until you know what that functionality is. Some questions to help you answer that are,
Why lists? Does the data need to be in sequence, i.e., in order? Does the order mean something? Does the application require ordered traversal?
Why two containers? Does membership in the container indicated some kind of property of the element?
Why a double-linked list specifically? Is O(1) insertion and deletion important? Is reverse-iteration important?
The answer to some or all of these may be, "no real reason, that's just how they implemented it". If so, you can replace that intrusive C-pointer mess with a non-intrusive C++ container solution, possibly containing shared_ptrs rather than ptrs.
What I'm getting at is, you may not need to re-implement anything. You may be able to discard the entire business, and store the values in proper C++ containers.

How's this?
struct T {
std::list<T*>::iterator entry1, entry2;
int value;
};
std::list<T*> list1, list2;
// init a T* item:
item = new T;
item->entry1 = list1.end();
item->entry2 = list2.end();
// add a T* item to list 1:
item->entry1 = list1.insert(<where>, item);
// remove a T* item from list1
if (item->entry1 != list1.end()) {
list1.remove(item->entry1); // this is O(1)
item->entry1 = list1.end();
}
// code for list2 management is similar
You could make T a class and use constructors and member functions to do most of this for you. If you have variable numbers of lists, you can use a list of iterators std::vector<std::list<T>::iterator> to track the item's position in each list.
Note that if you use push_back or push_front to add to the list, you need to do item->entry1 = list1.end(); item->entry1--; or item->entry1 = list1.begin(); respectively to get the iterator pointed in the right place.

It sounds like you're talking about something that could be addressed by applying graph theory. As such the Boost Graph Library might offer some solutions.

list::remove is what you're after. It'll remove any and all objects in the list with the same value as what you passed into it.
So:
list<T> listOne, listTwo;
// Things get added to the lists.
T thingToRemove;
listOne.remove(thingToRemove);
listTwo.remove(thingToRemove);
I'd also suggest converting your list node into a class; that way C++ will take care of memory for you.
class MyThing {
public:
int value;
// Any other values associated with T
};
list<MyClass> listOne, listTwo; // can add and remove MyClass objects w/o worrying about destroying anything.
You might even encapsulate the two lists into their own class, with add/remove methods for them. Then you only have to call one method when you want to remove an object.
class TwoLists {
private:
list<MyClass> listOne, listTwo;
// ...
public:
void remove(const MyClass& thing) {
listOne.remove(thing);
listTwo.remove(thing);
}
};

Related

Optimize search in std::deque

I'm doing a program that has a different kind of objects and all of them are children of a virtual class. I'm doing this looking for the advantages of polymorphism that allow me to call from a manager class a certain method of all the objects without checking the specific kind of object it is.
The point is the different kind of objects need sometimes get a list of objects of a certain type.
In that moment my manager class loop thought all the objects and check the type of the object. It creates a list and return it like this:
std::list<std::shared_ptr<Object>> ObjectManager::GetObjectsOfType(std::string type)
{
std::list<std::shared_ptr<Object>> objectsOfType;
for (int i = 0; i < m_objects.size(); ++i)
{
if (m_objects[i]->GetType() == type)
{
objectsOfType.push_back(m_objects[i]);
}
}
return objectsOfType;
}
m_objects is a deque. I know iterate a data structure is normally expensive but I want to know if is possible to polish it a little bit because now this function takes a third of all the time used in the program.
My question is: is there any design pattern or fuction that I'm not taking into account in order to reduce the cost of this operation in my program?
In the code as given, there is just a single optimization that can be done locally:
for (auto const& obj : m_objects)
{
if (obj->GetType() == type)
{
objectsOfType.push_back(obj);
}
}
The rationale is that operator[] is generally not the most efficient way to access a deque. Having said that, I don't expect a major improvement. Your locality of reference is very poor: You're essentially looking at two dereferences (shared_ptr and string).
A logical approach would be to make m_objects a std::multimap keyed by type.
Some things you can do to speed up:
Store the type on the base class, this will remove a somewhat expensive virtual lookup.
If type is a string, etc. change to a
simpel type like an enum or int
A vector is more effiecient to
traverse than a deque
if staying with deque, use iterators or a range based for loop to avoid the random lookups (which are more expensive in deque)
Range based looks like this:
for (auto const& obj : m_objects)
{
if (obj->GetType() == type)
{
objectsOfType.push_back(obj);
}
}
Update: Also I would recommend against using a std::list (unless for some reason you have to) as it is not really performing well in many cases - again std::vector springs to the rescue !

std::list, move item in list using iterators only

It seems to me given what I know about linked lists that this should be possible but I haven't found anywhere that has the answer so I'm asking here.
Given two iterators to items in the same list. I'd like to take the item pointed to by iterator "frm" and "insert" it into the list before the item pointed to by iterator "to".
It seems that all that is needed is to change the pointers on the items in the list pointing to "frm" (to remove "frm"), then changing the pointers on the item pointing at "to" so that it references "frm" then changing the pointers on "frm" node to point to "to".
I looked everywhere for this and couldn't find an answer.
NOTE that I cannot use splice as I do not have access to the list only the iterators to the items in the list.
template <typename T>
void move(typename std::list<T>::iterator frm, typename std::list<T>::iterator to) {
//remove the item from the list at frm
//insert the item at frm before the item at to
}
Iterators contain the minimal information required to point to a piece of data, what you are missing is the fact that linked lists have other bookkeeping that go along with it as well, so essentially the list class looks something like the following
template <typename Type>
class list {
int size; // for O(1) size()
Type* head;
Type* tail;
class Iterator {
Type* element;
// no back pointer to list<Type>*
};
...
};
And to remove an element from the list you would need to update those data members as well. And to do that an iterator must contain a back pointer to the list itself, which is not required as per the interface offered for most iterators. Notice also that the algorithms in the STL do not actually modify the bookkeeping for the containers the operate on, only maybe swap elements, and rearrange things.
I would encourage you took look into the <algorithm> header, as well as into facilities like std::back_inserter and std::move_iterator to get an idea of how iterators are wrapped to actually modify the container they represent.
The implementation of this is implementation defined but the c++ standard allows the use of iter_swap though it doesn't do this exactly. This maybe optimized to swap the pointers on the values held in the linked list similar to what I have described effectively reordering the items in the list without a full swap needed.
iter_swap() versus swap() -- what's the difference?

std::forward_list -- erasing with a stored iterator

I'm trying to keep a global list of a particular (base) class's instances so that I can track them down by iterating through this global list at any time.
I believe the most proper way to address this is with an intrusive list. I have heard that one can encounter these creatures by digging into the Linux kernel, for example.
In the situation where I'm in, I don't really need such guarantees of performance, and using intrusive lists will complicate matters somewhat for me.
Here's what I've got so far to implement this concept of a class that knows about all of its instances.
class A {
static std::forward_list<A*> globallist;
std::forward_list<A*>::iterator listhandle;
public:
A() {
globallist.push_front(this);
listhandle = globallist.begin();
}
virtual ~A() {
globallist.erase_after(...); // problem
}
};
The problem is that there is no forward_list::erase(), and it really does not appear like saving globallist.before_begin() in the ctor would do me much good. I'm never supposed to dereference before_begin()'s iterator. Will it actually hold on to the position? If I save out before_begin's iterator, and then push_front() a new item, that iterator is probably still not capable of being dereferenced, but will it be serviceable for sending to erase_after()?
forward_list is a singly linked list. To remove a node in the middle of that, you must have a pointer to previous node, somehow. For example, you could do something like this:
class A {
static std::forward_list<A*> globallist;
std::forward_list<A*>::iterator prev_node;
public:
A() {
A* old_head = globallist.front();
globallist.push_front(this);
prev_node = globallist.before_begin();
old_head->prev_node = globallist.begin();
}
};
The case of pushing the first element into an empty list, as well as the removal logic, are left as an exercise for the reader (when removing, copy your prev_node to the next node's prev_node).
Or, just use std::list and avoid all this trouble.

Store pointers to objects in multiple containers

For the sake of presenting my question, let's assume I have a set of pointers (same type)
{p1, p2, ..., pn}
I would like to store them in multiple containers as I need different access strategy to access them. Suppose I want to store them in two containers, linked list and a hash table. For linked list, I have the order and for hash table I have the fast access. Now, the problem is that if I remove a pointer from one container, I'll need to remember to remove from other container. This makes the code hard to maintain. So the question is that are there other patterns or data structures to manage situation like this? Would smart pointer help here?
If I understand correctly, you want to link your containers so that removing from one removes from all. I don't think this is directly possible. Possible solutions:
re-design whole object architecture, so pointer is not in many containers.
Use Boost Multi-index Containers Library to achieve all features you want in one container.
Use a map key instead of direct pointer to track objects, and keep the pointer itself in one map.
use std::weak_ptr so you can check if item has been deleted somewhere else, and turn it to std::shared_ptr while it is used (you need one container to have "master" std::shared_ptr to keep object around when not used)
create function/method/class to delete objects, which knows all containers, so you don't forget accidentally, when all deletion is in one place.
Why don't you create your own class which contains the both std::list and std::unordred_map and provide accessing functions and provide removal functions in a way that you can access them linearly with the list and randomly with the unordred_map, and the deletion will be deleting from both containers and insertion will insert to both. ( kind of a wrapper class :P )
Also you can consider about using std::map, and providing it a comparison function which will always keep your data structure ordered in the desired way and also you can randomly access the elements with log N access time.
As usually, try to isolate this logic to make things easier to support. Some small class with safe public interface (sorry, I didn't compile this, it is just a pseudocode).
template<class Id, Ptr>
class Store
{
public:
void add(Id id, Ptr ptr)
{
m_ptrs.insert(ptr);
m_ptrById.insert(std::make_pair(id, ptr));
}
void remove(Ptr ptr)
{
// remove in sync as well
}
private:
std::list<Ptr> m_ptrs;
std::map<Id, Ptr> m_ptrById;
};
Then use Store for keeping your pointers in sync.
If I understand your problem correctly, you are less concern with memory management (new/delete issue) and more concern with the actual "book keeping" of which element is valid or not.
So, I was thinking of wrapping each point with a "reference counter"
template< class Point >
class BookKeeping {
public:
enum { LIST_REF = 0x01,
HASH_REF = 0x02 };
BookKeeping( const Point& p ): m_p(p), m_refCout( 0x3 ) {} // assume object created in both containers
bool isValid() const { return m_refCount == 0x3; } // not "freed" from any container
void remove( unsigned int from ) { m_refCount = m_refCount & ! from ; }
private:
Point m_p;
unsigned int m_refCount;
};
See the answer (the only one, by now) to this similar question. In that case a deque is proposed instead of a list, since the OP only wanted to insert/remove at the ends of the sequence.
Anyway, you might prefer to use the Boost Multi-index Containers Library.

need a std::vector with O(1) erase

I was surprised to find out the vector::erase move elements on calling erase . I thought it would swap the last element with the "to-be-deleted" element and reduce the size by one. My first reaction was : "let's extend std::vector and over-ride erase()" . But I found in many threads like " Is there any real risk to deriving from the C++ STL containers? ", that it can cause memory leaks. But, I am not adding any new data member to vector. So there is no additional memory to be freed. Is there still a risk?
Some suggest that we should prefer composition over inheritance. I can't make sense of this advice in this context. Why should I waste my time in the "mechanical" task of wrapping every function of the otherwise wonderful std::vector class.? Inheritance indeed makes the most sense for this task - or am I missing something?
Why not just write a standalone function that does what you want:
template<typename T>
void fast_erase(std::vector<T>& v, size_t i)
{
v[i] = std::move(v.back());
v.pop_back();
}
All credit to Seth Carnegie though. I originally used "std::swap".
Delicate issue. The first guideline you're breaking is: "Inheritance is not for code reuse". The second is: "Don't inherit from standard library containers".
But: If you can guarantee, that nobody will ever use your unordered_vector<T> as a vector<T> you're good. However, if somebody does, the results may be undefined and/or horrible, regardless of how many members you have (it may seem to work perfectly but nevertheless be undefined behaviour!).
You could use private inheritance, but that would not free you from writing wrappers or pulling member functions in with lots of using statements, which would almost be as much code as composition (a bit less, though).
Edit: What I mean with using statements is this:
class Base {
public:
void dosmth();
};
class Derived : private Base {
public:
using Base::dosmth;
};
class Composed {
private:
Base base;
public:
void dosmth() {return base.dosmth(); }
};
You could do this with all member functions of std::vector. As you can see Derived is significantly less code than Composed.
The risk of inheritance is in the following example:
std::vector<something> *v = new better_vector<something>();
delete v;
That would cause problems because you deleted a pointer to a base class with no virtual destructor.
However if you always delete a pointer to your class like:
better_vector<something> *v = new better_vector<something>();
delete v;
Or don't allocate it on the heap there is no danger. just don't forget to call the parent destructor in your destructor.
I thought it would swap the last element with the "to-be-deleted"
element and reduce the size by one.
vector::erase maintains order of elements while moving last element to erased element and reduce the size by one does not. Since vector implements array, there is no O(1) way to maintain order of elements and to erase at the same time (unless you remove the last element).
If maintaining order of elements is not important than your solution is fine, otherwise, you better use other containers (for example list, which implements doubly-linked list).