The story begins with something I thought pretty simple :
I need to design a class that will use some STL containers. I need to give users of the class access to an immutable version of those containers. I do not want users to be able to change the container (they can not push_back() on a list for instance), but I want users to be able to change the contained objects (get an element with back() and modify it) :
class Foo
{
public:
// [...]
ImmutableListWithMutableElementsType getImmutableListWithMutableElements();
// [...]
};
// [...]
myList = foo.getImmutableListWithMutableElements();
myElement = myList.back();
myElement.change(42); // OK
// [...]
// myList.push_back(myOtherElement); // Not possible
At first glance, it seems that a const container will do. But of course, you can only use a const iterator on a const container and you can not change the content.
At second glance, things like specialized container or iterator come to mind. I will probably end up with that.
Then, my thought is "Someone must have done that already !" or "An elegant, generic solution must exist !" and I'm here asking my first question on SO :
How do you design / transform a standard container into an immutable container with mutable content ?
I'm working on it but I feel like someone will just say "Hey, I do that every time, it's easy, look !", so I ask...
Thank you for any hints, suggestions or wonderful generic ways to do that :)
EDIT:
After some experiments, I ended up with standard containers that handle some specifically decorated smart pointers. It is close to Nikolai answer.
The idea of an immutable container of mutable elements is not a killing concept, see the interesting notes in Oli answer.
The idea of a specific iterator is right of course, but it seems not practical as I need to adapt to any sort of container.
Thanks to you all for your help.
The simplest option would probably be a standard STL container of pointers, since const-ness is not propagated to the actual objects. One problem with this is that STL does not clean up any heap memory that you allocated. For that take a look at Boost Pointer Container Library or smart pointers.
Rather than providing the user with the entire container, could you just provide them non-const iterators to beginning and end? That's the STL way.
You need a custom data structure iterator, a wrapper around your private list.
template<typename T>
class inmutable_list_it {
public:
inmutable_list_it(std::list<T>* real_list) : real_list_(real_list) {}
T first() { return *(real_list_->begin()); }
// Reset Iteration
void reset() { it_ = real_list_->begin(); }
// Returns current item
T current() { return *it_; }
// Returns true if the iterator has a next element.
bool hasNext();
private:
std::list<T>* real_list_;
std::list<T>::iterator it_;
};
The painful solution:
/* YOU HAVE NOT SEEN THIS */
struct mutable_int {
mutable_int(int v = 0) : v(v) { }
operator int(void) const { return v; }
mutable_int const &operator=(int nv) const { v = nv; return *this; }
mutable int v;
};
Excuse me while I have to punish myself to atone for my sins.
Related
Can we overload the push_back() method in std::vector to allow non-duplicate elements? I know std::set and std::unordered_set are supposed to avoid duplicate elements, but std::set sorts the elements and std::unordered_set stores the elements in no particular order. I need to retrieve the elements in the order they are inserted, while ensuring duplicate elements are not inserted.
Edit: There's a possible duplicate for this question here. The best solution to this duplicate proposes to have an auxiliary data structure and another custom method "add". This doesn't look good for me since(I'll put it in a separate documentation) the users inserting data in std::vector rarely refer to the documentation for any custom functions. If there's no efficient way though, this can be a last resort.
Many people advise against it, but it seems there's some kind of urban legend going around that doing so will cause the universe to undergo vacuum decay and reality as we know it will dissolve.
You can publicly inherit from std::vector. But you have to think about what you can do with that.
If you inherit from vector, it is highly recommended that you don't add any data members to it. This can cause object slicing (google "c++ object slicing".) You also need to keep in mind that vector is not using virtual functions. That means you cannot override member functions. You can only shadow them, so it's not guaranteed that it will always be your push_back() function that gets called. The original will get called if you pass an object of your class to something that takes a reference to a vector, for example.
So in the end, you'd need to add a push_back_unique() function instead. But that in turns means that can be served by a simple free function instead. So inheriting vector isn't needed. This of course means there's never a guarantee that the elements in the vector will be unique. Other code might use push_back() instead somewhere.
Inheriting vector makes sense if you want to add completely new convenience functions that don't impose or lift any restrictions that vector has. If you want something that looks like a vector but really isn't (because it has different behavior and/or restrictions), you should implement your own type that delegates the container functionality to vector by either inheriting privately from it, or by having it as a private data member, and then replicate the vector API through public wrapper functions.
But this is very tedious to implement. Usually, you don't really need all the API from vector. So I'd say just write a smaller class around vector that only provides the functionality you need. And that functionality sounds like it's going to be pretty much read-only, since allowing write access to the elements allows for setting an element to the same value as another, breaking the container's uniqueness. So you could do something like:
template<typename T>
class UniqueVector
{
public:
void push_back(T&& elem)
{
if (std::find(vec_.begin(), vec_.end(), elem) == vec_.end()) {
vec_.push_back(std::forward(elem));
}
}
const T& operator[](size_t index) const
{
return vec_[index];
}
auto begin() const
{
return vec_.cbegin();
}
auto end() const
{
return vec_.cend();
}
private:
std::vector<T> vec_;
};
If you still want to allow write access to individual elements, then you can provide non-const functions that check if the value that is passed is already in the vector. Like:
void assign_if_unique(size_t index, T&& value)
{
if (std::find(vec_.begin(), vec_.end(), value) == vec_.end()) {
vec_[index] = std::forward(value);
}
}
This is a minimal example. You should obviously add the functions you actually want. Like size(), empty(), and whatever else you need.
You should first define a free function1 to implement your feature:
template<class T>
std::vector<T>&
push_back_unique(std::vector<T>& dest, T const& src)
{ /* ... */ }
If you use this a lot, and if make sense regarding your program, you might want to define an operator to do so:
template<class T>
std::vector<T>& operator<<(std::vector<T>& dest, T const& src)
{ return push_back_unique(dest, src); }
This allows:
std::vector<int> data;
data << 5 << 8 << 13 << 5 << 21;
for (auto n : data) std::cout << n << " "; // prints 5 8 13 21
1) This is because inheriting from standard containers is often bad practice and brings pitfalls.
I am starting using c++11 features and I like to use smart pointers only to own the objects. Here is my class:
class MyClass {
public:
vector<MyObject*> get_objs() const;
private:
vector<unique_ptr<MyObject>> m_objs;
};
The semantics is that MyClass owns a serial of MyObject which are created through make_unique(). get_objs() returns a vector of raw pointers in order for various callers to update the objects. Because those callers do not own the objects, so the function does not return vector<unique_ptr>.
But this means I need to implement get_objs() like this:
vector<MyObjects*> MyClass::get_objs() const
{
vector<MyObjects*> ret;
for (auto obj : my_objs) {
ret.push_back(obj->get());
}
return ret;
}
My concern is get_objs() is called fairly often, each time there is an overhead to construct this raw pointer vector.
Is there something I could do here? If there is no c++11 tricks to save the overhead, should I just use type vector<MyObject*> for m_objs in the first place?
UPDATE 1
Jonathan Wakely's solution using operator[] improves mine so that caller can access individual object directly.
Is there any other solution? I do not mind go over all the places calling get_objs(), but like to see if there is even better solution.
Another note - I cannot use BOOST, just some restriction I have to live with.
For a start you can use ret.reserve(m_objs.size()) to pre-allocate the right number of elements.
Alternatively, don't return a vector for callers to iterate over directly, but expose a vector-like interface instead:
class MyClass {
public:
struct iterator;
iterator begin();
iterator end();
MyObject* operator[](size_t n) { return m_objs[n].get(); }
private:
vector<unique_ptr<MyObject>> m_objs;
};
This allows the callers to modify the objects directly, rather than getting a container of pointers.
class MyClass {
public:
std::vector<std::unique_ptr<MyObject>> const& get_objs() const {
return m_objs;
}
private:
std::vector<std::unique_ptr<MyObject>> m_objs;
};
a const std::unique_ptr<MyObject>& cannot steal ownership, and is not the same as a std::unique_ptr<const MyObject>. A const std::vector<std::unique_ptr<MyObject>>& can only grant const access to its data.
In c++20 I would instead do this:
class MyClass {
public:
std::span<std::unique_ptr<MyObject> const> get_objs() const {
return {m_objs.begin(), m_objs.end()};
}
private:
std::vector<std::unique_ptr<MyObject>> m_objs;
};
which hides the implementation detail of "I am storing it in a vector" while exposing "I am storing it contiguously".
Prior to c++20, I advise finding or writing your own span type if you have the budget. They are quite useful.
If you can use Boost, try indirect_iterator (http://www.boost.org/doc/libs/1_55_0b1/libs/iterator/doc/indirect_iterator.html). You need to define iterator, begin and end in your class:
typedef boost::indirect_iterator<vector<unique_ptr<MyObject>::iterator> iterator;
iterator begin() { return make_indirect_iterator(m_objs.begin()); }
Then your class exposes iterator, the value of which is reference (not pointer!) to MyObject. You can iterate and access the elements of the vector directly.
For the record, I think something like Jonathan Wakely's answer is the way to go. But since you asked for more possibilities, another one is to use shared_ptr instead of unique_ptr:
class MyClass {
public:
const vector<shared_ptr<MyObject>>& get_objs() const {
return m_objs;
}
private:
vector<shared_ptr<MyObject>> m_objs;
};
This improves the original code in two ways:
There is no longer any need to build up a new vector in get_objs; you can just return a reference to the one you have.
You no longer need to worry about wild pointers in the case where a caller keeps the return value alive longer than the object that returned it--shared_ptr ensures the pointed-to objects aren't deleted until all references have been released.
On another note, get_objs arguably should not be const. Calling code can't modify the vector itself, but it can modify the MyObjects it contains.
Another way is not to return any objects at all and encapsulate how you are storing the data (tell don't ask).
Usually when you get the items you end up iterating through them all, so it makes sense to wrap that up in the class and pass in a function you want your objects to pass through.
The non boost way will be something like
void MyClass::VisitItems(std::function<void, MyObject&> f)
{
for (auto obj : my_objs)
{
f(*obj);
}
}
I have the following problem and I wonder whether there's a better way to solve it:
class myObj {
public:
typedef std::shared_ptr<myObj> handle;
typedef std::shared_ptr<const myObj> const_handle;
int someMethod() { ... }
int someConstMethod() const { ... }
};
Now what I need is a container class that somehow allows you to modify or read a collection of myObj depending on its own constness, like so:
class myCollection {
public:
typedef std::list<myObj::handle> objList;
typedef std::list<myObj::const_handle> const_objList;
inline objList& modify() { return _obl; }
// it would be nice to do this, but it won't compile as
// objList and const_objList are completely different types
inline const_objList& read() const { return _obl; } // doh! compile error...
// returning a const objList won't help either as it would return non-const
// handles, obviously.
// so I am forced to do this, which sucks as i have to create a new list and copy
void read(const_objList &l) {
std::for_each(
_obl.begin(),
_obl.end(),
[&l] (myObj::handle &h) { l.push_back(h); }
// ok as handle can be cast to const_handle
); // for_each
}
private:
objList _obl;
};
So this solution actually works as a const myCollection would only allow you to get a list of const_handle which only allows you to call non-modifying methods of myObj (GOOD).
The problem is that the "read" method is really ugly (BAD).
Another method would be to expose somehow the list methods and return const_handle and handle as needed but it's a lot of overhead, especially if you want to use something more complex than a list.
Any idea?
A list-of-pointers-to-T is not a list-of-pointers-to-constant-T.
std::list<std::shared_ptr<int>> a;
std::list<std::shared_ptr<const int>>& ra = a; // illegal but imagine it's not
std::shared_ptr<const int> x = std::make_shared<const int>(42);
ra.push_back(x); // totally legal, right?
++**a.begin(); // oops... just incremented a const int
Now a list-of-pointers-to-T is, conceptually, a constant-list-of-constant-pointers-to-constant-T, but std::list<std::shared_ptr<T>> does not support such a deep const propagation. const std::list<std::shared_ptr<T>> contains constant pointers to non-constant objects.
You can write your own variant of list<> or your own variant of shared_ptr<> that have such support. It probably won't be very easy though. A const_propagating_shared_ptr is probably the easier of the two. It would have to encapsulate an std::shared_ptr<T> object and forward almost everything to it as-is. As opposed to std::shared_ptr<T> it would have separate const and non-const versions of operator->, operator*() and get().
Given what you stated that you want to accomplish, I don't think that your solution is too bad. Imagine that some other code may be modifying the internal collection, like adding or removing values. Returning a copy of the current state of the collection is safe for client code, since it can work on the copy, without the danger of element being deleted in the meantime. But I digress, this is getting into threading issues and may not be relevant.
You could use prettier:
inline const_objList read() const
{
const_objList cl(_obl.begin(), _obl.end());
return cl;
}
However, I do think that your problems derive from mixing two types of constness: constness of the members of the collection versus the constness of the collection itself.
Instead of Modify and Read methods, that deal with the list as a whole, I would try exposing const and non-const iterators to internal list, through corresponding const and non-const methods returning said iterators.
But this immediately begs the question: why then have myCollection in the first place?
Creating entirely new collection type around std::list doesn't seem needed, unless you get a lot of proverbial bang for the buck from other, added functionality that is not visible in your sample.
You can then make your added functionality free methods that take std::list of your handles as the input. Not everything requires an object and operations on objects need not necessarily be member methods, unless access to private data is required.
You mentioned maybe using another container instead of the list. But your class, as is, won't do it, unless you have a template, where template parameter can be one of STL containers.
Which then implies that you should expose iterators.
Namely, if you foresee changing the internal collection type, you would want to make the public interface to myCollection transparent regarding the collection type. Otherwise, clients will have to recompile each time you change your mind about the internal implementation.
EDIT -----
Finally, if implementing iterators (while interesting and most correct) is too much, why not go for simple getters like in this SO post:
smart pointer const correctness
I'll quote the topmost answer by RĂ¼diger Stevens (it assumes vector instead of list):
template <typename T>
class MyExample
{
private:
vector<shared_ptr<T> > data;
public:
shared_ptr<const T> get(int idx) const
{
return data[idx];
}
shared_ptr<T> get(int idx)
{
return data[idx];
}
void add(shared_ptr<T> value)
{
data.push_back(value);
}
};
This is a pretty straightforward architectural question, however it's been niggling at me for ages.
The whole point of using a list, for me anyway, is that it's O(1) insert/remove.
The only way to have an O(1) removal is to have an iterator for erase().
The only way to get an iterator is to keep hold of it from the initial insert() or to find it by iteration.
So, what to pass around; an Iterator or a pointer?
It would seem that if it's important to have fast removal, such as some sort of large list which is changing very frequently, you should pass around an iterator, and if you're not worried about the time to find the item in the list, then pass around the pointer.
Here is a typical cut-down example:
In this example we have some type called Foo. Foo is likely to be a base class pointer, but it's not here for simplicity.
Then we have FooManger, which holds a list of shared_ptr, FooPtr . The manager is responsible for the lifetime of the object once it's been passed to it.
Now, what to return from addFoo()?
If I return a FooPtr then I can never remove it from the list in O(1), because I will have to find it in the list.
If I return a std::list::iterator, FooPtrListIterator, then anywhere I need to remove the FooPtr I can, just by dereferencing the iterator.
In this example I have a contrived example of a Foo which can kill itself under some circumstance, Foo::killWhenConditionMet().
Imagine some Foo that has a timer which is ticking down to 0, at which point it needs to ask the manager to delete itself. The trouble is that 'this' is a naked Foo*, so the only way to delete itself, is to call FooManager::eraseFoo() with a raw pointer. Now the manager has to search for the object pointer to get an iterator so it can be erased from the list, and destroyed.
The only way around that is to store the iterator in the object. i.e Foo has a FooPtrListIterator as a member variable.
struct Foo;
typedef boost::shared_ptr<Foo> FooPtr;
typedef std::list<FooPtr> FooPtrList;
typedef FooPtrList::iterator FooPtrListIterator;
struct FooManager
{
FooPtrList l;
FooPtrListIterator addFoo(Foo *foo) {
return l.insert(l.begin(), FooPtr(foo));
}
void eraseFoo(FooPtrListIterator foo) {
l.erase(foo);
}
void eraseFoo(Foo *foo) {
for (FooPtrListIterator it=l.begin(), ite=l.end(); it!=ite; ++it) {
if ((*it).get()==foo){
eraseFoo(it);
return;
}
}
assert("foo not found!");
}
};
FooManager g_fm;
struct Foo
{
int _v;
Foo(int v):_v(v) {
}
~Foo() {
printf("~Foo %d\n", _v);
}
void print() {
printf("%d\n", _v);
}
void killWhenConditionMet() {
// Do something that will eventually kill this object, like a timer
g_fm.eraseFoo(this);
}
};
void printList(FooPtrList &l)
{
printf("-\n");
for (FooPtrListIterator it=l.begin(), ite=l.end(); it!=ite; ++it) {
(*it)->print();
}
}
void test2()
{
FooPtrListIterator it1=g_fm.addFoo(new Foo(1));
printList(g_fm.l);
FooPtrListIterator it2=g_fm.addFoo(new Foo(2));
printList(g_fm.l);
FooPtrListIterator it3=g_fm.addFoo(new Foo(3));
printList(g_fm.l);
(*it2)->killWhenConditionMet();
printList(g_fm.l);
}
So, the questions I have are:
1. If an object needs to delete itself, or have some other system delete it, in O(1), do I have to store an iterator to object, inside the object? If so, are there any gotchas to do with iterators becoming invalid due other container iterations?
Is there simply another way to do this?
As a side question, does anyone know why and of the 'push*' stl container operations don't return the resultant iterator, meaning one has to resort to 'insert*'.
Please, no answers that say "don't pre-optimise", it drives me nuts. ;) This is an architectural question.
C++ standard in its [list.modifiers] section says that any list insertion operation "does not affect the validity of iterators and references", and any removal operation "invalidates only the iterators and references to the erased elements". So keeping iterators around would be safe.
Keeping iterators inside the objects also seems sane. Especially if you don't call them iterators, but rather name like FooManagerHandlers, which are processed by removal function in an opaque way. Indeed, you do not store "iterators", you store "representatives" of objects in an organized structure. These representatives are used to define a position of an object inside that structure. This is a separate, quite a high-level concept, and there's nothing illogical in implementing it.
However, the point of using lists is not just O(1) insert/remove, but also keeping elements in an order. If you don't need any order, then you would probably find hash tables more useful.
The one problem I see with storing the iterator in the object is that you must be careful of deleting the object from some other iterator, as your objects destructor does not know where it was destroyed from, so you can end up with an invalid iterator in the destructor.
The reason that push* does not return an iterator is that it is the inverse of pop*, allowing you to treat your container as a stack, queue, or deque.
So I have some legacy code which I would love to use more modern techniques. But I fear that given the way that things are designed, it is a non-option. The core issue is that often a node is in more than one list at a time. Something like this:
struct T {
T *next_1;
T *prev_1;
T *next_2;
T *prev_2;
int value;
};
this allows the core have a single object of type T be allocated and inserted into 2 doubly linked lists, nice and efficient.
Obviously I could just have 2 std::list<T*>'s and just insert the object into both...but there is one thing which would be way less efficient...removal.
Often the code needs to "destroy" an object of type T and this includes removing the element from all lists. This is nice because given a T* the code can remove that object from all lists it exists in. With something like a std::list I would need to search for the object to get an iterator, then remove that (I can't just pass around an iterator because it is in several lists).
Is there a nice c++-ish solution to this, or is the manually rolled way the best way? I have a feeling the manually rolled way is the answer, but I figured I'd ask.
As another possible solution, look at Boost Intrusive, which has an alternate list class a lot of properties that may make it useful for your problem.
In this case, I think it'd look something like this:
using namespace boost::intrusive;
struct tag1; struct tag2;
typedef list_base_hook< tag<tag1> > base1;
typedef list_base_hook< tag<tag2> > base2;
class T: public base1, public base2
{
int value;
}
list<T, base_hook<base1> > list1;
list<T, base_hook<base2> > list2;
// constant time to get iterator of a T item:
where_in_list1 = list1.iterator_to(item);
where_in_list2 = list2.iterator_to(item);
// once you have iterators, you can remove in contant time, etc, etc.
Instead of managing your own next/previous pointers, you could indeed use an std::list. To solve the performance of remove problem, you could store an iterator to the object itself (one member for each std::list the element can be stored in).
You can extend this to store a vector or array of iterators in the class (in case you don't know the number of lists the element is stored in).
I think the proper answer depends on how performance-critical this application is. Is it in an inner loop that could potentially cost the program a user-perceivable runtime difference?
There is a way to create this sort of functionality by creating your own classes derived from some of the STL containers, but it might not even be worth it to you. At the risk of sounding tiresome, I think this might be an example of premature optimization.
The question to answer is why this C struct exists in the first place. You can't re-implement the functionality in C++ until you know what that functionality is. Some questions to help you answer that are,
Why lists? Does the data need to be in sequence, i.e., in order? Does the order mean something? Does the application require ordered traversal?
Why two containers? Does membership in the container indicated some kind of property of the element?
Why a double-linked list specifically? Is O(1) insertion and deletion important? Is reverse-iteration important?
The answer to some or all of these may be, "no real reason, that's just how they implemented it". If so, you can replace that intrusive C-pointer mess with a non-intrusive C++ container solution, possibly containing shared_ptrs rather than ptrs.
What I'm getting at is, you may not need to re-implement anything. You may be able to discard the entire business, and store the values in proper C++ containers.
How's this?
struct T {
std::list<T*>::iterator entry1, entry2;
int value;
};
std::list<T*> list1, list2;
// init a T* item:
item = new T;
item->entry1 = list1.end();
item->entry2 = list2.end();
// add a T* item to list 1:
item->entry1 = list1.insert(<where>, item);
// remove a T* item from list1
if (item->entry1 != list1.end()) {
list1.remove(item->entry1); // this is O(1)
item->entry1 = list1.end();
}
// code for list2 management is similar
You could make T a class and use constructors and member functions to do most of this for you. If you have variable numbers of lists, you can use a list of iterators std::vector<std::list<T>::iterator> to track the item's position in each list.
Note that if you use push_back or push_front to add to the list, you need to do item->entry1 = list1.end(); item->entry1--; or item->entry1 = list1.begin(); respectively to get the iterator pointed in the right place.
It sounds like you're talking about something that could be addressed by applying graph theory. As such the Boost Graph Library might offer some solutions.
list::remove is what you're after. It'll remove any and all objects in the list with the same value as what you passed into it.
So:
list<T> listOne, listTwo;
// Things get added to the lists.
T thingToRemove;
listOne.remove(thingToRemove);
listTwo.remove(thingToRemove);
I'd also suggest converting your list node into a class; that way C++ will take care of memory for you.
class MyThing {
public:
int value;
// Any other values associated with T
};
list<MyClass> listOne, listTwo; // can add and remove MyClass objects w/o worrying about destroying anything.
You might even encapsulate the two lists into their own class, with add/remove methods for them. Then you only have to call one method when you want to remove an object.
class TwoLists {
private:
list<MyClass> listOne, listTwo;
// ...
public:
void remove(const MyClass& thing) {
listOne.remove(thing);
listTwo.remove(thing);
}
};