Storing object references in a simple container - c++

I am looking for a way to insert multiple objects of type A inside a container object, without making copies of each A object during insertion. One way would be to pass the A objects by reference to the container, but, unfortunately, as far as I've read, the STL containers only accept passing objects by value for insertions (for many good reasons). Normally, this would not be a problem, but in my case, I DO NOT want the copy constructor to be called and the original object to get destroyed, because A is a wrapper for a C library, with some C-style pointers to structs inside, which will get deleted along with the original object...
I only require a container that can return one of it's objects, given a particular index, and store a certain number of items which is determined at runtime, so I thought that maybe I could write my own container class, but I have no idea how to do this properly.
Another approach would be to store pointers to A inside the container, but since I don't have a lot of knowledge on this subject, what would be a proper way to insert pointers to objects in an STL container? For example this:
std::vector<A *> myVector;
for (unsigned int i = 0; i < n; ++i)
{
A *myObj = new myObj();
myVector.pushBack(myObj);
}
might work, but I'm not sure how to handle it properly and how to dispose of it in a clean way. Should I rely solely on the destructor of the class which contains myVector as a member to dispose of it? What happens if this destructor throws an exception while deleting one of the contained objects?
Also, some people suggest using stuff like shared_ptr or auto_ptr or unique_ptr, but I am getting confused with so many options. Which one would be the best choice for my scenario?

You can use boost or std reference_wrapper.
#include <boost/ref.hpp>
#include <vector>
struct A {};
int main()
{
A a, b, c, d;
std::vector< boost::reference_wrapper<A> > v;
v.push_back(boost::ref(a)); v.push_back(boost::ref(b));
v.push_back(boost::ref(c)); v.push_back(boost::ref(d));
return 0;
}
You need to be aware of object lifetimes when using
reference_wrapper to not get dangling references.
int main()
{
std::vector< boost::reference_wrapper<A> > v;
{
A a, b, c, d;
v.push_back(boost::ref(a)); v.push_back(boost::ref(b));
v.push_back(boost::ref(c)); v.push_back(boost::ref(d));
// a, b, c, d get destroyed by the end of the scope
}
// now you have a vector full of dangling references, which is a very bad situation
return 0;
}
If you need to handle such situations you need a smart pointer.
Smart pointers are also an option but it is crucial to know which one to use. If your data is actually shared, use shared_ptr if the container owns the data use unique_ptr.
Anyway, I don't see what the wrapper part of A would change. If it contains pointers internally and obeys the rule of three, nothing can go wrong. The destructor will take care of cleaning up. This is the typical way to handle resources in C++: acquire them when your object is initialized, delete them when the lifetime of your object ends.
If you purely want to avoid the overhead of construction and deletion, you might want to use vector::emplace_back.

In C++11, you can construct container elements in place using emplace functions, avoiding the costs and hassle of managing a container of pointers to allocated objects:
std::vector<A> myVector;
for (unsigned int i = 0; i < n; ++i)
{
myVector.emplace_back();
}
If the objects' constructor takes arguments, then pass them to the emplace function, which will forward them.
However, objects can only be stored in a vector if they are either copyable or movable, since they have to be moved when the vector's storage is reallocated. You might consider making your objects movable, transferring ownership of the managed resources, or using a container like deque or list that doesn't move objects as it grows.
UPDATE: Since this won't work on your compiler, the best option is probably std::unique_ptr - that has no overhead compared to a normal pointer, will automatically delete the objects when erased from the vector, and allows you to move ownership out of the vector if you want.
If that's not available, then std::shared_ptr (or std::tr1::shared_ptr or boost::shared_ptr, if that's not available) will also give you automatic deletion, for a (probably small) cost in efficiency.
Whatever you do, don't try to store std::auto_ptr in a standard container. It's destructive copying behaviour makes it easy to accidentally delete the objects when you don't expect it.
If none of these are available, then use a pointer as in your example, and make sure you remember to delete the objects once you've finished with them.

Related

c++ return structures and vectors optimally

I am reading a lot of different things on C++ optimization and I am getting quite mixed up. I would appreciate some help. Basically, I want to clear up what needs to be a pointer or not when I am passing vectors and structures as parameters or returning vectors and structures.
Say I have a Structure that contains 2 elements: an int and then a vector of integers. I will be creating this structure locally in a function, and then returning it. This function will be called multiple times and generate a new structure every time. I would like to keep the last structure created in a class member (lastStruct_ for example). So before returning the struct I could update lastStruct_ in some way.
Now, what would be the best way to do this, knowing that the vector in the structure can be quite large (would need to avoid copies). Does the vector in the struct need to be a pointer ? If I want to share lastStruct_ to other classes by creating a get_lastStruct() method, should I return a reference to lastStruct_, a pointer, or not care about that ? Should lastStruct_ be a shared pointer ?
This is quite confusing to me because apparently C++ knows how to avoid copying, but I also see a lot of people recommending the use of pointers while others say a pointer to a vector makes no sense at all.
struct MyStruct {
std::vector<int> pixels;
int foo;
}
class MyClass {
MyStruct lastStruct_;
public:
MyStruct create_struct();
MyStruct getLastStruct();
}
MyClass::create_struct()
{
MyStruct s = {std::vector<int>(100, 1), 1234};
lastStruct_ = s;
return s;
}
MyClass::getLastStruct()
{
return lastStruct_;
}
If the only copy you're trying to remove is the one that happen when you return it from your factory function, I'd say containing the vector directly will be faster all the time.
Why? Two things. Return Value Optimisation (RVO/NRVO) will remove any need for temporaries when returning. This is enough for almost all cases.
When return value optimisation don't apply, move semantics will. returning a named variable (eg: return my_struct;) will do implicit move in the case NRVO won't apply.
So why is it always faster than a shared pointer? Because when copying the shared pointer, you must dereference the control block to increase the owner count. And since it's an atomic operation, the incrementation is not free.
Also, using a shared pointer brings shared ownership and non-locality. If you were to use a shared pointer, use a pointer to const data to bring back value semantics.
Now that you added the code, it's much clearer what you're trying to do.
There's no way around the copy here. If you measure performance degradation, then containing a std::shared_ptr<const std::vector<int>> might be the solution, since you'll keep value semantic but avoid vector copy.

C++ - Proper way of using std::vector & related memory management

Hy, I would like to ask a question that puzzles me.
I've a class like this:
class A {
private:
std::vector<Object*>* my_array_;
...
public
std::vector<Object*>& my_array(); // getter
void my_array(const std::vector<Object*>& other_array); // setter
};
I wanted to ask you, based on your experience, what is the correct way of implementing the setter and getter in a (possible) SAFE manner.
The first solution came to my mind is the following.
First, when I do implement the setter, I should:
A) check the input is not a referring to the data structure I already hold;
B) release the memory of ALL objects pointed by my_array_
C) copy each object pointed by other_array and add its copy to my_array_
D) finally end the function.
The getter may produce a copy of the inner array, just in case.
The questions are many:
- is this strategy overkilling?
- does it really avoid problems?
- somebody really uses it or are there better approaches?
I've tried to look for the answer to this question, but found nothing so particularly focused on this problem.
That of using smart pointers is a very good answer, i thank you both.. it seems I can not give "useful answer" to more than one so I apologize in advance. :-)
From your answers however a new doubt has raised.
When i use a vector containing unique_ptr to objects, I will have to define a deep copy constructor. Is there a better way than using an iterator to copy each element in the vector of objects, given that now we are using smart pointers?
I'd normally recommend not using a pointer to a vector as a member, but from your question it seems like it's shared between multiple instances.
That said, I'd go with:
class A {
private:
std::shared_ptr<std::vector<std::unique_ptr<Object> > > my_array_;
public
std::shared_ptr<std::vector<std::unique_ptr<Object> > > my_array(); // getter
void my_array(std::shared_ptr<std::vector<std::unique_ptr<Object> > > other_array); // setter
};
No checks necessary, no memory management issues.
If the inner Objects are also shared, use a std::shared_ptr instead of the std::unique_ptr.
I think you are overcomplicating things having a pointer to std::vector as data member; remember that C++ is not Java (C++ is more "value" based than "reference" based).
Unless there is a strong reason to use a pointer to a std::vector as data member, I'd just use a simple std::vector stored "by value".
Now, regarding the Object* pointers in the vector, you should ask yourself: are those observing pointers or are those owning pointers?
If the vector just observes the Objects (and they are owned by someone else, like an object pool allocator or something), you can use raw pointers (i.e. simple Object*).
But if the vector has some ownership semantics on the Objects, you should use shared_ptr or unique_ptr smart pointers. If the vector is the only owner of Object instances, use unique_ptr; else, use shared_ptr (which uses a reference counting mechanism to manage object lifetimes).
class A
{
public:
// A vector which owns the pointed Objects
typedef std::vector<std::shared_ptr<Object>> ObjectArray;
// Getter
const ObjectArray& MyArray() const
{
return m_myArray
}
// Setter
// (new C++11 move semantics pattern: pass by value and move from the value)
void MyArray(ObjectArray otherArray)
{
m_myArray = std::move(otherArray);
}
private:
ObjectArray m_myArray;
};

Is it wrong to dereference a pointer to get a reference?

I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value. And I feel dirty converting back to a reference, it just seems wrong.
Is it?
To clarify...
MyType *pObj = ...
MyType &obj = *pObj;
Isn't this 'dirty', since you can (even if only in theory since you'd check it first) dereference a NULL pointer?
EDIT: Oh, and you don't know if the objects were dynamically created or not.
Ensure that the pointer is not NULL before you try to convert the pointer to a reference, and that the object will remain in scope as long as your reference does (or remain allocated, in reference to the heap), and you'll be okay, and morally clean :)
Initialising a reference with a dereferenced pointer is absolutely fine, nothing wrong with it whatsoever. If p is a pointer, and if dereferencing it is valid (so it's not null, for instance), then *p is the object it points to. You can bind a reference to that object just like you bind a reference to any object. Obviously, you must make sure the reference doesn't outlive the object (like any reference).
So for example, suppose that I am passed a pointer to an array of objects. It could just as well be an iterator pair, or a vector of objects, or a map of objects, but I'll use an array for simplicity. Each object has a function, order, returning an integer. I am to call the bar function once on each object, in order of increasing order value:
void bar(Foo &f) {
// does something
}
bool by_order(Foo *lhs, Foo *rhs) {
return lhs->order() < rhs->order();
}
void call_bar_in_order(Foo *array, int count) {
std::vector<Foo*> vec(count); // vector of pointers
for (int i = 0; i < count; ++i) vec[i] = &(array[i]);
std::sort(vec.begin(), vec.end(), by_order);
for (int i = 0; i < count; ++i) bar(*vec[i]);
}
The reference that my example has initialized is a function parameter rather than a variable directly, but I could just have validly done:
for (int i = 0; i < count; ++i) {
Foo &f = *vec[i];
bar(f);
}
Obviously a vector<Foo> would be incorrect, since then I would be calling bar on a copy of each object in order, not on each object in order. bar takes a non-const reference, so quite aside from performance or anything else, that clearly would be wrong if bar modifies the input.
A vector of smart pointers, or a boost pointer vector, would also be wrong, since I don't own the objects in the array and certainly must not free them. Sorting the original array might also be disallowed, or for that matter impossible if it's a map rather than an array.
No. How else could you implement operator=? You have to dereference this in order to return a reference to yourself.
Note though that I'd still store the items in the STL container by value -- unless your object is huge, overhead of heap allocations is going to mean you're using more storage, and are less efficient, than you would be if you just stored the item by value.
My answer doesn't directly address your initial concern, but it appears you encounter this problem because you have an STL container that stores pointer types.
Boost provides the ptr_container library to address these types of situations. For instance, a ptr_vector internally stores pointers to types, but returns references through its interface. Note that this implies that the container owns the pointer to the instance and will manage its deletion.
Here is a quick example to demonstrate this notion.
#include <string>
#include <boost/ptr_container/ptr_vector.hpp>
void foo()
{
boost::ptr_vector<std::string> strings;
strings.push_back(new std::string("hello world!"));
strings.push_back(new std::string());
const std::string& helloWorld(strings[0]);
std::string& empty(strings[1]);
}
I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value.
Just to be clear: STL containers were designed to support certain semantics ("value semantics"), such as "items in the container can be copied around." Since references aren't rebindable, they don't support value semantics (i.e., try creating a std::vector<int&> or std::list<double&>). You are correct that you cannot put references in STL containers.
Generally, if you're using references instead of plain objects you're either using base classes and want to avoid slicing, or you're trying to avoid copying. And, yes, this means that if you want to store the items in an STL container, then you're going to need to use pointers to avoid slicing and/or copying.
And, yes, the following is legit (although in this case, not very useful):
#include <iostream>
#include <vector>
// note signature, inside this function, i is an int&
// normally I would pass a const reference, but you can't add
// a "const* int" to a "std::vector<int*>"
void add_to_vector(std::vector<int*>& v, int& i)
{
v.push_back(&i);
}
int main()
{
int x = 5;
std::vector<int*> pointers_to_ints;
// x is passed by reference
// NOTE: this line could have simply been "pointers_to_ints.push_back(&x)"
// I simply wanted to demonstrate (in the body of add_to_vector) that
// taking the address of a reference returns the address of the object the
// reference refers to.
add_to_vector(pointers_to_ints, x);
// get the pointer to x out of the container
int* pointer_to_x = pointers_to_ints[0];
// dereference the pointer and initialize a reference with it
int& ref_to_x = *pointer_to_x;
// use the reference to change the original value (in this case, to change x)
ref_to_x = 42;
// show that x changed
std::cout << x << '\n';
}
Oh, and you don't know if the objects were dynamically created or not.
That's not important. In the above sample, x is on the stack and we store a pointer to x in the pointers_to_vectors. Sure, pointers_to_vectors uses a dynamically-allocated array internally (and delete[]s that array when the vector goes out of scope), but that array holds the pointers, not the pointed-to things. When pointers_to_ints falls out of scope, the internal int*[] is delete[]-ed, but the int*s are not deleted.
This, in fact, makes using pointers with STL containers hard, because the STL containers won't manage the lifetime of the pointed-to objects. You may want to look at Boost's pointer containers library. Otherwise, you'll either (1) want to use STL containers of smart pointers (like boost:shared_ptr which is legal for STL containers) or (2) manage the lifetime of the pointed-to objects some other way. You may already be doing (2).
If you want the container to actually contain objects that are dynamically allocated, you shouldn't be using raw pointers. Use unique_ptr or whatever similar type is appropriate.
There's nothing wrong with it, but please be aware that on machine-code level a reference is usually the same as a pointer. So, usually the pointer isn't really dereferenced (no memory access) when assigned to a reference.
So in real life the reference can be 0 and the crash occurs when using the reference - what can happen much later than its assignemt.
Of course what happens exactly heavily depends on compiler version and hardware platform as well as compiler options and the exact usage of the reference.
Officially the behaviour of dereferencing a 0-Pointer is undefined and thus anything can happen. This anything includes that it may crash immediately, but also that it may crash much later or never.
So always make sure that you never assign a 0-Pointer to a reference - bugs likes this are very hard to find.
Edit: Made the "usually" italic and added paragraph about official "undefined" behaviour.

STL: Stores references or values?

I've always been a bit confused about how STL containers (vector, list, map...) store values. Do they store references to the values I pass in, or do they copy/copy construct +store the values themselves?
For example,
int i;
vector<int> vec;
vec.push_back(i);
// does &(vec[0]) == &i;
and
class abc;
abc inst;
vector<abc> vec;
vec.push_back(inst);
// does &(vec[0]) == &inst;
Thanks
STL Containers copy-construct and store values that you pass in. If you want to store objects in a container without copying them, I would suggest storing a pointer to the object in the container:
class abc;
abc inst;
vector<abc *> vec;
vec.push_back(&inst);
This is the most logical way to implement the container classes to prevent accidentally storing references to variables on defunct stack frames. Consider:
class Widget {
public:
void AddToVector(int i) {
v.push_back(i);
}
private:
vector<int> v;
};
Storing a reference to i would be dangerous as you would be referencing the memory location of a local variable after returning from the method in which it was defined.
That depends on your type. If it's a simple value type, and cheap to copy, then storing values is probably the answer. On the other hand, if it's a reference type, or expensive to copy, you'd better store a smart pointer (not auto_ptr, since its special copy semantics prevent it from being stored in a container. Go for a shared_ptr). With a plain pointer you're risking memory leakage and access to freed memory, while with references you're risking the latter. A smart pointer avoids both.

preventing data from being freed when vector goes out of scope

Is there a way to transfer ownership of the data contained in a std::vector (pointed to by, say T*data) into another construct, preventing having "data" become a dangling pointer after the vector goes out of scope?
EDIT: I DON'T WANT TO COPY THE DATA (which would be an easy but ineffective solution).
Specifically, I'd like to have something like:
template<typename T>
T* transfer_ownership(vector<T>&v){
T*data=&v[0];
v.clear();
...//<--I'd like to make v's capacity 0 without freeing data
}
int main(){
T*data=NULL;
{
vector<double>v;
...//grow v dynamically
data=transfer_ownership<double>(v);
}
...//do something useful with data (user responsible for freeing it later)
// for example mxSetData(mxArray*A,double*data) from matlab's C interface
}
The only thing that comes to my mind to emulate this is:
{
vector<double>*v=new vector<double>();
//grow *v...
data=(*v)[0];
}
and then data will later either be freed or (in my case) used as mxSetData(mxArrayA,doubledata). However this results in a small memory leak (data struct for handling v's capacity, size, etc... but not the data itself of course).
Is it possible without leaking ?
A simple workaround would be swapping the vector with one you own:
vector<double> myown;
vector<double> someoneelses = foo();
std::swap( myown, someoneelses );
A tougher but maybe better approach is write your own allocator for the vector, and let it allocate out of a pool you maintain. No personal experience, but it's not too complicated.
The point of using a std::vector is not to have to worry about the data in it:
Keep your vector all along your application;
Pass it by const-ref to other functions (to avoid unnecessary copies);
And feed functions expecting a pointer-to-T with &v[0].
If you really don't want to keep your vector, you will have to copy your data -- you can't transfer ownership because std::vector guarantees it will destroy its content when going out-of-scope. In that case, use the std::copy() algorithm.
If your vector contains values you can only copy them (which happens when you call std::copy, std::swap, etc.). If you keep non-primitive objects in a vector and don't want to copy them (and use in another data structure), consider storing pointers
Does something like this work for you?
int main()
{
double *data = 0;
{
vector<double> foo;
// insert some elements to foo
data = new double[foo.size()];
std::copy(foo.begin(), foo.end(), &data[0]);
}
// Pass data to Matlab function.
delete [] data;
return 0;
}
Since you don't want to copy data between containers, but want to transfer ownership of data between containers, I suggest using a container of smart pointers as follows.
void f()
{
std::vector<boost::shared_ptr<double> > doubles;
InitVector(doubles);
std::vector<boost::shared_ptr<double> > newDoubles(doubles);
}
You really can't transfer ownership of data between standard containers without making a copy of it, since standard containers always copy the data they encapsulate. If you want to minimize the overhead of copying expensive objects, then it is a good idea to use a reference-counted smart pointer to wrap your expensive data structure. boost::shared_ptr is suitable for this task since it is fairly cheap to make a copy of it.