Vector of objects containing references or pointers - c++

I store some objects in a vector. When I call a member function of such an object that uses a reference the program gets terminated (no error). I wrote the following code do run some tests. It seams like after adding elements, the reference in the first entry fails. Why is that and what can I do to avoid this issue? It's exactly the same behaviour when I use pointers instead of references.
#include <iostream>
#include <vector>
using namespace std;
class A{
public:
A(int i) : var(i), ref(var) {}
int get_var() {return var;}
int get_ref() {return ref;}
private:
int var;
int& ref;
};
int main ()
{
vector<A> v;
for(unsigned int i=0;i<=2 ;i++){
v.emplace_back(i+5);
cout<<"entry "<<i<<":"<<endl;
cout<<" var="<<v.at(i).get_var()<<endl;
cout<<" ref="<<v.at(i).get_ref()<<endl;
}
cout<<endl;
for(unsigned int i=0;i<=2 ;i++){
cout<<"entry "<<i<<":"<<endl;
cout<<" var="<<v.at(i).get_var()<<endl;
cout<<" ref="<<v.at(i).get_ref()<<endl;
}
return 0;
}
The output is:
entry 0:
var=5
ref=5
entry 1:
var=6
ref=6
entry 2:
var=7
ref=7
entry 0:
var=5
ref=0 /////////////here it happens!
entry 1:
var=6
ref=6
entry 2:
var=7
ref=7
v has 3 entries

It's because your calls to emplace_back are causing the vector to resize. In order to do this, the vector may or may not have to move the entire vector to a different place in memory. Your "ref" is still referencing the old memory location.
Whether or not this actually happens is somewhat implementation dependent; compilers are free to reserve extra memory for the vector so they don't have to reallocate every single time you add something to the back.
It's mentioned in the standard documentation for emplace_back:
Iterator validity
If a reallocation happens, all iterators, pointers
and references related to this container are invalidated. Otherwise,
only the end iterator is invalidated, and all other iterators,
pointers and references to elements are guaranteed to keep referring
to the same elements they were referring to before the call.
To avoid the problem you could either (as JAB suggested in the comments) create the reference on the fly instead of storing it as a member variable:
int& get_ref() {return var;}
... although I would much rather use a smart pointer instead of this sort of thing.
Or, as RnMss suggested, implement the copy constructor so that it references the new location whenever the object is copied by vector:
A(A const& other) : ref(var) {
*this = other;
}

Okay, so here is what's happening. It really helps to understand your objects in terms of memory location, and remember that vector is allowed to move objects around in memory.
v.emplace_back(5)
You create an A-object in the vector. This object now resides in a block of memory ranging from 0x1234 to 0x123C. Member variable var sits at 0x1234 and member variable ref sits at 0x1238. For this object, the value of var is 0x0005 and the value of ref is 0x1234.
While adding elements to the vector, the vector runs out of space during the second insert. So, it resizes and moves the current elements (which at this moment is just the first element) from location 0x1234 to location 0x2000. This means the member elements also moved, so var is now located at address 0x2000 and ref is now located at 0x2004. But their values were copied, so the value of var is still 0x0005 and the value of ref is still 0x1234.
ref is pointing at an invalid location (but var still contains the right value!). Trying to access the memory ref now points to undefined behavior and generally bad.
Something like this would be a much more typical approach to providing reference access to a member attribute:
int & get_ref() {return var;}
Having references as member attributes isn't wrong in and of itself, but if you are storing a reference to an object, you have to make sure that that object doesn't move.

Related

Pointer gets modified after a push_back

Let us consider the following c++ code
#include <iostream>
#include <vector>
class A {
int x, y;
public:
A(int x, int y) : x(x), y(y){}
friend std::ostream & operator << (std::ostream & os, const A & a){
os << a.x << " " << a.y;
return os;
}
};
int main(){
std::vector<A> a;
std::vector<const A*> b;
for(int i = 0; i < 5; i++){
a.push_back(A(i, i + 1));
b.push_back(&a[i]);
}
while(!a.empty()){
a.pop_back();
}
for(auto x : b)
std::cout << *x << std::endl;
return 0;
}
Using a debugger I noticed that after the first insertion is done to a
the address of a[0] changes. Consequently, when I'm printing in the second
for loop I get an unvalid reference to the first entry. Why does this happen?
Thanks for your help!
for(int i = 0; i < 5; i++){
a.push_back(A(i, i + 1)); //add a new item to a
b.push_back(&a[i]); // point at the new item in a
}
The immediate problem is Iterator invalidation. As a grows, it reallocates its storage for more capacity. This may leave the pointers in b pointing to memory that has been returned to the freestore (probably the heap). Accessing these pointers invokes Undefined Behaviour and anything could happen. There are a few solutions to this, such as reserving space ahead of time to eliminate reallocation or using a container with more forgiving invalidation rules, but whatever you do is rendered moot by the next problem.
while(!a.empty()){
a.pop_back(); // remove item from `a`
}
Since the items in b point to items in a and there are no items in a, all of the pointers in b now reference invalid objects and cannot be accessed without invoking Undefined Behaviour.
All of the items in a referenced by items in b must remain alive as long as the item in b exists or be removed from a and b.
In this trivial case that answer is simple, don't empty a, but that defeats the point of the example. There are many solutions to the general case (just use a, store copies rather than pointers in b, use std::shared_ptr and store shared_ptrs to As in both a and b) but to make useful suggestions we need to know how a and b are being consumed.
std::vector is basically a dynamic array. Size of a dynamic array is not known at compile time and keeps changing at runtime. Therefore, whenever you fill elements into it, it has to keep growing. When it can't grow contiguously, the system has to look for a new contiguous block of memory that could hold that many elements. This answers your first question, as the base address of the vector changes.
Consequently, the address of all elements in the vector changes. This is a sufficient reason to cause the error in your second question. Moreover, you empty the contents of the first vector, to which the elements in your second vector point at. Obviously, this would cause an invalid dereferencing inside your second for loop.
When you add more elements to a std::vector than it has capacity, it will allocate new storage, move all of its elements to the new, larger, storage, and then finally free its old storage. When this happens, all pointers, references, and iterators to the elements in the vector's old storage become invalid.
To avoid having this happen you can use std::vector::reserve to pre-allocate enough storage for all of the elements you're going to add to the vector. I would advise against doing that though. It's brittle and very easy to screw something up and wander into undefined behavior. If you need to store elements of one vector in another you should prefer storing indices. Another option is to use an address-stable container like std::list instead of std::vector.

Values changing after adding object pointer to vector

Let's say I have a struct "A" with four members there. The four members are named:
one
two
three
four
I also have a struct "B" with five members there. Four of the five members come from struct A.
Then I have a vector with struct B pointers. I add every created struct B to this vector.
This looks like this:
std :: vector <B *> vec;
for (A& a : input.buffer())
{
B b =
{
a.one, a.two, a.three, a.four, random value
};
vec.push_back (& ​​b);
}
function_which_needs_a_const_pointer_to_the_first_element_and_size_of_vector (vec.front (), vec.size ());
Now I have used a number of std :: cout at different points in the code.
I print the following values:
a.one = 1304505
b.one (just before the push_back) = 1304505
vec [0] -> one (after the push_back) = 24050434
So as you can see I am troubled by values ​​that change after adding to the vector, as a result of which the rest of the code can no longer function correctly.
Does anyone have any idea how I can solve this? I probably do something stupid.
I tried google now for two days, but nothing seemed to help.
Thanks to some of your comments I know now that there are dangling pointers. If I make it a vector of shared pointers instead of raw pointers, I will have a invalid conversion from shared pointer to const raw pointer error.
So now, we know the issue. But what is the best way to fix it? Because I am not allowed to touch that const raw pointer in that function.
Before you all press the down vote button; None of you is still able to give me the correct solution.
The immediate cause of the problem is that your objects don't exist when you try to use them.
You should never store a pointer you acquire with & for later use.
The more fundamental source of the problem is that you have misunderstood the exercise.
The function you're not allowed to modify wants a pointer to the vector's first element, but vec.front() is not a pointer to the vector's first element - it is the vector's first element.
(This element happens to be a pointer, but it's not a pointer to the beginning of the vector).
You can get pointer to vec's first element with &vec[0] or vec.data() or &vec.front().
This is what you should pass to the function, and your vector's type should be vector<B>.
That is,
std::vector<B> vec;
for (A& a: input.buffer())
{
B b ={ a.one, a.two, a.three, a.four, random value };
vec.push_back(​b);
}
function_with_long_name(vec.data(), vec.size());
When you vec.push_back (& ​​b);, you are pushing a pointer to an object that is about to cease to exist. There isn't a B to point to when you vec[0]->one
You pass an address of local variable that becomes unavailable when going out of scope, that's why your vector will contain pointers to memory with no data available.
See comments in code.
std :: vector <B > vec; // just copy them all
for (A& a : input.buffer())
{
B b =
{
a.one, a.two, a.three, a.four, random value
};
vec.push_back (​​b); //not passing by reference anymore, just copy.
}
function_which_needs_a_const_pointer_to_the_first_element_and_size_of_vector (&vec.front (), vec.size ()); // Passing the reference of vec->front for the first element of the vector.

Returning a reference of an object inside a vector

Suppose I have the following:
class Map
{
std::vector<Continent> continents;
public:
Map();
~Map();
Continent* getContinent(std::string name);
};
Continent* Map::getContinent(std::string name)
{
Continent * c = nullptr;
for (int i = 0; i < continents.size(); i++)
{
if (continents[i].getName() == name)
{
c = &continents[i];
break;
}
}
return c;
}
You can see here that there are continent objects that live inside the vector called continents. Would this be a correct way of getting the object's reference, or is there a better approach to this? Is there an underlying issue with vector which would cause this to misbehave?
It is OK to return a pointer or a reference to an object inside std::vector under one condition: the content of the vector must not change after you take the pointer or a reference.
This is easy to do when you initialize a vector at start-up or in the constructor, and never change it again. In situations when the vector is more dynamic than that returning by value, rather than by pointer, is a more robust approach.
I would advice you against doing something like the above. std::vector does some fancy way of handling memory which include resizing and moving the array when it is out of capacity which will result in a dangling reference. On the other hand if the map contains a const vector, which means it is guaranteed not to be altered, what you are doing would work.
Thanks
Sudharshan
The design is flawed, as other have pointed out.
However, if you don't mind using more memory, lose the fact that the sequence no longer will sit in contiguous memory, and that the iterators are no longer random access, then a drop-in replacement would be to use std::list instead of std::vector.
The std::list does not invalidate pointers or references to the internal data when resized. The only time when a pointer / reference is invalidated is if you are removing the item being pointed to / referred to.

Does set::insert saves a copy or a pointer C++

does the function set::insert saves a pointer to the element or a copy of it. meaning, can I do the following code, or I have to make sure that the pointers are not deleted?
int *a;
*a=new int(1);
set<int> _set;
_set.insert (*a);
delete a;
*a=new int(2);
_set.insert (*a);
delete a;
I gave the example with int, but my real program uses classes that I created.
All STL containers store a copy of the inserted data. Look here in section "Description" in the third paragraph: A Container (and std::set models a Container) owns its elements. And for more details look at the following footnote [1]. In particular for the std::set look here under the section "Type requirements". The Key must be Assignable.
Apart from that you can test this easily:
struct tester {
tester(int value) : value(value) { }
tester(const tester& t) : value(t.value) {
std::cout << "Copy construction!" << std::endl;
}
int value;
};
// In order to use tester with a set:
bool operator < (const tester& t, const tester& t2) {
return t.value < t2.value;
}
int main() {
tester t(2);
std::vector<tester> v;
v.push_back(t);
std::set<tester> s;
s.insert(t);
}
You'll always see Copy construction!.
If you really want to store something like a reference to an object you either can store pointers to these objects:
tester* t = new tester(10);
{
std::set<tester*> s;
s.insert(t);
// do something awesome with s
} // here s goes out of scope just as well the contained objects
// i.e. the *pointers* to tester objects. The referenced objects
// still exist and thus we must delete them at the end of the day:
delete t;
But in this case you have to take care of deleting the objects correctly and this is sometimes very difficult. For example exceptions can change the path of execution dramatically and you never reach the right delete.
Or you can use smart pointers like boost::shared_ptr:
{
std::set< boost::shared_ptr<tester> > s;
s.insert(boost::shared_ptr<tester>(new tester(20)));
// do something awesome with your set
} // here s goes out of scope and destructs all its contents,
// i.e. the smart_ptr<tester> objects. But this doesn't mean
// the referenced objects will be deleted.
Now the smart pointers takes care for you and delete their referenced objects at the right time. If you copied one of the inserted smart pointers and transfered it somewhere else the commonly referenced object won't be delete until the last smart pointer referencing this object goes out of scope.
Oh and by the way: Never use std::auto_ptrs as elements in the standard containers. Their strange copy semantics aren't compatible with the way the containers are storing and managing their data and how the standard algorithms are manipulating them. I'm sure there are many questions here on StackOverflow concerning this precarious issue.
std::set will copy the element you insert.
You are saving pointers into the set.
The object pointed at by the pointer is not copied.
Thus after calling delete the pointer in the set is invalid.
Note: You probably want to just save integers.
int a(1);
set<int> s;
s.insert(a); // pushes 1 into the set
s.insert(2); // pushes 2 into the set.
Couple of other notes:
Be careful with underscores at the beginning of identifier names.
Use smart pointers to hold pointers.
Ptr:
std::auto_ptr<int> a(new int(1));
set<int*> s;
s.insert(a.release());
// Note. Set now holds a RAW pointer that you should delete before the set goes away.
// Or convert into a boost::ptr_set<int> so it takes ownership of the pointer.
int *a;
*a=new int(1);
This code is wrong because you try to use the value stored at address a which is a garbage.
And, every stl containers copy elements unless you use move semantics with insert() and push_back() taking rvalue references in C++0x.

Is it wrong to dereference a pointer to get a reference?

I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value. And I feel dirty converting back to a reference, it just seems wrong.
Is it?
To clarify...
MyType *pObj = ...
MyType &obj = *pObj;
Isn't this 'dirty', since you can (even if only in theory since you'd check it first) dereference a NULL pointer?
EDIT: Oh, and you don't know if the objects were dynamically created or not.
Ensure that the pointer is not NULL before you try to convert the pointer to a reference, and that the object will remain in scope as long as your reference does (or remain allocated, in reference to the heap), and you'll be okay, and morally clean :)
Initialising a reference with a dereferenced pointer is absolutely fine, nothing wrong with it whatsoever. If p is a pointer, and if dereferencing it is valid (so it's not null, for instance), then *p is the object it points to. You can bind a reference to that object just like you bind a reference to any object. Obviously, you must make sure the reference doesn't outlive the object (like any reference).
So for example, suppose that I am passed a pointer to an array of objects. It could just as well be an iterator pair, or a vector of objects, or a map of objects, but I'll use an array for simplicity. Each object has a function, order, returning an integer. I am to call the bar function once on each object, in order of increasing order value:
void bar(Foo &f) {
// does something
}
bool by_order(Foo *lhs, Foo *rhs) {
return lhs->order() < rhs->order();
}
void call_bar_in_order(Foo *array, int count) {
std::vector<Foo*> vec(count); // vector of pointers
for (int i = 0; i < count; ++i) vec[i] = &(array[i]);
std::sort(vec.begin(), vec.end(), by_order);
for (int i = 0; i < count; ++i) bar(*vec[i]);
}
The reference that my example has initialized is a function parameter rather than a variable directly, but I could just have validly done:
for (int i = 0; i < count; ++i) {
Foo &f = *vec[i];
bar(f);
}
Obviously a vector<Foo> would be incorrect, since then I would be calling bar on a copy of each object in order, not on each object in order. bar takes a non-const reference, so quite aside from performance or anything else, that clearly would be wrong if bar modifies the input.
A vector of smart pointers, or a boost pointer vector, would also be wrong, since I don't own the objects in the array and certainly must not free them. Sorting the original array might also be disallowed, or for that matter impossible if it's a map rather than an array.
No. How else could you implement operator=? You have to dereference this in order to return a reference to yourself.
Note though that I'd still store the items in the STL container by value -- unless your object is huge, overhead of heap allocations is going to mean you're using more storage, and are less efficient, than you would be if you just stored the item by value.
My answer doesn't directly address your initial concern, but it appears you encounter this problem because you have an STL container that stores pointer types.
Boost provides the ptr_container library to address these types of situations. For instance, a ptr_vector internally stores pointers to types, but returns references through its interface. Note that this implies that the container owns the pointer to the instance and will manage its deletion.
Here is a quick example to demonstrate this notion.
#include <string>
#include <boost/ptr_container/ptr_vector.hpp>
void foo()
{
boost::ptr_vector<std::string> strings;
strings.push_back(new std::string("hello world!"));
strings.push_back(new std::string());
const std::string& helloWorld(strings[0]);
std::string& empty(strings[1]);
}
I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value.
Just to be clear: STL containers were designed to support certain semantics ("value semantics"), such as "items in the container can be copied around." Since references aren't rebindable, they don't support value semantics (i.e., try creating a std::vector<int&> or std::list<double&>). You are correct that you cannot put references in STL containers.
Generally, if you're using references instead of plain objects you're either using base classes and want to avoid slicing, or you're trying to avoid copying. And, yes, this means that if you want to store the items in an STL container, then you're going to need to use pointers to avoid slicing and/or copying.
And, yes, the following is legit (although in this case, not very useful):
#include <iostream>
#include <vector>
// note signature, inside this function, i is an int&
// normally I would pass a const reference, but you can't add
// a "const* int" to a "std::vector<int*>"
void add_to_vector(std::vector<int*>& v, int& i)
{
v.push_back(&i);
}
int main()
{
int x = 5;
std::vector<int*> pointers_to_ints;
// x is passed by reference
// NOTE: this line could have simply been "pointers_to_ints.push_back(&x)"
// I simply wanted to demonstrate (in the body of add_to_vector) that
// taking the address of a reference returns the address of the object the
// reference refers to.
add_to_vector(pointers_to_ints, x);
// get the pointer to x out of the container
int* pointer_to_x = pointers_to_ints[0];
// dereference the pointer and initialize a reference with it
int& ref_to_x = *pointer_to_x;
// use the reference to change the original value (in this case, to change x)
ref_to_x = 42;
// show that x changed
std::cout << x << '\n';
}
Oh, and you don't know if the objects were dynamically created or not.
That's not important. In the above sample, x is on the stack and we store a pointer to x in the pointers_to_vectors. Sure, pointers_to_vectors uses a dynamically-allocated array internally (and delete[]s that array when the vector goes out of scope), but that array holds the pointers, not the pointed-to things. When pointers_to_ints falls out of scope, the internal int*[] is delete[]-ed, but the int*s are not deleted.
This, in fact, makes using pointers with STL containers hard, because the STL containers won't manage the lifetime of the pointed-to objects. You may want to look at Boost's pointer containers library. Otherwise, you'll either (1) want to use STL containers of smart pointers (like boost:shared_ptr which is legal for STL containers) or (2) manage the lifetime of the pointed-to objects some other way. You may already be doing (2).
If you want the container to actually contain objects that are dynamically allocated, you shouldn't be using raw pointers. Use unique_ptr or whatever similar type is appropriate.
There's nothing wrong with it, but please be aware that on machine-code level a reference is usually the same as a pointer. So, usually the pointer isn't really dereferenced (no memory access) when assigned to a reference.
So in real life the reference can be 0 and the crash occurs when using the reference - what can happen much later than its assignemt.
Of course what happens exactly heavily depends on compiler version and hardware platform as well as compiler options and the exact usage of the reference.
Officially the behaviour of dereferencing a 0-Pointer is undefined and thus anything can happen. This anything includes that it may crash immediately, but also that it may crash much later or never.
So always make sure that you never assign a 0-Pointer to a reference - bugs likes this are very hard to find.
Edit: Made the "usually" italic and added paragraph about official "undefined" behaviour.