I have two structs:
struct A
{};
struct B
{
A& a;
};
and initialize them A a0; B b{a0};. But now, I want to move object a0 to new one:
A a1{std::move(a0)};
I don't know how to inform b that the value of it's member a should be changed. It's impossible to change value of standard reference, so I was thinking to change it into std::reference_wrapper or something similar. But still there is a problem - how to inform object b (specifically - its member which now is some kind of smart reference) that value of field a should be changed?
I was thinking about observer pattern, where A will be subject and B member referring to A will be observer. After moving subject, it will send all observers its new address. It may be solution, but maybe there is something easier?
As appendix, why I use something like that. In my project, I have another structure (simplified version below):
struct C
{
A a;
B b{a};
};
A is wrapper to memory (pointer to memory is given to A in constructor, but here not mentioned to keep it simple), it knows size of allocated memory and how to write and read it. B is object which knows how to insert some values to memory (i.e. how to insert int32_t - in LE or BE, how to serialize compound object). This two structures are in some closed library (company library, so I can open issue to change it, but it is used in some other projects, so I must be sure what kind of changes I need and are they really necessary). C gives me interface to put only specific objects into memory - I get only raw memory to construct it, so I must handle creating objects A and B inside it and outside C nobody needs to know what dependencies I use to write into this memory.
Structure C should be movable, because it will be returned using std::optional - its constructor is private and there exists static method create which builds object of C depending on status of some other operations needed to construct it (here described simply using bool argument):
static std::optional<C> create(bool v)
{
return v ? std::optional<C>{C()} : std::optional<C>{};
}
To tests, I also write constructor of C:
C()
{
std::cout << "C::C()" << std::endl << &a << std::endl << &b.a << std::endl;
}
and function which is trying to build this object:
auto c = C::create(true);
std::cout << "main" << std::endl;
std::cout << &(c.value().a) << std::endl;
std::cout << &(c.value().b.a) << std::endl;
Here is result of executing this test:
C::C()
0x7ffe8498e560
0x7ffe8498e560
main
0x7ffe8498e570
0x7ffe8498e560
which shows that C member b holds wrong reference now.
I'm open to criticism, because I know it could be bad design and maybe I should do it in some other way.
Your class C should be:
struct C
{
C(const C& c) : a(c.a), b(this->a) {}
C(C&& c) : a(std::move(c.a)), b(this->a) {}
A a;
B b{a};
};
Then your reference is still valid.
Related
I'm still fairly new to C++ and am confused about references and move semantics. For a compiler I'm writing that generates C++17 code, I need to be able to have structs with fields that are other structs. Since the struct definitions will be generated from the user's code in the other language, they could potentially be very large, so I'm storing the inner struct as a reference. This is also necessary to deal with incomplete types that are declared at the beginning but defined later, which may happen in the generated code. (I avoided using pointers because adding * all over the place for dereferencing makes the code generation less straightforward.)
The language I'm compiling from has no aliasing, so something like Outer b = a should always be a "deep-copy". So in this case, b.inner should be a copy of a.inner and not a reference to it. But I can't figure out how to setup the constructors to create the deep-copy behavior in C++. I tried many different configurations of the constructors for Outer, and I tried both Inner& and Inner&& for storing inner.
Here is a mock example of how the generated code would look:
#include <iostream>
template<typename T>
T copy(T a) {
return a;
}
struct Inner;
struct Outer {
Inner&& inner;
Outer(Inner&& a);
Outer(Outer& a);
};
struct Inner {
int v;
};
Outer::Outer(Inner&& a) : inner(std::move(a)) {
std::cout << " -- Constructor 1 --" << std::endl;
}
// Copy the insides of the original object, then move that rvalue to the new object?
Outer::Outer(Outer& a) : inner(std::move(copy(a.inner))) {
std::cout << " -- Constructor 2 --" << std::endl;
}
int main() {
Outer a = {Inner {30}};
std::cout << a.inner.v << std::endl; // Should be: 30
a.inner.v += 1;
std::cout << a.inner.v << std::endl; // Should be: 31
Outer b = a; // Copy a to b
std::cout << a.inner.v << std::endl; // Should be: 31
std::cout << b.inner.v << std::endl; // Should be: 31
b.inner.v += 1;
std::cout << a.inner.v << std::endl; // Should be: 31
std::cout << b.inner.v << std::endl; // Should be: 32
return 0;
}
And this is what it currently outputs (it may vary by implementation):
-- Constructor 1 --
30
31
-- Constructor 2 --
297374876
32574
297374876
32574
Clearly this output is incorrect, and I think I must have a dangling reference somewhere among other things. How should I setup Outer to get the proper behavior here?
References in C++ are (almost always) non owning aliases.
You do not want a non owning alias.
Thus, do not use references.
You could have an owning (smart) pointer and a reference alias to make some code generation easier. Do not do this. The result of doing it is a class with mixed semantics; there is no coherant sensible operator= and copy/move constructors you can write in that case.
My advice would be to:
Write a value_ptr that inherits from unique_ptr but copies on assignment.
then either:
Generate code with ->
or
Add a helper method that returns *ptr reference, and generate code that does method().
(I avoided using pointers because adding * all over the place for dereferencing makes the code generation less straightforward.)
Don't let your desired interface interfere so much with your implementation. Separation of interface and implementation is a powerful tool.
Your goal is a deep copy. Your temporaries will not live long enough. Something has to own the copied data so it both lives long enough (no dangling references) and does not live too long (no leaked memory). A reference does not own its data. Since the data will not be directly part of your structure, you need a pointer with ownership semantics.
This does not mean that the code has to add de-referencing "all over the place". To aid your interface, you could have a reference to the object owned by the pointer. Normally this would be wasted space, but it might serve a purpose in your project, assuming your assessment about code generation is accurate.
Example:
struct Outer {
// Order matters here! The pointer must be declared before the reference!
// (This should be less of a problem for generated code than it can be for
// code edited by human programmers.)
const std::unique_ptr<Inner> inner_ptr;
Inner & inner;
// The idea is that `inner` refers to `*inner_ptr`, and the `const` on
// `inner_ptr` will prevent `inner` from becoming a dangling reference.
// Copy constructor
Outer(const Outer& src) :
inner_ptr(std::make_unique<Inner>(src.inner)), // Make a copy
inner(*inner_ptr) // Reference to the copy
{}
// The compiler-generated assignment operator will be deleted because
// of the reference member, just as in the question's code
// (so having it deleted because of the `unique_ptr` is not an issue).
// However, to make this explicit:
Outer& operator=(const Outer&) = delete;
};
With the above setup, you could still access the members of the inner data via syntax like object.inner.field. While this is redundant with access via the object.inner_ptr->field syntax, you indicated that you have established a need for the former syntax.
For the benefit of future readers:
This approach has drawbacks that would normally cause me to recommend against it. It is a judgement call as to which drawbacks are greater – those in this approach or the "less straightforward" code generation. Sometimes machine-generated code needs a bit of inefficiency to ensure that corner cases function correctly. So this might be acceptable in this particular case.
If I may stray a bit from your desired syntax, a neater option would be to have an accessor function. Whether or not this is applicable in your situation depends on details that are appropriately out-of-scope for this question. It might be worth considering.
Instead of wasting space by storing a reference in the structure, you could generate the reference as needed via a member function. This has the side-effect of removing the need to mark the pointer const.
struct Outer {
// Note the lack of restrictions imposed on the data.
// All that might be needed is an assertion that inner_ptr will never be null.
std::unique_ptr<Inner> inner_ptr;
// Here, `inner` will be a member function instead of member data.
Inner & inner() { return *inner_ptr; }
// And a const version for good measure.
const Inner & inner() const { return *inner_ptr; }
// Copy constructor
Outer(const Outer& src) :
inner_ptr(std::make_unique<Inner>(src.inner())) // Make a copy
{}
// With this setup, the compiler-generated copy assignment
// operator is still deleted because of the `unique_ptr`.
// However, a compiler-generated *move* assignment is
// available if you specifically request it.
Outer& operator=(const Outer&) = delete;
Outer& operator=(Outer &&) = default;
};
With this setup, access to the members of the inner data could be done via syntax like object.inner().field. I don't know if the extra parentheses will cause the same issues as the asterisks would.
Deep copying only makes sense when the class has ownership. A reference isn't generally used for owernship.
Clearly this output is incorrect, and I think I must have a dangling reference somewhere among other things
You've guessed correctly. In the declaration: Outer a = {Inner {30}}; The instance of Inner is a temporary object and its lifetime extends until the end of that declaration. After that, the reference member is left dangling.
so I'm storing the inner struct as a reference
A reference doesn't store an object. A reference refers to an object that is stored somewhere else.
How should I setup Outer to get the proper behavior here?
It seems that a smart pointer might be useful for your use case:
struct Outer {
std::unique_ptr<Inner> inner;
};
You'll need to define a deep copy constructor and assignment operator though.
I am struggling to understand the difference in behaviour of a raw pointer and a unique_ptr. I have class A with a variable x and class B with a pointer to an instance of A:
class A
{
public:
int x;
};
A::A(int y) : x(y)
{
}
class B
{
public:
B(A &);
A *p;
};
B::B(A &a)
{
p = &a;
}
This behaves as I would expect:
int main()
{
A a(2);
B b(a);
cout << a.x << " " << b.p->x << endl;
a.x = 4;
cout << a.x << " " << b.p->x << endl;
}
gives
2 2
4 4
Changing the raw pointer to a std::unique_ptr gives a different result:
class A
{
public:
int x;
};
A::A(int y) : x(y)
{
}
class B
{
public:
B(A &);
std::unique_ptr<A> p;
};
B::B(A &a)
{
p = std::make_unique<A>(a);
}
gives
2 2
4 2
Have I fundamentally misunderstood something about unique_ptrs?
make_unique creates a fresh object, one that that unique_pt has exclusive access to. So in the second example you have two objects, not one and when you set change the value of a.x in the first object it doesn't effect the other object held by the unique_ptr.
A unique pointer needs to own whatever it points to. Your code can be made to work - just substituting unique_ptr type and leaving everything else unchanged (no make_unique). But it will have undefined behavior, since you’ll create a unique pointer to an object that is owned elsewhere.
To compare apples to apples, the raw pointer code should read p=new A(a);. That’s what make_unique does.
Try reading the following expression from the smart pointer version:
std::make_unique<A>(a);
"make a unique A from a" (no mention of pointers!)
The result is a unique_ptr, but when reading the expression, read it as making (an object whose type is) the template parameter. The function parameters are parameters to the constructor. In this case, you are making an A object from an A object, which pulls in the copy constructor.
Once you understand that the smart pointer version is making a new A object (and your raw pointer version does not), your results should make sense.
The "unique" in "unique A" might be tricky to understand. Think of it as an object that no one else can lay claim to. It might be a copy of another object, but, taking the role of the unique_ptr, it is your copy, your responsibility to clean up after, and no one else's. Your preciousss, which you will not share (c.f. std::make_shared).
Note that a local variable (like the a in the main function) is the responsibility of the compiler, so it is ineligible to be the object to which a unique_ptr points (or any smart pointer, for that matter).
Let's say that I have some arbitrary class, A:
class A {
//... stuff
};
I want to call into an external API that takes in a shared pointer to some type, like so (I cannot change this interface):
//...much later
void foo(std::shared_ptr<A> _a){
//operate on _a as a shared_ptr
}
However, in the (legacy) code I'm working with, the class A instance I'm working with is allocated on the stack (which I cannot get around):
A a;
//...some stuff on a
//Now time to call foo
On top of this, an instance of class A is quite large, on the order of 1 GB per instance.
I know I could call
foo(std::make_shared<A> a);
but that would allocate memory for a copy of A, which I would really like to avoid.
Question
Is there a way to hack together some call to std::make_shared (possibly with move semantics) so that I am not forced to allocate memory for another instance of class A?
I've tried something like this:
foo(std::make_shared<A>(std::move(a)));
But from what I can tell, a new instance of A is still created.
Example code
#include <iostream>
#include <memory>
using namespace std;
class A{
public:
A(int _var=42) : var(_var){cout << "Default" << endl;}
A(const A& _rhs) : var(_rhs.var){cout << "Copy" << endl;}
A(A&& _rhs) : var(std::move(_rhs.var)){cout << "Move" << endl;}
int var;
};
void foo(std::shared_ptr<A> _a){
_a->var = 43;
cout << _a->var << endl;
}
int main() {
A a;
cout << a.var << endl;
foo(std::make_shared<A>(std::move(a)));
cout << a.var << endl;
a.var = 44;
foo(std::make_shared<A>(std::move(a)));
cout << a.var << endl;
return 0;
}
Output:
Default 42 Move 43 42 Move 43 44
This is possible with the shared_ptr constructor that allows for an "empty instance with non-null stored pointer":
A x;
std::shared_ptr<A> i_dont_own(std::shared_ptr<A>(), &x);
(It's "overload (8)" on the cppreference documentation.)
If you know that shared pointer you pass to foo() will not get stored, copied etc, ie will not outlive your object you can make std::shared_ptr pointed to object on the stack with empty deleter:
void emptyDeleter( A * ) {}
A a;
foo( std::shared_ptr<A>( &a, emptyDeleter ) );
Again you need to make sure that shared pointer or it's copy will not outlive the object and well document this hack.
Assuming class A supports move semantics, do this:
std::shared_ptr<A> newA = make_shared<A> (std::move (_a));
Do not use _a anymore, use only newA. You can now pass newA to the function.
If class A does not support move semantics, there is no safe/sane way to do this. Any hack will only happen to work, and may break in the future. If you control enough of the class code, you may be able to add support for move semantics.
But from what I can tell, a new instance of A is still created.
Why do you care? What you're trying to avoid is copying all the data in the instance, and this does that.
The point of move semantics is to move the data from one instance to another without having to do an allocate/copy/free. Of course, this makes the original instance "empty", so don't use that anymore.
I have written the following sample code:
#include <iostream>
class B
{
int Value;
public:
B(int V) : Value(V) {}
int GetValue(void) const { return Value;}
};
class A
{
const B& b;
public:
A(const B &ObjectB) : b(ObjectB) {}
int GetValue(void) { return b.GetValue();}
};
B b(5);
A a1(B(5));
A a2(b);
A a3(B(3));
int main(void)
{
std::cout << a1.GetValue() << std::endl;
std::cout << a2.GetValue() << std::endl;
std::cout << a3.GetValue() << std::endl;
return 0;
}
Compiled with mingw-g++ and executed on Windows 7, I get
6829289
5
1875385008
So, what I get from the output is that the two anonymous object are destroyed as the initialization has completed, even if they are declared in a global context.
From this my question: does is exist a way to be sure that a const reference stored in class will always refer to a valid object?
One thing you can do in class A:
A(B&&) = delete;
That way, the two lines that try to construct an A from a B temporary will fail to compile.
That obviously won't prevent you providing some other B object with a lifetime shorter than the A object referencing it, but it's a step in the right direction and may catch some accidental/careless abuses. (Other answers already discuss design scenarios and alternatives - I won't cover that same ground / whether references can be safe(r) in the illustrated scenario is already an important question.)
No, there is not. Remember that references are pointers under the hood, and normally don't control the lifetime of the object they reference (see here for an exception, although it doesn't apply in this case). I would recommend just having a B object, if this is in a piece of code that you need.
Also, you could utilize an object such as a shared_ptr in C++11, which will only eliminate the object once both the pointer in the function and in the object have been destroyed.
I'm learning C++ at the moment and though I grasp the concept of pointers and references for the better part, some things are unclear.
Say I have the following code (assume Rectangle is valid, the actual code is not important):
#include <iostream>
#include "Rectangle.h"
void changestuff(Rectangle& rec);
int main()
{
Rectangle rect;
rect.set_x(50);
rect.set_y(75);
std::cout << "x,y: " << rect.get_x() << rect.get_y() << sizeof(rect) << std::endl;
changestuff(rect);
std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
Rectangle* rectTwo = new Rectangle();
rectTwo->set_x(15);
rectTwo->set_y(30);
std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
changestuff(*rectTwo);
std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
std::cout << rectTwo << std::endl;
}
void changestuff(Rectangle& rec)
{
rec.set_x(10);
rec.set_y(11);
}
Now, the actual Rectangle object isn't passed, merely a reference to it; it's address.
Why should I use the 2nd method over the first one? Why can't I pass rectTwo to changestuff, but *rectTwo? In what way does rectTwo differ from rect?
There really isn't any reason you can't. In C, you only had pointers. C++ introduces references and it is usually the preferred way in C++ is to pass by reference. It produces cleaner code that is syntactically simpler.
Let's take your code and add a new function to it:
#include <iostream>
#include "Rectangle.h"
void changestuff(Rectangle& rec);
void changestuffbyPtr(Rectangle* rec);
int main()
{
Rectangle rect;
rect.set_x(50);
rect.set_y(75);
std::cout << "x,y: " << rect.get_x() << rect.get_y() << sizeof(rect) << std::endl;
changestuff(rect);
std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
changestuffbyPtr(&rect);
std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
Rectangle* rectTwo = new Rectangle();
rectTwo->set_x(15);
rectTwo->set_y(30);
std::cout << "x,y: " << rectTwo->get_x() << rectTwo->get_y() << std::endl;
changestuff(*rectTwo);
std::cout << "x,y: " << rectTwo->get_x() << rectTwo->get_y() << std::endl;
changestuffbyPtr(rectTwo);
std::cout << "x,y: " << rectTwo->get_x() << rectTwo->get_y() << std::endl;
std::cout << rectTwo << std::endl;
}
void changestuff(Rectangle& rec)
{
rec.set_x(10);
rec.set_y(11);
}
void changestuffbyPtr(Rectangle* rec)
{
rec->set_x(10);
rec->set_y(11);
}
Difference between using the stack and heap:
#include <iostream>
#include "Rectangle.h"
Rectangle* createARect1();
Rectangle* createARect2();
int main()
{
// this is being created on the stack which because it is being created in main,
// belongs to the stack for main. This object will be automatically destroyed
// when main exits, because the stack that main uses will be destroyed.
Rectangle rect;
// rectTwo is being created on the heap. The memory here will *not* be released
// after main exits (well technically it will be by the operating system)
Rectangle* rectTwo = new Rectangle();
// this is going to create a memory leak unless we explicitly call delete on r1.
Rectangle* r1 = createARectangle();
// this should cause a compiler warning:
Rectangle* r2 = createARectangle();
}
Rectangle* createARect1()
{
// this will be creating a memory leak unless we remember to explicitly delete it:
Rectangle* r = new Rectangl;
return r;
}
Rectangle* createARect2()
{
// this is not allowed, since when the function returns the rect will no longer
// exist since its stack was destroyed after the function returns:
Rectangle r;
return &r;
}
It should also be worth mentioning that a huge difference between pointers and references is that you can not create a reference that is uninitialized. So this perfectly legal:
int *b;
while this is not:
int& b;
A reference has to refer to something. This makes references basically unusable for polymorphic situations, in which you may not know what the pointer is initialized to. For instance:
// let's assume A is some interface:
class A
{
public:
void doSomething() = 0;
}
class B : public A
{
public:
void doSomething() {}
}
class C : public A
{
public:
void doSomething() {}
}
int main()
{
// since A contains a pure virtual function, we can't instantiate it. But we can
// instantiate B and C
B* b = new B;
C* c = new C;
// or
A* ab = new B;
A* ac = new C;
// but what if we didn't know at compile time which one to create? B or C?
// we have to use pointers here, since a reference can't point to null or
// be uninitialized
A* a1 = 0;
if (decideWhatToCreate() == CREATE_B)
a1 = new B;
else
a1 = new C;
}
In C++, objects can be allocated on the heap or on the stack. The stack is valid only locally, that is when you leave the current function, the stack and all contents will be destroyed.
On the contrary, heap-objects (which must be specifically allocated using new) will live as long you don't delete them.
Now the idea is that you a caller should not need to know what a method does (encapsulation), internally. Since the method might actually store and keep the reference you have passed to it, this might be dangerous: If the calling method returns, stack-objects will be destroyed, but the references are kept.
In your simple example, it all doesn't matter too much because the program will end when main() exits anyhow. However, for every program that is just a little more complex, this can lead to serious trouble.
You need to understand that references are NOT pointers. They ,may be implemented using them (or they may not) but a reference in C++ is a completely different beast to a pointer.
That being said, any function that takes a reference can be used with pointers simply by dereferencing them (and vice versa). Given:
class A {};
void f1( A & a ) {} // parameter is reference
void f2( A * a ) {} // parameter is pointer
you can say:
A a;
f1( a )
f2 ( &a );
and:
A * p = new A;
f1( *a )
f2 ( a );
Which should you use when? Well that comes down to experience, but general good practice is:
prefer to allocate objects automatically on the stack rather than using new whenever possible
pass objects using references (preferably const references) whenever possible
rectTwo differs from rect in that rect is an instance of a Rectangle on the stack and rectTwo is the address of a Rectangle on the heap. If you pass a Rectangle by value, a copy of it is made, and you will not be able to make any changes that exist outside of the scope of changestuff().
Passing it by reference means that changestuff will have the memory address of the Rectangle instance itself, and changes are not limited to the scope of changestuff (because neither is the Rectangle).
Edit: your comment made the question more clear. Generally, a reference is safer than a pointer.
From Wikipedia:
It is not possible to refer directly to a reference object after it is
defined; any occurrence of its name
refers directly to the object it
references.
Once a reference is created, it cannot be later made to reference
another object; it cannot be reseated.
This is often done with pointers.
References cannot be null, whereas pointers can; every reference refers
to some object, although it may or may
not be valid.
References cannot be uninitialized. Because it is impossible to
reinitialize a reference, they must be
initialized as soon as they are
created. In particular, local and
global variables must be initialized
where they are defined, and references
which are data members of class
instances must be initialized in the
initializer list of the class's
constructor.
Additionally, objects allocated on the heap can lead to memory leaks, whereas objects allocated on the stack will not.
So, use pointers when they are necessary, and references otherwise.
Quite a few application domains require the use of pointers. Pointers are needed when you have intimate knowledge about how your memory is layed out. This knowledge could be because you intended the memory to be layed out in a certain way, or because the layout is out of your control. When this is the case you need pointers.
Why would you have manually structured the memory for a certain problem domain ? Well an optimal memory layout for a certain problems are orders of magnitude faster than if you used traditional techniques.
Example domains:
Enterprise Databases.
Kernel design.
Drivers.
General purpose Linear Algebra.
Binary Data serialization.
Slab Memory allocators for transaction processing (web-servers).
Video game engines.
Embedded real-time programming.
Image processing
Unicode Utility functions.
You are right to say that the actual Rectangle object isn't passed, merely a reference to it. In fact you can never 'pass' any object or anything else really. You can only 'pass' a copy of something as a parameter to a function.
The something that you can pass could be a copy of a value, like an int, or a copy of an object, or a copy of a pointer or reference. So, in my mind, passing a copy of either a pointer or a reference is logically the same thing - syntactically its different, hence the parameter being either rect or *rectTwo.
References in C++ are a distinct advantage over C, since it allows the programmer to declare and define operators that look syntactically identical to those that are available for integers.
eg. the form: a=b+c can be used for ints or Rectangles.
This is why you can have changestuff(rect); because the parameter is a reference and a reference to (pointer to) rect is taken automatically. When you have the pointer Rectangle* rectTwo; it is an 'object' in its own right and you can operate on it, eg reassign it or increment it. C++ has chosen to not convert this to a reference to an object, you have to do this manually by 'dereferencing' the pointer to get to the object, which is then automatically converted to a reference. This is what *rectTwo means: dereferencing a pointer.
So, rectTwo is a pointer to a Rectangle, but rect is a rectangle, or a reference to a Rectangle.