Reference member variables as class members - c++

In my place of work I see this style used extensively:-
#include <iostream>
using namespace std;
class A
{
public:
A(int& thing) : m_thing(thing) {}
void printit() { cout << m_thing << endl; }
protected:
const int& m_thing; //usually would be more complex object
};
int main(int argc, char* argv[])
{
int myint = 5;
A myA(myint);
myA.printit();
return 0;
}
Is there a name to describe this idiom? I am assuming it is to prevent the possibly large overhead of copying a big complex object?
Is this generally good practice? Are there any pitfalls to this approach?

Is there a name to describe this idiom?
In UML it is called aggregation. It differs from composition in that the member object is not owned by the referring class. In C++ you can implement aggregation in two different ways, through references or pointers.
I am assuming it is to prevent the possibly large overhead of copying a big complex object?
No, that would be a really bad reason to use this. The main reason for aggregation is that the contained object is not owned by the containing object and thus their lifetimes are not bound. In particular the referenced object lifetime must outlive the referring one. It might have been created much earlier and might live beyond the end of the lifetime of the container. Besides that, the state of the referenced object is not controlled by the class, but can change externally. If the reference is not const, then the class can change the state of an object that lives outside of it.
Is this generally good practice? Are there any pitfalls to this approach?
It is a design tool. In some cases it will be a good idea, in some it won't. The most common pitfall is that the lifetime of the object holding the reference must never exceed the lifetime of the referenced object. If the enclosing object uses the reference after the referenced object was destroyed, you will have undefined behavior. In general it is better to prefer composition to aggregation, but if you need it, it is as good a tool as any other.

It's called dependency injection via constructor injection: class A gets the dependency as an argument to its constructor and saves the reference to dependent class as a private variable.
There's an interesting introduction on wikipedia.
For const-correctness I'd write:
using T = int;
class A
{
public:
A(const T &thing) : m_thing(thing) {}
// ...
private:
const T &m_thing;
};
but a problem with this class is that it accepts references to temporary objects:
T t;
A a1{t}; // this is ok, but...
A a2{T()}; // ... this is BAD.
It's better to add (requires C++11 at least):
class A
{
public:
A(const T &thing) : m_thing(thing) {}
A(const T &&) = delete; // prevents rvalue binding
// ...
private:
const T &m_thing;
};
Anyway if you change the constructor:
class A
{
public:
A(const T *thing) : m_thing(*thing) { assert(thing); }
// ...
private:
const T &m_thing;
};
it's pretty much guaranteed that you won't have a pointer to a temporary.
Also, since the constructor takes a pointer, it's clearer to users of A that they need to pay attention to the lifetime of the object they pass.
Somewhat related topics are:
Should I prefer pointers or references in member data?
Using reference as class members for dependencies
GotW #88
Forbid rvalue binding via constructor to member const reference

Is there a name to describe this idiom?
There is no name for this usage, it is simply known as "Reference as class member".
I am assuming it is to prevent the possibly large overhead of copying a big complex object?
Yes and also scenarios where you want to associate the lifetime of one object with another object.
Is this generally good practice? Are there any pitfalls to this approach?
Depends on your usage. Using any language feature is like "choosing horses for courses". It is important to note that every (almost all) language feature exists because it is useful in some scenario.
There are a few important points to note when using references as class members:
You need to ensure that the referred object is guaranteed to exist till your class object exists.
You need to initialize the member in the constructor member initializer list. You cannot have a lazy initialization, which could be possible in case of pointer member.
The compiler will not generate the copy assignment operator=() and you will have to provide one yourself. It is cumbersome to determine what action your = operator shall take in such a case. So basically your class becomes non-assignable.
References cannot be NULL or made to refer any other object. If you need reseating, then it is not possible with a reference as in case of a pointer.
For most practical purposes (unless you are really concerned of high memory usage due to member size) just having a member instance, instead of pointer or reference member should suffice. This saves you a whole lot of worrying about other problems which reference/pointer members bring along though at expense of extra memory usage.
If you must use a pointer, make sure you use a smart pointer instead of a raw pointer. That would make your life much easier with pointers.

C++ provides a good mechanism to manage the life time of an object though class/struct constructs. This is one of the best features of C++ over other languages.
When you have member variables exposed through ref or pointer it violates the encapsulation in principle. This idiom enables the consumer of the class to change the state of an object of A without it(A) having any knowledge or control of it. It also enables the consumer to hold on to a ref/pointer to A's internal state, beyond the life time of the object of A. This is bad design. Instead the class could be refactored to hold a ref/pointer to the shared object (not own it) and these could be set using the constructor (Mandate the life time rules). The shared object's class may be designed to support multithreading/concurrency as the case may apply.

Wanted to add some point that was (somewhat) introduced in manilo's (great!) answer with some code:
As David Rodríguez - dribeas mentioed (in his great answer as well!), there are two "forms" of aggragation: By pointer and by reference. Take into account that if the former is used (by reference, as in your example), then the container class can NOT have a default constructor - cause all class' members of type reference MUST be initialized at construction time.
The below code will NOT compile if you will remove the comment from the default ctor implementation (g++ version 11.3.0 will output the below error):
error: uninitialized reference member in ‘class AggregatedClass&’ [-fpermissive]
MyClass()
#include <iostream>
using namespace std;
class AggregatedClass
{
public:
explicit AggregatedClass(int a) : m_a(a)
{
cout << "AggregatedClass::AggregatedClass - set m_a:" << m_a << endl;
}
void func1()
{
cout << "AggregatedClass::func1" << endl;
}
~AggregatedClass()
{
cout << "AggregatedClass::~AggregatedClass" << endl;
}
private:
int m_a;
};
class MyClass
{
public:
explicit MyClass(AggregatedClass& obj) : m_aggregatedClass(obj)
{
cout << "MyClass::MyClass(AggregatedClass& obj)" << endl;
}
/* this ctor can not be compiled
MyClass()
{
cout << "MyClass::MyClass()" << endl;
}
*/
void func1()
{
cout << "MyClass::func1" << endl;
m_aggregatedClass.func1();
}
~MyClass()
{
cout << "MyClass::~MyClass" << endl;
}
private:
AggregatedClass& m_aggregatedClass;
};
int main(int argc, char** argv)
{
cout << "main - start" << endl;
// first we need to create the aggregated object
AggregatedClass aggregatedObj(15);
MyClass obj(aggregatedObj);
obj.func1();
cout << "main - end" << endl;
return 0;
}

Member references are usually considered bad. They make life hard compared to member pointers. But it's not particularly unsual, nor is it some special named idiom or thing. It's just aliasing.

Related

Deep-copy of struct with reference member in C++17

I'm still fairly new to C++ and am confused about references and move semantics. For a compiler I'm writing that generates C++17 code, I need to be able to have structs with fields that are other structs. Since the struct definitions will be generated from the user's code in the other language, they could potentially be very large, so I'm storing the inner struct as a reference. This is also necessary to deal with incomplete types that are declared at the beginning but defined later, which may happen in the generated code. (I avoided using pointers because adding * all over the place for dereferencing makes the code generation less straightforward.)
The language I'm compiling from has no aliasing, so something like Outer b = a should always be a "deep-copy". So in this case, b.inner should be a copy of a.inner and not a reference to it. But I can't figure out how to setup the constructors to create the deep-copy behavior in C++. I tried many different configurations of the constructors for Outer, and I tried both Inner& and Inner&& for storing inner.
Here is a mock example of how the generated code would look:
#include <iostream>
template<typename T>
T copy(T a) {
return a;
}
struct Inner;
struct Outer {
Inner&& inner;
Outer(Inner&& a);
Outer(Outer& a);
};
struct Inner {
int v;
};
Outer::Outer(Inner&& a) : inner(std::move(a)) {
std::cout << " -- Constructor 1 --" << std::endl;
}
// Copy the insides of the original object, then move that rvalue to the new object?
Outer::Outer(Outer& a) : inner(std::move(copy(a.inner))) {
std::cout << " -- Constructor 2 --" << std::endl;
}
int main() {
Outer a = {Inner {30}};
std::cout << a.inner.v << std::endl; // Should be: 30
a.inner.v += 1;
std::cout << a.inner.v << std::endl; // Should be: 31
Outer b = a; // Copy a to b
std::cout << a.inner.v << std::endl; // Should be: 31
std::cout << b.inner.v << std::endl; // Should be: 31
b.inner.v += 1;
std::cout << a.inner.v << std::endl; // Should be: 31
std::cout << b.inner.v << std::endl; // Should be: 32
return 0;
}
And this is what it currently outputs (it may vary by implementation):
-- Constructor 1 --
30
31
-- Constructor 2 --
297374876
32574
297374876
32574
Clearly this output is incorrect, and I think I must have a dangling reference somewhere among other things. How should I setup Outer to get the proper behavior here?
References in C++ are (almost always) non owning aliases.
You do not want a non owning alias.
Thus, do not use references.
You could have an owning (smart) pointer and a reference alias to make some code generation easier. Do not do this. The result of doing it is a class with mixed semantics; there is no coherant sensible operator= and copy/move constructors you can write in that case.
My advice would be to:
Write a value_ptr that inherits from unique_ptr but copies on assignment.
then either:
Generate code with ->
or
Add a helper method that returns *ptr reference, and generate code that does method().
(I avoided using pointers because adding * all over the place for dereferencing makes the code generation less straightforward.)
Don't let your desired interface interfere so much with your implementation. Separation of interface and implementation is a powerful tool.
Your goal is a deep copy. Your temporaries will not live long enough. Something has to own the copied data so it both lives long enough (no dangling references) and does not live too long (no leaked memory). A reference does not own its data. Since the data will not be directly part of your structure, you need a pointer with ownership semantics.
This does not mean that the code has to add de-referencing "all over the place". To aid your interface, you could have a reference to the object owned by the pointer. Normally this would be wasted space, but it might serve a purpose in your project, assuming your assessment about code generation is accurate.
Example:
struct Outer {
// Order matters here! The pointer must be declared before the reference!
// (This should be less of a problem for generated code than it can be for
// code edited by human programmers.)
const std::unique_ptr<Inner> inner_ptr;
Inner & inner;
// The idea is that `inner` refers to `*inner_ptr`, and the `const` on
// `inner_ptr` will prevent `inner` from becoming a dangling reference.
// Copy constructor
Outer(const Outer& src) :
inner_ptr(std::make_unique<Inner>(src.inner)), // Make a copy
inner(*inner_ptr) // Reference to the copy
{}
// The compiler-generated assignment operator will be deleted because
// of the reference member, just as in the question's code
// (so having it deleted because of the `unique_ptr` is not an issue).
// However, to make this explicit:
Outer& operator=(const Outer&) = delete;
};
With the above setup, you could still access the members of the inner data via syntax like object.inner.field. While this is redundant with access via the object.inner_ptr->field syntax, you indicated that you have established a need for the former syntax.
For the benefit of future readers:
This approach has drawbacks that would normally cause me to recommend against it. It is a judgement call as to which drawbacks are greater – those in this approach or the "less straightforward" code generation. Sometimes machine-generated code needs a bit of inefficiency to ensure that corner cases function correctly. So this might be acceptable in this particular case.
If I may stray a bit from your desired syntax, a neater option would be to have an accessor function. Whether or not this is applicable in your situation depends on details that are appropriately out-of-scope for this question. It might be worth considering.
Instead of wasting space by storing a reference in the structure, you could generate the reference as needed via a member function. This has the side-effect of removing the need to mark the pointer const.
struct Outer {
// Note the lack of restrictions imposed on the data.
// All that might be needed is an assertion that inner_ptr will never be null.
std::unique_ptr<Inner> inner_ptr;
// Here, `inner` will be a member function instead of member data.
Inner & inner() { return *inner_ptr; }
// And a const version for good measure.
const Inner & inner() const { return *inner_ptr; }
// Copy constructor
Outer(const Outer& src) :
inner_ptr(std::make_unique<Inner>(src.inner())) // Make a copy
{}
// With this setup, the compiler-generated copy assignment
// operator is still deleted because of the `unique_ptr`.
// However, a compiler-generated *move* assignment is
// available if you specifically request it.
Outer& operator=(const Outer&) = delete;
Outer& operator=(Outer &&) = default;
};
With this setup, access to the members of the inner data could be done via syntax like object.inner().field. I don't know if the extra parentheses will cause the same issues as the asterisks would.
Deep copying only makes sense when the class has ownership. A reference isn't generally used for owernship.
Clearly this output is incorrect, and I think I must have a dangling reference somewhere among other things
You've guessed correctly. In the declaration: Outer a = {Inner {30}}; The instance of Inner is a temporary object and its lifetime extends until the end of that declaration. After that, the reference member is left dangling.
so I'm storing the inner struct as a reference
A reference doesn't store an object. A reference refers to an object that is stored somewhere else.
How should I setup Outer to get the proper behavior here?
It seems that a smart pointer might be useful for your use case:
struct Outer {
std::unique_ptr<Inner> inner;
};
You'll need to define a deep copy constructor and assignment operator though.

Can this technique for creating a container of heterogenous functors be salvaged?

This blog post describes a technique for creating a container of heterogeneous pointers. The basic trick is to create a trivial base class (i.e. no explicit function declarations, no data members, nothing) and a templated derived class for storing std::function<> objects with arbitrary signatures, then make the container hold unique_ptrs to objects of the base class. The code is also available on GitHub.
I don't think this code can be made robust; std::function<> can be created from a lambda, which might include a capture, which might include a by-value copy of a nontrivial object whose destructor must be called. When the Func_t type is deleted by unique_ptr upon removal from the map, only its (trivial) destructor will be called, so the std::function<> objects never get properly deleted.
I've replaced the use-case code from the example on GitHub with a "non-trivial type" that is then captured by value inside a lambda and added to the container. In the code below, the parts copied from the example are noted in comments; everything else is mine. There's probably a simpler demonstration of the problem, but I'm struggling a bit to even get a valid compile out of this thing.
#include <map>
#include <memory>
#include <functional>
#include <typeindex>
#include <iostream>
// COPIED FROM https://plus.google.com/+WisolCh/posts/eDZMGb7PN6T
namespace {
// The base type that is stored in the collection.
struct Func_t {};
// The map that stores the callbacks.
using callbacks_t = std::map<std::type_index, std::unique_ptr<Func_t>>;
callbacks_t callbacks;
// The derived type that represents a callback.
template<typename ...A>
struct Cb_t : public Func_t {
using cb = std::function<void(A...)>;
cb callback;
Cb_t(cb p_callback) : callback(p_callback) {}
};
// Wrapper function to call the callback stored at the given index with the
// passed argument.
template<typename ...A>
void call(std::type_index index, A&& ... args)
{
using func_t = Cb_t<A...>;
using cb_t = std::function<void(A...)>;
const Func_t& base = *callbacks[index];
const cb_t& fun = static_cast<const func_t&>(base).callback;
fun(std::forward<A>(args)...);
}
} // end anonymous namespace
// END COPIED CODE
class NontrivialType
{
public:
NontrivialType(void)
{
std::cout << "NontrivialType{void}" << std::endl;
}
NontrivialType(const NontrivialType&)
{
std::cout << "NontrivialType{const NontrivialType&}" << std::endl;
}
NontrivialType(NontrivialType&&)
{
std::cout << "NontrivialType{NontrivialType&&}" << std::endl;
}
~NontrivialType(void)
{
std::cout << "Calling the destructor for a NontrivialType!" << std::endl;
}
void printSomething(void) const
{
std::cout << "Calling NontrivialType::printSomething()!" << std::endl;
}
};
// COPIED WITH MODIFICATIONS
int main()
{
// Define our functions.
using func1 = Cb_t<>;
NontrivialType nt;
std::unique_ptr<func1> f1 = std::make_unique<func1>(
[nt](void)
{
nt.printSomething();
}
);
// Add to the map.
std::type_index index1(typeid(f1));
callbacks.insert(callbacks_t::value_type(index1, std::move(f1)));
// Call the callbacks.
call(index1);
return 0;
}
This produces the following output (using G++ 5.1 with no optimization):
NontrivialType{void}
NontrivialType{const NontrivialType&}
NontrivialType{NontrivialType&&}
NontrivialType{NontrivialType&&}
NontrivialType{const NontrivialType&}
Calling the destructor for a NontrivialType!
Calling the destructor for a NontrivialType!
Calling the destructor for a NontrivialType!
Calling NontrivialType::printSomething()!
Calling the destructor for a NontrivialType!
I count five constructor calls and four destructor calls. I think that indicates that my analysis is correct--the container cannot properly destroy the instance it owns.
Is this approach salvageable? When I add a virtual =default destructor to Func_t, I see a matching number of ctor/dtor calls:
NontrivialType{void}
NontrivialType{const NontrivialType&}
NontrivialType{NontrivialType&&}
NontrivialType{NontrivialType&&}
NontrivialType{const NontrivialType&}
Calling the destructor for a NontrivialType!
Calling the destructor for a NontrivialType!
Calling the destructor for a NontrivialType!
Calling NontrivialType::printSomething()!
Calling the destructor for a NontrivialType!
Calling the destructor for a NontrivialType!
... so I think this change might be sufficient. Is it?
(Note: the correctness--or lack thereof--of this approach is independent of whether the idea of a container of heterogeneous functions is a good idea. In a few very specific cases, there may be some merit, for instance, when designing an interpreter; e.g., a Python class may be thought of as just such a container of heterogeneous functions plus a container of heterogeneous data types. But in general, my decision to ask this question does not indicate that I think this is likely to be a good idea in very many cases.)
This is basically someone trying to implement type erasure and getting it horribly wrong.
Yes, you need a virtual destructor. The dynamic type of the thing being deleted is obviously not Func_t, so it's plainly UB if the destructor isn't virtual.
The whole design is completely broken, anyway.
The point of type erasure is that you have a bunch of different types that share a common characteristic (e.g. "can be called with an int and get a double back"), and you want to make them into a single type that has that characteristic (e.g., std::function<double(int)>). By its nature, type erasure is a one-way street: once you have erased the type, you can't get it back without knowing what it is.
What does erasing something down to an empty class mean? Nothing, other than "it's a thing". It's a std::add_pointer_t<std::common_type_t<std::enable_if_t<true>, std::void_t<int>>> (more commonly known as void*), obfuscated in template clothing.
There are plenty of other problems with the design. Because the type was erased into nothingness, it had to recover the original type in order to do anything useful with it. But you can't recover the original type without knowing what it is, so it ends up using the type of arguments passed to Call to infer the type of the thing stored in the map. That is ridiculously error-prone, because A..., which represents the types and value categories of the arguments passed to Call, is highly unlikely to match exactly the parameter types of std::function's template argument. For instance, if you have a std::function<void(int)> stored in there, and you tried to call it with a int x = 0; Call(/* ... */ , x);, it's undefined behavior. Go figure.
To make it worse, any mismatch is hidden behind a static_cast that causes undefined behavior, making it harder to find and fix. There's also the curious design that requires the user to pass a type_index, when you "know" what the type is anyway, but it's just a sideshow when compared to all the other problems with this code.

C++11 best practice to use rvalue

I am new to C++11. In fact until recently, I programmed only using dynamic allocation, in a way similar to Java, e.g.
void some_function(A *a){
a->changeInternalState();
}
A *a = new A();
some_function(a);
delete a;
// example 2
some_function( new A() ); // suppose there is **no** memory leak.
Now I want to reproduce similar code with C++11, but without pointers.
I need to be able to pass newly created class class A directly to function useA(). There seems to be a problem if I want to do so with non-const normal reference and It works if I do it with rvalue reference.
Here is the code:
#include <stdio.h>
class A{
public:
void print(){
++p; // e.g. change internal state
printf("%d\n", p);
}
int p;
};
// normal reference
void useA(A & x){
x.print();
}
// rvalue reference
void useA(A && x){
useA(x);
}
int main(int argc, char** argv)
{
useA( A{45} ); // <--- newly created class
A b{20};
useA(b);
return 0;
}
It compiles and executes correctly, but I am not sure, if this is the correct acceptable way to do the work?
Are there some best practices for this kind of operations?
Normally you would not design the code so that a temporary object gets modified. Then you would write your print function as:
void useA(A const & x){
x.print();
}
and declare A::print as const. This binds to both rvalues and lvalues. You can use mutable for class member variables which might change value but without the object logically changing state.
Another plan is to keep just A &, but write:
{ A temp{45}; useA(temp); }
If you really do want to modify a temporary object, you can write the pair of lvalue and rvalue overloads as you have done in your question. I believe this is acceptable practice for that case.
The best thing about C++11 move semantics is that most of the time, you get them "for free" without having to explicitly add any &&s or std::move()s in your code. Usually, you only need to use these things explicitly if you're writing code that does manual memory management, such as the implementation of a smart pointer or a container class, where you would have had to write a custom destructor and copy constructor anyway.
In your example, A is just an int. For ints, a move is no different from a copy, because there's no opportunity for optimization even if the int happens to be a disposable temporary. Just provide a single useA() function that takes an ordinary reference. It'll have the same behavior.

Does this code subvert the C++ type system?

I understand that having a const method in C++ means that an object is read-only through that method, but that it may still change otherwise.
However, this code apparently changes an object through a const reference (i.e. through a const method).
Is this code legal in C++?
If so: Is it breaking the const-ness of the type system? Why/why not?
If not: Why not?
Note 1: I have edited the example a bit, so answers might be referring to older examples.
Edit 2: Apparently you don't even need C++11, so I removed that dependency.
#include <iostream>
using namespace std;
struct DoBadThings { int *p; void oops() const { ++*p; } };
struct BreakConst
{
int n;
DoBadThings bad;
BreakConst() { n = 0; bad.p = &n; }
void oops() const { bad.oops(); } // can't change itself... or can it?
};
int main()
{
const BreakConst bc;
cout << bc.n << endl; // 0
bc.oops(); // O:)
cout << bc.n << endl; // 1
return 0;
}
Update:
I have migrated the lambda to the constructor's initialization list, since doing so allows me to subsequently say const BreakConst bc;, which -- because bc itself is now const (instead of merely the pointer) -- would seem to imply (by Stroustrup) that modifying bc in any way after construction should result in undefined behavior, even though the constructor and the caller would have no way of knowing this without seeing each others' definitions.
The oops() method isn't allowed to change the constness of the object. Furthermore it doesn't do it. Its your anonymous function that does it. This anonymous function isn't in the context of the object, but in the context of the main() method which is allowed to modify the object.
Your anonymous function doesn't change the this pointer of oops() (which is defined as const and therefore can't be changed) and also in no way derives some non-const variable from this this-pointer. Itself doesn't have any this-pointer. It just ignores the this-pointer and changes the bc variable of the main context (which is kind of passed as parameter to your closure). This variable is not const and therefore can be changed. You could also pass any anonymous function changing a completely unrelated object. This function doesn't know, that its changing the object that stores it.
If you would declare it as
const BreakConst bc = ...
then the main function also would handle it as const object and couldn't change it.
Edit:
In other words: The const attribute is bound to the concrete l-value (reference) accessing the object. It's not bound to the object itself.
You code is correct, because you don't use the const reference to modify the object. The lambda function uses completely different reference, which just happen to be pointing to the same object.
In the general, such cases does not subvert the type system, because the type system in C++ does not formally guarantee, that you can't modify the const object or the const reference. However modification of the const object is the undefined behaviour.
From [7.1.6.1] The cv-qualifiers:
A pointer or reference to a cv-qualified type need not actually point
or refer to a cv-qualified object, but it is treated as if it does; a
const-qualified access path cannot be used to modify an object even if
the object referenced is a non-const object and can be modified through
some other access path.
Except that any class member declared mutable (7.1.1) can be modified,
any attempt to modify a const object during its lifetime (3.8) results
in undefined behavior.
I already saw something similar. Basically you invoke a cost function that invoke something else that modifies the object without knowing it.
Consider this as well:
#include <iostream>
using namespace std;
class B;
class A
{
friend class B;
B* pb;
int val;
public:
A(B& b);
void callinc() const;
friend ostream& operator<<(ostream& s, const A& a)
{ return s << "A value is " << a.val; }
};
class B
{
friend class A;
A* pa;
public:
void incval() const { ++pa->val; }
};
inline A::A(B& b) :pb(&b), val() { pb->pa = this; }
inline void A::callinc() const { pb->incval(); }
int main()
{
B b;
const A a(b); // EDIT: WAS `A a(b)`
cout << a << endl;
a.callinc();
cout << a << endl;
}
This is not C++11, but does the same:
The point is that const is not transitive.
callinc() doesn't change itself a and incval doesn't change b.
Note that in main you can even declare const A a(b); instead of A a(b); and everything compile the same.
This works from decades, and in your sample you're just doing the same: simply you replaced class B with a lambda.
EDIT
Changed the main() to reflect the comment.
The issue is one of logical const versus bitwise const. The compiler
doesn't know anything about the logical meaning of your program, and
only enforces bitwise const. It's up to you to implement logical const.
This means that in cases like you show, if the pointed to memory is
logically part of the object, you should refrain from modifying it in a
const function, even if the compiler will let you (since it isn't part
of the bitwise image of the object). This may also mean that if part of
the bitwise image of the object isn't part of the logical value of the
object (e.g. an embedded reference count, or cached values), you make it
mutable, or even cast away const, in cases where you modify it without
modifying the logical value of the object.
The const feature merely helps against accidental misuse. It is not designed to prevent dedicated software hacking. It is the same as private and protected membership, someone could always take the address of the object and increment along the memory to access class internals, there is no way to stop it.
So, yes you can get around const. If nothing else you can simply change the object at the memory level but this does not mean const is broken.

Avoiding need for #define with expression templates

With the following code, "hello2" is not displayed as the temporary string created on Line 3 dies before Line 4 is executed. Using a #define as on Line 1 avoids this issue, but is there a way to avoid this issue without using #define? (C++11 code is okay)
#include <iostream>
#include <string>
class C
{
public:
C(const std::string& p_s) : s(p_s) {}
const std::string& s;
};
int main()
{
#define x1 C(std::string("hello1")) // Line 1
std::cout << x1.s << std::endl; // Line 2
const C& x2 = C(std::string("hello2")); // Line 3
std::cout << x2.s << std::endl; // Line 4
}
Clarification:
Note that I believe Boost uBLAS stores references, this is why I don't want to store a copy. If you suggest that I store by value, please explain why Boost uBLAS is wrong and storing by value will not affect performance.
Expression templates that do store by reference typically do so for performance, but with the caveat they only be used as temporaries
Taken from the documentation of Boost.Proto (which can be used to create expression templates):
Note An astute reader will notice that the object y defined above will be left holding a dangling reference to a temporary int. In the sorts of high-performance applications Proto addresses, it is typical to build and evaluate an expression tree before any temporary objects go out of scope, so this dangling reference situation often doesn't arise, but it is certainly something to be aware of. Proto provides utilities for deep-copying expression trees so they can be passed around as value types without concern for dangling references.
In your initial example this means that you should do:
std::cout << C(std::string("hello2")).s << std::endl;
That way the C temporary never outlives the std::string temporary. Alternatively you could make s a non reference member as others pointed out.
Since you mention C++11, in the future I expect expression trees to store by value, using move semantics to avoid expensive copying and wrappers like std::reference_wrapper to still give the option of storing by reference. This would play nicely with auto.
A possible C++11 version of your code:
class C
{
public:
explicit
C(std::string const& s_): s { s_ } {}
explicit
C(std::string&& s_): s { std::move(s_) } {}
std::string const&
get() const& // notice lvalue *this
{ return s; }
std::string
get() && // notice rvalue *this
{ return std::move(s); }
private:
std::string s; // not const to enable moving
};
This would mean that code like C("hello").get() would only allocate memory once, but still play nice with
std::string clvalue("hello");
auto c = C(clvalue);
std::cout << c.get() << '\n'; // no problem here
but is there a way to avoid this issue without using #define?
Yes.
Define your class as: (don't store the reference)
class C
{
public:
C(const std::string & p_s) : s(p_s) {}
const std::string s; //store the copy!
};
Store the copy!
Demo : http://www.ideone.com/GpSa2
The problem with your code is that std::string("hello2") creates a temporary, and it remains alive as long as you're in the constructor of C, and after that the temporary is destroyed but your object x2.s stills points to it (the dead object).
After your edit:
Storing by reference is dangerous and error prone sometimes. You should do it only when you are 100% sure that the variable reference will never go out of scope until its death.
C++ string is very optimized. Until you change a string value, all will refer to the same string only. To test it, you can overload operator new (size_t) and put a debug statement. For multiple copies of same string, you will see that the memory allocation will happen only once.
You class definition should not be storing by reference, but by value as,
class C {
const std::string s; // s is NOT a reference now
};
If this question is meant for general sense (not specific to string) then the best way is to use dynamic allocation.
class C {
MyClass *p;
C() : p (new MyClass()) {} // just an example, can be allocated from outside also
~C() { delete p; }
};
Without looking at BLAS, expression templates typically make heavy use of temporary objects of types you aren't supposed to even know exists. If Boost is storing references like this within theirs, then they would suffer the same problem you see here. But as long as those temporary objects remain temporary, and the user doesnt store them for later, everything is fine because the temporaries they reference remain alive for as long as the temporary objects do. The trick is you perform a deep copy when the intermediate object is turned into the final object that the user stores. You've skipped this last step here.
In short, it's a dangerous move, which is perfectly safe as long as the user of your library doesn't do anything foolish. I wouldn't recommend making use of it unless you have a clear need, and you're well aware of the consequences. And even then, there might be a better alternative, I've never worked with expression templates in any serious capacity.
As an aside, since you tagged this C++0x, auto x = a + b; seems like it would be one of those "foolish" things users of your code can do to make your optimization dangerous.