Scope of operator-> - c++

When overloading operator ->, one would eventually have to return a pointer. If this pointer points to an object created inside the function like:
struct Something {
char c;
Something(char k){
c = k;
}
};
struct Hello {
int a;
Something* operator->(){
Something s(a);
return &s;
}
};
Would then dereferencing this result in undefined behavior (as Can a local variable's memory be accessed outside its scope?):
Hello h {5};
cout << h->c;
If so, can this be resolved, still using operator->?

Would then dereferencing this result in undefined behavior?
Yes.
If so, can this be resolved, still using operator->?
The following applies to any other member function.
Create Something instance as a member of Hello. Then the pointer to it remains valid for the duration of Hello's lifetime. You can construct Something from the integer you are storing immediately, so you can get rid of it.
If Something takes additional resources (memory, handles, etc.), you probably want to construct it at the time of operator-> call. You can choose std::unique_ptr (dynamic) or std::optional (static) for this. Or, if Something supports an uninitialized state (e.g. default-constructed), you can initialize (e.g. move-assign) to it later.
If you cannot alter Hello class then your only option is to overload operator-> to return by Something by value directly.
But, I must say, the circumstances of this problem are starting to get weird and the way of overloading operators even more so.

Related

Why program didn't crash when lifetime of temporary has ended? [duplicate]

A question came up here on SO asking "Why is this working" when a pointer became dangling. The answers were that it's UB, which means it may work or not.
I learned in a tutorial that:
#include <iostream>
struct Foo
{
int member;
void function() { std::cout << "hello";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->member = 5; // This will cause a read access violation but...
fooObj->function(); // Because this doesn't refer to any memory specific to
// the Foo object, and doesn't touch any of its members
// It will work.
}
Would this be the equivalent of:
static void function(Foo* fooObj) // Foo* essentially being the "this" pointer
{
std::cout << "Hello";
// Foo pointer, even though dangling or null, isn't touched. And so should
// run fine.
}
Am I wrong about this? Is it UB even though as I explained just calling a function and not accessing the invalid Foo pointer?
You're reasoning about what happens in practice. Undefined behavior is allowed to do the thing you expect... but it is not guaranteed.
For the non-static case, this is straightforward to prove using the rule found in [class.mfct.non-static]:
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
Note that there's no consideration about whether the non-static member function accesses *this. The object is simply required to have the correct dynamic type, and *(Foo*)nullptr certainly does not.
In particular, even on platforms which use the implementation you describe, the call
fooObj->func();
gets converted to
__assume(fooObj); Foo_func(fooObj);
and is optimization-unstable.
Here's an example which will work contrary to your expectations:
int main()
{
Foo* fooObj = nullptr;
fooObj->func();
if (fooObj) {
fooObj->member = 5; // This will cause a read access violation!
}
}
On real systems, this is likely to end up with an access violation on the commented line, because the compiler used the fact that fooObj can't be null in fooObj->func() to eliminate the if test following it.
Don't do things that are UB even if you think you know what your platform does. Optimization instability is real.
Also, the Standard is even more restrictive that you might think. This will also cause UB:
struct Foo
{
int member;
void func() { std::cout << "hello";}
static void s_func() { std::cout << "greetings";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->s_func(); // well-formed call to static member,
// but unlike Foo::s_func(), it requires *fooObj to be a valid object of type Foo
}
The relevant portions of the Standard are found in [expr.ref]:
The expression E1->E2 is converted to the equivalent form (*(E1)).E2
and the accompanying footnote
If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
This means that the code in question definitely evaluates (*fooObj), attempting to create a reference to a non-existent object. There have been several proposals to make this allowed and only forbid allowing lvalue->rvalue conversion on such a reference, but those have been rejected this far; even forming the reference is illegal in all versions of the Standard to date.
In practice this is usually how major compilers implement member functions, yes. This means that your test program would probably appear to run "just fine".
Having said that, dereferencing a pointer pointing to nullptr is undefined behavior which means that all bets are off and the whole program and it's output is meaningless, anything could happen.
You can never rely on this behavior, optimizers in particular could mess all of this code up because they're allowed to assume that fooObj is never nullptr.
Compiler isn't obliged by standard to implement member function by passing it a pointer to the class instance. Yes, there is pseudo-pointer "this", but it is unrelated element, guaranteed to be "understood".
nullptr pointer doesn't point on any existing object, and -> () calls a member of that object. From standard's view, this is nonsense and result of such operation is undefined (and potentially, catastrophic).
If function() would be virtual, then call is allowed to fail, because address of function would be unavailable (vtable might be implemented as part of object and doesn't exist if object doesn't).
if the member function (method) behaves like that and meant to be called like that it should be a static member function (method). Static method doesn't access non-static fields and doesn't call non-static methods of class. If it is static, the call could look like this as well:
Foo::function();

Is this undefined behaviour in C++ calling a function from a dangling pointer

A question came up here on SO asking "Why is this working" when a pointer became dangling. The answers were that it's UB, which means it may work or not.
I learned in a tutorial that:
#include <iostream>
struct Foo
{
int member;
void function() { std::cout << "hello";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->member = 5; // This will cause a read access violation but...
fooObj->function(); // Because this doesn't refer to any memory specific to
// the Foo object, and doesn't touch any of its members
// It will work.
}
Would this be the equivalent of:
static void function(Foo* fooObj) // Foo* essentially being the "this" pointer
{
std::cout << "Hello";
// Foo pointer, even though dangling or null, isn't touched. And so should
// run fine.
}
Am I wrong about this? Is it UB even though as I explained just calling a function and not accessing the invalid Foo pointer?
You're reasoning about what happens in practice. Undefined behavior is allowed to do the thing you expect... but it is not guaranteed.
For the non-static case, this is straightforward to prove using the rule found in [class.mfct.non-static]:
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
Note that there's no consideration about whether the non-static member function accesses *this. The object is simply required to have the correct dynamic type, and *(Foo*)nullptr certainly does not.
In particular, even on platforms which use the implementation you describe, the call
fooObj->func();
gets converted to
__assume(fooObj); Foo_func(fooObj);
and is optimization-unstable.
Here's an example which will work contrary to your expectations:
int main()
{
Foo* fooObj = nullptr;
fooObj->func();
if (fooObj) {
fooObj->member = 5; // This will cause a read access violation!
}
}
On real systems, this is likely to end up with an access violation on the commented line, because the compiler used the fact that fooObj can't be null in fooObj->func() to eliminate the if test following it.
Don't do things that are UB even if you think you know what your platform does. Optimization instability is real.
Also, the Standard is even more restrictive that you might think. This will also cause UB:
struct Foo
{
int member;
void func() { std::cout << "hello";}
static void s_func() { std::cout << "greetings";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->s_func(); // well-formed call to static member,
// but unlike Foo::s_func(), it requires *fooObj to be a valid object of type Foo
}
The relevant portions of the Standard are found in [expr.ref]:
The expression E1->E2 is converted to the equivalent form (*(E1)).E2
and the accompanying footnote
If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
This means that the code in question definitely evaluates (*fooObj), attempting to create a reference to a non-existent object. There have been several proposals to make this allowed and only forbid allowing lvalue->rvalue conversion on such a reference, but those have been rejected this far; even forming the reference is illegal in all versions of the Standard to date.
In practice this is usually how major compilers implement member functions, yes. This means that your test program would probably appear to run "just fine".
Having said that, dereferencing a pointer pointing to nullptr is undefined behavior which means that all bets are off and the whole program and it's output is meaningless, anything could happen.
You can never rely on this behavior, optimizers in particular could mess all of this code up because they're allowed to assume that fooObj is never nullptr.
Compiler isn't obliged by standard to implement member function by passing it a pointer to the class instance. Yes, there is pseudo-pointer "this", but it is unrelated element, guaranteed to be "understood".
nullptr pointer doesn't point on any existing object, and -> () calls a member of that object. From standard's view, this is nonsense and result of such operation is undefined (and potentially, catastrophic).
If function() would be virtual, then call is allowed to fail, because address of function would be unavailable (vtable might be implemented as part of object and doesn't exist if object doesn't).
if the member function (method) behaves like that and meant to be called like that it should be a static member function (method). Static method doesn't access non-static fields and doesn't call non-static methods of class. If it is static, the call could look like this as well:
Foo::function();

Modifying self within const method via non const pointer to self

In following example const object can modify itself via const method, because in that method it acces itself via non-const pointer. (same program on ideone)
#include <iostream>
struct Object;
Object * g_pObject;
struct Object
{
Object():m_a(0){}
void ModifySelfViaConstMethod() const
{
g_pObject->m_a = 37;
}
int m_a;
};
int main()
{
Object o;
g_pObject = &o;
const Object & co = o;
std::cout << co.m_a << "\n";
co.ModifySelfViaConstMethod();
std::cout << co.m_a << "\n";
return 0;
}
I am not so good with reading c++ standard, so I ask here:
What does standard says about this?
a)const method doesn't guarantee you that your object stays unmodified when you do stuff like this
b) Is it well defined that and it must compile
c) other ?
When you declare a const function, it's "for your own good".
In other words, you declare it const because according to your initial design, it is not supposed to change any object with which it will be invoked during runtime.
If during some later point in the implementation of this function you end up changing the object, the compiler will "yell at you", telling you it's wrong.
Of course, the compiler will be able to identify such attempt, only when applied on this.
In the given example, the compiler cannot identify the problem because it requires a comparison between this and g_pObject, and such comparison can only take place during runtime.
When a method is declared as const, the compiler ensures that the instance pointed to by the this pointer is not modified. If you try to modify the this instance, the compiler to fail. However, the compiler has no way of knowing that g_pObject and this are actually pointing at the same instance. That requires a run-time comparison, and no compiler is going to waste time performing run-time comparisons of every pointer used inside of a const method in the off-chance that they might match the this pointer. So if you are going to modify an Object via an external pointer, you are going to have to do your own check, eg:
void ModifySelfViaConstMethod() const
{
if (g_pObject != this)
g_pObject->m_a = 37;
}
What does standard says about this?
It says (paraphrasing) that this has type const Object *, so that you can't directly modify members or call non-const member functions via this. It says nothing about what you can do with any global variables that the function might have access to; it only controls direct access to the object the function is called on.
const method doesn't guarantee you that your object stays unmodified when you do stuff like this
No it doesn't. It states the intent that the function won't modify the object, and provides some protection against accidentally breaking that intent. It doesn't prevent a suitably deranged programmer from using const_cast, or (as here) uncontrolled coupling via global variables to break the promise.
Is it well defined that and it must compile
Yes. o is not itself constant, so there's nothing to stop you taking a non-const pointer or reference to it. The const on the member function only restricts access to the object via this, not to arbitrary objects via other pointers.
Constancy in C++ is a safety instrument, not security.
The code where constancy is honored will most probably work as expected, and all unintentional attempts to cast constancy away will be alerted by the compiler.
In cases when "I know what I am doing" one can find the whole variety of tools, from const_cast operator and mutable keyword to the banal C style cast.

Can a private variable be accessed through its address?

Would it be possible for a public function to return a pointer to a private variable in the class. If so / if not, what would happen? would it crash or is there anything highly unsafe about this? Can the pointed data be read or written to?
Thanks
Yes, a member function may return a pointer (or reference) to a private data member. There is nothing wrong with this except that in most circumstances it breaks encapsulation.
The data member can certainly be read via the returned pointer or reference. Whether or not it can be written to depends on whether the returned pointer or reference is to a const-qualified object (i.e., if you return a const T*, you won't be able to modify the pointed-to T). For example:
class Example
{
public:
int* get() { return &i; }
const int* get_const() const { return &i; }
private:
int i;
};
int main()
{
Example e;
int* p = e.get();
int a = *p; // yes, we can read the data via the pointer
*p = 42; // yes, we can modify the data via the pointer
const int* cp = e.get_const();
int b = *cp; // yes, we can read the data via the pointer
*cp = 42; // error: pointer is to a const int
}
Yes it can be returned and it can be read and written to. It is no more or less dangerous than taking the address of any other variable. Public/private/protected are syntactical constructs that are checked at compile time, and they aren't "contagious" or part of the type of something.
A private member is no different from a public member, or any other type of instance. So, yes, you can take pointers to them, and return them.
When taking pointers to any instance-based member, you have to be careful that the parent class is not deleted and doesn't go out of scope, unless you take that pointer and make a true copy of the data/object it points to. If it is deleted or goes out of scope, the pointer becomes a dangling pointer, and you can't use it anymore without your app exploding (or working on non-existent objects, and thus making your program do crazy unexpected things, but not crashing).
Design considerations:
Exposing any of your internal implementation details is potentially a violation of encapsulation. However, if you only want to encapsulate HOW the object you're returning is created/retrieved, then this is a reasonable solution. This would allow you to change the class to get the member object some other way (like querying a file, or an internal dictionary) without breaking code that calls these methods.
"would it crash or is there anything highly unsafe about this? Can the pointed data be read or written to?"
private/protected/public have no effect whatsoever at runtime, so they can't influence the program execution or even crash the program. They are checked at compile time and just cause your compiler to throw an error while compiling if there is a violation.
The const qualifier won't protect you on those cases. Considering James's response, please try to cast the const int * to int *
int* cp = (int*)e.get_const();
You will still be able to modify.

C++ reference type recommended usage

I am programming in C++ more then 5 years, and have never met any place where reference of the variable is recommended to use except as a function argument (if you don't want to copy what you pass as your function argument). So could someone point cases where C++ variable reference is recommended (I mean it gives any advantage) to use.
As a return value of an opaque collection accessor/mutator
The operator[] of std::map returns a reference.
To shorten the text needed to reference a variable
If you miss old-school with Foo do ... statement (that's Pascal syntax), you can write
MyString &name = a->very->long_->accessor->to->member;
if (name.upcase() == "JOHN") {
name += " Smith";
}
another example of this can be found in Mike Dunlavey's answer
To state that something is just a reference
References are also useful in wrapper objects and functors--i.e. in intermediate objects that logically contact no members but only references to them.
Example:
class User_Filter{
std::list<User> const& stop_list;
public: Functor (std::list<User> const& lst)
: stop_list(lst) { }
public: bool operator()(User const& u) const
{ return stop_list.exists(u); }
};
find_if(x.begin(),x.end(),User_Filter(user_list));
The idea here that it's a compile error if you don't initialize a reference in constructor of such an object. The more checks in compile time--the better programs are.
Here's a case where it's handy:
MyClass myArray[N];
for (int i = 0; i < N; i++){
MyClass& a = myArray[i];
// in code here, use a instead of myArray[i], i.e.
a.Member = Value;
}
Use references wherever you want, pointers when you are forced to.
References and pointers share part of their semantics: they are an alias to an element that is not present. The main difference is with memory managements: references express clearly that you are not responsible for the resource. On the other hand, with pointers it is never really clear (unless you mean smart pointers): are you assumed to delete the pointer or will it be deleted externally?
You must use pointers when you must manage memory, want to allow for optional semantics or need to change the element referred to at a later time.
In the rest of cases, where you can use a reference or a pointer, references are clearer and should be preferred.
Now, as you point out, they are really not needed: you can always use pointers for all the reference uses (even parameter passing), but the fact that you can use a single tool for everything does not mean there are no better suited tools for the job.
I tend to use reference members instead of pointers for externally controlled non-optional construction parameters.
EDIT (added example):
Let's say that you have a database and a DAO class having the database as a dependency:
struct Database {};
struct PersonDao {
const Database &m_d;
PersonDao(const Database &d): m_d(d) {}
};
Furthermore, the scope of the database is controlled externally from the DAO:
int main() {
Database d;
PersonDao pd(d);
}
In this case it makes sense to use a reference type, since you don't ever want DAO::m_d to be null, and its lifetime is controlled externally (from the main function in this case).
I use references in function arguments not just to avoid copies but also instead of pointers to avoid having to deal with NULL pointers where appropriate. Pointers model a "maybe there's a value, but maybe not (NULL)", references are a clear statement that a value is required.
... and to make it absolutely clear (-> comments). I tend to avoid pointers to model "maybe there are several values" - a vector is a better option here. Pointers to several values often end up in C-style programming because you usually have to pass the # of elements as well separately.
Use a const reference to give a name to a value, e.g.:
const Vec3 &ba=b-a;
This names the value, but doesn't necessarily create a variable for it. In theory, this gives the compiler more leeway and may allow it to avoid some copy constructor calls.
(Related non-duplicated Stack Overflow question at Const reference to temporary. The Herb Sutter link there has more information about this.)
The argument to the copy-constructor MUST be passed as a reference, since otherwise the copy constructor would need to call it self in an endless recursion (stack overflow).
I tend to agree, but perhaps const return values.
Well you kind of have two choices for aliasing other values(ignoring shared_ptrs and the like): pointers and references.
References must be initialized at construction to refer to something else. So semantically a reference can never be NULL. In reality, though, the underlying data can go away, giving you problems often more difficult to debug than if a pointer went away. So I'm not sure there's a real advantage here unless you were disciplined and consistent with how they were used vis-a-vis referring to items that were dynamically allocated. If you did this with pointers too, you'd avoid the same problems.
Perhaps more importantly, references can be used without thinking about all the issues that arise with pointers. This is probably the main advantage. Semantically a reference is the thing. If you guarantee as the caller/callee that the underlying memory doesn't go away, you don't have to confuse the user with any of the questions that come along with pointers (Do I need to free this? Could this be NULL? etc) and can safely use a reference for convenience.
An example of this might be a function that looks up the corresponding string for an enum,
const std::string& ConvertToString( someEnum val)
{
static std::vector< std::string > lookupTable;
if (lookupTable.empty())
{
// fill in lookup table
}
// ignoring the cast that would need to happen
return lookupTable[val]
}
Here the contract between the caller and the callee guarantees that the return type will always be there. You can safely return a reference, and avoid some of the questions that pointers invite.
References make code prettier. So use them whenever it takes a reference to beautify your code.
i would like to enlist some cases:
1) while writing singleton classes
class singleton
{
singleton();
explicit singleton(const singleton&);
singleton& operator=(const singleton&);
public:
static singleton& instance()
{
static singleton inst;
return inst;
}
};// this is called the 'Meyers' singleton pattern. refer to More Effective C++ by Scott Meyers
it has all the benefits, but avoids using the new operator
**2)**here is no such thing as a null reference. A reference must always refer to some object. As a result, if you have a variable whose purpose is to refer to another object, but it is possible that there might not be an object to refer to, you should make the variable a pointer, because then you can set it to null. On the other hand, if the variable must always refer to an object, i.e., if your design does not allow for the possibility that the variable is null, you should probably make the variable a reference
**3)**Because a reference must refer to an object, C++ requires that references be initialized:
string& rs; // error! References must
// be initialized
string s("xyzzy");
string& rs = s; // okay, rs refers to s
Pointers are subject to no such restriction
The fact that there is no such thing as a null reference implies that it can be more efficient to use references than to use pointers. That's because there's no need to test the validity of a reference before using it
**4)**Another important difference between pointers and references is that pointers may be reassigned to refer to different objects. A reference, however, always refers to the object with which it is initialized: ยค Item M1, P10
string s1("Nancy");
string s2("Clancy");
string& rs = s1; // rs refers to s1
string *ps = &s1; // ps points to s1
rs = s2; // rs still refers to s1,
// but s1's value is now
// "Clancy"
ps = &s2; // ps now points to s2;
// s1 is unchanged
Stream operators are an obvious example
std::ostream & operator<< (std::ostream &, MyClass const &...) {
....
}
mystream << myClassVariable;
You obviously don't want a pointer as checking for NULL makes using an operator very tedious i.s.o. convenient
I've used a reference to an ostream instead of a pointer. I supppose that I prefer references to pointers when the class has a lot of operators.