Are C++ references guaranteed to use pointers "internally"? - c++

Looking at references in C++ I noticed that all implementations I looked at used a pointer internally.
Does the C++ Standard guarantee that a reference will use a pointer internally or would it be ok for an implementation to use a more "efficient" solution? (I would currently not see how it could be done "better" because when a new stack frame is created there's not really a bulletproof way to know easily at what offset from the stack base pointer the variable that is being referenced is at because the stack is quite dynamic)
Note: I do understand the difference between a pointer and a reference in C++ (This question has nothing to do with that)

If you mean that a reference requires the compiler to allocate storage for a pointer, then that's unspecified.
§ 8.3.2/4
It is unspecified whether or not a reference requires storage.
EDIT: To record Martin Bonner's comment as a useful, practical note,
[F]or debugging purposes it can be quite useful to know what is going on "under the hood". (E.g. to answer questions like "why hasn't this gone completely off the rails?"). In practise, compilers all implement references as pointers (unless they can optimize the reference completely away).

No, it does not make any guarantees about how references are implemented. The C++ language only defines the semantics of references, not their implementation.

The standard doesn't say how a reference is implemented, just how it works.
It also doesn't say anything about stack frames, that's another implementation detail.

Related

C++ standard conforming method to assign address of program memory to pointer

How to assing internal process memory address to pointer object in C++ via standard conforming method?
For example, this is Undefined behavior, coz its dont defined in C++ Standard:
CInterpretator* pInterpretatorObj= reinterpret_cast<CInterpretator*>(0x1000FFFF);
or this, without reinterpret_cast, but with same effect:
CInterpretator* pInterpretatorObj= static_cast<CInterpretator*>(static_cast<void*>(0x1000FFFF));
Maybe using .asm file with public function who return address of object, and using this "foreign" function in C++ program code for get address is standard conforming or not?
But for me its very ugly. Maybe there are good methods for this.
There is no standards-compliant way of doing this. Standard C++ does not have a notion of memory layout, nor of particular integers being meaningful when casted to pointers (other than those which came from casting pointers to integers).
The good news is, “undefined behavior” is undefined by the standard. Implementations are free to offer guarantees that certain types of otherwise-UB code will do something meaningful. So if you want guaranteed correctness, rather than “just happened to work”, you won’t get that from the Standard but you may get it from your compiler documentation.
For literally all C++ compilers I know of, using reinterpret_cast as you’ve done here will do what you expect it to do.

Why hasn't not_null made it into the C++ standard yet?

After adding the comment "// not null" to a raw pointer for the Nth time I wondered once again whatever happened to the not_null template.
The C++ core guidelines were created quite some time ago now and a few things have made into into the standard including for example std::span (some like string_view and std::array originated before the core guidelines themselves but are sometimes conflated). Given its relative simplicity why hasn't not_null (or anything similar) made it into the standard yet?
I scan the ISO mailings regularly (but perhaps not thoroughly) and I am not even aware of a proposal for it.
Possibly answering my own question. I do not recall coming across any cases where it would have prevented a bug in code I've worked on as we try not to write code that way.
The guidelines themselves are quite popular, making it into clang-tidy and sonar for example. The support libraries seem a little less popular.
For example boost has been available as a package on Linux from near the start. I am not aware of any implementation of GSL that is. Though, I presume it is bundled with Visual C++ on windows.
Since people have asked in the comments.
Myself I would use it to document intent.
A construct like not_null<> potentially has semantic value which a comment does not.
Enforcing it is secondary though I can see its place. This would preferably be done with zero overhead (perhaps at compile time only for a limited number of cases).
I was thinking mainly about the case of a raw pointer member variable. I had forgotten about the case of passing a pointer to a function for which I always use a reference to mean not-null and also to mean "I am not taking ownership".
Likewise (for class members) we could also document ownership owned<> not_owned<>.
I suppose there is also whether the associated object is allowed to be changed. This might be too high level though. You could use references members instead of pointers to document this. I avoid reference members myself because I almost always want copyable and assignable types. However, see for example Should I prefer pointers or references in member data? for some discussion of this.
Another dimension is whether another entity can modify a variable.
"const" says I promise not to modify it. In multi-threaded code we would like to say almost the opposite. That is "other code promises not to modify it while we are using it" (without an explicit lock) but this is way off topic...
There is one big technical issue that is likely unsolvable which makes standardizing not_null a problem: it cannot work with move-only smart pointers.
The most important use case for not_null is with smart pointers (for raw pointers a reference usually is adequate, but even then, there are times when a reference won't work). not_null<shared_ptr<T>> is a useful thing that says something important about the API that consumes such an object.
But not_null<unique_ptr<T>> doesn't work. It cannot work. The reason being that moving from a unique pointer leaves the old object null. Which is exactly what not_null is expected to prevent. Therefore, not_null<T> always forces a copy on its contained T. Which... you can't do with a unique_ptr, because that's the whole point of the type.
Being able to say that the unqiue_ptr consumed by an API is not null is good and useful. But you can't actually do that with not_null, which puts a hole in its utility.
So long as move-only smart pointers can't work with not_null, standardizing the class becomes problematic.

Alternate Reference Implementation C++

As far as I can tell, in c++ references are implemented as constant pointers.
say you have y, which is a reference to variable x.
why would it not be more performant and efficient (especially when passing variables in functions). To either:
A. every time y is mentioned it gets replaced by x as a pre-processing stage.
B. have both x and y refer to the same memory address in the compilers symbol table.
As far as I can tell, in c++ references are implemented as constant pointers.
Might be correct, might be not. I actually don't know for sure. In any case thats implementation details. What matters is how the C++ standard specifies reference and it does not mention that they must be implemented as constant pointers.
why would it not be more performant and efficient...
When it is possible and more performant the compiler will do that. The so-called as-if rule allows the compiler to perform any optimization that does not change observable behavior of the program in accordance with the C++ standard. The standard does not specify how references are implemented in detail. It is up to the compiler to implement them in the most efficient way.

What exactly was the rationale behind introducing references in c++?

From the discussion that has happened in my recent question (Why is a c++ reference considered safer than a pointer?), it raises another question in my mind: What exactly was the rationale behind introducing references in c++?
Section 3.7 of Stroustrup's Design and Evolution of C++ describes the introduction of references into the language. If you're interested in the rationale behind any feature of C++, I highly recommend this book.
References were introduced primarily to support operator overloading. Doug McIlroy recalls that once I was explaining some problems with a precursor to the current operator overloading scheme to him. He used the word reference with the startling effect that I muttered "Thank you," and left his office to reappear the next day with the current scheme essentially complete. Doug had reminded me of Algol68.
C passes every function argument by value, and where passing an object by value would be inefficient or inappropriate the user can pass a pointer. This strategy doesn't work where operator overloading is used. In that case, notational convenience is essential because users cannot be expected to insert address-of operators if the objects are large. For example:
a = b - c;
is acceptable (that is, conventional) notation, but
a = &b - &c;
is not. Anyway, &b - &c already has a meaning in C, and I didn't want to change that.
It is not possible to change what a reference refers to after initialization. That is, once a C++ reference is initialized, it cannot be re-bound. I had in the past been bitten by Algol68 references where r1 = r2 can either assign through r1 to the object referred to or assign a new reference value to r1 (re-binding r1) depending on the type of r2. I wanted to avoid such problems in C++.
You need them for operator overloading (of course we can now go down the rabbit hole of "what was the rationale for introducing operator overloading?")
How would you type std::auto_ptr::operator*() without references? Or std::vector::operator[]?
References bind to objects implicitly. This has large advantages when you consider things like binding to temporaries or operator overloading- C++ programs would be full of & and *. When you think about it, the basic use case of a pointer is actually to behave of a reference. In addition, it's much harder to screw up references- you don't perform any pointer arithmetic yourself, can't automatically convert from arrays (a terrible thing), etc.
References are cleaner, easier, and safer than pointers.
It's interesting because most other languages don't have references like C++ has them (aliases), they just have pointer-style references.
If code takes the address of a variable and passes it to a routine, the compiler has no way of knowing whether that address might get stored someplace and used long after the called routine has exited, and possibly after the variable has ceased to exist. By contrast, if code passes give a routine a reference to a variable, it has somewhat more assurance that the reference will only be used while that routine is running. Once that routine returns, the reference will no longer be used.
Things end up getting a little 'broken' by the fact that C++ allows code to take the address of a reference. This ability was provided to allow compatibility with older routines which expected pointers rather than references. If a reference is passed to a routine which takes its address and stores it someplace, all bets are off. On the other hand, if as a matter of policy one forbids using the address of a reference in any way that might be persisted, one can pretty well gain the assurances that references provide.
To allow for operator overloading. They wanted operators to be overloadable both for objects and pointers, so they needed a way to refer to an object by something other than a pointer. Hence the reference was introduce. It is in "The Design and Evolution of C++".

If de-referencing a NULL pointer is an invalid thing to do, how should auto pointers be implemented?

I thought dereferencing a NULL pointer was dangerous, if so then what about this implementation of an auto_ptr?
http://ootips.org/yonat/4dev/smart-pointers.html
If the default constructor is invoked without a parameter the internal pointer will be NULL, then when operator*() is invoked won't that be dereferencing a null pointer?
Therefore what is the industrial strength implementation of this function?
Yes, dereferencing NULL pointer = bad.
Yes, constructing an auto_ptr with NULL creates a NULL auto_ptr.
Yes, dereferencing a NULL auto_ptr = bad.
Therefore what is the industrial strength implementation of this function?
I don't understand the question. If the definition of the function in question created by the industry itself is not "industrial strength" then I have a very hard time figuring out what might be.
std::auto_ptr is intended to provide essentially the same performance as a "raw" pointer. To that end, it doesn't (necessarily) do any run-time checking that the pointer is valid before being dereferenced.
If you want a pointer that checks validity, it's relatively easy to provide that, but it's not the intended purpose of auto_ptr. In fairness, I should add that the real intent of auto_ptr is rather an interesting question -- its specification was changed several times during the original standardization process, largely because of disagreements over what it should try to accomplish. The version that made it into the standard does have some uses, but quite frankly, not very many. In particular, it has transfer-of-ownership semantics that make it unsuitable for storage in a standard container (among other things), removing one of the obvious purposes for smart pointers in general.
Its purpose to help prevent memory leaks by ensuring that delete is performed on the underlying pointer whenever the auto_ptr goes out of scope (or itself is deleted).
Just like in higher-level languages such as C#, trying to dereference a null pointer/object will still explode, as it should.
Do what you would do if you dereferenced a NULL pointer. On many platforms, this means throw an exception.
Well, just like you said: dereferencing null pointer is illegal, leads to undefined behavior. This immediately means that you must not use operator * on a default-constructed auto_ptr. Period.
Where exactly you see a problem with "industrial strength" of this implementation is not clear to me.
#Jerry Coffin: It is naughty of me to answer your answer here rather than the OP's question but I need more lines than a comment allows..
You are completely right about the ridiculous semantics of the current version, it is so completely rotten that a new feature: "mutable" HAD to be added to the language just to allow these insane semantics to be implemented.
The original purpose of "auto_ptr" was exactly what boost::scoped_ptr does (AFAIK), and I'm happy to see a version of that finally made it into the new Standard. The reason for the name "auto_ptr" is that it should model a first class pointer on the stack, i.e. an automatic variable.
This auto_ptr was an National Body requirement, based on the following logic: if we have catchable exceptions in C++, we MUST have a way to manage pointers which is exception safe IN the Standard. This also applies to non-static class members (although that's a slightly harder problem which required a change to the syntax and semantics of constructors).
In addition a reference counting pointer was required but due to a lot of different possible implementation with different tradeoffs, one can accept that this be left out of the Standard until a later time.
Have you ever played that game where you pass a message around a ring of people and at the end someone reads out the input and output messages? That's what happened. The original intent got lost because some people thought that the auto_ptr, now we HAD to have it, could be made to do more... and finally what got put in the standard can't even do what the original simple scope_ptr style one did (auto_ptr semantics don't assure the pointed at object is destroyed because it could be moved elsewhere).
If I recall the key problem was returning the value of a auto_ptr: the core design simply doesn't allow that (it's uncopyable). A sane solution like
return ap.swap(NULL)
unfortunately still destroys the intended invariant. The right way is probably closer to:
return ap.clone();
which copies the object and returns the copy, destroying the original: the compiler is then free to optimise away the copy (as written not exception safe .. the clone might leak if another exception is thrown before it returns: a ref counted pointer solves this of course).