Taking the address of a temporary object - c++

§5.3.1 Unary operators, Section 3
The result of the unary & operator is a pointer to its operand. The operand shall be an lvalue or a qualified-id.
What exactly does "shall be" mean in this context? Does it mean it's an error to take the address of a temporary? I was just wondering, because g++ only gives me a warning, whereas comeau refuses to compile the following program:
#include <string>
int main()
{
&std::string("test");
}
g++ warning: taking address of temporary
comeau error: expression must be an lvalue or a function designator
Does anyone have a Microsoft compiler or other compilers and can test this program, please?

The word "shall" in the standard language means a strict requirement. So, yes, your code is ill-formed (it is an error) because it attempts to apply address-of operator to a non-lvalue.
However, the problem here is not an attempt of taking address of a temporary. The problem is, again, taking address of a non-lvalue. Temporary object can be lvalue or non-lvalue depending on the expression that produces that temporary or provides access to that temporary. In your case you have std::string("test") - a functional style cast to a non-reference type, which by definition produces a non-lvalue. Hence the error.
If you wished to take address of a temporary object, you could have worked around the restriction by doing this, for example
const std::string &r = std::string("test");
&r; // this expression produces address of a temporary
whith the resultant pointer remaining valid as long as the temporary exists. There are other ways to legally obtain address of a temporary object. It is just that your specific method happens to be illegal.

When the word "shall" is used in the C++ Standard, it means "must on pain of death" - if an implementation does not obey this, it is faulty.

It is permitted in MSVC with the deprecated /Ze (extensions enabled) option. It was allowed in previous versions of MSVC. It generates a diagnostic with all warnings enabled:
warning C4238: nonstandard extension used : class rvalue used as lvalue.
Unless the /Za option is used (enforce ANSI compatibility), then:
error C2102: '&' requires l-value

&std::string("test"); is asking for the address of the return value of the function call (we'll ignore as irrelevant the fact that this function is a ctor). It didn't have an address until you assign it to something. Hence it's an error.

The C++ standard is a actually a requirement on conformant C++ implementations. At places it is written to distinguish between code that conformant implementations must accept and code for which conformant implementations must give a diagnostic.
So, in this particular case, a conformant compiler must give a diagnostic if the address of an rvalue is taken. Both compilers do, so they are conformant in this respect.
The standard does not forbid the generation of an executable if a certain input causes a diagnostic, i.e. warnings are valid diagnostics.

I'm not a standards expert, but it certainly sounds like an error to me. g++ very often only gives a warning for things that are really errors.

user defined conversion
struct String {
std::string str;
operator std::string*() {
return &str;
}
};
std::string *my_str = String{"abc"};

Related

Could a compiler optimisaton give an lvalue rather than rvalue?

The following code:
int x = 0;
x+0 = 10;
unsurprisingly produces the compiler error
lvalue required as left operand of assignment
However, is it guaranteed that all standards-conforming compilers will produce a similar error, or could a compiler legitimately treat line 2 as
x = 10;
which then would compile?
Yes, it is guaranteed you get an error (or more precisely, some diagnostic). Compiler optimization never makes ill-formed code well formed.
Yes, compilers must reject this.
A good description of value categories can be found here:
https://medium.com/#barryrevzin/value-categories-in-c-17-f56ae54bccbe
One of the takeaways is, the value category of an expression can be determined by looking at the type decltype ((expr)).
Compiler optimations do not change the types of expressions, and they generally happen after name resolution, determination of types, overload resolution etc.
Gcc is known to perform some constant folding in the front end but i would be shocked if any version of gcc compiles your example.

What makes this usage of pointers unpredictable?

I'm currently learning pointers and my professor provided this piece of code as an example:
//We cannot predict the behavior of this program!
#include <iostream>
using namespace std;
int main()
{
char * s = "My String";
char s2[] = {'a', 'b', 'c', '\0'};
cout << s2 << endl;
return 0;
}
He wrote in the comments that we can't predict the behavior of the program. What exactly makes it unpredictable though? I see nothing wrong with it.
The behaviour of the program is non-existent, because it is ill-formed.
char* s = "My String";
This is illegal. Prior to 2011, it had been deprecated for 12 years.
The correct line is:
const char* s = "My String";
Other than that, the program is fine. Your professor should drink less whiskey!
The answer is: it depends on what C++ standard you're compiling against. All the code is perfectly well-formed across all standards‡ with the exception of this line:
char * s = "My String";
Now, the string literal has type const char[10] and we're trying to initialize a non-const pointer to it. For all other types other than the char family of string literals, such an initialization was always illegal. For example:
const int arr[] = {1};
int *p = arr; // nope!
However, in pre-C++11, for string literals, there was an exception in §4.2/2:
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; [...]. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ]
So in C++03, the code is perfectly fine (though deprecated), and has clear, predictable behavior.
In C++11, that block does not exist - there is no such exception for string literals converted to char*, and so the code is just as ill-formed as the int* example I just provided. The compiler is obligated to issue a diagnostic, and ideally in cases such as this that are clear violations of the C++ type system, we would expect a good compiler to not just be conforming in this regard (e.g. by issuing a warning) but to fail outright.
The code should ideally not compile - but does on both gcc and clang (I assume because there's probably lots of code out there that would be broken with little gain, despite this type system hole being deprecated for over a decade). The code is ill-formed, and thus it does not make sense to reason about what the behavior of the code might be. But considering this specific case and the history of it being previously allowed, I do not believe it to be an unreasonable stretch to interpret the resulting code as if it were an implicit const_cast, something like:
const int arr[] = {1};
int *p = const_cast<int*>(arr); // OK, technically
With that, the rest of the program is perfectly fine, as you never actually touch s again. Reading a created-const object via a non-const pointer is perfectly OK. Writing a created-const object via such a pointer is undefined behavior:
std::cout << *p; // fine, prints 1
*p = 5; // will compile, but undefined behavior, which
// certainly qualifies as "unpredictable"
As there is no modification via s anywhere in your code, the program is fine in C++03, should fail to compile in C++11 but does anyway - and given that the compilers allow it, there's still no undefined behavior in it†. With allowances that the compilers are still [incorrectly] interpreting the C++03 rules, I see nothing that would lead to "unpredictable" behavior. Write to s though, and all bets are off. In both C++03 and C++11.
†Though, again, by definition ill-formed code yields no expectation of reasonable behavior
‡Except not, see Matt McNabb's answer
Other answers have covered that this program is ill-formed in C++11 due to the assignment of a const char array to a char *.
However the program was ill-formed prior to C++11 also.
The operator<< overloads are in <ostream>. The requirement for iostream to include ostream was added in C++11.
Historically, most implementations had iostream include ostream anyway, perhaps for ease of implementation or perhaps in order to provide a better QoI.
But it would be conforming for iostream to only define the ostream class without defining the operator<< overloads.
The only slightly wrong thing that I see with this program is that you're not supposed to assign a string literal to a mutable char pointer, though this is often accepted as a compiler extension.
Otherwise, this program appears well-defined to me:
The rules that dictate how character arrays become character pointers when passed as parameters (such as with cout << s2) are well-defined.
The array is null-terminated, which is a condition for operator<< with a char* (or a const char*).
#include <iostream> includes <ostream>, which in turn defines operator<<(ostream&, const char*), so everything appears to be in place.
You can't predict the behaviour of the compiler, for reasons noted above. (It should fail to compile, but may not.)
If compilation succeeds, then the behaviour is well-defined. You certainly can predict the behaviour of the program.
If it fails to compile, there is no program. In a compiled language, the program is the executable, not the source code. If you don't have an executable, you don't have a program, and you can't talk about behaviour of something that doesn't exist.
So I'd say your prof's statement is wrong. You can't predict the behaviour of the compiler when faced with this code, but that's distinct from the behaviour of the program. So if he's going to pick nits, he'd better make sure he's right. Or, of course, you might have misquoted him and the mistake is in your translation of what he said.
As others have noted, the code is illegitimate under C++11, although it was valid under earlier versions. Consequently, a compiler for C++11 is required to issue at least one diagnostic, but behavior of the compiler or the remainder of the build system is unspecified beyond that. Nothing in the Standard would forbid a compiler from exiting abruptly in response to an error, leaving a partially-written object file which a linker might think was valid, yielding a broken executable.
Although a good compiler should always ensure before it exits that any object file it is expected to have produced will be either valid, non-existent, or recognizable as invalid, such issues fall outside the jurisdiction of the Standard. While there have historically been (and may still be) some platforms where a failed compilation can result in legitimate-appearing executable files that crash in arbitrary fashion when loaded (and I've had to work with systems where link errors often had such behavior), I would not say that the consequences of syntax errors are generally unpredictable. On a good system, an attempted build will generally either produce an executable with a compiler's best effort at code generation, or won't produce an executable at all. Some systems will leave behind the old executable after a failed build, since in some cases being able to run the last successful build may be useful, but that can also lead to confusion.
My personal preference would be for disk-based systems to to rename the output file, to allow for the rare occasions when that executable would be useful while avoiding the confusion that can result from mistakenly believing one is running new code, and for embedded-programming systems to allow a programmer to specify for each project a program that should be loaded if a valid executable is not available under the normal name [ideally something which which safely indicates the lack of a useable program]. An embedded-systems tool-set would generally have no way of knowing what such a program should do, but in many cases someone writing "real" code for a system will have access to some hardware-test code that could easily be adapted to the purpose. I don't know that I've seen the renaming behavior, however, and I know that I haven't seen the indicated programming behavior.

Binding r-value to l-value reference is non-standard Microsoft C++ extension

I've been working on a project recently, and I decided to install ReSharper C++ to Visual Studio. When it analysed my code it spat out a bunch of new warnings (apparently I have bad coding habits..). One of them that took me a while to figure out was Binding r-value to l-value reference is non-standard Microsoft C++ extension. I've recreated the warning with the following code:
Type foo(Type t1, Type t2) {
return Type(t1.value & t2.value);
}
The expression t1.value & t2.value triggers the warning. I understand the second part of the warning, which means that my code only compiles due to a Microsoft extension, and other compilers would refuse to compile it. I'm using an overloaded operator, which returns an object (called Datum) which Type takes as a constructor parameter, as a reference (Type::Type(Datum& dat)).
With some playing around, I managed to make the warning go away by refactoring the code:
Type bar(Type t1, Type t2) {
Datum datum = t1.value & t2.value;
return Type(datum);
}
To my understanding, this is functionally equivalent to the code that generated the warning. What I'd really like to know is whether there's something here that I should be aware of, because I'm pretty confused about why one function complains and one doesn't.
I think I've got it figured out. I already had the question typed out, so I'm going to post it with what I found, for the reference of others. I don't really have enough knowledge to go into detail, so please feel free to expand on or correct my answer if it isn't satisfactory :)
That's one way to remove the warning: the variable is an lvalue, so can bind directly to the dodgy constructor's reference parameter, while the expression result is an rvalue which can't.
A better solution is to fix the constructor to take its argument by value or constant reference:
Type(Datum dat) // value
Type(Datum const & dat) // constant reference
Now you can give the argument as either an lvalue or an rvalue.
From what I can see, the reason I got such behaviour is because the Type constructor is taking a reference to the Datum object, rather than passing it by value. This causes a warning in Type foo(Type, Type) because the compiler doesn't like taking the reference of expressions, which would be due to the semantics of expression evaluation.
Again, please feel free to elaborate on or correct my findings, as this is simply the result of my experimentation and inference.

Why is returning a reference to a function local value not a compile error?

The following code invokes undefined behaviour.
int& foo()
{
int bar = 1234;
return bar;
}
g++ issues a warning:
warning: reference to local variable ‘bar’ returned [-Wreturn-local-addr]
clang++ too:
warning: reference to stack memory associated with local variable 'bar' returned [-Wreturn-stack-address]
Why is this not a compile error (ignoring -Werror)?
Is there a case where returning a ref to a local var is valid?
EDIT As pointed out, the spec mandates this be compilable. So, why does the spec not prohibit such code?
I would say that requiring this to make the program ill-formed (that is, make this a compilation error) would complicate the standard considerably for little benefit. You'd have to exactly spell out in the standard when such cases shall be diagnosed, and all compilers would have to implement them.
If you specify too little, it will not be too useful. And compilers probably already check for this to emit warnings, and real programmers compile with -Wall_you_can_give_me -Werror anyway.
If you specify too much, it will be difficult (or impossible) for compilers to implement the standard.
Consider this class (for which you only have the header and a library):
class Foo
{
int x;
public:
int& getInteger();
};
And this code:
int& bar()
{
Foo f;
return f.getInteger();
}
Now, should the standard be written to make this ill-formed or not? Probably not, what if Foo is implemented like this:
#include "Foo.h"
int global;
int& Foo::getInteger()
{
return global;
}
At the same time, it could be implemented like this:
#include "Foo.h"
int& Foo::getInteger()
{
return x;
}
Which of course would give you a dangling reference.
My point is that the compiler cannot really know whether returning a reference is OK or not, except for a few trivial cases (returning a reference to a function-scope automatic variable or parameter of non-reference type). I don't think it's worth it to complicate the standard for that. Especially as most compilers already warn about this as a quality-of-implementation matter.
Also, because you may want to get the current stack pointer (whatever that means on your particular implementation).
This function:
void* get_stack_pointer (void) { int x; return &x; };
AFAIK, it is not undefined behavior if you don't dereference the resulting pointer.
is much more portable than this one:
void* get_stack_pointer (void) {
register void* sp asm ("%esp"); return sp; }
As to why you may want to get the stack pointer: well, there are cases where you have a valid reason to get it: for instance the conservative Boehm garbage collector needs to scan the stack (so wants the stack pointer and the stack bottom).
And if you returned a C++ reference on which you would only take its address using the & unary operator, getting such an address is IIUC legal (it is IMHO the only licit operation you can do on it).
Another reason to get the stack pointer would be to get a non-NULL pointer address (which you could e.g. hash) different of any heap, local or static data. However, you could use (void*)1 or (void*)-1 for that purpose.
So the compiler is right in only warning against this.
I guess that a C++ compiler should accept
int& get_sp_ref(void) { int x; return x; }
void show_sp(void) {
std::cout << (&(get_sp_ref())) << std::endl; }
For the same reason C allows you to return a pointer to a memory block that's been freed.
It's valid according to the language specification. It's a horribly bad idea (and is nowhere close to being guaranteed to work) but it's still valid inasmuch as it's not forbidden.
If you're asking why the standard allows this, it's probably because, when references were introduced, that's the way they worked. Each iteration of the standard has certain guidelines to follow (such as minimising the possibility of "breaking changes", those that render existing well-formed programs invalid) and the standard is an agreement between user and implementer, with undoubtedly more implementers than users sitting on the committees :-)
It may be worth pushing that idea through as a potential change and seeing what ISO say but I suspect it would be considered one of those "breaking changes" and therefore very suspect.
To expand on the earlier answers, the ISO C++ standard does not capture the distinction between warnings and errors to begin with; it simply uses the term 'diagnostic' when referring to what a compiler must emit upon seeing an ill-formed program. Quoting N3337, 1.4, paragraphs 1 and 2:
The set of diagnosable rules consists of all syntactic and semantic rules in this
International Standard except for those rules containing an explicit notation that “no
diagnostic is required” or which are described as resulting in “undefined behavior.”
Although this International Standard states only requirements on C++ implementations,
those requirements are often easier to understand if they are phrased as requirements on
programs, parts of programs, or execution of programs. Such requirements have the
following meaning:
If a program contains no violations of the rules in this International Standard, a
conforming implementation shall, within its resource limits, accept and correctly execute
that program.
If a program contains a violation of any diagnosable rule or an occurrence of a
construct described in this Standard as “conditionally-supported” when the
implementation does not support that construct, a conforming implementation shall issue
at least one diagnostic message.
If a program contains a violation of a rule for which no diagnostic is required, this
International Standard places no requirement on implementations with respect to that
program.
Something not mentioned by other answers yet is that this code is OK if the function is never called.
The compiler isn't required to diagnose whether a function might ever be called or not. For example you might set up a program which looks for counterexamples to Fermat's Last Theorem, and calls this function if it finds one. It would be a mistake for the compiler to reject such a program.
Returning reference into local variable is bad idea, however some people may create code which requires that, so compiler should only warn about that and don't determine valid (valid structure) code as erroneous.
Angew already posted sample with local variable that is actually global. However there is some other (IMHO better) sample.
Object& GetSmth()
{
Object* obj = new Object();
return *obj;
}
In this case reference to local object is valid and caller after usage should dealocate memory.
IMPORTANT NOTE I don't encourage and don't recommend to use such coding style, because it is bad, usually it is hard to understand what is going on and it leads in some kind of problems like memory leaks or crashes. It is just a sample which shows why this particular situation cannot be treated as error.

Unforgiving GCC C++ compiler

MS VS x86 compiler has no problem with the following definitions, but GCC (ARM) complains. Is GCC dumb or is MSVS_x86 too clever?
bool checkPointInside(const CIwVec2& globalPoint) {
return checkPointIn(globalPoint, CIwVec2());
}; /// error: no matching function for call to 'Fair::Sprite::checkPointIn(const CIwVec2&, CIwVec2)'
bool checkPointIn(const CIwVec2& globalPoint, CIwVec2& localPoint) {
return false;
};
According to the C++ standard, you cannot bind an rvalue to a non-const reference. The Microsoft compiler has an evil extension that allows this, however. So g++ is correct not to accept your program.
g++ is right to complain and Microsoft has this wrong. The problem with this code is that the checkPointIn function takes it's second parameter by reference, meaning that it must take an lvalue (a variable, or a dereferenced pointer, for example). However, the code in checkPointInside is passing in a temporary object, which is an rvalue. For historical reasons the Microsoft compiler allows this, though it's explicitly forbidden by the spec. Usually, if you crank the warning level all the way up in the Microsoft compiler, it will indeed flag this code as erroneous.
To fix this, either have checkPointIn take its last argument by value or by const reference. The latter is probably the better choice, since const references can bind to rvalues if necessary and avoid making costly copies in other cases.
You can't create references to temporaries (only constant references or r-value references in C++0x).
This is happening when you call checkPoint with CIwVec2() as second parameter.