Show where temporaries are created in C++ - c++

What is the fastest way to uncover where temporaries are created in my C++ code?
The answer is not always easily deducible from the standard and compiler optimizations can further eliminate temporaries.
I have experimented with godbolt.org and its fantastic. Unfortunately it often hides the trees behind the wood of assembler when it comes to temporaries. Additionally, aggressive compiler optimization options make the assembler totally unreadable.
Any other means to accomplish this?

"compiler optimizations can further eliminate temporaries."
It seems you have a slight misunderstanding of the C++ semantics. The C++ Standard talks about temporaries to define the formal semantics of a program. This is a compact way to describe a large set of possible executions.
An actual compiler doesn't need to behave at all like this. And often, they won't. Real compilers know about registers, real compilers don't pretend that POD's have (trivial) constructors and destructors. This happens already before optimizations. I don't know of any compiler that will generate trivial ctors in debug mode.
Now some semantics described by the Standard can only be achieved by a fairly close approximation. When destructors have visible side effects (think std::cout), temporaries of those types cannot be entirely eliminated. But real compilers might implement the visible side effect while not allocating any storage. The notion of a temporary existing or not existing is a binary view, and in reality there are intermediate forms.

Due to the "as-if" rule it is probably unreliable to try to view the compilation process to see where temporaries are created.
But reading the code (and coding) while keeping in mind the following paragraph of the standard may help in finding where temporaries are created or not, [class.temporary]/2
The materialization of a temporary object is generally delayed as long as possible in order to avoid creating unnecessary temporary objects. [ Note: Temporary objects are materialized:
when binding a reference to a prvalue ([dcl.init.ref], [expr.type.conv], [expr.dynamic.cast], [expr.static.cast], [expr.const.cast], [expr.cast]),
when performing member access on a class prvalue ([expr.ref], [expr.mptr.oper]),
when performing an array-to-pointer conversion or subscripting on an array prvalue,
when initializing an object of type std​::​initializer_­list from a braced-init-list ([dcl.init.list]),
for certain unevaluated operands ([expr.typeid], [expr.sizeof]), and
when a prvalue appears as a discarded-value expression.
In this paragraph coming from the C++17 standard, the term prvalue has a new definition [basic.lval]/1:
A prvalue is an expression whose evaluation initializes an object or a bit-field, or computes the value of the operand of an operator, as specified by the context in which it appears.
And in the last standard (pre C++20), the paragraph [basic.lval] has been moved to Expressions [expr], so what we knew as value categories is evolving to become expression categories.

Related

Why (if that is the case) does the standard say that copying uninitialized memory with memcpy is UB?

When a class member cannot have a sensible meaning at the moment of construction,
I don't initialize it. Obviously that only applies to POD types, you cannot NOT
initialize an object with constructors.
The advantage of that, apart from saving CPU cycles initializing something to
a value that has no meaning, is that I can detect erroneous usage of these
variables with valgrind; which is not possible when I'd just give those variables
some random value.
For example,
struct MathProblem {
bool finished;
double answer;
MathProblem() : finished(false) { }
};
Until the math problem is solved (finished) there is no answer. It makes no sense to initialize answer in advance (to -say- zero) because that might not be the answer. answer only has a meaning after finished was set to true.
Usage of answer before it is initialized is therefore an error and perfectly OK to be UB.
However, a trivial copy of answer before it is initialized is currently ALSO UB (if I understand the standard correctly), and that doesn't make sense: the default copy and move constructor should simply be able to make a trivial copy (aka, as-if using memcpy), initialized or not: I might want to move this object into a container:
v.push_back(MathProblem());
and then work with the copy inside the container.
Is moving an object with an uninitialized, trivially copyable member indeed defined as UB by the standard? And if so, why? It doesn't seem to make sense.
Is moving an object with an uninitialized, trivially copyable member indeed defined as UB by the standard?
Depends on the type of the member. Standard says:
[basic.indet]
When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced ([expr.ass]).
If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
If an indeterminate value of unsigned ordinary character type ([basic.fundamental]) or std​::​byte type ([cstddef.syn]) is produced by the evaluation of:
the second or third operand of a conditional expression,
the right operand of a comma expression,
the operand of a cast or conversion ([conv.integral], [expr.type.conv], [expr.static.cast], [expr.cast]) to an unsigned ordinary character type or std​::​byte type ([cstddef.syn]), or
a discarded-value expression,
then the result of the operation is an indeterminate value.
If an indeterminate value of unsigned ordinary character type or std​::​byte type is produced by the evaluation of the right operand of a simple assignment operator ([expr.ass]) whose first operand is an lvalue of unsigned ordinary character type or std​::​byte type, an indeterminate value replaces the value of the object referred to by the left operand.
If an indeterminate value of unsigned ordinary character type is produced by the evaluation of the initialization expression when initializing an object of unsigned ordinary character type, that object is initialized to an indeterminate value.
If an indeterminate value of unsigned ordinary character type or std​::​byte type is produced by the evaluation of the initialization expression when initializing an object of std​::​byte type, that object is initialized to an indeterminate value.
None of the exceptional cases apply to your example object, so UB applies.
with memcpy is UB?
It is not. std::memcpy interprets the object as an array of bytes, in which exceptional case there is no UB. You still have UB if you attempt to read the indeterminate copy (unless the exceptions above apply).
why?
The C++ standard doesn't include a rationale for most rules. This particular rule has existed since the first standard. It is slightly stricter than the related C rule which is about trap representations. To my understanding, there is no established convention for trap handling, and the authors didn't wish to restrict implementations by specifying it, and instead opted to specify it as UB. This also has the effect of allowing optimiser to deduce that indeterminate values will never be read.
I might want to move this object into a container:
Moving an uninitialised object into a container is typically a logic error. It is unclear why you might want to do such thing.
The design of the C++ Standard was heavily influenced by the C Standard, whose authors (according to the published Rationale) intended and expected that implementations would, on a quality-of-implementation basis, extend the semantics of the language by meaningfully processing programs in cases where it was clear that doing so would be useful, even if the Standard didn't "officially" define the behavior of those programs. Consequently, both standards place more priority upon ensuring that they don't mandate behaviors in cases where doing so might make some implementations less useful, than upon ensuring that they mandate everything that should be supported by quality general-purpose implementations.
There are many cases where it may be useful for an implementation to extend the semantics of the language by guaranteeing that using memcpy on any valid region of storage will, at worst, behave in a fashion consistent with populating the destination with some possibly-meaningless bit pattern with no outside side effects, and few if any where it would be either easier or more useful to have it do something else. The only situations where anyone should care about whether the behavior of memcpy is defined in a particular situation involving valid regions of storage would be those in which some alternative behavior would be genuinely more useful than the commonplace one. If such situations exist, compiler writers and their customers would be better placed than the Committee to judge which behavior would be most useful.
As an example of a situation where an alternative behavior might be more useful, consider code which uses memcpy to copy a partially-written structure, and then uses it to make two copies of that structure. In some cases, having the compiler only write the parts of the two destination structures which had been written in the original may improve efficiency, but that behavior would be observably different from having the first memcpy behave as though it stores some bit pattern to its destination. Note that while such a change would not adversely affect a program's overall behavior if no copies of the uninitialized parts of the structure are ever used in a way that would affect behavior, the Standard has no nice way of distinguishing scenarios that could or could not occur under such a module, and thus leaves all such scenarios undefined.

Is it legal to convert a pointer/reference to a fixed array size to a smaller size

Is it legal as per the C++ standard to convert a pointer or reference to a fixed array (e.g. T(*)[N] or T(&)[N]) to a pointer or reference to a smaller fixed array of the same type and CV qualification (e.g. T(*)[M] or T(&)[M])?
Basically, would this always be well-formed for all instantiations of T (regardless of layout-type):
void consume(T(&array)[2]);
void receive(T(&array)[6])
{
consume(reinterpret_cast<T(&)[2]>(array));
}
I don't see any references to this being a valid conversion in:
expr.reinterpret.cast,
expr.static.cast,
conv.array, or even
basic.types
However, it appears that all major compilers accept this and generate proper code even when optimized when using T = std::string (compiler explorer)(not that this proves much, if it is undefined behavior).
It's my understanding that this should be illegal as per the type-system, since an object of T[2] was never truly created, which means a reference of T(&)[2] would be invalid.
I'm tagging this question c++11 because this is the version I am most interested in the answer for, but I would be curious to know whether this answer is different in newer versions a well.
There’s not much to say here except no, in any language version: the types are simply unrelated. C++20 does allow conversion from T (*)[N] to T (*)[] (and similarly for references), but that doesn’t mean you can treat two different Ns equivalently. The closest you’re going to get to a “reference” for this rule is [conv.array]/1 (“The result is a pointer to the first element of the array.”, which T[2] does not exist in your example) and a note in [defns.undefined] (“Undefined behavior may be expected when this document omits any explicit definition of behavior”).
Part of the reason that compilers don’t “catch” you is that such reinterpret_casts are valid to return to the real type of an object after another reinterpret_cast used to “sneak” it through an interface that expects a pointer or reference to a different type (but doesn’t use it as that type!). That means that the code as given is legitimate, but the obvious sort of definition for consume and caller for receive would together cause undefined behavior. (The other part is that optimizers often leave code alone that’s always undefined unless it can eliminate a branch.)
A late additional answer, that rather yields the quality of a comment but would exceed the allowed content amount by far:
At first: Great question! It's remarkable, that such a quite obvious issue is hard to be verified and generates a lot of confusion even among experts. Worth to mention, that I've seen code of that category quite often already...
Some words about undefined behavior first
I think at least the question about the pointer usage is a great example where one has to admit, that theoretical undefined behavior from one aspect of the language can sometimes be "beaten" by two other strong aspects:
Are there other standard clauses that reduce the degree of UB for the aspect of interest for several cases? Are there maybe clauses whose priorities within the standard are ambiguous to each other even? (There are several prominent examples still existing in C++20, see conversion-type-id handling for operator auto() for instance...).
Are there (Turing-) provable arguments, that any theoretical and practical compiler realization has to behave as you expect since there are other constraints from the language, that have to determine it that way? Saying that even if UB can quirky mean, the compiler could apply "I can do what I want here, even the biggest mess" for your case, it might be provable, that the ensuring of other specified(!) language aspects determines that to be at least effectively impossible.
So with respect to point 2, there's an often underrated aspect: What are the constraints (if definable) by the model of the abstract machine, that determine the outcome of any theoretical (compiler-) implementation for the given code?
So, many words so far, but does anything from 1) apply to your concrete case (the pointer way)?
As multiple times users mentioned within the comments, a chance for that lies here basic.types#basic.compound-4:
Two objects a and b are pointer-interconvertible if:
...
(4.4) there exists an object c such that a and c are
pointer-interconvertible, and c and b are pointer-interconvertible.
That's the simple rule of transitivity. Can we actually find such a c (for arrays)?
Within the same section, the standard says further on:
If two objects are pointer-interconvertible, then they have the same
address, and it is possible to obtain a pointer to one from a pointer
to the other via a reinterpret_­cast. [ Note: An array object and its
first element are not pointer-interconvertible, even though they have
the same address.  — end note ]
demolishing our dreams here of our approach via the pointer-to-the-first-element - usage. There isn't such a c for arrays.
Do we have another chance? You mentioned expr.reinterpret.cast#7 :
An object pointer can be explicitly converted to an object pointer of
a different type.70 When a prvalue v of type “pointer to T1” is
converted to the type “pointer to cv T2”, the result is static_cast<cv
T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout
types ([basic.types]) and the alignment requirements of T2 are no
stricter than those of T1, or if either type is void. Converting a
prvalue of type “pointer to T1” to the type “pointer to T2” (where T1
and T2 are object types and where the alignment requirements of T2 are
no stricter than those of T1) and back to its original type yields the
original pointer value. The result of any other such pointer
conversion is unspecified.
This looks promising at first glance but the devil is in the details. That solely ensures that you can apply the pointer conversion since the alignment requirements for both arrays are equal, but not refer to interconvertibility (i.e. object usage itself) a priori.
As Davis already said: with the pointer to the first element, one could still use reinterpret_cast as some kind of a fake fascade fully standard compliant as long as the wrong type pointer to T[2] is only really used as a forwarder and all actual use cases refer to the element pointer via an according reinterpret_cast and as long as all use cases "are aware" of the fact, that the actual type was a T[4]. Trivial to see, that this is still hacky as hell for many scenarios. At least a type aliasing in order to emphasize the forwarding quality would be recommended here.
So a strict interpretation of the standard here is: It's undefined behavior with the note that we all know that it should work well with all common modern compilers on many common platforms (I know, the latter was not your question).
Do we have some chances according to my point 2) about effective "weak UB" from above?
I don't think so as long as only the abstract machine is on focus here. For instance, IMO there's no restriction from the standard, a compiler/environment could not handle (abstract) allocation schemes differently between arrays of different size (changed intrinsics for threshold sizes for instance) while still ensuring alignment requirements. To be very quirky here, one could say a very exotic compiler could be allowed to refer to underlying dynamic storage duration mechanisms even for scoped objects that appear to be on that what we know as stack. Another related possible issue could be the question about proper deallocation of arrays of dynamic storage duration here (see the similar debate about UB in the context of inheritance from classes that do not provide virtual destructors). I highly doubt that it's trivial to validate, that the standard guarantees a valid cleanup here a priori, i.e. effectively calling ~T[4] for your example for all cases.

Can std::move() or its explicit equivalent on a local variable allow elision?

For example:
Big create()
{
Big x;
return std::move(x);
// return static_cast<typename std::remove_reference<T>::type&&>(t) // why not elide here?
}
Assuming that applying std::move() to return a local variable inhibits move-semantics because compilers can't make any assumptions about the inner-workings of functions in general, what about cases when those assumptions are not necessary, for example when:
std::move(x) is inlined (probably always)
std::move(x) is written as: static_cast<typename std::remove_reference<T>::type&&>(t)
According to the current Standard, an implementation is allowed to apply NRVO...
— in a return statement in a function with a class return type, when the
expression is the name of a non-volatile automatic object (other than
a function parameter or a variable introduced by the
exception-declaration of a handler (18.3)) with the same type
(ignoring cv-qualification) as the function return type, the copy/move
operation can be omitted by constructing the automatic object directly
into the function call’s return object
Obviously, neither 1) nor 2) qualify. Apart from the fact that using std::move() to return a local variable is redundant, why is this restriction necessary?
You should be clear on exactly what "allow elision" means. First of all, the compiler can do anything it wants, under the "as-if" rule. That is, the compiler can spit out any assembly it wants, as long as that assembly behaves correctly. That means that the compiler can elide any constructor it wants, but it does have to prove that the program will behave the same whether or not the constructor is called.
So why the special rules for elision? Well, these are cases where the compiler can elide constructor calls (and therefore, destructor calls too) without proving that the behavior is the same. This is very useful, because there are lots of types where the constructor is very non-trivial (like say, string), and the compilers in practice are generally not capable of proving that they are safe to elide (in a reasonable time frame) (in the past, there was even lack of clarity on whether optimizing out a heap allocation was legal to begin with, since it is basically mutation of a global variable).
So, we want to have elision for performance reasons. However, it is basically designating a special case in the standard, in terms of behavior. The bigger the special case, the more complexity we are introducing to the standard. So the goal should be to make the permitted situation for elision to be broad enough to cover the useful cases we care about, but no broader.
You are approaching this as: why not make the special case as big as practical? In reality, it is the opposite. To extend the allowable situations for elision, it needs to be shown to be very worthwhile.
After re-reading the question, I understand it differently. I read the question as 'Why std::move() inhibits (N)RVO'
Quote from standard provided in the question has wrong highlight. It should be
in a return statement in a function with a class return type, when the
expression is the name of a non-volatile automatic object (other than
a function parameter or a variable introduced by the
exception-declaration of a handler (18.3)) with the same type
(ignoring cv-qualification) as the function return type
What inhibits NRVO here is not that std::move() is called, but the fact that return value of std::move is not X, but X&&. It doesn't match the function signature!

What exactly is a 'side-effect' in C++?

Is it a standard term which is well defined, or just a term coined by developers to explain a concept (.. and what is the concept)? As I understand this has something to do with the all-confusing sequence points, but am not sure.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
For reference, some questions talking about side effects:
Is comma operator free from side effect?
Force compiler to not optimize side-effect-less statements
Side effects when passing objects to function in C++
A "side effect" is defined by the C++ standard in [intro.execution], by:
Reading an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.
The term "side-effect" arises from the distinction between imperative languages and pure functional languages. A C++ expression can do three things:
compute a result (or compute "no result" in the case of a void expression),
raise an exception instead of evaluating to a result,
in addition to 1 or 2, otherwise alter the state of the abstract machine on which the program is nominally running.
(3) are side-effects, the "main effect" being to evaluate the result of the expression. Exceptions are a slightly awkward special case, in that altering the flow of control does change the state of the abstract machine (by changing the current point of execution), but isn't a side-effect. The code to construct, handle and destroy the exception may have its own side-effects, of course.
The same principles apply to functions, with the return value in place of the result of the expression.
So, int foo(int a, int b) { return a + b; } just computes a return value, it doesn't alter anything else. Therefore it has no side-effects, which sometimes is an interesting property of a function when it comes to reasoning about your program (e.g. to prove that it is correct, or by the compiler when it optimizes). int bar(int &a, int &b) { return ++a + b; } does have a side-effect, since modifying the caller's object a is an additional effect of the function beyond simply computing a return value. It would not be permitted in a pure functional language.
The stuff in your quote about "has finished being evaluated" refers to the fact that the result of an expression (or return value of a function) can be a "temporary object", which is destroyed at the end of the full expression in which it occurs. So creating a temporary isn't a "side-effect" by that definition: other changes are.
What exactly is a 'side-effect' in C++? Is it a standard term which is well defined...
c++11 draft - 1.9.12: Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access to a volatile object is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
The significance is that, as expressions are being evaluated they can modify the program state and/or perform I/O. Expressions are allowed in myriad places in C++: variable assignments, if/else/while conditions, for loop setup/test/modify steps, function parameters etc.... A couple examples: ++x and strcat(buffer, "append this").
In a C++ program, the Standard grants the optimiser the right to generate code representing the program operations, but requires that all the operations associated with steps before a sequence point appear before any operations related to steps after the sequence point.
The reason C++ programmers tend to have to care about sequence points and side effects is that there aren't as many sequence points as you might expect. For example: given x = 1; f(++x, ++x);, you may expect a call to f(2, 3) but it's actually undefined behaviour. This behaviour is left undefined so the compiler's optimiser has more freedom to arrange operations with side effects to run in the most efficient order possible - perhaps even in parallel. It also avoid burdening compiler writers with detecting such conditions.
1.Is comma operator free from side effect?
Yes - a comma operator introduces a sequence point: the steps on the left must be complete before those on the right execute. There are a list of sequence points at http://en.wikipedia.org/wiki/Sequence_point - you should read this! (If you have to ask about side effects, then be careful in interpreting this answer - the "comma operator" is NOT invoked between function arguments, array initialisation elements etc.. The comma operator is relatively rarely used and somewhat obscure. Do some reading if you're not sure what the comma operator really is.)
2.Force compiler to not optimize side-effect-less statements
I assume you mean "side-effect-ful" statements. Compiler's are not obliged to support any such option. What behaviour would they exhibit if they tried? - the Standard doesn't define what they should do in such situations. Sometimes a majority of programmers might share an intuitive expectation, but other times it's really arbitary.
3.Side effects when passing objects to function in C++
When calling a function, all the parameters must have been completely evaluated - and their side effects triggered - before the function call takes place. BUT, there are no restrictions on the compiler related to evaluating specific parameter expressions before any other. They can be overlapping, in parallel etc.. So, in f(expr1, expr2) - some of the steps in evaluating expr2 might run before anything from expr1, but expr1 might still complete first - it's undefined.
1.9.6
The observable behavior of the abstract machine is its sequence of
reads and writes to volatile data and calls to library I/O
functions.
A side-effect is anything that affects observable behavior.
Note that there are exceptions specified by the standard, where observable behavior doesn't have to conform to that of the abstract machine - see return value optimization, temporary copy elision.

Ways to accidentally create temporary objects in C++?

Years ago I believed that C was absolutely pure compared to C++ because the compiler couldn't generate any code that you couldn't predict. I now believe counter examples include the volatile keyword and memory barriers (in multiprocessor programming or device drivers for memory-mapped hardware devices, where plain assembly language would be even more pure than the optimizations of a C compiler).
At the moment I'm trying to enumerate the unpredictable things a C++ compiler can do. The main complaint that sticks in my mind about C++ is that the compiler will implicitly instantiate temporary objects, but I believe these cases can all be expected. The cases I'm thinking of are:
when a class defines a copy constructor for a type other than itself, without using the explicit keyword
when a class defines an overloaded conversion operator: operator ()
when a function accepts an object by value instead of by reference
when a function returns an object by value instead of by reference
Are there any others?
I suppose "unpredictable" means "something in accordance with the standard but different from what the programmer expects when writing code", right?
I guess you can see from the code where objects are being instantiated or copied, even if it's maybe not obvious. It might be hard to understand though.
Some stuff is just implemented in certain ways by (all?) compiler vendors, but it could be done differently. E.g., late binding (aka. calling an overloaded, virtual method) is usually implemented using function pointers in the background. This is maybe the fastest way doing it, but I suppose it could be done differently and that would be unexpected. I don't know any compiler though that does it differently.
Lots of stuff is unexpected in the sense that C++ is overly complex - hardly anybody understands the full language. So unexpected also depends on your knowledge.
12.2 Temporary objects
1 Temporaries of class type are
created in various contexts: binding
an rvalue to a reference (8.5.3),
returning an rvalue (6.6.3), a
conversion that creates an rvalue
(4.1, 5.2.9, 5.2.11, 5.4), throwing an
exception (15.1), entering a handler
(15.3), and in some initializations
(8.5).
4 There are two contexts in which
temporaries are destroyed at a
different point than the end of the
fullexpression.
In fact I suggest take a look at the entire 12.2
At the moment I'm trying to enumerate
the unpredictable things a C++
compiler can do. The main complaint
that sticks in my mind about C++ is
that the compiler will implicitly
instantiate temporary objects, but I
believe these cases can all be
expected.
The compiler does not create temporaries implicitly -- it obeys the standard. Unless, of course, when you invoke undefined behavior. Note, that there is something called copy-elision and return value optimization which may actually reduce the number of temporaries that would otherwise be created.
An interesting link about common pitfalls related to this subject:
http://www.gotw.ca/gotw/002.htm