C++11 lambdas: member variable capture gotcha - c++

Consider this code:
#include <memory>
#include <iostream>
class A
{
public:
A(int data) : data_(data)
{ std::cout << "A(" << data_ << ")" << std::endl; }
~A() { std::cout << "~A()" << std::endl; }
void a() { std::cout << data_ << std::endl; }
private:
int data_;
};
class B
{
public:
B(): a_(new A(13)) { std::cout << "B()" << std::endl; }
~B() { std::cout << "~B()" << std::endl; }
std::function<void()> getf()
{
return [=]() { a_->a(); };
}
private:
std::shared_ptr<A> a_;
};
int main()
{
std::function<void()> f;
{
B b;
f = b.getf();
}
f();
return 0;
}
Here it looks like I'm capturing a_ shared pointer by value, but when I run it on Linux (GCC 4.6.1), this is printed:
A(13)
B()
~B()
~A()
0
Obviously, 0 is wrong, because A is already destroyed. It looks like this is actually captured and is used to look up this->a_. My suspicion is confirmed when I change the capture list from [=] to [=,a_]. Then the correct output is printed and the lifetime of the objects is as expected:
A(13)
B()
~B()
13
~A()
The question:
Is this behaviour specified by the standard, implementation-defined, or undefined? Or I'm crazy and it's something entirely different?

Is this behaviour specified by the standard
Yes. Capturing member variables is always done via capturing this; it is the only way to access a member variable. In the scope of a member function a_ is equivalent to (*this).a_. This is true in Lambdas as well.
Therefore, if you use this (implicitly or explicitly), then you must ensure that the object remains alive while the lambda instance is around.
If you want to capture it by value, you must explicitly do so:
std::function<void()> getf()
{
auto varA = a_;
return [=]() { varA->a(); };
}
If you need a spec quote:
The lambda-expression’s compound-statement yields the function-body ( 8.4 ) of the function call operator, but for purposes of name lookup (3.4), determining the type and value of this (9.3.2) and transforming id-expressions referring to non-static class members into class member access expressions using (*this) ( 9.3.1 ),
the compound-statement is considered in the context of the lambda-expression.

Related

Overwriting object with new object of same type and using closure using this

In the following code an object is overwritten with a new object of same type, where a lambda-expression creates a closure that uses this of the old object. The old address (this) remains the same, the new object has the same layout, so this should be ok and not UB. But what about non trivial objects or other cases?
struct A {
void g(A& o, int v) {
o = A{.x = v, .f = [this]{
std::cout << "f" << this->x << '\n';
}};
}
int x{0};
std::function<void()> f;
~A() {
std::cout << "dtor" << x << '\n';
}
};
void test() {
A a;
a.g(a, 2);
a.f();
}
You are not actually replacing any object. You are just assigning from another object to the current one. o = simply calls the implicit copy assignment operator which will copy-assign the individual members from the temporary A constructed in the assignment expression with A{...}.
The lambda is going to capture this from this in g, not from the temporary object.
std::function will always keep a copy of the lambda referring to the original object on which g was called and since that is its parent object, it cannot outlive it.
So there is no problem here. The only exception would be that you call f during the destruction of the A object, in which case using the captured pointer may be forbidden.
Here is a slightly modified code with a corner case. I create a temporary in a function and call g on it passing it a more permanent object. The temporary vanishes and the long life object now has a closure refering to an object after its end of life. Invoking f is UB:
#include <iostream>
#include <functional>
struct A {
void g(A& o, int v) {
o = A{ .x = v, .f = [this] {
std::cout << "f" << this->x << ' ' << this << '\n';
} };
}
int x{ 0 };
std::function<void()> f;
~A() {
std::cout << "dtor" << x << ' ' << this << '\n';
}
};
void test(A& a) {
A b{ 2 };
b.g(a, 3);
}
int main() {
A a{ 1 };
std::cout << a.x << '\n';
test(a);
std::cout << a.x << '\n';
a.f(); // UB because a.f uses an object after its end of life
}
The output is:
1
dtor3 0135F9C0
dtor2 0135FA30
3
f341072 0135FA30
dtor3 0135FAA8
proving that the invocation of a.f() tried to use the object at address 0135FA30 (in that specific run) after it has been destroyed.

How can I access the `typeid` of a captured this pointer in a lambda?

I have the following code:
#include <iostream>
class Bobo
{public:
int member;
void function()
{
auto lambda = [this]() { std::cout << member << '\n'; };
auto lambda2 = [this]() { std::cout << typeid(*this).name() << '\n'; };
lambda();
lambda2();
}
};
int main()
{
Bobo bobo;
bobo.function();
}
The line std::cout << typeid(*this).name(); in lambda2() understandably prints out:
class <lambda_49422032c40f80b55ca1d0ebc98f567f>
However how can I access the 'this' pointer that's been captured so the typeid operator can return type class Bobo?
Edit: The result I get is from compiling this code in Visual Studio Community 2019.
This seems to be VS's bug; when determining the type of this pointer in lambda:
For the purpose of name lookup, determining the type and value of the
this pointer and for accessing non-static class members, the body of
the closure type's function call operator is considered in the context
of the lambda-expression.
struct X {
int x, y;
int operator()(int);
void f()
{
// the context of the following lambda is the member function X::f
[=]()->int
{
return operator()(this->x + y); // X::operator()(this->x + (*this).y)
// this has type X*
};
}
};
So the type of this should be Bobo* in the lambda.
As #songyuanyao suggests, your could should work and produce the appropriate typeid, so it's probably a bug. But - here's a workaround for you:
#include <iostream>
class Bobo
{public:
int member;
void function() {
auto lambda = [this]() { std::cout << member << '\n'; };
auto lambda2 = [my_bobo = this]() {
std::cout << typeid(std::decay_t<decltype(*my_bobo)>).name() << '\n';
};
lambda();
lambda2();
}
};
int main() {
Bobo bobo;
bobo.function();
}
Note that you can replaced typeid(...).name() with the proper type name, obtained (at compile-time!) as per this answer:
std::cout << type_name<std::decay_t<decltype(*my_bobo)>>() << '\n';

How to write a constructor for non-named class [duplicate]

Is there a way to declare a constructor or a destructor in an unnamed class? Consider the following
void f()
{
struct {
// some implementation
} inst1, inst2;
// f implementation - usage of instances
}
Follow up question : The instances are ofcourse constructed (and destroyed) as any stack based object. What gets called? Is it a mangled name automatically assigned by the compiler?
The simplest solution is to put a named struct instance as a member in the unnamed one, and put all of the functionality into the named instance. This is probably the only way that is compatible with C++98.
#include <iostream>
#include <cmath>
int main() {
struct {
struct S {
double a;
int b;
S() : a(sqrt(4)), b(42) { std::cout << "constructed" << std::endl; }
~S() { std::cout << "destructed" << std::endl; }
} s;
} instance1, instance2;
std::cout << "body" << std::endl;
}
Everything that follows requires C++11 value initialization support.
To avoid the nesting, the solution for the construction is easy. You should be using C++11 value initialization for all members. You can initialize them with the result of a lambda call, so you can really execute arbitrarily complex code during the initialization.
#include <iostream>
#include <cmath>
int main() {
struct {
double a { sqrt(4) };
int b { []{
std::cout << "constructed" << std::endl;
return 42; }()
};
} instance1, instance2;
}
You can of course shove all the "constructor" code to a separate member:
int b { [this]{ constructor(); return 42; }() };
void constructor() {
std::cout << "constructed" << std::endl;
}
This still doesn't read all that cleanly, and conflates the initialization of b with other things. You could move the constructor call to a helper class, at the cost of the empty class still taking up a bit of space within the unnamed struct (usually one byte if it's the last member).
#include <iostream>
#include <cmath>
struct Construct {
template <typename T> Construct(T* instance) {
instance->constructor();
}
};
int main() {
struct {
double a { sqrt(4) };
int b { 42 };
Construct c { this };
void constructor() {
std::cout << "constructed" << std::endl;
}
} instance1, instance2;
}
Since the instance of c will use some room, we might as well get explicit about it, and get rid of the helper. The below smells of a C++11 idiom, but is a bit verbose due to the return statement.
struct {
double a { sqrt(4) };
int b { 42 };
char constructor { [this]{
std::cout << "constructed" << std::endl;
return char(0);
}() };
}
To get the destructor, you need the helper to store both the pointer to an instance of the wrapped class, and a function pointer to a function that calls the destructor on the instance. Since we only have access to the unnamed struct's type in the helper's constructor, it's there that we have to generate the code that calls the destructor.
#include <iostream>
#include <cmath>
struct ConstructDestruct {
void * m_instance;
void (*m_destructor)(void*);
template <typename T> ConstructDestruct(T* instance) :
m_instance(instance),
m_destructor(+[](void* obj){ static_cast<T*>(obj)->destructor(); })
{
instance->constructor();
}
~ConstructDestruct() {
m_destructor(m_instance);
}
};
int main() {
struct {
double a { sqrt(4) };
int b { 42 };
ConstructDestruct cd { this };
void constructor() {
std::cout << "constructed" << std::endl;
}
void destructor() {
std::cout << "destructed" << std::endl;
}
} instance1, instance2;
std::cout << "body" << std::endl;
}
Now you're certainly complaining about the redundancy of the data stored in the ConstructDestruct instance. The location where the instance is stored is at a fixed offset from the head of the unnamed struct. You can obtain such offset and wrap it in a type (see here). Thus we can get rid of the instance pointer in the ConstructorDestructor:
#include <iostream>
#include <cmath>
#include <cstddef>
template <std::ptrdiff_t> struct MInt {};
struct ConstructDestruct {
void (*m_destructor)(ConstructDestruct*);
template <typename T, std::ptrdiff_t offset>
ConstructDestruct(T* instance, MInt<offset>) :
m_destructor(+[](ConstructDestruct* self){
reinterpret_cast<T*>(reinterpret_cast<uintptr_t>(self) - offset)->destructor();
})
{
instance->constructor();
}
~ConstructDestruct() {
m_destructor(this);
}
};
#define offset_to(member)\
(MInt<offsetof(std::remove_reference<decltype(*this)>::type, member)>())
int main() {
struct {
double a { sqrt(4) };
int b { 42 };
ConstructDestruct cd { this, offset_to(cd) };
void constructor() {
std::cout << "constructed " << std::hex << (void*)this << std::endl;
}
void destructor() {
std::cout << "destructed " << std::hex << (void*)this << std::endl;
}
} instance1, instance2;
std::cout << "body" << std::endl;
}
Unfortunately, it doesn't seem possible to get rid of the function pointer from within ConstructDestruct. This isn't that bad, though, since its size needs to be non-zero. Whatever is stored immediately after the unnamed struct is likely to be aligned to a multiple of a function pointer size anyway, so there may be no overhead from the sizeof(ConstructDestruct) being larger than 1.
You can not declare a constructor or destructor for an unnamed class because the constructor and destructor names need to match the class name. In your example, the unnamed class is local. It has no linkage so neither mangled name is created.
If you are thinking of C++ names, then any class that has objects has to have a destructor whether you create it explicitly or not. So yes, the compiler knows how to assign a name. Whether that naming convention is any of your business, however, probably not.
Actually, you can create a structure or also a namespace without a name. You still need to have names somewhere because at the time you link all of that, the linker needs some kind of a name to make it all work, although in many cases it will be local names that are resolved immediately at compile time--by the assembler.
One way to know of the names assigned by the compiler is to look at the debug strings and see what corresponds to the different addresses that you're interested in. When you compile with -g then you should get all the necessary debug for your debugger to place your current at the right place with the right "names"... (I've see the namespaces without a name it says " namespace", I'm pretty sure structures use the same trick at a higher level.)

C++ bind reference member to constructor parameter

When running the following code on gcc 8 (https://wandbox.org/, with "g++ prog.cc -Wall -Wextra -std=c++1z"):
#include <iostream>
class B{
public:
B(): copy(false){ std::cout << "B-constructed" << std::endl;}
B(const B& b): copy(true){ std::cout << "B-copy-constructed" << std::endl; }
~B(){ std::cout << (copy?"B-destructed":"B-(copy)-destructed") << std::endl;}
bool copy;
};
class A{
public:
A(B b): bref(b){std::cout << "A-constructed" << std::endl;}
~A() {std::cout << "A-destructed" << std::endl;}
B &bref;
};
void f(){
B b;
A a(b);
std::cout << "f over" << std::endl;
}
int main()
{
f();
std::cout << "main over" << std::endl;
return 0;
}
the following output is yielded:
B-constructed
B-copy-constructed
A-constructed
B-destructed
f over
A-destructed
B-(copy)-destructed
main over
The order of object destructions seems unusual. It is as if the lifetime of the constructor's parameter is extended. Does the standard say anything about binding member references to constructor parameters?
I don't think that this quote from the standard applies, as the parameter is not a temporary object (however I do not know the definition of a "temporary expression"):
A temporary expression bound to a reference member in a mem-initializer is ill-formed. [ Example:
struct A {
A() : v(42) { } // error
const int& v;
};
—end example ]
Your destructor has a logical error, since you print that a copy is destructed when copy is wrong.
Change this:
~B(){ std::cout << (copy?"B-destructed":"B-(copy)-destructed") << std::endl;}
to this:
~B(){ std::cout << (copy?"B-(copy)-destructed":"B-destructed") << std::endl;}
which now outputs:
B-constructed
B-copy-constructed
A-constructed
B-(copy)-destructed
f over
A-destructed
B-destructed
main over
nice and clear (Order of member constructor and destructor calls).
Does the standard say anything about binding member references to constructor parameters?
Similarly, before the lifetime of an object has started but after the
storage which the object will occupy has been allocated or, after the
lifetime of an object has ended and before the storage which the
object occupied is reused or released, any glvalue that refers to the
original object may be used but only in limited ways. For an object
under construction or destruction, see [class.cdtor]. Otherwise, such
a glvalue refers to allocated storage
([basic.stc.dynamic.deallocation]), and using the properties of the
glvalue that do not depend on its value is well-defined.
Source

Would this restrict the class to be have a lifetime in the current frame only?

I wanted to restrict a specific class to be creatable on the stack only (not via allocation). The reason for this is that on the stack, the object which lifetime has begun last, will be the first to be destroyed, and I can create a hierarchy. I did it like this:
#include <cstddef>
#include <iostream>
class Foo {
public:
static Foo createOnStack() {
return {};
}
~Foo () {
std::cout << "Destructed " << --i << std::endl;
}
protected:
static int i;
Foo () {
std::cout << "Created " << i++ << std::endl;
}
Foo (const Foo &) = delete;
};
int Foo::i = 0;
The constructor normally should push the hierarchy stack, and the destructor pops it. I replaced it here for proof of concept. Now, the only way you can use such an object is by storing it in a temporary reference like this:
int main() {
Foo && a = Foo::createOnStack();
const Foo& b = Foo::createOnStack();
return 0;
}
My question now is, how safe is this with the C++ standard? Is there still a way to legally create a Foo on the heap or hand it down from your function into another frame (aka return it from your function) without running into undefined behaviour?
EDIT: link to example https://ideone.com/M0I1NI
Leaving aside the protected backdoor, C++17 copy elision breaks this in two ways:
#include<iostream>
#include<memory>
struct S {
static S make() {return {};}
S(const S&)=delete;
~S() {std::cout << '-' << this << std::endl;}
private:
S() {std::cout << '+' << this << std::endl;}
};
S reorder() {
S &&local=S::make();
return S::make();
}
int main() {
auto p=new S(S::make()),q=new S(S::make()); // #1
delete p; delete q;
reorder(); // #2
}
The use of new is obvious and has been discussed.
C++17 also allows prvalues to propagate through stack frames, which means that a local can get created before a return value and get destroyed while that return value is alive.
Note that the second case already existed (formally in C++14 and informally long before) in the case where local is of type S but the return value is some other (movable) type. You can't assume in general that even automatic object lifetimes nest properly.