Based on the Rust book, the following code can cause the closure may outlive the current function error:
use std::thread;
fn main() {
let x = 1;
thread::spawn(|| {
println!("x is {}", x);
});
}
It is abstract to think when and how the closure would outlive the current function happen; can you provide any examples or specifications?
Since you moved the closure into a thread and threads may outlive the current function (they are not automatically joined an the function end, use the crossbeam crate for this kind of feature), it's just the same as moving it on the heap.
If you look at the following piece of code, you can see that moving a closure to the heap and returning it is forbidden. Since threads are basically the same thing with respect to borrowing, you can't reference anything in a thread.
fn foo() -> Box<FnOnce()> {
let x = 1;
Box::new(|| {
println!("x is {}", x);
})
}
fn main() {
let f = foo();
}
Note that the compiler gives a solution to the Problem in the error message:
help: to force the closure to take ownership of `x` (and any other referenced variables), use the `move` keyword, as shown:
| Box::new(move || {
Related
First of all, this is the most obscure problem I have ever had and thus is the hardest question to ask. I'll try make this question as articulate as possible by posting minimal code and providing some context. First of all here is the code where the problem occurs:
// Before lambda, pointer variable in question is fine
menu->onSelect = [=] ()
{
window->pushCallback('e', [=]()
{
// Here the pointer captured changes, causing a segfault later on
}
};
So now for some context:
The signature for pushCallback looks like this:
void Window::pushCallback(int key, std::function<void()> callback) {}
It stores a function to call when the key code denoted by the first argument is called, in this case 'e'.
Menu::onSelect is also a std::function, storing a single function to execute when a menu item is chosen.
Now the obscurity of this problem is that the pointer only changes on the second time stepping through the code. Note that this is an interactive ncurses program. However I have done much testing and have found the pointer changes between the two shown comment lines. Showing that the mutation occurs through capturing the variable in the nested lambdas. I have also made the pointer const in all classes that refer to it.
You cannot use automatic member capture in nested lambdas like that because the inner lambda's capture isn't performed until the outer lambda is executed. I believe that the are some differences between pre-C++14 and C++14/17 but I don't remember off-hand. I'll look into it and update my answer.
Also, I vaguely remember running into some differences between G++ and Clang, but that had to do with the compiler complaining, not with implementation differences that would cause a segfault. Again, I'll update once I can remember.
Work around, explicitly capture the variable or save it in a tmp variable.
std::shared_ptr<int> ptr; // This assumes that you are talking about shared pointers since I don't think you would have this problem with raw pointers unless the object is being destroyed before your lambda is being invoked
auto x = [ptr]()
{
// You could also create a temporary variable here to avoid ambiguity
// std::shared_ptr<int> ptr2;
return [ptr]() { return ptr; };
};
I verified that my original hypothesis is wrong. Shared points work properly in this case (but make sure you're not passing by reference or all bets are off).
auto foo(std::shared_ptr<int> p)
{
return [=](int x) { return [=](int y) { return *p + x + y; }; };
}
int main()
{
auto p = std::make_shared<int>(42);
auto func = foo(p);
std::cout << func(1)(2) << std::endl;
++(*p);
std::cout << func(1)(2) << std::endl;
return 0;
}
I recommend looking at the error in gdb and possibly setting a hardware break on the pointer if you are concerned that it is being changed.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am writing a Transpiler for educational purpose.
My transpiler transpires from my language to C language.
I am now writing closure syntax analyzer and code generation component.
I saw people saying that closures in C++ are actually transformed to unnamed structure types with captured values as variables inside.
Here is the reference.
This code
int c = 10;
auto closure = [=] () -> void {
std::cout << c << std::endl;
};
is transformed into some sort of thing like this basically under the hood, so they say.
struct UNNAMED_TYPE_0 {
int c;
void operator() () const {
std::cout << c << std::endl;
}
};
// assume the closure is initialized and variables are assigned
If someone wants to mutate that int c when the closure executes, he/she has to pass this variable as ref [&c] () -> void { /* mutation comes here */}. But the problem is if we declare int c inside a function and create that closure inside the function like this
function<void()> aFunction() {
int c = 10;
auto closure = [&c] () -> void { c = 100; }
return closure;
}
aFunction() ();
int c is captured but as soon as aFunction stack is destroyed, that int c is destroyed as well. This means, if we try to write on a deallocated address, we might run segmentation fault(core dumped) pointer error hopefully.
In Java,
// suppose this callback interface exists
public interface VoidCallback {
public void method();
}
public void aMethod() {
int c = 10;
VoidCallback callback = () -> c = 10; /* this gives an error */
// local variables referenced from a lambda expression must be final or effectively final
}
Java handles closures like this and ensures there is no mutation to closure captures (let's say implicit captures). Meaning Java passes closure captures a copy rather than the ref. For reference or class types, only the Object pointer is passed as a copy. Although the pointer reference does not mutate, you can mutate the contents inside the object that pointer points to. This is basically the same as former one.
In Objective-C,
__block int c = 0;
// they say, this `int c` is allocated in heap instead of stack
// so that it exists until the closure is finished executing.
void (^ closure) (void) = ^void() {
c = 10; // this is valid
// this actually changed the `int c`'s value
};
In Swift
var a : Int = 10;
var closure = { [] () -> Void in
a = 10; // this is valid by default
// unless `var a` is declared as `let a`
};
So, this means, Objective-C and Swift allocate primitive capture lists as pointers. So that they can be mutated.
P.S: Please note that Swift closure capture list is only for a class or ref types but I'm meaning implicit captures over primitive types here.
This is
__block int c = 0;
// they say, this `int c` is allocated in heap instead of stack
// so that it exists until the closure is finished executing.
void (^ closure) (void) = ^void() {
c = 10; // this is valid
// this actually changed the `int c`'s value
};
the same as (basically) to this
int * c = malloc(sizeof(int));
*c = 0;
void (^ closure) (void) = ^void() {
*c = 10;
if (c) {
free(c);
c = NULL;
}
};
Freeing the pointer variable as soon as the closure is done will be too bad I think.
What if there is a lot of closures that point to the variable and will mutate when executed?
What if those closures are passed around or got executed across different thread?
I came up with a solution using Reference Counting technique.
When a closure that mutate the variable is created, the variable will be retained.
When a closure that mutate the variable is destroyed, the variable will be released.
When there's no closure, the variable will be truly deallocated.
To ensure thread safety, I would lock and unlock the counter variable address as the closures manipulate the reference counting technique.
If there is another technique, please guide me.
Any explanation in any language is greatly appreciated.
Currently, I have zero knowledge of assembly language.
For moderators,
as this question is a kind of research, I beg you not to flag as too broad.
Following strikes me: "I am writing a Transpiler for educational purpose. My transpiler transpiles from my language to C language." Now, that means that the specification of your language defines how it is supposed to operate! We can't tell you how your language is supposed to operate.
Now, you already found a bunch of options:
C++ doesn't do anything special to the local variable. If you keep a reference to it and use that when it ran out of scope, bad luck. That's the C++ spirit to not put any overhead on you but to allow you to shoot yourself in the foot if you don't pay attention.
Java simply checks the code and tells you if you are trying to do anything it thinks is not guaranteed to be valid and gives you an error otherwise. It doesn't allow you to shoot yourself in the foot even if you want it badly.
The other languages seem to convert the local variable with limited scope to a heap-based one. I'm not sure about their object model, but e.g. in Python you don't have anything similar to C++ local variables or Java primitive types at all, just as you don't have deterministic destructor calls (you gain similar things using with there, just for completeness), so this doesn't make any difference. These languages impose an overhead on you in order to guarantee that you don't have any dangling references (perhaps even if you don't really need it).
Now, the first thing is to decide which one best fits into your language's object model. Only then the question comes up how to best implement it. Concerning the implementation, there are a bunch of different approaches. Using a reference counter is one (implemented with lock-free, atomic operations though), using a linked list is another, or using a garbage collector.
I have a piece of code which calls an async rdma-write. The rdma API receives a void* context which I would like to use to pass a callback to be called when the operation finishes.
void invoke_async_operation(... some stuff to capture ...) {
...
MyCallBackType* my_callback = // Create callback somehow
rdma_post_write(..., my_callback);
...
}
void on_complete(void* context) {
(*(MyCallbackType*)context)();
}
I thought using a lambda here would be best, because it will easily capture all the context which is required to the later callback invokement. However I saw in What is the lifetime of a C++ lambda expression? that a lambda lifetime is limited to the scope where it was defined.
Note that I can't copy the lambda, because context is a pointer.
What is the correct approach here? Should I insist on using lambdas and prolong their lifetime somehow, or is there a better way? Thanks.
Lifetime of a lambda
The object that represents the lamda expression and allows to invoke it, obeys indeed the usual scoping rules.
However this object can be copied (e.g passing it as argument to a function or a constructor, or assigning it to a global, or whatever else you want to do) so that the lambda can be invoked at any later point, even after the scope it was initially defined in is left.
Because of exactly this potentially long survival of lambdas, you can find quite a few questions, blogs or books that will advise on careful use of the lambda capture, especially if captured by reference, because the lambda itself (and not its anonymous proxy object) can be called even after the referred objects are destroyed.
Your callback issue
You are constraint in your design by the use of an OS callback that can only convey a raw pointer that was passed to it when the callback was set up.
The way to approach this could be to use a std::function object of the standard <functional> library. Here a small function to show you how it works:
function<void()>* preparatory_work() {
auto l = [](){ cout<< "My lambda is fine !" <<endl; } ; // lambda
function<void ()> f = l; // functor
auto p = new function<void()>(l); // a functor on the heap
l(); // inovke the lambda object
f(); // invoke the functor
(*p)(); // invoike functor via a pointer
return p;
}
Function objects are as handy to use as any other object and as easy to declare as function pointers. They are however much more powerful than function pointers, because they can refer basically to any callable object.
As you see, in the example above, I allocated a function objet with new, and returned its pointer. So you could indeed later invoke this function:
int main() {
auto fcp = preparatory_work(); // fcp is a pointer
(*fcp)();
// and even with casting as you would like
void *x = (void*)fcp;
(*(function<void()>*)x)(); // YES !!!
}
Here an online demo
Imagine the following code:
void async(connection *, std::function<void(void)>);
void work()
{
auto o = std::make_shared<O>();
async(&o->member, [] { do_something_else(); } );
}
async will, for example, start a thread using member of o which was passed as a pointer. But written like this when o is going out of scope right after async() has been called and it will be deleted and so will member.
How to solve this correctly and nicely(!) ?
Apparently one solution is to pass o to the capture list. Captures are guaranteed to not be optimized out even if not used.
async(&o->member, [o] { do_something_else(); } );
However, recent compilers (clang-5.0) include the -Wunused-lambda-capture in the -Wextra collection. And this case produces the unused-lambda-capture warning.
I added (void) o; inside the lamdba which silences this warning.
async(&o->member, [o] {
(void) o;
do_something_else();
});
Is there are more elegant way to solve this problem of scope?
(The origin of this problem is derived from using write_async of boost::asio)
Boost.Asio seems to suggest using enable_shared_from_this to keep whatever owns the "connection" alive while there are operations pending that use it. For example:
class task : std::enable_shared_from_this<task> {
public:
static std::shared_ptr<task> make() {
return std::shared_ptr<task>(new task());
}
void schedule() {
async(&conn, [t = shared_from_this()]() { t->run(); });
}
private:
task() = default;
void run() {
// whatever
}
connection conn;
};
Then to use task:
auto t = task::make();
t->schedule();
This seems like a good idea, as it encapsulates all the logic for scheduling and executing a task within the task itself.
I suggest that your async function is not optimally designed. If async invokes the function at some arbitrary point in the future, and it requires that the connection be alive at that time, then I see two possibilities. You could make whatever owns the logic that underlies async also own the connection. For example:
class task_manager {
void async(connection*, std::function<void ()> f);
connection* get_connection(size_t index);
};
This way, the connection will always be alive when async is called.
Alternatively, you could have async take a unique_ptr<connection> or shared_ptr<connection>:
void async(std::shared_ptr<connection>, std::function<void ()> f);
This is better than capturing the owner of connection in the closure, which may have unforeseen side-effects (including that async may expect the connection to stay alive after the function object has been invoked and destroyed).
Not a great answer, but...
It doesn't seem like there's necessarily a "better"/"cleaner" solution, although I'd suggest a more "self descriptive" solution might be to create a functor for the thread operation which explicitly binds the member function and the shared_ptr instance inside it. Using a dummy lambda capture doesn't necessarily capture the intent, and someone might come along later and "optimize" it to a bad end. Admittedly, though, the syntax for binding a functor with a shared_ptr is somewhat more complex.
My 2c, anyway (and I've done similar to my suggestion, for reference).
A solution I've used in a project of mine is to derive the class from enable_shared_from_this and let it leak during the asynchronous call through a data member that stores a copy of the shared pointer.
See Resource class for further details and in particular member methods leak and reset.
Once cleaned up it looks like the following minimal example:
#include<memory>
struct S: std::enable_shared_from_this<S> {
void leak() {
ref = this->shared_from_this();
}
void reset() {
ref.reset();
}
private:
std::shared_ptr<S> ref;
};
int main() {
auto ptr = std::make_shared<S>();
ptr->leak();
// do whatever you want and notify who
// is in charge to reset ptr through
// ptr->reset();
}
The main risk is that if you never reset the internal pointer you'll have an actual leak. In that case it was easy to deal with it, for the underlying library requires a resource to be explicitly closed before to discard it and I reset the pointer when it's closed. Until then, living resources can be retrieved through a proper function (walk member function of Loop class, still a mapping to something offered by the underlying library) and one can still close them at any time, therefore leaks are completely avoided.
In your case you must find your way to avoid the problem somehow and that could be a problem, but it mostly depends on the actual code and I cannot say.
A possible drawback is that in this case you are forced to create your objects on the dynamic storage through a shared pointer, otherwise the whole thing would break out and don't work.
I have a program, where I cannot use the standard std::async and threading mechanisms. Instead I have to code the program like so:
void processor( int argument, std::function<void(int)> callback ) {
int blub = 0;
std::shared_ptr<object> objptr = getObject();
// Function is called later.
// All the internal references are bound here!
auto func = [=, &blub]() {
// !This will fail since blub is accessed by reference!
blub *= 2;
// Since objptr is copied by value it works.
// objptr holds the value of getObject().
objptr->addSomething(blub);
// Finally we need to call another callback to return a value
callback(blub);
};
objptr = getAnotherObject();
// Puts func onto a queue and returns immediately.
// func is executed later.
startProcessing(func);
}
I now would like to know whether I am doing it right or what the best way of using lambdas as asynchronous callbacks is.
EDIT: Added expected behavior to the code comments.
See answer/comments for possible solutions for the problem with blub.
The function object will contain a reference to the local variable blub. As in every other situation in the language, this won't make the local variable live after the function ends.
Copies of all the other captured objects will be stored within the function object, since they're captured-by-value. This means there's no issue with them.
If you want it to live after the function ends, you cannot tie its lifetime to the function: you need dynamic storage duration. A std::unique_ptr can be used to to handle the cleanup of such an object, but it gets a bit annoying because you can't "capture-by-move" into a lambda :S
auto blub = make_unique<int>(0); // [1]
std::shared_ptr<object> objptr = getObject();
// use std::bind to store the unique_ptr with the lambda
auto func = std::bind([=](std::unique_ptr<int>& blub) {
*blub *= 2;
objptr->addSomething(*blub);
callback(*blub);
}, std::move(blub)); // move the unique_ptr into the function object
objptr = getAnotherObject();
// func is not copiable because it holds a unique_ptr
startProcessing(std::move(func)); // move it
As an added note, the old deprecated std::auto_ptr would actually work fine here, because if the lambda captures it by value it gets copied and its strange copy semantics are exactly what's needed.
1. See GOTW #102 for make_unique.