I am making a std::vector of callback std::functions, and I'm having a little trouble understanding the captures. They seem to be going out of scope when I try to use them if I capture by reference. If I capture by value, everything works.
The code that uses these callback functions expects a certain signature, so assuming I can't modify the code that's using these, I need to stick with capture variables instead of passing things as function arguments.
When is localVar being captured? Is it when the lambda is defined, or when it is called? Does the answer change depending on whether I capture by value or reference?
Here's a little example that I would like to understand:
#include <iostream>
#include <functional>
#include <vector>
int main(int argc, char **argv)
{
int n(5);
// make a vector of lambda functions
std::vector<std::function<const int(void)> > fs;
for(size_t i = 0; i < n; ++i){
int localVar = i;
auto my_lambda = [&localVar]()->int // change &localVar to localVar and it works
{
return localVar+100;
};
fs.push_back(my_lambda);
}
// use the vector of lambda functions
for(size_t i = 0; i < n; ++i){
std::cout << fs[i]() << "\n";
}
return 0;
}
The reference is captured when you create the lambda. The value of the referred object is never captured. When you call the lambda, it will use the reference to determine the referred object's value whenever you use it (like using any other reference). If you use the reference after the referred object ceases to exist, you are using a dangling reference, it's undefined behavior.
In this case, auto my_lambda = [&localVar]()->int creates a lambda with a reference named localVar to the local variable localVar.
std::cout << fs[i]() << "\n"; calls one of the lambdas. However, when the lambda executes return localVar+100;, it's trying to use the reference localVar to the local variable localVar(local to the first for loop) but that local variable no longer exists. You have undefined behavior.
If you drop the ampersand and take localVar by value (auto my_lambda = [localVar]()->int), you will instead capture a copy of the value as it is at the moment the lambda is created. Since it's a copy, it doesn't matter what happens to the original localVar.
You can read about this at http://en.cppreference.com/w/cpp/language/lambda#Lambda_capture
They seem to be going out of scope when I try to use them if I capture by reference
That's right. You created a lambda that encapsulates a reference to a local variable. The variable went out of scope, leaving that reference dangling. This is no different to any other reference.
Capturing "happens" at the point where you define the lambda — that is the purpose of it! If it occurred later, when you call the lambda (which time?), the things you wanted to capture would be long gone, or at least unreachable.
Capturing allows us to "save" things that we can name now, for later. But if you capture by reference, you'd better ensure the thing referred-to still exists when you come to use that reference.
Watch out for weirdnesses like this, though.
Related
I searched everywhere but could not find an answer to my question. I am trying to write an example that shows that capturing a local variable of the enclosing function by reference is dangerous because it may not exist anymore when it is actually referenced. Here's my example:
#include <iostream>
std::function<int (int)> test2(int l) {
int k = 10;
return [&] (int y) { return ++k + 100; };
}
void test(std::function<int (int)> k) {
std::cout << k(100);
}
int main() {
test(test2(100));
std::function<int (int)> func = test2(100);
test(func);
return 0;
}
I tried to reproduce stack corruption from trying to access and modify a local variable that doesn't exist on the stack frame by returning a lambda function from test2 that captures a local variable k and modifies it.
std::function<int (int)> func = test2(100);
test(func);
prints out a garbage value which indicates something went wrong as expected. However,
test(test2(100));
prints out "111". This is confusing to me as I thought when test2(100) returns a lambda function of type std::function, the stack frame for test2 will be gone, and when test is invoked, it should not be able to access the value of k. I'd appreciate any ideas or keywords I can use to search for answers.
I have run your test on my machine and the results are as expected total garbage in both cases. Having a correct answer once in a while in this capacity is very misleading. A dangling reference or pointer might occasionally point out to the same value as long as the pointed memory hasn't been occupied by a different value yet.
In a nutshell, The C++ lambdas do not extend the lifetimes of captured references/pointers shall their reference stack unwind. Same thing applies to capturing the 'this' pointer of a class. If the class goes out of scope, the 'this->' will result in a completely undefined behaviour.
Trying to figure out why no overloading-ambiguity caused in the following codes:
float foo2(float& i)
{
cout << "call from reference" << endl;
return i;
}
float foo2(float i)
{
cout << "call from non reference"<<endl;
return i;
}
int main()
{
cout<<foo2(2); // print "call from non reference"
}
The foo2 whose parameters not passed by reference is called. Why? How to call the foo2 that pass reference parameters?
The foo2 whose parameters not passed by reference is called. Why?
Because you cannot pass a constant or any computed expression by reference to a function that takes a non-constant reference. To pass an expression by reference you need an assignable value - something that can appear on the left-hand side of an assignment expression. Since 2 cannot appear on the left side of an assignment expression, it cannot be used to call a function that expects a reference. When the reference is const, you can pass anything, because C++ will create a temporary variable, assign the expression to it, and pass a reference to function taking const reference.
How to call the foo2 that pass reference parameters?
There is no obvious way of doing that, because the moment you pass a variable or another expression that can become a reference, the compiler will complain that you are making an ambiguous call:
float f;
foo2(f); // <<== This will not compile
There is a way to call it, though: you can make a function pointer that matches only one of the two function signatures, and use it to make your call:
typedef float (*fptr_with_ref)(float&);
int main()
{
cout<<foo2(2) << endl; // print "call from non reference"
fptr_with_ref foo2ptr(foo2); // Only one overload matches
float n = 10;
cout<<foo2ptr(n) << endl; // Calls foo2(float&)
}
Demo.
The 2 you gave as a parameter for foo2 is an rvalue, which cannot be a reference. Thus the function accepting a reference cannot be called with it.
Your question is in C++ terms, not in design terms. C++ supports things in a way because it makes sense in terms of design. C++ allows you to do a lot of things, but if you want to make good software with it you cannot stick to mere literal C++ rules, you need to go further. In practice that means you end up making your own rules.
In your case - well, I'd never make the reference variant if I did not actually change the variable in the function.
If you adopt such 'rules' for yourself, then you immediately see why the ref function doesn't bind: what is the point, in the first place, to change a loose constant?
If you make the right rules, you'll see that they are beautifully supported by C++. Like ... function changes an object: pass non-const ref. No change? const ref. Optional? const pointer. Take over mem management? non-const pointer.
Note that that is just the beginning, especially when multi-threading comes into play. You have to add things to the 'contract' of the function. Example for const ref: must the object stay 'alive' after the call? Can the object change during the call? And so forth.
I am passing my local-variables by reference to two lambda. I call these lambdas outside of the function scope. Is this undefined ?
std::pair<std::function<int()>, std::function<int()>> addSome() {
int a = 0, b = 0;
return std::make_pair([&a,&b] {
++a; ++b;
return a+b;
}, [&a, &b] {
return a;
});
}
int main() {
auto f = addSome();
std::cout << f.first() << " " << f.second();
return 0;
}
If it is not, however, changes in one lambda are not reflected in other lambda.
Am i misunderstanding pass-by-reference in context of lambdas ?
I am writing to the variables and it seems to be working fine with no runtime-errors with output
2 0. If it works then i would expect output 2 1.
Yes, this causes undefined behavior. The lambdas will reference stack-allocated objects that have gone out of scope. (Technically, as I understand it, the behavior is defined until the lambdas access a and/or b. If you never invoke the returned lambdas then there is no UB.)
This is undefined behavior the same way that it's undefined behavior to return a reference to a stack-allocated local and then use that reference after the local goes out of scope, except that in this case it's being obfuscated a bit by the lambda.
Further, note that the order in which the lambdas are invoked is unspecified -- the compiler is free to invoke f.second() before f.first() because both are part of the same full-expression. Therefore, even if we fix the undefined behavior caused by using references to destroyed objects, both 2 0 and 2 1 are still valid outputs from this program, and which you get depends on the order in which your compiler decides to execute the lambdas. Note that this is not undefined behavior, because the compiler can't do anything at all, rather it simply has some freedom in deciding the order in which to do some things.
(Keep in mind that << in your main() function is invoking a custom operator<< function, and the order in which function arguments are evaluated is unspecified. Compilers are free to emit code that evaluates all of the function arguments within the same full-expression in any order, with the constraint that all arguments to a function must be evaluated before that function is invoked.)
To fix the first problem, use std::shared_ptr to create a reference-counted object. Capture this shared pointer by value, and the lambdas will keep the pointed-to object alive as long as they (and any copies thereof) exist. This heap-allocated object is where we will store the shared state of a and b.
To fix the second problem, evaluate each lambda in a separate statement.
Here is your code rewritten with the undefined behavior fixed, and with f.first() guaranteed to be invoked before f.second():
std::pair<std::function<int()>, std::function<int()>> addSome() {
// We store the "a" and "b" ints instead in a shared_ptr containing a pair.
auto numbers = std::make_shared<std::pair<int, int>>(0, 0);
// a becomes numbers->first
// b becomes numbers->second
// And we capture the shared_ptr by value.
return std::make_pair(
[numbers] {
++numbers->first;
++numbers->second;
return numbers->first + numbers->second;
},
[numbers] {
return numbers->first;
}
);
}
int main() {
auto f = addSome();
// We break apart the output into two statements to guarantee that f.first()
// is evaluated prior to f.second().
std::cout << f.first();
std::cout << " " << f.second();
return 0;
}
(See it run.)
Unfortunately C++ lambdas can capture by reference but don't solve the "upwards funarg problem".
Doing so would require allocating captured locals in "cells" and garbage collection or reference counting for deallocation. C++ is not doing it and unfortunately this make C++ lambdas a lot less useful and more dangerous than in other languages like Lisp, Python or Javascript.
More specifically in my experience you should avoid at all costs implicit capture by reference (i.e. using the [&](…){…} form) for lambda objects that survive the local scope because that's a recipe for random segfaults later during maintenance.
Always plan carefully about what to capture and how and about the lifetime of captured references.
Of course it's safe to capture everything by reference with [&] if all you are doing is simply using the lambda in the same scope to pass code for example to algorithms like std::sort without having to define a named comparator function outside of the function or as locally used utility functions (I find this use very readable and nice because you can get a lot of context implicitly and there is no need to 1. make up a global name for something that will never be reused anywhere else, 2. pass a lot of context or creating extra classes just for that context).
An approach that can work sometimes is capturing by value a shared_ptr to a heap-allocated state. This is basically implementing by hand what Python does automatically (but pay attention to reference cycles to avoid memory leaks: Python has a garbage collector, C++ doesn't).
When you are going out of scope, make a copy of the locals you use with capture by value ([=]):
MyType func(void)
{
int x = 5;
//When called, local x will no longer be in scope; so, use capture by value.
return ([=] {
x += 2;
});
}
When you are in the same scope, better to use capture by reference ([&]):
void func(void)
{
int x = 5;
//When called, local x will still be in scope; safe to use capture by reference.
([&] {
x += 2;
})(); //Lambda is immediately invoked here, in the same scope as x, with ().
}
Why are captured-by-value values const, but captured-by-reference objects not:
int a;
auto compile_error = [=]()
{
a = 1;
}
auto compiles_ok = [&]()
{
a = 1;
}
To me this seem illogical but it seem to be the standard? Especially as the unwanted modification of a captured value may be an annoying bug, but chances are high that the consequences are limited to lambda scope, whereas unwanted modification of objects captured by reference will often lead to more serious effects.
So why not capture by const reference per default? Or at least support [const &] and [&]? What are the reasons for this design?
As workaround you are probably supposed to use std::cref wrapped const references captured by value?
Let's say you are capturing a pointer by value. The pointer itself is const, but access to the object it points to is not.
int i = 0;
int* p = &i;
auto l = [=]{ ++*p; };
l();
std::cout << i << std::endl; // outputs 1
This lambda is equivalent to:
struct lambda {
int* p;
lambda(int* p_) : p(p_) {}
void operator()() const { ++*p; }
};
The const on the operator()() makes usage of p equivalent to declaring it as:
int* const p;
Similar thing happens with a reference. The reference itself is "const" (in quotes because references cannot be reseated), but access to the object it refers to is not.
Captured references are also const. Or rather, references are always implicitly const -- there is no syntax in the language that allows you to change where a reference points to. a = 1; when a is a reference is not changing the reference, but changing the thing that the reference references.
When you talk about "const reference", I think you are confused. You are talking about "reference to const int" (const int &). The "const" there refers to the thing the reference points to, not the reference itself. It's analogous with pointers: with "pointer to const int" (const int *), the pointer itself is not const -- you can assign to a variable of this type all you want. A real "const pointer" would be int *const. Here, you cannot assign to something of this type; but you can modify the int it points to. Hence, the "const" for the pointer or reference is separate from the "const" for the thing it points to. You can also have a "const pointer to const int": const int *const.
My logic says the following: lambdas are just a piece of code, with optional needed references. In the case when you need to actually copy something (which usually happens for memory management purposes, such as copying a shared_ptr), you still don't really want the lambda to have its own state. That's quite an unusual situation.
I think only the 2 following options feel "right"
Enclose some code using some local variables, so that you can "pass it around"
The same as above, only add memory management, because maybe you want to execute that piece of code asynchronously or something, and the creating scope will disappear.
But when a lambda is "mutable", i.e., it captures values which aren't const, this means that it actually supports its own state. Meaning, every time you call a lambda, it could yield a different result, which isn't based on its actual closure, but again, on its internal state, which is kind of counter-intuitive in my book, considering that lambdas originate in functional languages.
However, C++, being C++, gives you a way to bypass that limitation, by also making it a bit uglier, just to make sure you're aware of the fact you're doing something strange.
I hope this reasons with you.
Is the following function safe in C++03 or C++11 or does it exhibit UB?
string const &min(string const &a, string const &b) {
return a < b ? a : b;
}
int main() {
cout << min("A", "B");
}
Is it OK to return a reference to an object passed to the function by
reference?
Is it guaranteed that the temporary string object is
not destroyed too soon?
Is there any chance that the given function
min could exhibit UB (if it does not in the given context)?
Is it possible to make an equivalent, but safe function while still avoiding
copying or moving?
Is it OK to return a reference to an object passed to the function by reference?
As long as the object isn't destroyed before you access it via that reference, yes.
Is it guaranteed that the temporary string object is not destroyed too soon?
In this case, yes. A temporary lasts until the end of the full expression which creates it, so it is not destroyed until after being streamed to cout.
Is there any chance that the given function min could exhibit UB (if it does not in the given context)?
Yes, here is an example:
auto const & r = min("A", "B"); // r is a reference to one of the temporaries
cout << r; // Whoops! Both temporaries have been destroyed
Is it possible to make an equivalent, but safe function while still avoiding copying or moving?
I don't think so; but this function is safe as long as you don't keep hold of a reference to its result.
Your temporary objects will stay "alive" till the end of the ; from the cout in main, so this way of using it is safe.
Yes, it's safe. The string temps for both "A" and "B" will survive until the end of the 'sequence point', that is the semicolon.
Is it guaranteed that the temporary string object is not destroyed too
soon
For your specific case yes, BUT for the following code no
int main() {
const string &tempString(min("A", "B"));
cout << tempString;
}
Beside that I agree with what "Mike Seymour" said.
yes it is safe to pass reference through function because in the changes always made in the arguments which you pass by value and in the above code string temps for both A and B will survive until the end of the 'sequence point' that is the semicolon but in case of pass by reference the changes made in copy of that argument not in a original copy.