C++11 lambda cannot access reference - c++

I have a problem with a lambda function in C++: I'm trying to define an asynchronous loader that fills an array of object given a list of string as input.
The code looks like this (not exactly, but I hope you get the idea):
void loadData() {
while (we_have_data()) {
std::string str = getNext();
array.resize(array.size() + 1);
element &e = array.back();
tasks.push_back([&, str] () {
std::istringstream iss(str);
iss >> e;
}
}
for (auto task: tasks) {
task();
}
}
When at the end I scan the list of tasks and execute them, the application crashes on the first access to the variable e inside the lambda. If I run inside a debugger, I can find the right values inside the object e itself. I am doing something wrong, but I don't really understand what.

You are holding a dangling reference. When you do
tasks.push_back([&, str] () {
std::istringstream iss(str);
iss >> e;
}
You capture by reference the element returned by array.back() since a reference to e is actually a reference to whatever e refers to. Unfortunately resize is called in the while loop so when array is resized the references to back() are invalidated and you are now referring to an object that no longer exist.

The scope of element& e is the while-loop.
After every iteration of the while-loop, you have lambda functions with a captured reference to different e's, which have all gone out-of-scope.

You capture e (aka. array.back()) "by-reference" when creating the lambda, with subsequent resizes of the array (with possible reallocations), leaves a dangling reference and in turn causes an error when you attempt to access this dangling reference. Any attempt (not limited to the lambda) to access elements in the array by a previously assigned reference after the array has undergone a resize (and reallocation) will cause a "dangling reference" problem.
An alternative... instead of the two loops, why not just execute the task immediately in the while loop and forgo the dangling reference and attempting to get pointer or iterator based alternatives working.
Further alternative... if the elements in the array can be shared, a std::shared_ptr solution could work, the caveat would be to capture the shared_ptr elements by value, thus ensuring the lambda shared shares ownership of those elements with the array as it is resized.
A sampling of the idea...
void loadData() {
while (we_have_data()) {
std::string str = getNext();
array.resize(array.size() + 1);
std::shared_ptr<element> e = array.back();
tasks.push_back([e, str] () {
std::istringstream iss(str);
iss >> *e;
}
}
for (auto task: tasks) {
task();
}
}

You have two strikes against you here.
Firstly, you are capturing a reference to an iterator to a vector that is likely to be resized and thus relocated.
Secondly, you are capturing a reference to a local (stack) variable that goes out of scope. Within the loop, the compiler probably uses the same memory location for 'e' each time, so all of the references would point to the same stack location.
A simpler solution would be to store the element number:
while (we_have_data()) {
std::string str = getNext();
size_t e = array.size();
array.resize(e + 1);
tasks.push_back([&, e, str] () {
std::istringstream iss(str);
iss >> array[e];
}
}
If you have C++14 and your strings are long, you may want to consider:
tasks.push_back([&, e, str{std::move(str)}] () {
All of this assumes that array will not undergo further manipulations or go out of scope while the tasks are running.

Related

Why do 'for' and 'for_each' result in different functions being generated by iterating through array elements using lambdas?

I'm trying to better understand the interactions of lambda expressions and iterators.
What is the difference between these three snippets of code? onSelect is an std::function that is called when a component is selected.
Example 1 and 3 seem to work quite nicely. Example 2 returns the same index value, regardless of the component clicked.
My intuition is that Example 2 only results in one symbol being generated, and therefore the function only points to the first value. My question is, why would for_each result in multiple function definitions being generated, and not the normal for loop?
components[0].onSelect = [&]{ cout<<0; };
components[1].onSelect = [&]{ cout<<1; };
components[2].onSelect = [&]{ cout<<2; };
components[3].onSelect = [&]{ cout<<3; };
//And so on
vs
for (int i = 0; i < numComponents; ++i)
{
components[i].onSelect = [&]
{
cout<<components[i];
};
}
vs
int i = 0;
std::for_each (std::begin (components), std::end (components), [&](auto& component)
{
component.onSelect = [&]{
cout<<i;
});
What is the difference between these three snippets of code?
Well, only the first one is legal.
My intuition is that Example 2 only results in one symbol being generated
Each lambda expression generates a unique unnamed class type in the smallest enclosing scope. You have one block scope (inside the for loop) and one lambda expression so yes, there's only one type.
Each instance of that type (one per iteration) could differ in state, because they could all capture different values of i. They don't, though, they all capture exactly the same lexical scope by reference.
and therefore the function only points to the first value
A lambda expression is always a class type, not a function. A lambda expression with an empty capture is convertible to a free function pointer - but you don't have an empty capture. Finally, the lambda didn't capture only the first value - or any value - but an unusable reference to the variable i. Because you explicitly asked to capture by reference ([&]) instead of value.
That is, they all get the same reference to i whatever its particular value at the time they're instantiated, and i will have been set to numComponents and then gone out of scope before any of them can be invoked. So, even if it hadn't gone out of scope, referring to components[i] would almost certainly be Undefined Behaviour. But as it has gone out of scope, it is a dangling reference. This is an impressive density of bugs in a small amount of code.
Compare the minimal change:
for (int i = 0; i < numComponents; ++i) {
components[i].onSelect = [i, &components]
{
cout<<components[i];
};
}
which captures i by value, which is presumably what you really wanted, and only takes components by reference. This works correctly with no UB.
My question is, why would for_each result in multiple function definitions being generated, and not the normal for loop?
You have two nested lambda expressions in example 3, but we're only concerned with the inner one. That's still a single lambda expression in a single scope, so it's only generating one class type. The main difference is that the i to which it has (again) captured a reference, has presumably not gone out of scope by the time you try calling the lambda.
For example, if you actually wrote (and a minimal reproducible example would have shown this explicitly)
int i = 0;
std::for_each (std::begin (components), std::end (components), [&](auto& component)
{
component.onSelect = [&]{
cout<<i;
});
for (i = 0; i < numComponents; ++i)
components[i].onSelect();
then the reason it would appear to work is that i happens to hold the expected value whenever you call the lambda. Each copy of it still has a reference to the same local variable i though. You can demonstrate this by simply writing something like:
int i = 0;
std::for_each (std::begin (components), std::end (components), [&](auto& component)
{
component.onSelect = [&]{
cout<<i;
});
components[0].onSelect();
components[1].onSelect();
i = 2;
components[1].onSelect();

What does the '&' sign do when using 'auto' for iteration

recently I've encountered very very peculiar question when using auto in C++, just ... just look at the following code snippet :
my main function:
#include <list>
#include <iostream>
#include <stdio.h>
int main(){
int a = 10, b = 20, c = 30;
list<int> what;
what.push_back(a);
what.push_back(b);
what.push_back(c);
read(what);
return 0;
}
And here's function read:
void read(const list<int>& con){
for (auto it : con){
printf("%p\n", &it);
cout << it << endl;
}
return ;
}
And here's is the output :
0x7fffefff66a4
10
0x7fffefff66a4
20
0x7fffefff66a4
30
What the heck is that? Same address with different content !?
And more strange this is, if I modify the for-loop by adding an '&'
that is:
for (auto& it : con){
All the output makes sense immediately, the addresses would change by iteration
So my question is,
Why does the '&' sign make a change under this circumstance?
for (auto it : con){
Same address with different content !?
This is very typical for variables with automatic storage duration. This has nothing to do with auto in C++†. You would get the same result if you had used int:
for (int it : con){
The lifetime of it (as well as each automatic variable within the loop) is just a single iteration. Since the lifetime of the it in last iteration was ended, the next iteration can re-use the same memory, and that's why the address is the same.
Why does the '&' sign make a change under this circumstance?
Because T& declares a reference to type T. Reference variables are different from non-references (object variables). Instead of holding a value such as an object would, a reference instead "refers" to another object.
When you use the addressof operator on a reference, the result will be the address of the referred object; not the address of the reference (which might not even have an address, since it's not an object). That is why the address changes in the latter case. In this case, the references would refer to the int objects that are stored in the nodes of what (because con itself is a reference, and refers to the passed object).
† I mention in C++, because in C auto is in fact a storage class modifier that signifies automatic storage class. It has never had that meaning in standard C++, and its use obsolete in C as well. It's a vestigial keyword from the B language.
In C++, auto declares a type that will be deduced from context.
let's see the expanded version of the : loop syntax first.
for( auto it: container) {
...
}
is conceptually the same as
for( auto _it = container.begin(); _it != container.end(); it++) {
auto it = *_it;
...
}
while the reference form:
for( auto& it: container)
is the same as
for( auto _it = container.begin(); _it != container.end(); it++) {
auto &it = *_it;
...
}
So in the first case it is a copy of the items in the container, in the second case it is a (lvalue) reference of it, hence if you modify it in the second loop it affects the items in the container
The address issue too can be explained this way: in the copy example the local variable has always the same address in each loop iteration (because their lifetime do not overlap, the compiler has no reason not to use the same address in the stack), thought if you factorize the code inside a function you may observe it changing in different function invocation (because the stack size might be different), in the reference example the address is different every time, because taking the address of a reference will yield the address of the object being referenced (in this case, the item in the container)
Note that auto is standing in for int in your case. So it's a red herring. Consider
for (int i = 0; i < 10; ++i){
int j = i;
cout << (void*)&j << '\n';
}
Since j has automatic storage duration, it is most likely created each time with the same address - but points to a different value - , j is being pushed then popped from a stack on each iteration (let's set aside compiler optimisations). That is what is happening in your case with for (auto it : con){. it has automatic storage duration.
When you write
for (auto& it : con){
it is a reference to an int within the container con, so its address will differ on each iteration.

C++ Custom Stack Printing Issues

I am creating a custom class Stack to store a couple string variables. When I attempt to print the stack, it says that the stack is always empty, which is not correct. I am using vectors to represent the custom stack, so the way I go about my print method should work, but for some reason it does not. What is my error? Is it in my isEmpty method?
void stack::printStack() {
std::vector<std::string> v;
if(stack::isEmpty()) {
std::cout << "Stack is empty! " << std::endl;
}
else {
for(int i = 0; i != v.size(); i++) {
std::cout << v[i] << std::endl;
}
}
}
You are returning a copy of the underlying vector with
std::vector<std::string> stack::getVector();
This results in all calls such as stack::getVector().push_back(n); making no modifications to the stack::v, but modifying the returned temporary instead.
I don't see, why you shouldn't be using v directly in the member functions:
v.push_back(n);
Or, if you don't want to do that (for some reason), make getVector return both reference-to-const and reference-to-non-const:
std::vector<std::string> const& stack::getVector() const { return v; };
std::vector<std::string> &stack::getVector() { return v; };
Note, that you're breaking encapsulation with std::vector<std::string> & returning overload.
You are creating a local variable here:
void stack::printStack(){
std::vector<std::string> v;
That v is not the same as the v member variable. That local v is empty, thus your loop never prints.
Also, use descriptive variable names. Using a single letter variable name such as v is not a good idea.
Also, with respect to return a vector by reference or copy, see this question and answer: Returning vector copies.
So do you want to return a copy of the vector, or do you want to return the actual vector? If it is the latter, return a reference, if it's the former, then return a copy (as your current code is doing). Note that there are implications in returning a copy as opposed to returning a reference (as the link shows).

std::vector does strange thing

(Sorry if my sentances are full of mystakes, I'll do my best to write something readable) Hi, I'm working on a function that reads a file and store every line whose first char is ":" and removes every dash contained in the string. Every time this kind of line is found, push_back() is used to store this line in a vector. The problem is that, every time push_back() is used, all the elements in the vector takes the value of the last one. I don't understand why does it happen. Here's the code :
string listContent;
size_t dashPos;
vector<char*>cTagsList;
while(!SFHlist.eof())
{
getline(SFHlist,listContent);
if(listContent[0]==':')
{
listContent.erase(0,1);
dashPos = listContent.rfind("-",string::npos);
while(dashPos!=string::npos)
{
listContent.pop_back();
dashPos = listContent.rfind("-",string::npos);
}
char* c_listContent = (char*)listContent.c_str();
cTagsList.push_back(c_listContent);
}
}
I first thought it was a problem with the end of the file but aborting the searching process before reaching this point gives the same results.
the c_str()-method of std::string states:
The pointer returned may be invalidated by further calls to other member functions that modify the object.
If you're allowed to use a std::vector< std::string > instead of the vector of char*, you're fine since there would be always a copy of the std::string listContent pushed into the vector, ie.
std::string listContent;
size_t dashPos;
std::vector<std::string>cTagsList;
while(!SFHlist.eof())
{
getline(SFHlist,listContent);
if(listContent[0]==':')
{
listContent.erase(0,1);
dashPos = listContent.rfind("-",string::npos);
while(dashPos!=string::npos)
{
listContent.pop_back();
dashPos = listContent.rfind("-",string::npos);
}
cTagsList.push_back(listContent);
}
}
(I haven't tested it)

What does this C++ / C++11 construction mean?

I've got this short snippet of code. I don't understand what this construction means. I know this snippet of code reads numbers from input and counts its frequency in an unordered_map. But what is [&]? And what is the meaning of (int x)? What does the input(cin) stand for? I mean the "cin" in parentheses? And how can for_each iterate from input(cin) to empty eof parameter? I don't understand of this whole construction.
unordered_map<int,int> frequency;
istream_iterator<int> input(cin);
istream_iterator<int> eof;
for_each(input, eof, [&] (int x)
{ frequency[x]++; });
istream_iterator allows you to iteratively extract items from an istream, which you pass in to the constructor. The eof object is explained thus:
A special value for this iterator exists: the end-of-stream; When an
iterator is set to this value has either reached the end of the stream
(operator void* applied to the stream returns false) or has been
constructed using its default constructor (without associating it with
any basic_istream object).
for_each is a loop construct that takes iterator #1 and increments it until it becomes equal with iterator #2. Here it takes the iterator that wraps standard input cin and increments it (which translates to extracting items) until there is no more input to consume -- this makes input compare equal to eof and the loop ends.
The construct [&] (int x) { frequency[x]++; } is an anonymous function; it is simply a shorthand way to write functions inline. Approximately the same effect could be achieved with
unordered_map<int,int> frequency; // this NEEDS to be global now
istream_iterator<int> input(cin);
istream_iterator<int> eof;
void consume(int x) {
frequency[x]++;
}
for_each(input, eof, consume);
So in a nutshell: this code reads integers from standard input until all available data is consumed, keeping a count of each integer's appearance frequency in a map.
There are two parts of your question.
The first one concerns stream iterators. An std::istream_iterator<T> is constructed from some std::istream & s, and upon dereferencing, it behaves like { T x; s >> x; return x; }. Once the extraction fails, the iterator becomes equal to a default-constructed iterator, which serves as the "end" iterator.
Stream iterators allow you to treat a stream as a container of tokens. For example:
std::vector<int> v(std::istream_iterator<int>(std::cin),
std::istream_iterator<int>());
std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));
C++11 introduces lambda expressions, which define anonymous functions or functors (called closures). A simple one looks like this:
auto f = [](int a, int b) -> double { return double(a) / double(b); };
auto q = f(1, 2); // q == 0.5
This f could have been written as an ordinary, free function, but the free function would have had to appear at namespace scope or as a static member function of a local class. (Which is in fact what the type of the lambda expression is!) Note that the type of a lambda expression is unknowable, and can only be captured via the new auto keyword.
Lambdas become more useful when they act as complex function objects which can capture ambient state. Your example could have been written like this:
auto f = [&frequency](int x) -> void { ++frequency[x]; };
The variables that appear in between the first square brackets are captured. This lambda is equivalent to the following local class and object:
struct F
{
F(std::unordered_map<int, int> & m) : m_(m) { }
void operator()(int x) { ++m_[x]; }
private:
std::unordered_map<int, int> & m_;
} f;
Variables without & in the capture list are captured by value, i.e. a copy is made in the closure object. As a short-hand, you can say [=] or [&] to capture everything respectively by value or by reference.
This is STL std::for_each (non-C++11) iterating input until is equals to eof; calling lambda [&] (int x) { frequency[x]++; } for each value
So, this code calculate frequency of characters in istream; saving them into map