Auto in lambda bug in g++-5? - c++

The following code, contains two identical lambdas, but when compiled with g++5 produces different answers. The lambda which uses the auto keyword in the argument declaration compiles fine, but returns zero instead of the correct count of 1. Why? I should add the code produces the correct output with g++-6.
g++-5 -std=c++14 file.cc
./a.out
Output:
f result=0 (incorrect result from lambda f)
...
g result=1 (correct result from lambda g)
...
#include<iostream>
#include<set>
#include<vector>
#include<algorithm>
using namespace std;
enum obsMode { Hbw, Lbw, Raw, Search, Fold};
int main(int , char **)
{
static set<obsMode> legal_obs_modes = {Hbw, Lbw, Raw, Search, Fold};
vector<obsMode> obs_mode = { Hbw,Lbw,Hbw,Lbw};
// I named the lambdas to illustrate the issue
auto f = [&] (auto i) -> void
{
cout << "f result=" << legal_obs_modes.count(i) << endl;
};
auto g = [&] (obsMode i) -> void
{
cout << "g result=" << legal_obs_modes.count(i) << endl;
};
// f does not work
for_each(obs_mode.begin(), obs_mode.end(), f);
// g does work
for_each(obs_mode.begin(), obs_mode.end(), g);
return 0;
}

A somewhat deep dive into the issue shows up what the problem is.
It seems to exist all the way from 5.1 up to 6.1 (according to Compiler Explorer), and I notice this targeted-for-fix-in-6.2 bug report which may be related. The errant code was:
#include <iostream>
#include <functional>
int main() {
static int a;
std::function<void(int)> f = [](auto) { std::cout << a << '\n'; };
a = 1;
f(0);
}
and it printed 0 rather than the correct 1. Basically, the use of statics and lambdas caused some troubles in that a static variable was made available to the lambda, at the time of lambda creation, as a copy. For that particular bug report, it meant that static variable always seemed to have the value it had when the lambda was created, regardless of what you'd done with it in the meantime.
I originally thought that this couldn't be related since the static in this question was initialised on declaration, and never changed after lambda creation. However, if you place the following line before creating the lambdas and as the first line in each lambda, and compile (again, on Compiler Explorer) with x86-64 6.1 with options --std=c++14:
cout << &legal_obs_modes << ' ' << legal_obs_modes.size() << '\n';
then you'll see something very interesting (I've reformatted a little for readability):
0x605220 5
0x605260 0 f result=0
0x605260 0 f result=0
0x605260 0 f result=0
0x605260 0 f result=0
0x605220 5 g result=1
0x605220 5 g result=1
0x605220 5 g result=1
0x605220 5 g result=1
The failing f ones have a size of zero rather than five, and a totally different address. The zero size is indication enough that count will return zero, simply because there are no elements in an empty set. I suspect the different address is a manifestation of the same problem covered in the linked bug report.
You can actually see this in the Compiler Explorer assembler output, where the two different lambdas load up a different set address:
mov edi, 0x605260 ; for f
mov edi, 0x605220 ; for g
Making the set automatic instead of static causes the problem to go away entirely. The address is the same within both lambdas and outside of them, 0x7ffd808eb050 (on-stack rather than in static area, hence the vastly changed value). This tends to gel with the fact that statics aren't actually captured in lambdas, because they're always supposed to be at the same address, so can just be used as-is.
So, the problem appears to be that the f lambda, with its auto-deduced parameter, is making a copy of the static data instead of using it in-situ. And I don't mean a good copy, I mean one akin to being from photo-copier that ran out of toner sometime in 2017 :-)
So, in answer to your specific question about whether this was a bug or not, I think the consensus would be a rather emphatic yes.

Related

Is Lambda Expression just capture objects before it? [duplicate]

I have a Visual Studio 2010 C++ program, the main function of which is:
vector<double> v(10);
double start = 0.0; double increment = 10.0;
auto f = [&start, increment]() { return start += increment; };
generate(v.begin(), v.end(), f);
for(auto it = v.cbegin(); it != v.cend(); ++it) { cout << *it << ", "; }
cout << endl << "Changing vars to try again..." << endl;
start = 15; increment = -1.5;
generate(v.begin(), v.end(), f);
for(auto it = v.cbegin(); it != v.cend(); ++it) { cout << *it << ", "; }
return 0;
When I compile this in MS Visual Studio, the first generate does what I expected, resulting in "10, 20, ... 100, ". The second does not; the lambda "sees" the change in start but not the change in increment, so I get "25, 35, ... 115, ".
MSDN explains that
The Visual C++ compiler binds a lambda expression to its captured variables when the expression is declared instead of when the expression is called. ... [T]he reassignment of [a variable captured by value] later in the program does not affect the result of the expression.
So my question is: is this standards-compliant C++11 behavior, or is it Microsoft's own eccentric implementation? Bonus: if it is standard behavior, why was the standard written that way? Does it have to do with enforcing referential transparency for functional programming?
With a lambda expression, the bound variables are captured at the time of declaration.
This sample will make it very clear: https://ideone.com/Ly38P
std::function<int()> dowork()
{
int answer = 42;
auto lambda = [answer] () { return answer; };
// can do what we want
answer = 666;
return lambda;
}
int main()
{
auto ll = dowork();
return ll(); // 42
}
It is clear that the capture must be happening before the invocation, since the variables being captured don't even exist (not in scope, neither in lifetime) anymore at a later time.
It's bound at creation time. Consider:
#include <functional>
#include <iostream>
std::function<int(int)> foo;
void sub()
{
int a = 42;
foo = [a](int x) -> int { return x + a; };
}
int main()
{
sub();
int abc = 54;
abc = foo(abc); // Note a no longer exists here... but it was captured by
// value, so the caller shouldn't have to care here...
std::cout << abc; //96
}
There's no a here when the function is called -- there'd be no way for the compiler to go back and update it. If you pass a by reference, then you have undefined behavior. But if you pass by value any reasonable programmer would expect this to work.
I think you are confusing the mechanism of capture with the mechanism of variable passing. They are not the same thing even if they bear some superficial resemblance to one another. If you need the current value of a variable inside a lambda expression, capture it by reference (though, of course, that reference is bound to a particular variable at the point the lambda is declared).
When you 'capture' a variable, you are creating something very like a closure. And closures are always statically scoped (i.e. the 'capture' happens at the point of declaration). People familiar with the concept of a lambda expression would find C++'s lambda expressions highly strange and confusing if it were otherwise. Adding a brand new feature to a programming language that is different from the same feature in other programming languages in some significant way would make C++ even more confusing and difficult to understand than it already is. Also, everything else in C++ is statically scoped, so adding some element of dynamic scoping would be very strange for that reason as well.
Lastly, if capture always happened by reference, then that would mean a lambda would only be valid as long as the stack frame was valid. Either you would have to add garbage collected stack frames to C++ (with a huge performance hit and much screaming from people who are depending on the stack being largely contiguous) or you would end up creating yet another feature where it was trivially easy to blow your foot off with a bazooka by accident as the stack frame referenced by a lambda expression would go out of scope and you'd basically be creating a lot of invisible opportunities to return local variables by reference.
Yes, it has to capture by value at the point because otherwise you could attempt to capture a variable (by reference for example) that no longer exists when the lambda/function is actually called.
The standard supports capturing both by value AND by reference to address both possible use cases. If you tell the compiler to capture by value it's captured at the point the lambda is created. If you ask to capture by reference, it will capture a reference to the variable which will then be used at the point the lambda is called (requiring of course that the referenced variable must still exist at the point the call is made).

Lambda captures unexpected variables

I was trying to figure out how lambda works in C++.
And something strange happened. It's so weird that I don't know how to describe it correctly. I tried googling several keywords, but didn't find anything mentioned the behavior.
I first tried this code.
#include <iostream>
#include <utility>
using namespace std ;
auto func() {
int a = 0 ;
auto increase = [ &a ]( int i = 1 ){ a += i ; } ;
auto print = [ &a ](){ cout << a << '\n' ; } ;
pair< decltype(increase), decltype(print) >
p = make_pair( increase, print ) ;
return p ;
}
int main() {
auto lambdas = func() ;
auto increase = lambdas.first ;
auto print = lambdas.second ;
print() ;
increase() ;
print() ;
increase( 123456 ) ;
print() ;
return 0;
}
The output is as expected as
-1218965939
-1218965938
-1218842482
However, after I add this into the 'func()'
cout << typeid( decltype( print ) ).name() << '\n'
<< typeid( decltype( increase ) ).name() << '\n' ;
like this one
the output became
Z4funcvEUlvE0_
Z4funcvEUliE_
0
1
123457
I did not expect to happen.
[UPDATE]
The variable a should have be "dead" because its life-cycle was ended.
But I'm curious why the code exams typeid and decltype cause a seemed to be resurrected?
You are binding to a by reference. But this is a local variable which gets stored on the stack. It's undefined behavior to access it once the function finishes executing.
It's the same as if you returned a pointer to a and then started using it from the caller.
None of the output from your program is "as expected".
The lambdas in func() capture by reference a locally-scoped variable that goes out of scope as soon as func() returns.
After func() returns, a no longer exists, like any other function-local scope object. As such their captured references are now referenced to an object that went out of scope and got destroyed, and any usage of the referecend value becomes undefined behavior.
Worse, the code also sets the value via the no-longer-valid reference. On traditional implementation, this will scribble over some random part of the stack, which can lead to the entire process crashing.
Pure chance.
As I suspect you know, you are printing unspecified values through a dangling reference.
In your first example, the dangling reference tries to "read" from a memory location that has since been re-used for something else.
In your second example, the couts and/or typeids have affected the bloody guts of the implementation of your compiled program such that the memory location of a happens to be untouched by the time you illegally print its value.
But there is no point in trying to rationalise about this any further, and you could get a different result the next time you run the program. Or your computer could explode. Or the timeline could be altered such that you had never been born. Don't try to explain the symptoms of UB — just avoid it.

Weird Behaviour with const_cast [duplicate]

This question already has answers here:
Two different values at the same memory address
(7 answers)
Closed 5 years ago.
I know that using const_cast is generally bad idea, but I was playing around with it and I came across a weird behaviour, where:
Two pointers have the same address value, yet when de-referenced, give different data values.
Does anyone have an explanation for this?
Code
#include <iostream>
int main()
{
const int M = 10;
int* MPtr = const_cast<int*>(&M);
(*MPtr)++;
std::cout << "MPtr = " << MPtr << " (*MPtr) = " << (*MPtr) << std::endl;
std::cout << " &M = " << &M << " M = " << M << std::endl;
}
Output
MPtr = 0x7fff9b4b6ce0 (*MPtr) = 11
&M = 0x7fff9b4b6ce0 M = 10
The program has undefined bahaviour because you may not change a const object.
From the C++ Standard
4 Certain other operations are described in this International
Standard as undefined (for example, the effect of attempting to modify
a const object). [ Note: This International Standard imposes no
requirements on the behavior of programs that contain undefined
behavior. —end note ]
So, aside from the "it's undefined behaviour" (which it is), the compiler is perfectly fine to use the fact that M is a constant, thus won't change, in the evaluation of cout ... << M << ..., so can use an instruction that has the immediate value 10, instead of the actual value stored in the memory of M. (Of course, the standard will not say how this works, more than "it's undefined", and compilers are able to choose different solutions in different circumstances, etc, etc, so it's entirely possible that you'll get different results if you modify the code, use a different compiler, different version of compiler or the wind is blowing in a different direction).
Part of the tricky bit with "undefined behaviour" is that it includes things that are "perfectly what you may expect" as well as "nearly what you'd expect". The compiler could also decide to start tetris if it discovers this is what you are doing.
And yes, this is very much one of the reasons why you SHOULD NOT use const_cast. At the very least NOT on things that were originally const - it's OK if you have something along these lines:
int x;
void func(const int* p)
{
...
int *q = const_cast<int *>(p);
*q = 7;
}
...
func(&x);
In this case, x is not actually const, it just becomes const when we pass it to func. Of course, the compiler may still assume that x is not changed in func, and thus you could have problems....

In C++11, when are a lambda expression's bound variables supposed to be captured-by-value?

I have a Visual Studio 2010 C++ program, the main function of which is:
vector<double> v(10);
double start = 0.0; double increment = 10.0;
auto f = [&start, increment]() { return start += increment; };
generate(v.begin(), v.end(), f);
for(auto it = v.cbegin(); it != v.cend(); ++it) { cout << *it << ", "; }
cout << endl << "Changing vars to try again..." << endl;
start = 15; increment = -1.5;
generate(v.begin(), v.end(), f);
for(auto it = v.cbegin(); it != v.cend(); ++it) { cout << *it << ", "; }
return 0;
When I compile this in MS Visual Studio, the first generate does what I expected, resulting in "10, 20, ... 100, ". The second does not; the lambda "sees" the change in start but not the change in increment, so I get "25, 35, ... 115, ".
MSDN explains that
The Visual C++ compiler binds a lambda expression to its captured variables when the expression is declared instead of when the expression is called. ... [T]he reassignment of [a variable captured by value] later in the program does not affect the result of the expression.
So my question is: is this standards-compliant C++11 behavior, or is it Microsoft's own eccentric implementation? Bonus: if it is standard behavior, why was the standard written that way? Does it have to do with enforcing referential transparency for functional programming?
With a lambda expression, the bound variables are captured at the time of declaration.
This sample will make it very clear: https://ideone.com/Ly38P
std::function<int()> dowork()
{
int answer = 42;
auto lambda = [answer] () { return answer; };
// can do what we want
answer = 666;
return lambda;
}
int main()
{
auto ll = dowork();
return ll(); // 42
}
It is clear that the capture must be happening before the invocation, since the variables being captured don't even exist (not in scope, neither in lifetime) anymore at a later time.
It's bound at creation time. Consider:
#include <functional>
#include <iostream>
std::function<int(int)> foo;
void sub()
{
int a = 42;
foo = [a](int x) -> int { return x + a; };
}
int main()
{
sub();
int abc = 54;
abc = foo(abc); // Note a no longer exists here... but it was captured by
// value, so the caller shouldn't have to care here...
std::cout << abc; //96
}
There's no a here when the function is called -- there'd be no way for the compiler to go back and update it. If you pass a by reference, then you have undefined behavior. But if you pass by value any reasonable programmer would expect this to work.
I think you are confusing the mechanism of capture with the mechanism of variable passing. They are not the same thing even if they bear some superficial resemblance to one another. If you need the current value of a variable inside a lambda expression, capture it by reference (though, of course, that reference is bound to a particular variable at the point the lambda is declared).
When you 'capture' a variable, you are creating something very like a closure. And closures are always statically scoped (i.e. the 'capture' happens at the point of declaration). People familiar with the concept of a lambda expression would find C++'s lambda expressions highly strange and confusing if it were otherwise. Adding a brand new feature to a programming language that is different from the same feature in other programming languages in some significant way would make C++ even more confusing and difficult to understand than it already is. Also, everything else in C++ is statically scoped, so adding some element of dynamic scoping would be very strange for that reason as well.
Lastly, if capture always happened by reference, then that would mean a lambda would only be valid as long as the stack frame was valid. Either you would have to add garbage collected stack frames to C++ (with a huge performance hit and much screaming from people who are depending on the stack being largely contiguous) or you would end up creating yet another feature where it was trivially easy to blow your foot off with a bazooka by accident as the stack frame referenced by a lambda expression would go out of scope and you'd basically be creating a lot of invisible opportunities to return local variables by reference.
Yes, it has to capture by value at the point because otherwise you could attempt to capture a variable (by reference for example) that no longer exists when the lambda/function is actually called.
The standard supports capturing both by value AND by reference to address both possible use cases. If you tell the compiler to capture by value it's captured at the point the lambda is created. If you ask to capture by reference, it will capture a reference to the variable which will then be used at the point the lambda is called (requiring of course that the referenced variable must still exist at the point the call is made).

Different sizes of lambda expressions in VS2010?

Out of curiosity, I tested the size of a lamba expression. My first thought was, that they'd be 4 bytes big, like a function pointer. Strangely, the output of my first test was 1:
auto my_lambda = [&]() -> void {};
std::cout << sizeof(my_lambda) << std::endl;
Then I tested with some calculations inside the lambda, the output still being 1:
auto my_lambda2 = [&]() -> void {int i=5, j=23; std::cout << i*j << std::endl;};
std::cout << sizeof(my_lambda2) << std::endl;
My next idea was kinda random, but the output finally changed, displaying the awaited 4:
auto my_lambda3 = [&]() -> void {std::cout << sizeof(my_lambda2) << std::endl;};
std::cout << sizeof(my_lambda3) << std::endl;
At least in Visual Studio 2010. Ideone still display 1 as the output.
I know of the standard rule, that a lambda expression cannot appear in an unevaluated context, but afaik that only counts for direct lambda use like
sizeof([&]() -> void {std::cout << "Forbidden." << std::endl;})
on which VS2010 prompts me with a compiler error.
Anyone got an idea what's going on?
Thanks to #Hans Passant's comment under the question, the solution was found. My original approach was wrong in the fact that I thought every object would be captured by the lambda, but that isn't the case, only those in the enclosing scope are, and only if they are used.
And for everyone of those captured objects, 4 bytes are used (size of the reference).
Visual Studio probably doesn't implement lambda objects as functions. You're probably getting an object back. Who knows what it looks like. If you're truly interested you could always look at your variables with a debugger and see what they look like...if it'll let you.