Getting unexpected result when compiling with clang optimization

Getting unexpected result when compiling with clang optimization - c++

I found a bug in my code that only happens when I enable compiler optimizations -O1 or greater. I traced the bug and it seems that I can't use the boost type_erased adaptor on a boost transformed range when optimizations are enabled. I wrote this c++ program to reproduce it:
#include <iostream>
#include <vector>
#include <boost/range/adaptor/transformed.hpp>
#include <boost/range/adaptor/type_erased.hpp>
using namespace boost::adaptors;
using namespace std;
int addOne(int b) {
return b + 1;
}
int main(int, char**) {
vector<int> nums{ 1, 2, 3 };
auto result1 = nums | transformed(addOne) | type_erased<int, boost::forward_traversal_tag>();
auto result2 = nums | transformed(addOne);
auto result3 = nums | type_erased<int, boost::forward_traversal_tag>();
for (auto n : result1)
cout << n << " ";
cout << endl;
for (auto n : result2)
cout << n << " ";
cout << endl;
for (auto n : result3)
cout << n << " ";
cout << endl;
}
When I run this program without any optimizations, I get the following output:
2 3 4
2 3 4
1 2 3
When I run it with the -O1 flag, I get the following:
1 1 1
2 3 4
1 2 3
I am using clang++ to compile it. The version of clang that I am using is:
Apple LLVM version 8.0.0 (clang-800.0.38)
I don't know if I am doing something wrong, or if it is a boost/clang bug.
edit:
Changed it to
type_erased<int, boost::forward_traversal_tag, const int>()
and it works now. The third template argument is the reference type, setting the reference to const prolongs the timespan of the temporary created by the transformed.

EDIT In fact there's more to this than meets the eye. There is another usability issue, which does address the problem. See OP's self-answer
You're falling into the number 1 pitfall with Boost Range v2 (and Boost Proto etc.).
nums | transformed(addOne) is a temporary. The type_erased adaptor stores a reference to that.
After assigning the type-erased adaptor to the resultN variable, the temporary is destructed.
What you have is a dangling reference :(
This is a highly unintuitive effect, and the number 1 reason why I limit the use of Range V2 in my codebase: I've been there all too often.
Here is a workaround:
auto tmp = nums | transformed(addOne);
auto result = tmp | type_erased<int, boost::forward_traversal_tag>();
-fsanitize=address,undefined confirms that the UB is gone when using the named temporary.

Using
type_erased<int, boost::forward_traversal_tag, const int>()
works. The third template argument is the reference type, setting the reference to const prolongs the timespan of the temporary created by the transformed.

Related

Is this gcc and clang optimizer bug with minmax and structured binding?

This program, built with -std=c++20 flag:
#include <iostream>
using namespace std;
int main() {
auto [n, m] = minmax(3, 4);
cout << n << " " << m << endl;
}
produces expected result 3 4 when no optimization flags -Ox are used. With optimization flags it outputs 0 0. I tried it with multiple gcc versions with -O1, -O2 and -O3 flags.
Clang 13 works fine, but clang 10 and 11 outputs 0 4198864 with optimization level -O2 and higher. Icc works fine. What is happening here?
The code is here: https://godbolt.org/z/Wd4ex8bej

The overload of std::minmax taking two arguments returns a pair of references to the arguments. The lifetime of the arguments however end at the end of the full expression since they are temporaries.
Therefore the output line is reading dangling references, causing your program to have undefined behavior.
Instead you can use std::tie to receive by-value:
#include <iostream>
#include <tuple>
#include <algorithm>
int main() {
int n, m;
std::tie(n,m) = std::minmax(3, 4);
std::cout << n << " " << m << std::endl;
}
Or you can use the std::initializer_list overload of std::minmax, which returns a pair of values:
#include <iostream>
#include <algorithm>
int main() {
auto [n, m] = std::minmax({3, 4});
std::cout << n << " " << m << std::endl;
}

recursive application of C++20 range adaptor causes a compile time infinite loop

The ranges library in C++20 supports the expression
auto view = r | std::views::drop(n);
to remove the first n elements of a range r with the range adaptor drop.
However if I recursively drop elements from a range, the compiler enters an infinite loop.
Minimal working example: (takes infinite time to compile in GCC 10)
#include <ranges>
#include <iostream>
#include <array>
#include <string>
using namespace std;
template<ranges::range T>
void printCombinations(T values) {
if(values.empty())
return;
auto tail = values | views::drop(1);
for(auto val : tail)
cout << values.front() << " " << val << endl;
printCombinations(tail);
}
int main() {
auto range1 = array<int, 4> { 1, 2, 3, 4 };
printCombinations(range1);
cout << endl;
string range2 = "abc";
printCombinations(range2);
cout << endl;
}
expected output:
1 2
1 3
1 4
2 3
2 4
3 4
a b
a c
b c
Why does this take infinite time to compile and how should I resolve the problem?

Let's take a look at the string case (just because that type is shorter) and manually examine the call stack.
printCombinations(range2) calls printCombinations<string>. The function recursively calls itself with tail. What's the type of tail? That's drop_view<ref_view<string>>. So we call printCombinations<drop_view<ref_view<string>>>. Straightforward so far.
Now, we again recursively call ourselves with tail. What's the type of tail now? Well, we just wrap. It's drop_view<drop_view<ref_view<string>>>. And then we recurse again with drop_view<drop_view<drop_view<ref_view<string>>>>. And then we recurse again with drop_view<drop_view<drop_view<drop_view<ref_view<string>>>>>. And so forth, infinitely, until the compiler explodes.
Can we fix this by maintaining the same algorithm? Actually, yes. P1739 was about reducing this kind of template instantiation bloat (although it didn't have an example as amusing as this one). And so drop_view has a few special cases for views that it recognizes and won't rewrap.
The type of "hello"sv | views::drop(1) is still string_view, not drop_view<string_view>. So printCombinations(string_view(range2)) should only generate a single template instantiation.
But it looks like libstdc++ doesn't implement this feature yet. So you can either implement it manually (but only trafficking in, say, subrange) or abandon the recursive approach here.

Although this is very old question, but I came across this annoying bug/(or maybe feature) today, and this is how I solved it without much change to the original code.
#include <ranges>
#include <iostream>
#include <array>
#include <string>
#include <span> /////// <------ added this
using namespace std;
template<ranges::range T>
void printCombinations(T values_) { // <--- Changed values to values_
auto values = std::span(values_); // <--- defined values here
if(values.empty())
return;
auto tail = values | views::drop(1);
for(auto val : tail)
cout << values.front() << " " << val << endl;
printCombinations(tail);
}
int main() {
auto range1 = array<int, 4> { 1, 2, 3, 4 };
printCombinations(range1);
cout << endl;
string range2 = "abc";
printCombinations(range2);
cout << endl;
}
By creating span, we make the typename of the variable simply span instead of deeply nested typenames as shown in the accepted answer.
You can check it here on goldbolt that it works exactly as expected.
UPDATE:
You can replace the use of std::span with std::ranges::subrange and it will cover even more cases. For example, std::span does not work with std::ranges::views::reverse.

C++11 : map::lower_bound doesn't work correctly for 2 or less elements in Linux

If I run following C++11 example in Linux (Debian 7, GCC 4.8.2, Eclipse CDT), the while cycle is infinite. First loop is correct. Iterator is decremented by 1 and it references to the first map element. But second and other loops are incorrect. Decrement operator doesn't decrement iterator. It still references to the first element.
If I remove comment (in map initialization), while cycle will stop.
Could you please tell me, what I did wrong?
Thank you very much for every comment.
#include <iostream>
#include <map>
using namespace std;
int main() {
std::map<int, int> mymap = {{1, 100}, {2, 200}/*, {3, 300}*/};
auto it = mymap.lower_bound(2);
cout << "mymap key: " << it->first << endl;
while(--it != buff.end())
cout << "mymap key: " << it->first << endl;
return 0;
}
Note: This code works correct under Windows platform (Visual studio 2013 Express).

You pass a begin() iterator to this line:
while(--it != buff.end())
And --begin() yields undefined behaviour.

C++11 Lambda closure involving a stack variable by reference that leaves scope is allowed but getting undefined behavior?

I know C++ fairly well. I have used lambdas and closures in other languages. For my learning, I wanted to see what I could do with these in C++.
Fully knowing the "danger" and expecting the compiler to reject this, I created a lambda in a function using a function stack variable by reference and returned the lambda. The compiler allowed it and strange things occurred.
Why did the compiler allow this? Is this just a matter of the compiler not being able to detect that I did something very, very bad and the results are just "undefined behavior"? Is this a compiler issue? Does the spec have anything to say about this?
Tested on a recent mac, with MacPorts-installed gcc 4.7.1 and the -std=c++11 compile option.
Code used:
#include <functional>
#include <iostream>
using namespace std;
// This is the same as actsWicked() except for the commented out line
function<int (int)> actsStatic() {
int y = 0;
// cout << "y = " << y << " at creation" << endl;
auto f = [&y](int toAdd) {
y += toAdd;
return y;
};
return f;
}
function<int (int)> actsWicked() {
int y = 0;
cout << "actsWicked: y = " << y << " at creation" << endl;
auto f = [&y](int toAdd) {
y += toAdd;
return y;
};
return f;
}
void test(const function<int (int)>& f, const int arg, const int expected) {
const int result = f(arg);
cout << "arg: " << arg
<< " expected: " << expected << " "
<< (expected == result ? "=" : "!") << "= "
<< "result: " << result << endl;
}
int main(int argc, char **argv) {
auto s = actsStatic();
test(s, 1, 1);
test(s, 1, 2);
test(actsStatic(), 1, 1);
test(s, 1, 3);
auto w = actsWicked();
test(w, 1, 1);
test(w, 1, 2);
test(actsWicked(), 1, 1);
test(w, 1, 3);
return 0;
}
Results:
arg: 1 expected: 1 == result: 1
arg: 1 expected: 2 == result: 2
arg: 1 expected: 1 != result: 3
arg: 1 expected: 3 != result: 4
actsWicked: y = 0 at creation
arg: 1 expected: 1 == result: 1
arg: 1 expected: 2 == result: 2
actsWicked: y = 0 at creation
arg: 1 expected: 1 == result: 1
arg: 1 expected: 3 != result: 153207395

Returning a lambda that captures a local variable by reference is the same as returning a reference to a local variable directly; it results in undefined behaviour:
5.1.2 Lambda expressions [expr.prim.lambda]
22 - [ Note: If an entity is implicitly or explicitly captured by reference, invoking the function call operator of the corresponding lambda-expression after the lifetime of the entity has ended is likely to result in undefined behavior. —end note ]
Specifically, the undefined behaviour in this case is in lvalue-to-rvalue conversion:
4.1 Lvalue-to-rvalue conversion [conv.lval]
1 - A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue.
If T is an incomplete type, a program that necessitates this conversion is ill-formed. If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior.
The compiler is not required to diagnose this form of undefined behaviour, although as compiler support for lambdas improves it is likely that compilers will be able to diagnose this case and offer an appropriate warning.
Since lambda closure types are well defined, just opaque, your example is equivalent to:
struct lambda {
int &y;
lambda(int &y): y(y) {};
int operator()(int toAdd) {
y += toAdd;
return y;
};
} f{y};
return f;
In general terms, C++ solves the funarg problem by making it the responsibility of the programmer and providing facilities (mutable lambda capture, move semantics, unique_ptr etc.) to allow the programmer to solve it efficiently.

std::istream_iterator<> with copy_n() and friends

The snippet below reads three integers from std::cin; it writes two into numbers and discards the third:
std::vector<int> numbers(2);
copy_n(std::istream_iterator<int>(std::cin), 2, numbers.begin());
I'd expect the code to read exactly two integers from std::cin, but it turns out this is a correct, standard-conforming behaviour. Is this an oversight in the standard? What is the rationale for this behaviour?
From 24.5.1/1 in the C++03 standard:
After it is constructed, and every
time ++ is used, the iterator reads
and stores a value of T.
So in the code above at the point of call the stream iterator already reads one integer. From that point onward every read by the iterator in the algorithm is a read-ahead, yielding the value cached from the previous read.
The latest draft of the next standard, n3225, doesn't seem to bear any change here (24.6.1/1).
On a related note, 24.5.1.1/2 of the current standard in reference to the istream_iterator(istream_type& s) constructor reads
Effects: Initializes in_stream with
s. value may be initialized during
construction or the first time it is
referenced.
With emphasis on "value may be initialized ..." as opposed to "shall be initialized". This sounds contradicting with 24.5.1/1, but maybe that deserves a question of its own.

Unfortunately the implementer of copy_n has failed to account for the read ahead in the copy loop. The Visual C++ implementation works as you expect on both stringstream and std::cin. I also checked the case from the original example where the istream_iterator is constructed in line.
Here is the key piece of code from the STL implementation.
template<class _InIt,
class _Diff,
class _OutIt> inline
_OutIt _Copy_n(_InIt _First, _Diff _Count,
_OutIt _Dest, input_iterator_tag)
{ // copy [_First, _First + _Count) to [_Dest, ...), arbitrary input
*_Dest = *_First; // 0 < _Count has been guaranteed
while (0 < --_Count)
*++_Dest = *++_First;
return (++_Dest);
}
Here is the test code
#include <iostream>
#include <istream>
#include <sstream>
#include <vector>
#include <iterator>
int _tmain(int argc, _TCHAR* argv[])
{
std::stringstream ss;
ss << 1 << ' ' << 2 << ' ' << 3 << ' ' << 4 << std::endl;
ss.seekg(0);
std::vector<int> numbers(2);
std::istream_iterator<int> ii(ss);
std::cout << *ii << std::endl; // shows that read ahead happened.
std::copy_n(ii, 2, numbers.begin());
int i = 0;
ss >> i;
std::cout << numbers[0] << ' ' << numbers[1] << ' ' << i << std::endl;
std::istream_iterator<int> ii2(std::cin);
std::cout << *ii2 << std::endl; // shows that read ahead happened.
std::copy_n(ii2, 2, numbers.begin());
std::cin >> i;
std::cout << numbers[0] << ' ' << numbers[1] << ' ' << i << std::endl;
return 0;
}
/* Output
1
1 2 3
4 5 6
4
4 5 6
*/

Today I encountered very similar problem, and here is the example:
#include <iostream>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <string>
struct A
{
float a[3];
unsigned short int b[6];
};
void ParseLine( const std::string & line, A & a )
{
std::stringstream ss( line );
std::copy_n( std::istream_iterator<float>( ss ), 3, a.a );
std::copy_n( std::istream_iterator<unsigned short int>( ss ), 6, a.b );
}
void PrintValues( const A & a )
{
for ( int i =0;i<3;++i)
{
std::cout<<a.a[i]<<std::endl;
}
for ( int i =0;i<6;++i)
{
std::cout<<a.b[i]<<std::endl;
}
}
int main()
{
A a;
const std::string line( "1.1 2.2 3.3 8 7 6 3 2 1" );
ParseLine( line, a );
PrintValues( a );
}
Compiling the above example with g++ 4.6.3 produces one:
1.1 2.2 3.3 7 6 3 2 1 1
, and compiling with g++ 4.7.2 produces another result :
1.1 2.2 3.3 8 7 6 3 2 1
The c++11 standard tells this about copy_n :
template<class InputIterator, class Size, class OutputIterator>
OutputIterator copy_n(InputIterator first, Size n, OutputIterator result);
Effects: For each non-negative integer i < n, performs *(result + i) = *(first + i).
Returns: result + n.
Complexity: Exactly n assignments.
As you can see, it is not specified what exactly happens with the iterators, which means it is implementation dependent.
My opinion is that your example should not read the 3rd value, which means this is a small flaw in the standard that they haven't specified the behavior.

I don't know the exact rationale, but as the iterator also has to support operator*(), it will have to cache the values it reads. Allowing the iterator to cache the first value at construction simplifies this. It also helps in detecting end-of-stream when the stream is initially empty.
Perhaps your use case is one the committee didn't consider?

Today, 9 years after you, I fell into the same problem, So following this thread, while playing with the problem noticed this, It seems we can walk the iterator one step for each reading after first time(I mean cin also can't ignore end of line feed automatically, we help it with cin.ignore(), we can help this implementation too I guess):
#include<bits/stdc++.h>
using namespace std;
int main(){
freopen("input.txt","r",stdin);
istream_iterator<int> it(cin);
ostream_iterator<int> cout_it(cout, " ");
copy_n(it, 5, cout_it);
cout<<"\nAnd for the rest of the stream\n";
for(int i=0;i<10;i++){
it++;
copy_n(it, 1, cout_it);
}
return 0;
}
and that should produce output like:
1 2 3 4 5
And for the rest of the stream
6 7 8 9 10 11 12 13 14 15

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Getting unexpected result when compiling with clang optimization - c++

Using type_erased<int, boost::forward_traversal_tag, const int>() works. The third template argument is the reference type, setting the reference to const prolongs the timespan of the temporary created by the transformed.

Related

Is this gcc and clang optimizer bug with minmax and structured binding?

recursive application of C++20 range adaptor causes a compile time infinite loop

C++11 : map::lower_bound doesn't work correctly for 2 or less elements in Linux

C++11 Lambda closure involving a stack variable by reference that leaves scope is allowed but getting undefined behavior?

std::istream_iterator<> with copy_n() and friends

Categories

Resources