LLVM coverage confused by if-constexpr - c++

I have encountered a weird problem with LLVM coverage when using constant expressions in an if-statement:
template<typename T>
int foo(const T &val)
{
int idx = 0;
if constexpr(std::is_trivially_copyable<T>::value && sizeof(T) <= sizeof(int)
{
memcpy(&idx, &v, sizeof(T));
}
else
{
//store val and assign its index to idx
}
return idx;
}
The instantiations executed:
int idx1 = foo<int>(10);
int idx2 = foo<long long>(10);
int idx3 = foo<std::string>(std::string("Hello"));
int idx4 = foo<std::vector<int>>(std::vector<int>{1,2,3,4,5});
In none of these is the sizeof(T) <= sizeof(int) ever shown as executed. And yet in the first case instantiation (int) the body of the first if is indeed executed as it should. In none other it is shown as executed.
Relevant part of the compilation command line:
/usr/bin/clang++ -g -O0 -Wall -Wextra -fprofile-instr-generate -fcoverage-mapping -target x86_64-pc-linux-gnu -pipe -fexceptions -fvisibility=default -fPIC -DQT_CORE_LIB -DQT_TESTLIB_LIB -I(...) -std=c++17 -o test.o -c test.cpp
Relevant part of the linker command line:
/usr/bin/clang++ -Wl,-m,elf_x86_64,-rpath,/home/michael/Qt/5.11.2/gcc_64/lib -L/home/michael/Qt/5.11.2/gcc_64/lib -fprofile-instr-generate -fcoverage-mapping -target x86_64-pc-linux-gnu -o testd test.o -lpthread -fuse-ld=lld
When the condition is extracted to its own fucnction both int and long long instantiations are shown correctly in the coverage as executing the sizeof(T) <= sizeof(int) part. What might be causing such behaviour and how to solve it? Is it a bug in the Clang/LLVM cov?
Any ideas?
EDIT: This seems to be a known bug in LLVM (not clear yet if LLVM-cov or Clang though):
https://bugs.llvm.org/show_bug.cgi?id=36086
https://bugs.chromium.org/p/chromium/issues/detail?id=845575

First of all, sizeof(T) <= sizeof(int) should be executed at compile-time in your code, so chances are, compilation is not profiled for coverage.
Next, of those three types only long long looks trivially_copyable, but its size is (highly likely) more than that of int, so then-clause is not executed for them, nor even compiled. Since everything happens inside a templated function, the non-executed branch is not compiled.

Related

Optimization flag removing undefined reference to extern variable

Considering the following piece of code :
extern int var;
void foo(int & param)
{
(void) param;
}
int main(void)
{
foo(*(&var));
return 0;
}
Compiled this way :
$ g++ -Os -o test.o -c test.cpp
$ g++ test.o
But when I remove the -Os flag, there is an undefined reference to var.
What kind of optimization are enabled by -Os to skip this undefined reference ? (I tried to replace the flag by all the optimizations it enable according to the GCC documentation but I can't reproduce without -Os.
Another question, when I compile the sample in one go :
$ g++ -c test.c
There is no error even though there is no optimization flag, why ?
After performing some binary search on the flags, the relevant one appears to be -fipa-pure-const, demo here. The description is "Discover which functions are pure or constant. Enabled by default at -O1 and higher.", which presumably includes noticing that foo doesn't actually do anything with param.

Why does g++ 11 trigger "maybe uninitialized" warning in connection with this pointer casting

After a compiler update from g++ v7.5 (Ubuntu 18.04) to v11.2 (Ubuntu 22.04), the following code triggers the maybe-uninitialized warning:
#include <cstdint>
void f(std::uint16_t v)
{
(void) v;
}
int main()
{
unsigned int pixel = 0;
std::uint16_t *f16p = reinterpret_cast<std::uint16_t*>(&pixel);
f(*f16p);
}
Compiling on Ubuntu 22.04, g++ v11.2 with -O3 -Wall -Werror.
The example code is a reduced form of a real use case, and its ugliness is not the issue here.
Since we have -Werror enabled, it leads to a build error, so I'm trying figure out how to deal with it right now. Is this an instance of this gcc bug, or is there an other explanation for the warning?
https://godbolt.org/z/baaMTxhae
You don't initialize an object of type std::uint16_t, so it is used uninitialized.
This error is suppressed by -fno-strict-aliasing, allowing pixel to be accessed through a uint16_t lvalue.
Note that -fstrict-aliasing is the default at -O2 or higher.
It could also be fixed with a may_alias pointer:
using aliasing_uint16_t = std::uint16_t [[gnu::may_alias]];
aliasing_uint16_t* f16p = reinterpret_cast<std::uint16_t*>(&pixel);
f(*f16p);
Or you can use std::start_lifetime_as:
static_assert(sizeof(int) >= sizeof(std::uint16_t) && alignof(int) >= alignof(std::uint16_t));
// (C++23, not implemented in g++11 or 12 yet)
auto *f16p = std::start_lifetime_as<std::uint16_t>(&pixel);
// (C++17)
auto *f16p = std::launder(reinterpret_cast<std::uint16_t*>(new (&pixel) char[sizeof(std::uint16_t)]));

gcc c++ how to disable the `-Wno-error=permissive` error when `-fpermissive` and `-Werror` are both on?

For this struct and function:
typedef struct data_s
{
int i1;
int i2;
} data_t;
void print_data_passed_by_ptr(const data_t *data)
{
printf(" i1 = %i\n"
" i2 = %i\n\n",
data->i1,
data->i2);
}
this works fine in C:
// Print R-value struct passed by ptr
print_data_passed_by_ptr(&(data_t){
.i1 = 7,
.i2 = 8,
});
My C build command is:
gcc -Wall -Wextra -Werror -O3 -std=c17 \
struct_pass_R_values_by_cpp_reference_and_ptr.c -o bin/a && bin/a
But, in C++ it fails with
error: taking address of temporary [-fpermissive]
My C++ build command is:
g++ -Wall -Wextra -Werror -O3 -std=c++17 \
struct_pass_R_values_by_cpp_reference_and_ptr.c -o bin/a && bin/a
So, I added -fpermissive to my C++ build command:
g++ -Wall -Wextra -Werror -O3 -std=c++17 -fpermissive \
struct_pass_R_values_by_cpp_reference_and_ptr.c -o bin/a && bin/a
and now the C++ build fails with this:
error: taking address of temporary [-Werror=permissive]
I tried turning off -Werror=permissive with -Wno-error=permissive (see here for that GCC documentation: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html, and search the page for -Wno-error=), but that's not a valid option. New build cmd I attempted:
g++ -Wall -Wextra -Werror -O3 -std=c++17 -fpermissive -Wno-error=permissive \
struct_pass_R_values_by_cpp_reference_and_ptr.c -o bin/a && bin/a
...fails with:
cc1plus: error: -Werror=permissive: no option -Wpermissive
So, how do I solve this to force this C code to build in C++? Either suppressing the warning/error, OR providing some modification to the code other than the one shown just below are acceptable answers. I want the code to compile as C also, not just C++, in the end.
I know I can use "const reference" in C++ instead of ptr, like this, and that's great and all and it might answer somebody else's question, but that's not my question:
void print_data_passed_by_cpp_reference(const data_t& data)
{
printf(" i1 = %i\n"
" i2 = %i\n\n",
data.i1,
data.i2);
}
// Print R-value struct passed by C++ reference
print_data_passed_by_cpp_reference({
.i1 = 9,
.i2 = 10,
});
I also know I can remove -Werror and keep -fpermissive to make it build, with warnings, like this:
eRCaGuy_hello_world/cpp$ g++ -Wall -Wextra -O3 -std=c++17 -fpermissive \
struct_pass_R_values_by_cpp_reference_and_ptr.c -o bin/a && bin/a
struct_pass_R_values_by_cpp_reference_and_ptr.c: In function ‘int main()’:
struct_pass_R_values_by_cpp_reference_and_ptr.c:87:5: warning: taking address of temporary [-fpermissive]
});
^
Hello world.
i1 = 7
i2 = 8
i1 = 9
i2 = 10
...but I'd really like to keep -Werror on and make that warning go away.
How to automatically pass an R-value parameter into a function as a const ptr for C and as a const reference for C++
(emphasis added to my original quote)
So, how do I solve this to force this C code to build in C++? Either suppressing the warning/error, OR providing some modification to the code other than the one shown just below are acceptable answers. I want the code to compile as C also, not just C++, in the end.
This works! It is one approach. If there are ways to disable the warning/error in gcc via command-line options I'd still like to know those though.
This is pretty clever I think. It passes the R-value by const ptr for C and by const reference for C++ by using two separate definitions for the print_data() function and the DATA_T macro, depending on the language.
#ifndef __cplusplus
// For C
void print_data(const data_t *data)
{
printf(" i1 = %i\n"
" i2 = %i\n\n",
data->i1,
data->i2);
}
#else
// For C++
void print_data(const data_t& data)
{
printf(" i1 = %i\n"
" i2 = %i\n\n",
data.i1,
data.i2);
}
#endif
#ifndef __cplusplus
// For C
#define DATA_T &(data_t)
#else
// For C++
#define DATA_T // leave empty
#endif
Usage:
// Print R-value struct passed by C++ reference, OR by C ptr, depending on
// whether this code is compiled as C or C++
print_data(DATA_T{
.i1 = 9,
.i2 = 10,
});
Build commands:
# For C
gcc -Wall -Wextra -Werror -O3 -std=c17 \
struct_pass_R_values_by_cpp_reference_and_ptr.c -o bin/a && bin/a
# For C++
g++ -Wall -Wextra -Werror -O3 -std=c++17 \
struct_pass_R_values_by_cpp_reference_and_ptr.c -o bin/a && bin/a

Detecting this undefined behavior in gcc/clang?

I am trying to detect the following undefined behavior:
% cat undef.cxx
#include <iostream>
class C
{
int I;
public:
int getI() { return I; }
};
int main()
{
C c;
std::cout << c.getI() << std::endl;
return 0;
}
For some reason all my naive attempts have failed so far:
% g++ -Wall -pedantic -o undef -fsanitize=undefined undef.cxx && ./undef
21971
same goes for:
% clang++ -Weverything -o undef -fsanitize=undefined undef.cxx && ./undef
0
Is there a way to use a magic flag in gcc/clang to report a warning/error for the above code at compile time ? at run time ?
References:
% g++ --version
g++ (Debian 10.2.1-6) 10.2.1 20210110
and
% clang++ --version
Debian clang version 11.0.1-2
Turns out my g++ version seems to handle it just fine, all I was missing is the optimization flag:
% g++ -O2 -Wall -pedantic -o undef -fsanitize=undefined undef.cxx && ./undef
undef.cxx: In function ‘int main()’:
undef.cxx:7:25: warning: ‘c.C::I’ is used uninitialized in this function [-Wuninitialized]
7 | int getI() { return I; }
| ^
0
This is clearly documented upstream:
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wuninitialized
Because these warnings depend on optimization, the exact variables or
elements for which there are warnings depend on the precise
optimization options and version of GCC used.
Here is upstream 'meta'-bug to track all those related issues:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639

C++ operator overload performance issue

Consider following scheme. We have 3 files:
main.cpp:
int main() {
clock_t begin = clock();
int a = 0;
for (int i = 0; i < 1000000000; ++i) {
a += i;
}
clock_t end = clock();
printf("Number: %d, Elapsed time: %f\n",
a, double(end - begin) / CLOCKS_PER_SEC);
begin = clock();
C b(0);
for (int i = 0; i < 1000000000; ++i) {
b += C(i);
}
end = clock();
printf("Number: %d, Elapsed time: %f\n",
a, double(end - begin) / CLOCKS_PER_SEC);
return 0;
}
class.h:
#include <iostream>
struct C {
public:
int m_number;
C(int number);
void operator+=(const C & rhs);
};
class.cpp
C::C(int number)
: m_number(number)
{
}
void
C::operator+=(const C & rhs) {
m_number += rhs.m_number;
}
Files are compiled using clang++ with flags -std=c++11 -O3.
What I expected were very similar performance results, since I thought that compiler will optimize the operators not to be called as functions. The reality though was a bit different, here is the result:
Number: -1243309312, Elapsed time: 0.000003
Number: -1243309312, Elapsed time: 5.375751
I played around a bit and found out, that if I paste all of the code from class.* into the main.cpp the speed dramatically improves and results are very similar.
Number: -1243309312, Elapsed time: 0.000003
Number: -1243309312, Elapsed time: 0.000003
Than I realized that this behavior is probably caused by the fact, that compilation of main.cpp and class.cpp is completely separated and therefore compiler is unable to perform adequate optimizations.
My question: Is there any way of keeping the 3-file scheme and still achieve the optimization level as if the files were merged into one and than compiled? I have read something about 'unity builds' but that seems like an overkill.
Short answer
What you want is link time optimization. Try the answer from this question. I.e., try:
clang++ -O4 -emit-llvm main.cpp -c -o main.bc
clang++ -O4 -emit-llvm class.cpp -c -o class.bc
llvm-link main.bc class.bc -o all.bc
opt -std-compile-opts -std-link-opts -O3 all.bc -o optimized.bc
clang++ optimized.bc -o yourExecutable
You should see that your performance reaches the one that you had when pasting everything into main.cpp.
Long answer
The problem is that the compiler cannot inline your overloaded operator during linking, because it no longer has its definition in a form which it can use to inline it (it cannot inline bare machine code). Thus, the operator call in main.cpp will stay a real function call to the function declared in class.cpp. A function call is very expensive in comparison to a simple inlined addition which can be optimized further (e.g., vectorized).
When you enable link time optimization, the compiler is able to do this. As you see above, you first create llvm intermediate representation byte code (the .bc files, which I will simply call llvm code hereinafter) instead of machine code.
You then link these files to a new .bc file which still contains llvm code instead of machine code. In contrast to machine code, the compiler is able to perform inlining on llvm code. opt is the llvm optimizer (be sure to install llvm), which performs the inlining and further link time optimizations. Then, we call clang++ a final time to generate executable machine code from the optimized llvm code.
For People with GCC
The answer above is only for clang. GCC (g++) users must use the -flto flag during compilation and during linking to enable link time optimization. It is simpler than with clang, simply add -flto everywhere:
g++ -c -O2 -flto main.cpp
g++ -c -O2 -flto class.cpp
g++ -o myprog -flto -O2 main.o class.o
The technique what you are looking for is called Link Time Optimization.
From the timing data, it is obvious that the compiler doesn't just generate better code for the trivial case, but that it doesn't perform any code at all to sum up a billion number. That doesn't happen in real life. You are not performing a useful benchmark. You want to test code that is at least complicated enough to avoid stupid/clever things like this.
I'd re-run the test, but change the loop to
for (int i = 0; i < 1000000000; ++i) if (i != 1000000) {
// ...
}
so that the compiler is forced to actually add up the numbers.