Illegal Instruction when compiling with -mfma

Illegal Instruction when compiling with -mfma - c++

I am compiling with GCC 5.3.0 on Intel SandyBridge E5-2670. When I use these flags -O3 -DEIGEN_NO_DEBUG -std=c++11 -Wall -Wextra -Werror -march=native -ffast-math the code runs without error. When I add -mfma I get illegal instruction.
I figured that using -march=native would never produce illegal instructions. I ran the program with gdb and bt but it shows a valid (at least to me) stack so I don't think -mfma exposed a bad pointer or other memory problem.
#0 0x000000000043a59c in ConvexHull::SortConvexHull() ()
#1 0x000000000043badd in ConvexHull::ConvexHull(Eigen::Matrix<double, -1, -1, 0, -1, -1>) ()
#2 0x000000000040b794 in Group::BuildCatElement() ()
#3 0x0000000000416b60 in SurfaceModel::ProcessGroups() ()
#4 0x00000000004435c6 in MainLoop(Inputs&, std::ostream&) ()
#5 0x000000000040494e in main ()
Then I recompiled with debugging (-O0 -g), all other options the same and gdb comes back with
0x00000000004140df in Eigen::internal::pmadd<double __vector(4)>(double __vector(4) const&, double __vector(4) const&, double __vector(4) const&) (a=..., b=..., c=...)
at ./../eigen-eigen-5a0156e40feb/Eigen/src/Core/arch/AVX/PacketMath.h:178
178 __asm__("vfmadd231pd %[a], %[b], %[c]" : [c] "+x" (res) : [a] "x" (a), [b] "x" (b));
The backtrace shows that the error starts at line 259
using namespace Eigen;
252 gridPnts.rowwise() -= gridPnts.colwise().mean(); //gridPnts is MatrixXd (X by 3)
253 Matrix3d S = gridPnts.transpose() * gridPnts;
254 S /= static_cast<double>(gridPnts.rows() - 1);
255 Eigen::SelfAdjointEigenSolver<MatrixXd> es(S);
256 Eigen::Matrix<double, 3, 2> trans;
257 trans = es.eigenvectors().block<3, 2>(0, 1);
258 MatrixXd output(gridPnts.rows(), 2);
259 output = gridPnts * trans;
The point of compiling with -mfma was to see if I could improve performance. Is this a bug in Eigen or more likely did I use it incorrectly?

To debug illegal instruction you should first of all look at the disassembly, not backtrace or source code. In your case though, even from the source code you can easily see that the offending (illegal) instruction is vfmadd231pd, which is from the FMA instruction set extension. But SandyBridge CPUs, one of which you have, don't support this ISA extension, so by enabling it in the compiler you have shot yourself in the foot.
On Linux you can check whether your CPU supports FMA by this shell command:
grep -q '\<fma\>' /proc/cpuinfo && echo supported || echo not supported

-mfma adds the FMA instruction set to the set of allowed instructions. You need at least an Intel-Haswell or AMD-Piledriver CPU for that.
Adding -mInstructionSet additionally to -march=native will never help -- either it was included already or it will allow the compiler to use illegal instructions (on your CPU).

Related

How do you find out the cause of rare crashes that are caused by things that are not caught by try catch (access violation, divide by zero, etc.)?

I am a .NET programmer who is starting to dabble into C++. In C# I would put the root function in a try catch, this way I would catch all exceptions, save the stack trace, and this way I would know what caused the exception, significantly reducing the time spent debugging.
But in C++ some stuff(access violation, divide by zero, etc.) are not caught by try catch. How do you deal with them, how do you know which line of code caused the error?
For example let's assume we have a program that has 1 million lines of code. It's running 24/7, has no user-interaction. Once in a month it crashes because of something that is not caught by try catch. How do you find out which line of code caused the crash?
Environment: Windows 10, MSVC.

C++ is meant to be a high performance language and checks are expensive. You can't run at C++ speeds and at the same time have all sorts of checks. It is by design.
Running .Net this way is akin to running C++ in debug mode with sanitizers on. So if you want to run your application with all the information you can, turn on debug mode in your cmake build and add sanitizers, at least undefined and address sanitizers.
For Windows/MSVC it seems that address sanitizers were just added in 2021. You can check the announcement here: https://devblogs.microsoft.com/cppblog/addresssanitizer-asan-for-windows-with-msvc/
For Windows/mingw or Linux/* you can use Gcc and Clang's builtin sanitizers that have largely the same usage/syntax.
To set your build to debug mode:
cd <builddir>
cmake -DCMAKE_BUILD_TYPE=debug <sourcedir>
To enable sanitizers, add this to your compiler command line: -fsanitize=address,undefined
One way to do that is to add it to your cmake build so altogether it becomes:
cmake -DCMAKE_BUILD_TYPE=debug \
-DCMAKE_CXX_FLAGS_DEBUG_INIT="-fsanitize=address,undefined" \
<sourcedir>
Then run your application binary normally as you do. When an issue is found a meaningful message will be printed along with a very informative stack trace.
Alternatively you can set so the sanitizer breaks inside the debugger (gdb) so you can inspect it live but that only works with the undefined sanitizer. To do so, replace
-fsanitize=address,undefined
with
-fsanitize-undefined-trap-on-error -fsanitize-trap=undefined -fsanitize=address
For example, this code has a clear problem:
void doit( int* p ) {
*p = 10;
}
int main() {
int* ptr = nullptr;
doit(ptr);
}
Compile it in the optimized way and you get:
$ g++ -O3 test.cpp -o test
$ ./test
Segmentation fault (core dumped)
Not very informative. You can try to run it inside the debugger but no symbols are there to see.
$ g++ -O3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
...
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) r
Starting program: /tmp/test
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555044 in main ()
(gdb)
That's useless so we can turn on debug symbols with
$ g++ -g3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
...
Reading symbols from ./test...
(gdb) r
Starting program: /tmp/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
test.cpp:4:5: runtime error: store to null pointer of type 'int'
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555259 in doit (p=0x0) at test.cpp:4
4 *p = 10;
Then you can inspect inside:
(gdb) p p
$1 = (int *) 0x0
Now, turn on sanitizers to get even more messages without the debugger:
$ g++ -O0 -g3 test.cpp -fsanitize=address,undefined -o test
$ ./test
test.cpp:4:5: runtime error: store to null pointer of type 'int'
AddressSanitizer:DEADLYSIGNAL
=================================================================
==931717==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x563b7b66c259 bp 0x7fffd167c240 sp 0x7fffd167c230 T0)
==931717==The signal is caused by a WRITE memory access.
==931717==Hint: address points to the zero page.
#0 0x563b7b66c258 in doit(int*) /tmp/test.cpp:4
#1 0x563b7b66c281 in main /tmp/test.cpp:9
#2 0x7f36164a9082 in __libc_start_main ../csu/libc-start.c:308
#3 0x563b7b66c12d in _start (/tmp/test+0x112d)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /tmp/test.cpp:4 in doit(int*)
==931717==ABORTING
That is much better!

clang and clang++ with ASAN generate different output

I'm trying to add ASAN (Google's/Clang's address sanitize) to our project and stuck at this problem.
For example, we have this simple C++ code
#include <iostream>
int main() {
std::cout << "Started Program\n";
int* i = new int();
*i = 42;
std::cout << "Expected i: " << *i << std::endl;
}
Then, I build it with clang++
clang++-3.8 -o memory-leak++ memory_leak.cpp -fsanitize=address -fno-omit-frame-pointer -g
The program gives this output
Started Program
Expected i: 42
=================================================================
==14891==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x4f2040 in operator new(unsigned long) (memory-leak+++0x4f2040)
#1 0x4f4f00 in main memory_leak.cpp:4:11
#2 0x7fae13ce6f44 in __libc_start_main /build/eglibc-SvCtMH/eglibc-2.19/csu/libc-start.c:287
SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).
Cool, it works and symbolizer gives meaningful information too.
Now, I build this with clang
clang-3.8 -o memory-leak memory_leak.cpp -std=c++11 -fsanitize=address -fno-omit-frame-pointer -g -lstdc++
And the program gives this output
Started Program
Expected i: 42
=================================================================
==14922==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x4c3bc8 in malloc (memory-leak+0x4c3bc8)
#1 0x7f024a8e4dac in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x5edac)
#2 0x7f0249998f44 in __libc_start_main /build/eglibc-SvCtMH/eglibc-2.19/csu/libc-start.c:287
SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).
Ok, it detects memory leak, but the stack trace looks strange and it doesn't really include the memory_leak.cpp:4:11 line.
I've spent quite a while trying to narrow down this problem in our codebase and eventually, the only difference, is clang vs clang++.
Why it's event a problem, can't we use clang++?
We use bazel, which uses CC compiler instead of CXX for some blah-balh reasons. We cannot blindly force to use it CXX because we have CC dependencies which cannot be build by CXX. So...
Any idea how to get the same ASAN output when used with clang and clang++? Or, how to make Bazel to use clang++ for C++ targets and clang for C targets?

This seems to be a bug in Clang, could you file bug report in their tracker? (EDIT: this was [resolved as not-a-bug](Asan developers https://github.com/google/sanitizers/issues/872) so probly needs to be fixed by Bazel developers instead).
Some details: when you use ordinary clang, it decides not to link C++ part of Asan runtime as can be seen in Tools.cpp:
if (SanArgs.linkCXXRuntimes())
StaticRuntimes.push_back("asan_cxx");
and SanitizerArgs.cpp:
LinkCXXRuntimes =
Args.hasArg(options::OPT_fsanitize_link_cxx_runtime) || D.CCCIsCXX();
(note the D.CCCIsCXX part, it checks for clang vs. clang++ whereas instead they need to check the file type).
C++ part of the runtime contains interceptor for operator new so this would explain why it's missing when you link with clang instead of clang++. On a positive side, you should be able to work around this by adding -fsanitize-link-c++-runtime to your flags.
As for the borked stack, by default Asan unwinds stack with frame pointer based unwinder which has problems unwinding through code which wasn't built with -fno-omit-frame-pointer (like libstdc++.so in your case). See e.g. this answer for another example of such behavior and available workarounds.

Is sse2 enabled by default in g++?

When I run g++ -Q --help=target, I get
-msse2 [disabled].
However, if I create the assembly code of with default options as
g++ -g mycode.cpp -o mycode.o; objdump -S mycode.o > default,
and a sse2 version with
g++ -g -msse2 mycode.cpp -o mycode.sse2.o; objdump -S mycode.sse2.o > sse2,
and finally a non-sse2 version with
g++ -g -mno-sse2 mycode.cpp -o mycode.nosse2.o; objdump -S mycode.nosse2.o > nosse2
I see basically no difference between default and sse2, but a big difference between default and nosse2, so this tells me that, by default, g++ is using sse2 instructions, even though I am being told it is disabled ... what is going on here?
I am compiling on a Xeon E5-2680 under Linux with gcc-4.4.7 if it matters.

If you are compiling for 64bit, then this is totally fine and documented behavior.
As stated in the gcc docs the SSE instruction set is enabled by default when using an x86-64 compiler:
-mfpmath=unit
Generate floating point arithmetics for selected unit unit. The choices for unit are:
`387'
Use the standard 387 floating point coprocessor present majority of chips and emulated otherwise. Code compiled with this option will run almost everywhere. The temporary results are computed in 80bit precision instead of precision specified by the type resulting in slightly different results compared to most of other chips. See -ffloat-store for more detailed description.
This is the default choice for i386 compiler.
`sse'
Use scalar floating point instructions present in the SSE instruction set. This instruction set is supported by Pentium3 and newer chips, in the AMD line by Athlon-4, Athlon-xp and Athlon-mp chips. The earlier version of SSE instruction set supports only single precision arithmetics, thus the double and extended precision arithmetics is still done using 387. Later version, present only in Pentium4 and the future AMD x86-64 chips supports double precision arithmetics too.
For the i386 compiler, you need to use -march=cpu-type, -msse or -msse2 switches to enable SSE extensions and make this option effective. For the x86-64 compiler, these extensions are enabled by default.
The resulting code should be considerably faster in the majority of cases and avoid the numerical instability problems of 387 code, but may break some existing code that expects temporaries to be 80bit.
This is the default choice for the x86-64 compiler.

Using -fsanitize=memory with clang on linux with libstdc++

With the system supplied libstdc++ the clang memory sanitizer is basically unusable due to false positives - eg the code below fails.
#include <iostream>
#include <fstream>
int main(int argc, char **argv)
{
double foo = 1.2;
std::ofstream out("/tmp/junk");
auto prev = out.flags(); //false positive here
out.setf(std::ios::scientific);
out << foo << std::endl;
out.setf(prev);
}
Building libstdc++ as described here:
https://code.google.com/p/memory-sanitizer/wiki/InstrumentingLibstdcxx
and running it like so:
LD_LIBRARY_PATH=~/goog-gcc-4_8/build/src/.libs ./msan_example
gives the foolowing output
/usr/bin/llvm-symbolizer: symbol lookup error: /home/hal/goog-gcc-4_8/build/src/.libs/libstdc++.so.6: undefined symbol: __msan_init
Both centos7 (epel clang) and ubuntu fail in this manner.
Is there something I'm doing wrong?
Previous stack overflow question Using memory sanitizer with libstdc++
edit
Using #eugins suggestion, compile command line for this code is:
clang++ -std=c++11 -fPIC -O3 -g -fsanitize=memory -fsanitize-memory-track-origins -fPIE -fno-omit-frame-pointer -Wall -Werror -Xlinker --build-id -fsanitize=memory -fPIE -pie -Wl,-rpath,/home/hal/goog-gcc-4_8/build/src/.libs/libstdc++.so test_msan.cpp -o test_msan
$ ./test_msan
==19027== WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f66df377d4e in main /home/hal/tradingsystems/test_msan.cpp:9
#1 0x7f66ddefaaf4 in __libc_start_main (/lib64/libc.so.6+0x21af4)
#2 0x7f66df37780c in _start (/home/hal/tradingsystems/test_msan+0x7380c)
Uninitialized value was created by an allocation of 'out' in the stack frame of function 'main'
#0 0x7f66df377900 in main /home/hal/tradingsystems/test_msan.cpp:6
SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/hal/tradingsystems/test_msan.cpp:9 main
Exiting

MSan spawns llvm-symbolizer process to translate stack trace PCs into function names and file/line numbers. Because of the LD_LIBRARY_PATH setting, the instrumented libstdc++ is loaded into both the main MSan process (which is good) and the llvm-symbolizer process (which won't work).
Preferred way of dealing with it is though RPATH setting (at link time):
-Wl,-rpath,/path/to/libstdcxx_msan
You could also check this msan/libc++ guide which is more detailed and up-to-date:
https://code.google.com/p/memory-sanitizer/wiki/LibcxxHowTo

Using libc++ causes GDB to segfault on OS X

I'm trying to use C++11 (with Clang and libc++ on OS X) for a program, but whenever I debug with gdb and try to inspect standard containers, gdb segfaults. Here's a minimal example:
file.cpp:
#include <iostream>
#include <string>
int main(int argc, char* argv[])
{
std::string str = "Hello world";
std::cout << str << std::endl; // Breakpoint here
}
If I compile for C++11 using the following:
$ c++ --version
Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)
Target: x86_64-apple-darwin12.4.0
Thread model: posix
$
$ c++ -ggdb -std=c++11 -stdlib=libc++ -Wall -pedantic -O0 -c file.cpp
$ c++ -ggdb -std=c++11 -stdlib=libc++ -Wall -pedantic -O0 file.o -o program
And then debug as follows, it crashes when I try to p str.size():
$ gdb program
GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Wed Feb 6 22:51:23 UTC 2013)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done
(gdb) br file.cpp:8
Breakpoint 1 at 0x100000d80: file file.cpp, line 8.
(gdb) run
Starting program: /Users/mjbshaw/School/cs6640/2/program
Reading symbols for shared libraries ++............................. done
Breakpoint 1, main (argc=1, argv=0x7fff5fbffab0) at file.cpp:8
8 std::cout << str << std::endl; // Breakpoint here
(gdb) p str.size()
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
std::__1::operator<< <char, std::__1::char_traits<char>, std::__1::allocator<char> > (__os=#0x7fff5fc3d628, __str=#0x1) at string:1243
1243
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__is_long() const) will be abandoned.
If I don't run this in gdb, I get no crash and it works fine (but I need gdb to debug my program). Also, if I remove -std=c++11 -stdlib=libc++ from the compiling options, then it works fine (even in gdb), but I need C++11 for my program.
Are there some known issues with gdb and C++11 (specifically libc++)? I know libc++ and libstdc++ can cause issues if used together, but I'm not trying to use them together (at least not consciously; all I want to use is libc++). Am I specifying some compilation options wrong? Is there a way to properly compile for C++11 on OS X and still be able to debug properly?

GDB 6.3 is almost nine years old. That's just about eternity in Internet years. The product has improved greatly since then. Updating to the last stable release is a must for every developer.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js