I have a fairly large C++ code (multiple hpp and cpp files). It compiles properly. However, when I run, I get this error:
libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: basic_string Abort trap: 6
I understand that this error indicates an out of range error on a string, but how do I know where this error occurs? Is there a way to print out the file and line number that is throwing this error? I'm not worried about solving the issue yet, just helping find the offending line.
I'm trying to use the gfortran option -fcheck=bounds,pointer to look for runtime errors in some code. What do the error reports look like, and where/when do they appear? Are they written to standard error, or output, or some file? Are they written and flushed at the point of occurrence, or at the end of execution? Does an error report terminate execution?
In reverse order
An error report does terminate execution
This obviates whether they're buffered or flushed immediately
They're written to standard error
Passing an invalid pointer to a routine looks like this:
At line 556 of file ../../topslave.F
Fortran runtime error: Pointer actual argument 'buffer2' is not associated
In Crash Reporter
1. What's the difference between an Exception Type and an Exception Code.
2. Why are there multiple codes?
3. Where should I look to decode them?
4. How does an EXC_BREAKPOINT exception gain two error codes when INT3 doesn't look like it takes an error codes.
For example I have this one right now:
Exception Type: EXC_BREAKPOINT (SIGTRAP)
Exception Codes: 0x0000000000000001, 0x0000000000000000
I've done some digging ... Possibly the exception codes are listed here:
https://opensource.apple.com/source/xnu/xnu-2050.22.13/osfmk/mach/kern_return.h
Apple helpfully provide this note which says:
"Exception Codes: Processor specific information about the exception
encoded into one or more 64-bit hexadecimal numbers. Typically, this
field will not be present because the Crash Reporter parses the
exception codes to present them as a human-readable description in the
other fields."
https://developer.apple.com/library/content/technotes/tn2151/_index.html
Well sure - but it's present in my crash report log.
Maybe these are the codes Intel describe here:
"An exception is an event that typically occurs when an instruction
causes an error. For example, an attempt to divide by zero generates
an exception. However, some exceptions, such as breakpoints, occur
under other condi- tions. Some types of exceptions may provide error
codes. An error code reports additional information about the error.
An example of the notation used to show an exception and error code is
shown below: #PF(fault code)
But Vol 2A 3-471 of the Intel processor manual says that the breakpoint exception doesn't take a fault code.
So I'm lost.
I tried to run the makefile on https://github.com/nasadi/Zambezi. It shows an error like:-- "file included from src/driver/buildContiguous.c:7:0: src/shared/dictionary/Dictionary.h: In function ‘readDictionary’: src/shared/dictionary/Dictionary.h:132:8: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&id, sizeof(int), 1, fp);" . Can anyone help me to run the program.Do i need to install any packages.I am new to c programming.
In fact, this is not an error, it is a warning. When compiler emits a warning, it means that code is syntactically correct but may potentially contain logic error.
In your case compiler says that return value of the fread function is not examined. Such ignorance can lead to a situation, where, for e.g., end of file is encountered, but the program is unaware of it and continues execution. Therefore, variable read from file have wrong value, and wrong (invalid) values may cause program crash later on.
Summarizing, if there are no other errors, then your program is successfully compiled and can be run.
How does one determine where the mistake is in the code that causes a segmentation fault?
Can my compiler (gcc) show the location of the fault in the program?
GCC can't do that but GDB (a debugger) sure can. Compile you program using the -g switch, like this:
gcc program.c -g
Then use gdb:
$ gdb ./a.out
(gdb) run
<segfault happens here>
(gdb) backtrace
<offending code is shown here>
Here is a nice tutorial to get you started with GDB.
Where the segfault occurs is generally only a clue as to where "the mistake which causes" it is in the code. The given location is not necessarily where the problem resides.
Also, you can give valgrind a try: if you install valgrind and run
valgrind --leak-check=full <program>
then it will run your program and display stack traces for any segfaults, as well as any invalid memory reads or writes and memory leaks. It's really quite useful.
You could also use a core dump and then examine it with gdb. To get useful information you also need to compile with the -g flag.
Whenever you get the message:
Segmentation fault (core dumped)
a core file is written into your current directory. And you can examine it with the command
gdb your_program core_file
The file contains the state of the memory when the program crashed. A core dump can be useful during the deployment of your software.
Make sure your system doesn't set the core dump file size to zero. You can set it to unlimited with:
ulimit -c unlimited
Careful though! that core dumps can become huge.
There are a number of tools available which help debugging segmentation faults and I would like to add my favorite tool to the list: Address Sanitizers (often abbreviated ASAN).
Modern¹ compilers come with the handy -fsanitize=address flag, adding some compile time and run time overhead which does more error checking.
According to the documentation these checks include catching segmentation faults by default. The advantage here is that you get a stack trace similar to gdb's output, but without running the program inside a debugger. An example:
int main() {
volatile int *ptr = (int*)0;
*ptr = 0;
}
$ gcc -g -fsanitize=address main.c
$ ./a.out
AddressSanitizer:DEADLYSIGNAL
=================================================================
==4848==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x5654348db1a0 bp 0x7ffc05e39240 sp 0x7ffc05e39230 T0)
==4848==The signal is caused by a WRITE memory access.
==4848==Hint: address points to the zero page.
#0 0x5654348db19f in main /tmp/tmp.s3gwjqb8zT/main.c:3
#1 0x7f0e5a052b6a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x26b6a)
#2 0x5654348db099 in _start (/tmp/tmp.s3gwjqb8zT/a.out+0x1099)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /tmp/tmp.s3gwjqb8zT/main.c:3 in main
==4848==ABORTING
The output is slightly more complicated than what gdb would output but there are upsides:
There is no need to reproduce the problem to receive a stack trace. Simply enabling the flag during development is enough.
ASANs catch a lot more than just segmentation faults. Many out of bounds accesses will be caught even if that memory area was accessible to the process.
¹ That is Clang 3.1+ and GCC 4.8+.
All of the above answers are correct and recommended; this answer is intended only as a last-resort if none of the aforementioned approaches can be used.
If all else fails, you can always recompile your program with various temporary debug-print statements (e.g. fprintf(stderr, "CHECKPOINT REACHED # %s:%i\n", __FILE__, __LINE__);) sprinkled throughout what you believe to be the relevant parts of your code. Then run the program, and observe what the was last debug-print printed just before the crash occurred -- you know your program got that far, so the crash must have happened after that point. Add or remove debug-prints, recompile, and run the test again, until you have narrowed it down to a single line of code. At that point you can fix the bug and remove all of the temporary debug-prints.
It's quite tedious, but it has the advantage of working just about anywhere -- the only times it might not is if you don't have access to stdout or stderr for some reason, or if the bug you are trying to fix is a race-condition whose behavior changes when the timing of the program changes (since the debug-prints will slow down the program and change its timing)
Lucas's answer about core dumps is good. In my .cshrc I have:
alias core 'ls -lt core; echo where | gdb -core=core -silent; echo "\n"'
to display the backtrace by entering 'core'. And the date stamp, to ensure I am looking at the right file :(.
Added: If there is a stack corruption bug, then the backtrace applied to the core dump is often garbage. In this case, running the program within gdb can give better results, as per the accepted answer (assuming the fault is easily reproducible). And also beware of multiple processes dumping core simultaneously; some OS's add the PID to the name of the core file.
This is a crude way to find the exact line after which there was the segmentation fault.
Define line logging function
#include \<iostream>
void log(int line) {
std::cout << line << std::endl;
}
find and replace all the semicolon after the log function with "; log(_LINE_);"
Make sure that the semicolons replaced with functions in the for (;;) loops are removed
If you have a reproducible exception like segmentation fault, you can use a tool like a debugger to reproduce the error.
I used to find source code location for even non-reproducible error. It's based on the Microsoft compiler tool chain. But it's based on a idea.
Save the MAP file for each binary (DLL,EXE) before you give it to the customer.
If an exception occurs, lookup the address in the MAP file and determine the function whose start address is just below the exception address. As a result you know the function, where the exception occurred.
Subtract the function start address from the exception address. The result is the offset in the function.
Recompile the source file containing the function with assembly listing enabled. Extract the function's assembly listing.
The assembly includes the offset of each instruction in the function. Lookup the source code line, that matches the offset in the function.
Evaluate the assembler code for the specific source code line. The offset points exactly the assembler instruction that caused the thrown exception. Evaluate the code of this single source code line. With a bit of experience with the compiler output you can say what caused the exception.
Be aware the reason for the exception might be at a totally different location. e.g. the code dereferenced a NULL pointer, but the actual reason, why the pointer is NULL can be somewhere else.
The steps 6. and 7. are beneficial since you asked only for the line of code. But I recommend that you should be aware of it.
I hope you get a similar environment with the GCC compiler for your platform. If you don't have a usable MAP file, use the tool chain tools to get the addresses of the the function. I am sure the ELF file format supports this.
In case any of you (like me!) were looking for this same question but with gfortran, not gcc, the compiler is much more powerful these days and before resorting to the use of the debugger, you can also try out these compile options. For me, this identified exactly the line of code where the error occurred and which variable I was accessing out of bounds to cause the segmentation fault error.
-O0 -g -Wall -fcheck=all -fbacktrace