I'm writing this program and when I build it, it runs OK.But when I turn the debugger I get this message:
Program received signal SIGSEGV, Segmentation fault.
In ntdll!LdrWx86FormatVirtualImage
() (C:\Windows\system32\ntdll.dll)
#0 7788E3C6 ntdll!
LdrWx86FormatVirtualImage()
(C:\Windows\system32\ntdll.dll:??) #1 ?? ?? () (??:??)
Could you tell me what these errors mean?
It's a segmentation fault.
Probable culprit is passing an invalid pointer around (for example, a buffer that is too small) or trying to deference a null pointer.
Related
I have a small to medium size application which combines Fortran and C++. The main is written in Fortran, but one module is in c++. This module returns pointers to class objects which are stored on the Fortran size. During the creation on one of these pointers the system is throwing the following error:
malloc(): memory corruption
Thread 1 "bc_test" received signal SIGABRT, Aborted.
__GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory
(gdb) bt
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007ffff4a60801 in __GI_abort () at abort.c:79
#2 0x00007ffff4aa9897 in __libc_message (action=action#entry=do_abort,
fmt=fmt#entry=0x7ffff4bd6b9a "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3 0x00007ffff4ab090a in malloc_printerr (
str=str#entry=0x7ffff4bd4e0e "malloc(): memory corruption") at malloc.c:5350
#4 0x00007ffff4ab4994 in _int_malloc (av=av#entry=0x7ffff4e0bc40 <main_arena>,
bytes=bytes#entry=44) at malloc.c:3738
#5 0x00007ffff4ab72ed in __GI___libc_malloc (bytes=44) at malloc.c:3065
#6 0x00007ffff50bc298 in operator new(unsigned long) ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x0000555555578967 in My_Class::My_Class(this=0x7fffffffd4e0, n=11)
at /home/.../my_class.cpp:20
Using gdb I have found that the error is thrown during a call to new. More specifically during a call to new within the constructor of an object being created via new (a basic new call works as expected). The line throwing the error is the following:
int* test = new int[n];
in this case n is an integer with n=11.
I don't think that the problem is due to a lack of memory as I have only allocated 2 small class instances and a few basic variables at this point. I also believe this would throw a different error if this were the problem.
Unfortunately I haven't managed to create a MWE. I've now run out of ideas of how to fix this problem. What can cause this error? How can it be debugged beyond finding the line throwing the error?
Other stack overflow results concerning "malloc(): memory corruption" errors are due to accessing unallocated memory however this isn't the case here as it is the allocation call itself which is throwing the error.
Memory corruption errors do not always manifest themselves in the place where the error was committed. As a result the gdb backtrace is often useless for finding the error. Instead a memory analysis/debugging tool such as Valgrind should be used.
I am running a cpp program in ubuntu. I am getting a signal SIGSEGV, Segmentation fault.
I tried to use gdb to see the exact line for segmentation fault. I get this idea from this thread of question:
Determine the line of code that causes a segmentation fault?
The gdb is returning me this:
Thread 1 "incremental_sat" received signal SIGSEGV, Segmentation fault.
0x00007ffff7857c50 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare(char const*) const () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
Please see the screenshot of the terminal below:
I am using data structure involving all types of vector and unordered_map for int and string.
Can I know, which all lines of code is responsible for this error?
I am new to debugging in CPP. Any suggestion for the direction I can proceed with this info?
UPDATE: after the suggestion of running "bt" command, I am adding the output:
I am observing that it is getting segmentation fault just after calling cnf_transformation_out_diff() function. As it is not even printing the first cout command written in cnf_transformation_out_diff() function.
Finally, I resolved the bug with the help of my friend Arpan.
In one scenario, the data structure gates_out_diff remained empty. I didn't added the safety check and the program is accessing the gates_out_diff[i][1] value, so it resulted in a segmentation fault.
It is running after I fixed that case. It took me one day. Hope it saves someone's time.
I use gdb command as follows to localize the segmentation fault, but it shows ?? so that I am confused. What does it mean? How to avoid it?
$ gdb program core
...
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000048d0000048c in ?? ()
(gdb) bt
#0 0x0000046a00000469 in ?? ()
#1 0x0000046c0000046b in ?? ()
#2 0x0000046e0000046d in ?? ()
#3 0x000004700000046f in ?? ()
#4 0x0000047300000472 in ?? ()
#5 0x0000047600000475 in ?? ()
#6 0x0000047800000477 in ?? ()
#7 0x0000047a00000479 in ?? ()
#8 0x0000047d0000047b in ?? ()
...
I find that the array is out of bounds and I solved it. But I still confused with the phenomenon above.
0x0000048d0000048c
This looks like you've called a function through a function pointer, but that pointer has been overwritten with two integers: 0x48d == 1165 and 0x48c == 1164 (do these values look like something that your program is using?).
You should use bt to tell you how you got there.
You should probably use Valgrind or Address Sanitizer to check for uninitialized or dangling memory and buffer overflow (which are some of the common ways to end up with invalid function pointer).
Update:
Now that you show the stack trace, it's an almost 100% guarantee that you have some local array of integers which you've overflown (filling it with values like 1129, 1130, 1131, etc.), thus corrupting your stack.
Address Sanitizer (available in recent versions of GCC) should point you straight at where the bug is.
This means that your program crashed in a function unknow by gdb (function not provided by the symbol table)
try these two options, in the given order:
if you are debugging a target, be sure that all your code layers are compiled with the option -g if you are using gcc.
You can give manually the symbol table to gdb with the command file "binary_with_symbol_table" and it will give you the function and the address of the bug.
Note that many exceptions may be hidden behind a segmentation fault.
My program recently crashed with the following stack;
Program terminated with signal 7, Bus error.
#0 0x00007f0f323beb55 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f0f323beb55 in raise () from /lib64/libc.so.6
#1 0x00007f0f35f8042e in skgesigOSCrash () from /usr/lib/oracle/11.2/client64/lib/libclntsh.so.11.1
#2 0x00007f0f36222ca9 in kpeDbgSignalHandler () from /usr/lib/oracle/11.2/client64/lib/libclntsh.so.11.1
#3 0x00007f0f35f8063e in skgesig_sigactionHandler () from /usr/lib/oracle/11.2/client64/lib/libclntsh.so.11.1
#4 <signal handler called>
What should I check in my code to avoid this? Or is this something Oracle should fix?
Main reasons you could get a bus error revolves around inaccessible memory. This could be due to many reasons:
Accessing through a deleted pointer.
Accessing through an uninitialized pointer.
Accessing through a NULL pointer.
Accessing the address which is not yours. It could be due to overflow errors.
Try adding the following to the $ORACLE_HOME/network/admin/*.ora file:
DIAG_ADR_ENABLED=OFF
DIAG_SIGHANDLER_ENABLED=FALSE
DIAG_DDE_ENABLED=FALSE
This sounds like an Oracle issue.
And also Oracle's libraries seem to be compiled by Intel compilers.
Yesterday I ran into misery which took me 24 hours of frustration. The problem boiled down to unexpected crashes occurring on random basis. To complicate things, debugging reports had absolutely random pattern as well. To complicate it even more, all debugging traces were leading to either random Qt sources or native DLLs, i.e. proving every time that the issue is rather not on my side.
Here you are a few examples of such lovely reports:
Program received signal SIGSEGV, Segmentation fault.
0x0000000077864324 in ntdll!RtlAppendStringToString () from C:\Windows\system32\ntdll.dll
(gdb) bt
#0 0x0000000077864324 in ntdll!RtlAppendStringToString () from C:\Windows\system32\ntdll.dll
#1 0x000000002efc0230 in ?? ()
#2 0x0000000002070005 in ?? ()
#3 0x000000002efc0000 in ?? ()
#4 0x000000007787969f in ntdll!RtlIsValidHandle () from C:\Windows\system32\ntdll.dll
#5 0x0000000000000000 in ?? ()
warning: HEAP: Free Heap block 307e5950 modified at 307e59c0 after it was freed
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00000000778bf0b2 in ntdll!ExpInterlockedPopEntrySListFault16 () from C:\Windows\system32\ntdll.dll
(gdb) bt
#0 0x00000000778bf0b2 in ntdll!ExpInterlockedPopEntrySListFault16 () from C:\Windows\system32\ntdll.dll
#1 0x000000007786fd34 in ntdll!RtlIsValidHandle () from C:\Windows\system32\ntdll.dll
#2 0x0000000077910d20 in ntdll!RtlGetLastNtStatus () from C:\Windows\system32\ntdll.dll
#3 0x00000000307e5950 in ?? ()
#4 0x00000000307e59c0 in ?? ()
#5 0x00000000ffffffff in ?? ()
#6 0x0000000000220f10 in ?? ()
#7 0x0000000077712d60 in WaitForMultipleObjectsEx () from C:\Windows\system32\kernel32.dll
#8 0x0000000000000000 in ?? ()
Program received signal SIGSEGV, Segmentation fault.
0x0000000000a9678a in QBasicAtomicInt::ref (this=0x8) at ../../include/QtCore/../../../qt-src/src/corelib/arch/qatomic_x86_64.h:121
121 : "memory");
(gdb) bt
#0 0x0000000000a9678a in QBasicAtomicInt::ref (this=0x8) at ../../include/QtCore/../../../qt-src/src/corelib/arch/qatomic_x86_64.h:121
#1 0x00000000009df08e in QVariant::QVariant (this=0x21e4d0, p=...) at d:/Distributions/qt-src/src/corelib/kernel/qvariant.cpp:1426
#2 0x0000000000b4dde9 in QList<QVariant>::value (this=0x323bd480, i=1) at ../../include/QtCore/../../../qt-src/src/corelib/tools/qlist.h:666
#3 0x00000000009ccff7 in QObject::property (this=0x3067e900,
name=0xa9d042a <QCDEStyle::drawPrimitive(QStyle::PrimitiveElement, QStyleOption const*, QPainter*, QWidget const*) const::pts5+650> "_q_stylerect")
at d:/Distributions/qt-src/src/corelib/kernel/qobject.cpp:3742
#4 0x0000000000000000 in ?? ()
As you can see this stuff is pretty nasty, it gives one no useful information. But, there was one thing I didn't pay attention to. It was a weird warning during compilation which is also hard to catch with an eye:
In file included from d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/qsharedpointer.h:50:0,
from d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/QSharedPointer:1,
from ../../../../source/libraries/Project/sources/Method.hpp:4,
from ../../../../source/libraries/Project/sources/Slot.hpp:4,
from ../../../../source/libraries/Project/sources/Slot.cpp:1:
d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/qsharedpointer_impl.h: In instantiation of 'static void QtSharedPointer::ExternalRefCount<T>::deref(QtSharedPointer::ExternalRefCount<T>::Data*, T*) [with T = Project::Method::Private; QtSharedPointer::ExternalRefCount<T>::Data = QtSharedPointer::ExternalRefCountData]':
d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/qsharedpointer_impl.h:336:11: required from 'void QtSharedPointer::ExternalRefCount<T>::deref() [with T = Project::Method::Private]'
d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/qsharedpointer_impl.h:401:38: required from 'QtSharedPointer::ExternalRefCount<T>::~ExternalRefCount() [with T = Project::Method::Private]'
d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/qsharedpointer_impl.h:466:7: required from here
d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/qsharedpointer_impl.h:342:21: warning: possible problem detected in invocation of delete operator: [enabled by default]
d:/Libraries/x64/MinGW-w64/4.7.2/Qt/4.8.4/include/QtCore/qsharedpointer_impl.h:337:28: warning: 'value' has incomplete type [enabled by default]
Actually, I turned to this warning only as a last resort because in such a desperate pursuit to find a bug, the code was already infected with logging to death literally.
After reading it carefully, I recalled that, for instance, if one uses std::unique_ptr or std::scoped_ptr for Pimpl - one should certainly provide desctructor, otherwise the code won't even compile. However, I also remember that std::shared_ptr does not care about destructor and works fine without it. It was another reason why I didn't pay attention to this strange warning. Long story short, when I added destructor, this random crashing stopped. Looks like Qt's QSharedPointer has some design flaws compared to std::shared_ptr. I guess it would be better, if Qt developers transformed this warning into error because debugging marathons like that are simply not worth one's time, effort and nerves.
My questions are:
What's wrong with QSharedPointer? Why destructor is so vital?
Why crashing happened when there was no destructor? These objects (which are using Pimpl + QSharedPointer) are created on stack and no other objects have access to them after their death. However, crashing happened during some random period of time after their death.
Has anyone ran into issues like that before? Please, share your experience.
Are there other pitfalls
like that in Qt - ones that I must know about for sure to stay
safe in future?
Hopefully, these questions and my post in general will help others to avoid the hell I've been to for the past 24 hours.
The issue has been worked around in Qt 5, see https://codereview.qt-project.org/#change,26974
The compiler calling the wrong destructor or assuming a different memory layout probably lead to some kind of memory corruption. I'd say a compiler should give an error for this issue and not a warning.
You'll run into a similar issue with std::unique_ptr, which can also cause broken destructors if used with an incomplete type. The fix is pretty trivial, of course - I declare a constructor for the class, then define it in the implementation file as
MyClass::~MyClass() = default;
The reason that this is an issue for std::unique_ptr but not std::shared_ptr is that the destructor is part of the type of the former, but is a member of the latter.