What does this std::ostream related stack trace mean? - c++

I'm trying to build a C++ library for an integrated/mobile platform. The platform has a decent set of libs, including stdc++. The library I'm trying to build uses ofstream and whenever it attempts to use a class that depends on ofstream, I get a 'bad_cast' exception.
0 0xb082d9b1 in SignalKill ()
from /home/preet/bbndk-2.0.1/target/qnx6/x86/lib/libc.so.3
1 0xb081aa7e in raise ()
from /home/preet/bbndk-2.0.1/target/qnx6/x86/lib/libc.so.3
2 0xb0818cb8 in abort ()
from /home/preet/bbndk-2.0.1/target/qnx6/x86/lib/libc.so.3
3 0xb87c48bf in __gnu_cxx::__verbose_terminate_handler ()
at ../../../../../libstdc++-v3/libsupc++/vterminate.cc:93
4 0xb87c23d6 in __cxxabiv1::__terminate (
handler=0xb87c47c0 <__gnu_cxx::__verbose_terminate_handler()>)
at ../../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
5 0xb87c2421 in std::terminate ()
at ../../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
6 0xb87c2563 in __cxxabiv1::__cxa_throw (obj=0x859e710, tinfo=0xb87f4c24,
dest=0xb87c0670 <std::bad_cast::~bad_cast()>)
at ../../../../../libstdc++-v3/libsupc++/eh_throw.cc:83
7 0xb875e88c in std::__throw_bad_cast ()
at ../../../../../libstdc++-v3/src/functexcept.cc:52
8 0xb8798c0d in __check_facet<std::ctype<char> > (__f=<optimized out>)
at /home/builder/hudson/650-gcc-4.4/svn/linux-x86-o-ntox86/i486-pc-nto-qnx6.5.0/pic/libstdc++-v3/include/bits/basic_ios.h:49
9 widen (__c=<optimized out>, this=<optimized out>)
at /home/builder/hudson/650-gcc-4.4/svn/linux-x86-o-ntox86/i486-pc-nto-qnx6.5.0/pic/libstdc++-v3/include/bits/basic_ios.h:440
10 std::endl<char, std::char_traits<char> > (__os=...)
at /home/builder/hudson/650-gcc-4.4/svn/linux-x86-o-ntox86/i486-pc-nto-qnx6.5.0/pic/libstdc++-v3/include/ostream:539
11 0xb8793c2d in std::ostream::operator<< (this=0x84db220,
__pf=0x804f64c <_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_#plt>)
at /home/builder/hudson/650-gcc-4.4/svn/linux-x86-o-ntox86/i486-pc-nto-qnx6.5.0/pic/libstdc++-v3/include/ostream:113
12 0x0805240d in QDecViewport::QDecViewport (this=0x86da6c0, parent=0x0)
at ../qml_osg_viewport/qdecviewport.cpp:12
13 0x08051cca in QDeclarativePrivate::QDeclarativeElement<QDecViewport>::QDeclarativeElement (this=0x86da6c0)
at /usr/local/Trolltech/QtLighthouse-4.8.2-i386/include/QtDeclarative/qdeclarativeprivate.h:83
14 0x08051d3c in QDeclarativePrivate::createInto<QDecViewport> (
memory=0x86da6c0)
at /usr/local/Trolltech/QtLighthouse-4.8.2-i386/include/QtDeclarative/qdeclarativeprivate.h:91
15 0xb8ad5ec5 in ?? ()
16 0x086da6c0 in ?? ()
17 0x00000000 in ?? ()
Frames 7-11 are relevant and the ones I need help understanding. The line of code frame 12 is referring to is just:
OSG_INFO << "Hello OSG" << std::endl;
Where OSG_INFO is a stream redirector used for logging. I'm able to use std::cout in the same way without any issue. Unmangling frame 11 gives me:
__pf=0x804f64c <std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)#plt>)
Which is still pretty cryptic... and I'd understand things going crazy if maybe I was trying to pass something really strange to the ofstream output operator, but its just text. Does anyone have any suggestions?

std::endl has the following behavior, citing C++11 §27.7.3.8/1:
Calls os.put(os.widen('\n')), then os.flush().
Frame 9 says that endl's call to widen is failing, i.e. that OSG_INFO.widen('\n') is failing. widen, in turn, has the following behavior (§27.5.5.3/12):
Returns: use_facet< ctype<char_type> >(getloc()).widen(c)
use_facet itself will throw bad_cast if the facet is not present in the imbued locale (§22.3.2/3), but your stack trace doesn't indicate that this is the case. (Then again, I haven't dug through the libstdc++ internals to verify that it's doing things by the book...)
I assume that __check_facet is called before use_facet (or use_facet was inlined and disappeared from the stack trace), with the same net effect; this implies that OSG_INFO has been imbued with some locale that does not have the std::ctype<char> facet present – bad times!
Alternatively, it may have been imbued with some locale with a facet present that simply doesn't handle widen('\n') gracefully. But there's no way to know for sure, and nothing else we can tell you, without knowing what OSG_INFO is and/or how it's implemented.

Related

How does a unsigned char vector deallocation crash a program with a segfault...? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I'm literally deallocating a vector of unsigned chars during just normal object deallocation, and it crashes with a segfault at the vector_base deallocation free():
[Switching to Thread 17648.0x3528]
0x00007ff9ba0a9606 in ntdll!RtlAllocateHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
(gdb) back
#0 0x00007ff9ba0a9606 in ntdll!RtlAllocateHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
#1 0x00007ff9ba0a5d21 in ntdll!RtlFreeHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2 0x00007ff9b9839c9c in msvcrt!free () from C:\WINDOWS\System32\msvcrt.dll
#3 0x00000000004bc540 in __gnu_cxx::new_allocator<unsigned char>::deallocate(unsigned char*, unsigned long long) ()
#4 0x00000000004ea87b in std::allocator_traits<std::allocator<unsigned char> >::deallocate(std::allocator<unsigned char>&, unsigned char*, unsigned long long) ()
#5 0x00000000004df392 in std::_Vector_base<unsigned char, std::allocator<unsigned char> >::_M_deallocate(unsigned char*, unsigned long long) ()
#6 0x00000000004df436 in std::_Vector_base<unsigned char, std::allocator<unsigned char> >::~_Vector_base() ()
#7 0x000000000050110d in std::vector<unsigned char, std::allocator<unsigned char> >::~vector() ()
#8 0x0000000000420dd4 in Text::~Text() ()
#9 0x000000000041c9f7 in Scene::clearOnScreenText() ()
#10 0x0000000000410a52 in Application::NextScene() ()
#11 0x0000000000412a41 in Application::update() ()
#12 0x00000000004119a8 in Application::Run()::{lambda()#2}::operator()() const ()
#13 0x000000000041501a in std::_Function_handler<void (), Application::Run()::{lambda()#2}>::_M_invoke(std::_Any_data const&) ()
#14 0x00000000004cad92 in std::function<void ()>::operator()() const ()
#15 0x000000000051fba9 in std::intervalThread(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::function<void ()>, std::function<void ()>, std::function<void ()>, std::function<bool ()>, long long, std::vector<long long, std::allocator<long long> >*, bool)::{lambda()#1}::operator()() const ()
Just three questions:
How is this even possible?
What did I monumentally do wrong...?
Most importantly, how does one fix this?
Side note:
I have had nothing but problems with deallocating memory recently (specificly in this program), so would there possibly be something wrong with MinGW, or is possibly GDB not reading the stack correctly? All debugging symbols are off, and optimation is at 0;
How is this even possible?
With undefined behavior, anything is possible. :) More helpfully, what's probably happening is that your heap has been corrupted (e.g. by a bad memory write somewhere), and the new_allocator<unsigned char>::deallocate() method tried to dereference a bad pointer in the heap's metadata, which caused the crash... but the damage had been silently done sometime earlier in your program's execution.
Another possibility is that clearOnScreenText() tried to call delete on an invalid (but non-NULL) Text * pointer, and so when Text::~Text() tries to run the destructor of the std::vector<char> member variable, it's trying to destroy a "vector object" that is really just arbitrary bytes that are not a valid state for a vector, with catastrophic consequences.
Most importantly, how does one fix this?
If you can run your code on Linux, valgrind is a valuable tool in situations like this. Under Windows, there are similar tools (I think one is called Electric Fence, but I forget what else there is out there). Short of that, you might have to just start playing "twenty questions" with the code, by commenting out various parts of the program until the crash goes away, then adding them back in until the crash comes back, and repeating until you have a better understanding of which parts of the code are required to execute in order to reproduce the crash. Once you've figure out what code to look at, you can start trying figure out what is wrong in the suspect code. Very tedious, but sometimes that is the only way.

Why does gmp crash with "invalid next size" to realloc here?

I have a simple function using the gmp C++ bindings:
#include <inttypes.h>
#include <memory>
#include <gmpxx.h>
mpz_class f(uint64_t n){
std::unique_ptr<mpz_class[]> m = std::make_unique<mpz_class[]>(n + 1);
m[0] = 0;
m[1] = 1;
for(uint64_t i = 2; i <= n; ++i){
m[i] = m[i-1] + m[i-2];
}
return m[n];
}
int main(){
mpz_class fn;
for(uint64_t n = 0;; n += 1){
fn = f(n);
}
}
Presumably make_unique should allocate a fresh array and free it when the function returns since the unique pointer owning it has its lifetime end. Presumably the mpz_class object returned should be a copy and not affected by this array getting deleted. The program crashes with the error:
realloc(): invalid next size
and if I look at the core dump in gdb I get the stack trace:
#0 raise()
#1 abort()
#2 __libc_message()
#3 malloc_printerr()
#4 _int_realloc()
#5 realloc()
#6 __gmp_default_reallocate()
#7 __gmpz_realloc()
#8 __gmpz_add()
#9 __gmp_binary_plus::eval(v, w, z)
#10 __gmp_expr<...>::eval(this, this, p)
#11 __gmp_set_expr<...>(expr, z)
#12 __gmp_expr<...>::operator=<...>(expr, this)
#13 f(n)
#14 main(argc, argv)
This isn't helpful to me, except that it suggests maybe the problem is coming from gmpxx using expression templates (stack frames 9-12 indicate this, valgrind and stack frame 12 put the last line of my code executed before the error at m[1] = 1;). Valgrind says there is an invalid read of size 8 at this line but lists stack entries corresponding to the rest of the trace after it, and then says there is an invalid write at the next instruction. The invalid read is 8 bytes after "a block of size 24 alloc'd [by make_unique]" while the invalid write is to null. Obviously this line should not cause either though as it should only be reading a pointer and then writing to part of the buffer it points to which definitely does not have address 0x0. I decided to use the C++ bindings even though I always use gmp from C because I thought it would be faster to write but this error ensured that was not the case. Is this a problem with gmp or am I allocating the array wrong? I get similar errors if I used new and delete directly or if I manually inline the function call. I feel like the problem may have to do with mpz_class actually storing an expression template and not a proper concretized value.
I'm using GCC 9.2.0 with g++ -std=c++17 -O2 -g -Wall ... and GMP 6.1.2-3.
Neither Clang nor GCC report any errors.
If we run under Valgrind, we see:
==1948514== Invalid read of size 8
==1948514== at 0x489B0F0: __gmpz_set_si (in /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2)
==1948514== by 0x10945E: __gmp_expr<__mpz_struct [1], __mpz_struct [1]>::assign_si(long) (gmpxx.h:1453)
==1948514== by 0x1094E3: __gmp_expr<__mpz_struct [1], __mpz_struct [1]>::operator=(int) (gmpxx.h:1538)
==1948514== by 0x109248: f(unsigned long) (59678712.cpp:8)
==1948514== by 0x109351: main (59678712.cpp:18)
==1948514== Address 0x4e08ca0 is 8 bytes after a block of size 24 alloc'd
==1948514== at 0x483650F: operator new[](unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1948514== by 0x10953F: std::_MakeUniq<__gmp_expr<__mpz_struct [1], __mpz_struct [1]> []>::__array std::make_unique<__gmp_expr<__mpz_struct [1], __mpz_struct [1]> []>(unsigned long) (unique_ptr.h:855)
==1948514== by 0x10920C: f(unsigned long) (59678712.cpp:6)
==1948514== by 0x109351: main (59678712.cpp:18)
This demonstrates that when we call f(0), we write to m[1], which is out of bounds. That's undefined behaviour, so anything could happen. Luckily you got a crash, rather than something more subtle.
Simple fix:
mpz_class f(uint64_t n) {
if (!n) return 0;
BTW, prefer <cstdint> to <inttypes.h>, and write as std::uint64_t etc.

Can I get valgrind to tell me _which_ value is uninitialized?

I ran valgrind on some code as follows:
valgrind --tool=memcheck --leak-check=full --track-origins=yes ./test
It returns the following error:
==24860== Conditional jump or move depends on uninitialised value(s)
==24860== at 0x4081AF: GG::fl(M const&, M const&) const (po.cpp:71)
==24860== by 0x405CDB: MO::fle(M const&, M const&) const (m.cpp:708)
==24860== by 0x404310: M::operator>=(M const&) const (m.cpp:384)
==24860== by 0x404336: M::operator<(M const&) const (m.cpp:386)
==24860== by 0x4021FD: main (test.cpp:62)
==24860== Uninitialised value was created by a heap allocation
==24860== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==24860== by 0x40653F: GODA<unsigned int>::allocate_new_block() (goda.hpp:82)
==24860== by 0x406182: GODA<unsigned int>::GODA(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) (goda.hpp:103)
==24860== by 0x402A0E: M::init(unsigned long) (m.cpp:63)
==24860== by 0x403831: M::M(std::initializer_list<unsigned int>, MO const*) (m.cpp:248)
==24860== by 0x401B56: main (test.cpp:31)
So line 71 has an error. OK, great. Here are the lines leading up to line 71 of po.cpp (line 71 is last):
DEG_TYPE dtk = t.ord_deg();
DEG_TYPE duk = u.ord_deg();
bool searching = dtk == duk;
NVAR_TYPE n = t.nv();
NVAR_TYPE k = 0;
for (/* */; searching and k < n; ++k) { // this is line 71
OK, so which value of line 71 is uninitialized?
certainly not k;
I manually checked (= "stepping through gdb") that t's constructor initializes the value that is returned by t.nv(), so certainly not n (in fact n is set to 6, the correct value);
searching is determined by dtk and duk, but I also manually checked that t's and u's constructors initialize the values that are returned by .ord_deg() (in fact both dtk and duk are set to 3, the correct value).
I'm at a complete loss here. Is there some option that will tell valgrind to report which precise value it thinks is uninitialized?
Update
In answer to one question, here is line 61 of test.cpp:
M s { 1, 0, 5, 2, 0 };
So it constructs using an initializer list. Here's that constructor:
M::M(
initializer_list<EXP_TYPE> p, const MO * ord
) {
common_init(ord);
init_e(p.size());
NVAR_TYPE i = 0;
last = 0;
for (
auto pi = p.begin();
pi != p.end();
++pi
) {
if (*pi != 0) {
e[last] = i;
e[last + 1] = *pi;
last += 2;
}
++i;
}
ord->set_data(*this);
}
Here's the data in the class, adding comments showing where it's initialized:
NVAR_TYPE n; // init_e()
EXP_TYPE * e; // common_init()
NVAR_TYPE last; // common_init()
DEG_TYPE od; // common_init(), revised in ord->set_data()
const MO * o; // common_init()
MOD * o_data; // common_init(), revised in ord->set_data()
Is there some option that will tell valgrind to report which precise
value it thinks is uninitialized?
The best you can do is to use --track-origins=yes (you already using this option). Valgrind will tell you only approximate location of uninitialised values (origin in terms of Valgrind), but not exact variable name. See Valgrind manual for --track-origins:
When set to yes, Memcheck keeps track of the origins of all
uninitialised values. Then, when an uninitialised value error is
reported, Memcheck will try to show the origin of the value. An origin
can be one of the following four places: a heap block, a stack
allocation, a client request, or miscellaneous other sources (eg, a
call to brk).
For uninitialised values originating from a heap block, Memcheck shows
where the block was allocated. For uninitialised values originating
from a stack allocation, Memcheck can tell you which function
allocated the value, but no more than that -- typically it shows you
the source location of the opening brace of the function. So you
should carefully check that all of the function's local variables are
initialised properly.
You can use gdb+vgdb+valgrind to debug your program under valgrind.
Then, when valgrind stops on the error reported above, you can examine the
definedness of the variables you are interested in using the monitor request
'xb' or 'get_vbits' by asking the adress of the variable, and then examining
the vbits for the size of the variable.
For example:
p &searching
=> 0xabcdef
monitor xb 0xabcdef 1
=> will show you the value of searching and the related vbits.
For more details, see 'Debugging your program using Valgrind gdbserver and GDB' http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver
and 'Memcheck Monitor Commands' http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.monitor-commands
From the Valgrind documentation,
4.2.2. Use of uninitialised values
...
Sources of uninitialised data tend to be:
- Local variables in procedures which have not been initialised, as in the example above.
- The contents of heap blocks (allocated with malloc, new, or a similar function) before you (or a constructor) write something there.
You have:
==24860== Uninitialised value was created by a heap allocation
==24860== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==24860== by 0x40653F: GODA<unsigned int>::allocate_new_block() (goda.hpp:82)
So it is very likely that the malloc used by GODA<unsigned int>::allocate_new_block() is causing this error.
You could use clang-tidy as an alternative to find uninitialized variables.
QtCreator 4.7 has full integration of clang-tidy, select "Clang-Tidy and Clazy" in the debug pane, press run, select the file(s) you want to test.
You need to understand how memcheck works. In order to avoid generating excessive errors, uninitialized values don't get flagged until they have a possible impact on your code. The uninitialized information gets propagated by assignments.
// if ord_deg returns something that is uninitialized, dtk and/or duk will be
// flagged internally as uninitialized but no error issued
DEG_TYPE dtk = t.ord_deg();
DEG_TYPE duk = u.ord_deg();
// again transitively if either dtk or duk is flagged as uninitialized then
// searching will be flagged as uninitialized, and again no error issued
bool searching = dtk == duk;
// if nv() returns something that is uninitialized, n will be
// flagged internally as unintialized
NVAR_TYPE n = t.nv();
// k is flagged as initialized
NVAR_TYPE k = 0;
// OK now the values of searching and n affect your code flow
// if either is uninitialized then memcheck will issue an error
for (/* */; searching and k < n; ++k) { // this is line 71
You are looking at wrong stack trace.
Valgrind tells you that uninitialized value was created by a heap allocation:
==24860== Uninitialised value was created by a heap allocation
==24860== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==24860== by 0x40653F: GODA<unsigned int>::allocate_new_block() (goda.hpp:82)
==24860== by 0x406182: GODA<unsigned int>::GODA(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) (goda.hpp:103)
==24860== by 0x402A0E: M::init(unsigned long) (m.cpp:63)
==24860== by 0x403831: M::M(std::initializer_list<unsigned int>, MO const*) (m.cpp:248)
==24860== by 0x401B56: main (test.cpp:31)
You can omit a few top stack frames from third party library code, because it is less likely that the error is in third party code. You should look more closely at this stack frame, which appears to be your code:
==24860== by 0x402A0E: M::init(unsigned long) (m.cpp:63)
Most likely uninitialized variable should be in m.cpp:63 line of code.

Segfault from Opencv Mat::create

I am getting a Segmentation fault from the following call to Mat::Create
void PoissonBlend::blend(Mat& src, Mat& dst, Mat& mask, Mat& out){
Mat outer(mask.rows, mask.cols, CV_8U);
When I run my program in gdb I can see that both rows and cols are valid, and I have tried several different data types, but no matter what I get a Segfault on this line.
My program defines several other Mats in main(), before the call to blend, and all of them work perfectly fine. Has anyone else ever run into this before? This error is driving me crazy, I cant find any difference between this call to create and any of the others in my program, yet this one fails every time.
My gdb output is:
Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff6fbe740 <main_arena>, bytes=307228) at malloc.c:3879
#0 _int_malloc (av=0x7ffff6fbe740 <main_arena>, bytes=307228) at malloc.c:3879
#1 0x00007ffff6c88fc5 in __GI___libc_malloc (bytes=307228) at malloc.c:2924
#2 0x00007ffff791594d in cv::fastMalloc(unsigned long) () from /usr/lib/libopencv_core.so.2.3
#3 0x00007ffff78884bc in cv::Mat::create(int, int const*, int) () from /usr/lib/libopencv_core.so.2.3
#4 0x00000000004243da in cv::Mat::create (this=0x7fffffffdab0, _rows=480, _cols=640, _type=0) at /usr/include/opencv2/core/mat.hpp:368
#5 0x0000000000427608 in cv::Mat::Mat (this=0x7fffffffdab0, _rows=480, _cols=640, _type=0) at /usr/include/opencv2/core/mat.hpp:68
#6 0x00000000004255a7 in PoissonBlend::blend (this=0x7fffffffdd13, src=..., dst=..., mask=..., out=...)
at /home/adam/WorkingCode/rasc/trunk/src/Poisson.cpp:95
#7 0x0000000000423eb2 in main () at /home/adam/WorkingCode/rasc/trunk/src/PoissonTest.cpp:45
Since the crash is within malloc.c, I suspect that you probably have memory corruption. Try running the program under Valgrind to detect this.

What does the second column in the gdb stack trace mean?

I have a stack trace that looks like this:
#3 0x00007fffde86c206 in GetMedia (p_ml=0xb91560, id=<value optimized out>, select=ML_MEDIA, reload=<value optimized out>) at ../../../modules/media_library/sql_media_library.c:1170
#4 0x00007fffde86a7d0 in GetInputItemFromMedia (p_ml=0xb91560, i_media=12276000) at ../../../modules/media_library/sql_media_library.c:1204
#5 0x00007ffff6765eab in ml_CreateInputItem (this=0x7784f0) at ../../../../include/vlc_media_library.h:887
#6 MLModel::popupInfo (this=0x7784f0) at ../../../../modules/gui/qt4/components/playlist/media_library/ml_model.cpp:528
#7 0x00007ffff67a7204 in MLModel::qt_metacall (this=0x7784f0, _c=<value optimized out>, _id=17710, _a=<value optimized out>) at components/playlist/media_library/ml_model.moc.cpp:79
#8 0x00007ffff4ec8e3f in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) () from /usr/lib/libQtCore.so.4
I'm wondering that the second column signifies. Also, what does the lack of it signify? As can be seen, frame #6 does not have this address, and I believe my problem( a segfault ) is being caused due to something related.
That column contains the return address from the called function just above to the caller function on that line. Its lack probably means that the function was inlined.