building QEMU with TSAN did not get any problem
but it get FATAL during start up
ThreadSanitizer: can't find longjmp buf
FATAL: ThreadSanitizer CHECK failed: ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:544 "((0)) != (0)" (0x0, 0x0)
#0 <null> <null> (libtsan.so.0+0x891b4)
#1 <null> <null> (libtsan.so.0+0xa74ae)
#2 <null> <null> (libtsan.so.0+0x2b0b2)
#3 siglongjmp <null> (libtsan.so.0+0x2cb64)
#4 qemu_coroutine_switch util/coroutine-ucontext.c:221 (qemu-system-x86_64+0xc0bcbd)
#5 qemu_aio_coroutine_enter util/qemu-coroutine.c:147 (qemu-system-x86_64+0xc089c7)
#6 qemu_coroutine_enter util/qemu-coroutine.c:170 (qemu-system-x86_64+0xc08b5a)
...
#23 main /root/qemu-4.2.0/vl.c:4436 (qemu-system-x86_64+0x5c92ec)
#24 __libc_start_main <null> (libc.so.6+0x271e2)
#25 _start <null> (qemu-system-x86_64+0x2c6b9d)
it looks like the known issue for TSAN, and --with-coroutine=gthread seems work in the past
but it was removed (https://patchwork.kernel.org/patch/9704545/)
i tried all ucontext and sigaltstack backend, but failed
my question is, does TSAN still work for current version ? (qemu 4.2.0)
There is work in progress to add TSan support to QEMU, and it is not in any release yet.
Our work in progress branch is here: https://github.com/rf972/qemu/tree/tsan_v0.
It is worth mentioning that our WIP branch contains several important patches that we picked up from Emilio Cota in this branch.
This patch seems related to your issue. It modifies the same area you referenced above and brings in the support for fiber annotations for coroutine-ucontext.
Related
I'm new to custom hardware designs and I'm going to scale up my custom hardware which is functioning well with few boards. I need some help with making decision on prototypes and scaling up with the state of the prototypes.
This hardware is based on i.MX6Q processor & MT41K256M16TW-107 IT:P memory. This is most similar to nitrogen6_max development board.
I'm having trouble with my hardware which is really difficult to figure out as some boards are working really well and some are not (From 7 units of production 4 boards are functioning really well, one board getting segmentation faults and kernel panic while running linux application ). When I do memory calibration of bad boards those are really looks like same to good boards.
Segmentation fault is directing to some memory issues, I back traced and core dumped using linux GDB. >>
Program terminated with signal SIGSEGV, Segmentation fault.
#0 gcoHARDWARE_QuerySamplerBase (Hardware=0x22193dc, Hardware#entry=0x0,
VertexCount=0x7ef95370, VertexCount#entry=0x7ef95368, VertexBase=0x40000,
FragmentCount=FragmentCount#entry=0x2217814, FragmentBase=0x0) at
gc_hal_user_hardware_query.c:6020
6020 gc_hal_user_hardware_query.c: No such file or directory.
[Current thread is 1 (Thread 0x76feb010 (LWP 697))]
(gdb) bt
#0 gcoHARDWARE_QuerySamplerBase (Hardware=0x22193dc, Hardware#entry=0x0,
VertexCount=0x7ef95370, VertexCount#entry=0x7ef95368, VertexBase=0x40000,
FragmentCount=FragmentCount#entry=0x2217814, FragmentBase=0x0) at
gc_hal_user_hardware_query.c:6020
#1 0x765d20e8 in gcoHAL_QuerySamplerBase (Hal=<optimized out>,
VertexCount=VertexCount#entry=0x7ef95368, VertexBase=<optimized out>,
FragmentCount=FragmentCount#entry=0x2217814,
FragmentBase=0x0) at gc_hal_user_query.c:692
#2 0x681e31ec in gcChipRecompileEvaluateKeyStates (chipCtx=0x0,
gc=0x7ef95380) at src/chip/gc_chip_state.c:2115
#3 gcChipValidateRecompileState (gc=0x7ef95380, gc#entry=0x21bd96c,
chipCtx=0x0, chipCtx#entry=0x2217814) at src/chip/gc_chip_state.c:2634
#4 0x681c6da8 in __glChipDrawValidateState (gc=0x21bd96c) at
src/chip/gc_chip_draw.c:5217
#5 0x68195688 in __glDrawValidateState (gc=0x21bd96c) at
src/glcore/gc_es_draw.c:585
#6 __glDrawPrimitive (gc=0x21bd96c, mode=<optimized out>) at
src/glcore/gc_es_draw.c:943
#7 0x68171048 in glDrawArrays (mode=4, first=6, count=6) at
src/glcore/gc_es_api.c:399
#8 0x76c9ac72 in CEGUI::OpenGL3GeometryBuffer::draw() const () from
/usr/lib/libCEGUIOpenGLRenderer-0.so.2
#9 0x76dd1aee in CEGUI::RenderQueue::draw() const () from
/usr/lib/libCEGUIBase-0.so.2
#10 0x76e317d8 in CEGUI::RenderingSurface::draw(CEGUI::RenderQueue const&,
CEGUI::RenderQueueEventArgs&) () from /usr/lib/libCEGUIBase-0.so.2
#11 0x76e31838 in CEGUI::RenderingSurface::drawContent() () from
/usr/lib/libCEGUIBase-0.so.2
#12 0x76e36d30 in CEGUI::GUIContext::drawContent() () from
/usr/lib/libCEGUIBase-0.so.2
#13 0x76e31710 in CEGUI::RenderingSurface::draw() () from
/usr/lib/libCEGUIBase-0.so.2
#14 0x001bf79c in tengri::gui::cegui::System::Impl::draw (this=0x2374f08) at
codebase/src/gui/cegui/system.cpp:107
#15 tengri::gui::cegui::System::draw (this=this#entry=0x2374e74) at
codebase/src/gui/cegui/system.cpp:212
#16 0x000b151e in falcon::osd::view::MainWindowBase::Impl::preNativeUpdate
(this=0x2374e10) at codebase/src/osd/view/MainWindow.cpp:51
#17 falcon::osd::view::MainWindowBase::preNativeUpdate
(this=this#entry=0x209fe30) at codebase/src/osd/view/MainWindow.cpp:91
#18 0x000c4686 in falcon::osd::view::FBMainWindow::update (this=0x209fe00)
at
codebase/include/falcon/osd/view/FBMainWindow.h:56
#19 falcon::osd::view::App::Impl::execute (this=0x209fdb0) at
codebase/src/osd/view/app_view_osd_falcon.cpp:139
#20 falcon::osd::view::App::execute (this=<optimized out>) at
codebase/src/osd/view/app_view_osd_falcon.cpp:176
#21 0x000475f6 in falcon::osd::App::execute (this=this#entry=0x7ef95c84) at
codebase/src/osd/app_osd_falcon.cpp:75
#22 0x00047598 in main () at codebase/src/main.cpp:5
(gdb) Quit
Here I have attached NXP tool calibration results for 2 good boards and 1 bad(getting segmentation faults) board. Click on following links.
Board 1
Board 2
Board 3
I did stress test using stressapptest and it was a over night test. But I didn't get any fault and test was passed.
From above 3 boards Board 1 and Board 2 are working really well and Board 3 is getting kernel panics while running same application on 3 boards. Can you help me to figure out any clue from this results from above 3 boards ?
I did 50 units of production 6 months ago and only 30 were worked properly. But that is with Alliance memory AS4C256M16D3A-12BCN. So will this be an issue of the design ? If this is an issue of the ddr layout or whole design why some boards are working really well ?
Will this be an issue of the manufacturing side ? Then how this could be happen with the same production ? Because some are working and some are not.
Will stressapptest stress power as well. Do you know any linux app which can stress power as well?
I don't have much experience with mass production and but I like to move forward after learning and correcting this issues. I must be thankful to you if you will kindly reply me soon.
I have a single threaded program that crashes consistently at certain points right after free() is called when running in non-debug mode.
When in debug mode however, debugger breaks on the line that calls free() even though there are no break points set. When I try to step to the next line again, debugger breaks again on the same line. Stepping once again resumes execution as normal. No crash, no segfault, nothing.
EDIT-1: Contrary to what I wrote above, crashes in non-debug mode
turns out to be inconsistent, which makes me think I am somehow
writing somewhere that I shouldn't. (Breaks in debug mode are
still consistent, though.)
Call stack at the breaks shows some windows library functions(I think) called after the function that calls free() statement. I have no idea how to interpret them. And consequently, I have no idea how to go about debugging in this situation.
I have provided the call stacks at break points below. Can someone point me in a direction where I can tackle the problem? What might be causing the breaks in debugger mode?
Program is run on Windows Vista, compiled with gcc 4.9.2, debugger used is gdb. Assume double release is not the case.(I use ::operator new and ::operator delete overloads that catch that. Situation described is the same without these overloads as well.)
Note that the crash(or the involuntary breaks in debugger) is consistent. Happens every time, in the same execution point.
Here is the call stack at the initial break:
(Note that free_wrapper() is the function that houses free() statement that causes the crash/breaks.)
#0 0x770186ff ntdll!DbgBreakPoint() (C:\Windows\system32\ntdll.dll:??)
#1 0x77082edb ntdll!RtlpNtMakeTemporaryKey() (C:\Windows\system32\ntdll.dll:??)
#2 0x7706b953 ntdll!RtlImageRvaToVa() (C:\Windows\system32\ntdll.dll:??)
#3 0x77052c4f ntdll!RtlQueryRegistryValues() (C:\Windows\system32\ntdll.dll:??)
#4 0x77083f3b ntdll!RtlpNtMakeTemporaryKey() (C:\Windows\system32\ntdll.dll:??)
#5 0x7704bcfd ntdll!EtwSendNotification() (C:\Windows\system32\ntdll.dll:??)
#6 0x770374d5 ntdll!RtlEnumerateGenericTableWithoutSplaying() (C:\Windows\system32\ntdll.dll:??)
#7 0x75829dc6 KERNEL32!HeapFree() (C:\Windows\system32\kernel32.dll:??)
#8 0x75a99c03 msvcrt!free() (C:\Windows\system32\msvcrt.dll:??)
#9 0x350000 ?? () (??:??)
--> #10 0x534020 free_wrapper(pv=0x352af0) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\Unrelated\MemMgmt.cpp:282)
#11 0x407f74 operator delete(pv=0x352af0) (C:\dm\bin\codes\CodeBlocks\ProjTemp\main.cpp:1002)
#12 0x629a74 __gnu_cxx::new_allocator<char>::deallocate(this=0x22f718, __p=0x352af0 "\nÿÿÿÿÿÿº\r%") (C:/Program Files/CodeBlocks/MinGW/lib/gcc/mingw32/4.9.2/include/c++/ext/new_allocator.h:110)
#13 0x6c2257 std::allocator_traits<std::allocator<char> >::deallocate(__a=..., __p=0x352af0 "\nÿÿÿÿÿÿº\r%", __n=50) (C:/Program Files/CodeBlocks/MinGW/lib/gcc/mingw32/4.9.2/include/c++/bits/alloc_traits.h:383)
#14 0x611940 basic_CDataUnit<std::allocator<char> >::~basic_CDataUnit(this=0x22f714, __vtt_parm=0x781df4 <VTT for basic_CDataUnit_TDB<std::allocator<char> >+4>, __in_chrg=<optimized out>) (include/DataUnit/CDataUnit.h:112)
#15 0x61dfa1 basic_CDataUnit_TDB<std::allocator<char> >::~basic_CDataUnit_TDB(this=0x22f714, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) (include/DataUnit/CDataUnit_TDB.h:125)
#16 0x503898 CTblSegHandle::UpdateChainedRowData(this=0x353cf8, new_row_data=..., old_row_fetch_res=..., vColTypes=..., block_hnd=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\SegHandles\CTblSegHandle.cpp:912)
#17 0x502fcc CTblSegHandle::UpdateRowData(this=0x353cf8, new_row_data=..., old_row_fetch_res=..., vColTypes=..., block_hnd=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\SegHandles\CTblSegHandle.cpp:764)
#18 0x443272 UpdateRow(row_addr=..., new_data_unit=..., vColTypes=..., block_hnd=..., seg_hnd=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\DbUtilities.cpp:910)
#19 0x443470 UpdateRow(row_addr=..., vColValues=..., vColTypes=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\DbUtilities.cpp:935)
#20 0x4023e3 test_RowChaining() (C:\dm\bin\codes\CodeBlocks\ProjTemp\main.cpp:234)
#21 0x4081c6 main() (C:\dm\bin\codes\CodeBlocks\ProjTemp\main.cpp:1034)
And here is the call stack when I step to the next line and debugger breaks one last time before resuming normal execution:
#0 0x770186ff ntdll!DbgBreakPoint() (C:\Windows\system32\ntdll.dll:??)
#1 0x77082edb ntdll!RtlpNtMakeTemporaryKey() (C:\Windows\system32\ntdll.dll:??)
#2 0x77052c7f ntdll!RtlQueryRegistryValues() (C:\Windows\system32\ntdll.dll:??)
#3 0x77083f3b ntdll!RtlpNtMakeTemporaryKey() (C:\Windows\system32\ntdll.dll:??)
#4 0x7704bcfd ntdll!EtwSendNotification() (C:\Windows\system32\ntdll.dll:??)
#5 0x770374d5 ntdll!RtlEnumerateGenericTableWithoutSplaying() (C:\Windows\system32\ntdll.dll:??)
#6 0x75829dc6 KERNEL32!HeapFree() (C:\Windows\system32\kernel32.dll:??)
#7 0x75a99c03 msvcrt!free() (C:\Windows\system32\msvcrt.dll:??)
#8 0x350000 ?? () (??:??)
--> #9 0x534020 free_wrapper(pv=0x352af0) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\Unrelated\MemMgmt.cpp:282)
#10 0x407f74 operator delete(pv=0x352af0) (C:\dm\bin\codes\CodeBlocks\ProjTemp\main.cpp:1002)
#11 0x629a74 __gnu_cxx::new_allocator<char>::deallocate(this=0x22f718, __p=0x352af0 "\nÿÿÿÿÿÿº\r%") (C:/Program Files/CodeBlocks/MinGW/lib/gcc/mingw32/4.9.2/include/c++/ext/new_allocator.h:110)
#12 0x6c2257 std::allocator_traits<std::allocator<char> >::deallocate(__a=..., __p=0x352af0 "\nÿÿÿÿÿÿº\r%", __n=50) (C:/Program Files/CodeBlocks/MinGW/lib/gcc/mingw32/4.9.2/include/c++/bits/alloc_traits.h:383)
#13 0x611940 basic_CDataUnit<std::allocator<char> >::~basic_CDataUnit(this=0x22f714, __vtt_parm=0x781df4 <VTT for basic_CDataUnit_TDB<std::allocator<char> >+4>, __in_chrg=<optimized out>) (include/DataUnit/CDataUnit.h:112)
#14 0x61dfa1 basic_CDataUnit_TDB<std::allocator<char> >::~basic_CDataUnit_TDB(this=0x22f714, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) (include/DataUnit/CDataUnit_TDB.h:125)
#15 0x503898 CTblSegHandle::UpdateChainedRowData(this=0x353cf8, new_row_data=..., old_row_fetch_res=..., vColTypes=..., block_hnd=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\SegHandles\CTblSegHandle.cpp:912)
#16 0x502fcc CTblSegHandle::UpdateRowData(this=0x353cf8, new_row_data=..., old_row_fetch_res=..., vColTypes=..., block_hnd=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\SegHandles\CTblSegHandle.cpp:764)
#17 0x443272 UpdateRow(row_addr=..., new_data_unit=..., vColTypes=..., block_hnd=..., seg_hnd=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\DbUtilities.cpp:910)
#18 0x443470 UpdateRow(row_addr=..., vColValues=..., vColTypes=...) (C:\dm\bin\codes\CodeBlocks\ProjTemp\src\DbUtilities.cpp:935)
#19 0x4023e3 test_RowChaining() (C:\dm\bin\codes\CodeBlocks\ProjTemp\main.cpp:234)
#20 0x4081c6 main() (C:\dm\bin\codes\CodeBlocks\ProjTemp\main.cpp:1034)
When I see a call stack that looks like yours the most common cause is heap corruption. A double free or attempting to free a pointer that was never allocated can have similar call stacks. Since you characterize the crash as inconsistent that makes heap corruption the more likely candidate. Double frees and freeing unallocated pointers tend to crash consistently in the same place. To hunt down issues like this I usually:
Install Debugging Tools for Windows
Open a command prompt with elevated privileges
Change directory to the directory that Debugging Tools for Windows is installed in.
Enable full page heap by running gflags.exe -p /enable applicationName.exe /full
Launch application with debugger attached and recreate the issue.
Disable full page heap for the application by running gflags.exe -p /disable applicationName.exe
Running the application with full page heap places an inaccessible page at the end of each allocation so that the program stops immediately if it accesses memory beyond the allocation. This is according to the page GFlags and PageHeap. If a buffer overflow is causing the heap corruption this setting should cause the debugger to break when the overflow occurs..
Make sure to disable page heap when you are done debugging. Running under full page heap can greatly increase memory pressure on an application by making every heap allocation consume an entire page.
You can use valgrind to check if there is any invalid read /write or any invalid free is there in your CODE.
valgrind -v --leak-check=full --show-reachable=yes --log-file=log_valgrind ./Process
log_valgrind will contains invalid read/write.
Some time ago we separate our big project with almost static libraries to many projects with dynamic libraries.
Since then we stated seeing problems on shutdown.
Sometimes, the process would not terminate. With gdb I found, that on object destruction a segfault occurs, but the process is blocked in futex_wait.
I've since improved the code, by creating global objects are now created in function, instead of global static data. That reduced the problem: it doesn't happen in my development environment anymore.
However, in test environment (rare) and in production environment (often) processes still get stuck on shutdown. So we need to restart container manually, or have some kind of health check.
We are trying to simulate this kind of situation on standalone docker container running under Kubernetes where we have the process running under circusd and we see following:
#0 malloc_consolidate (av=0xf47fc400 <main_arena>) at malloc.c:4151
#1 0xf46ff1ab in _int_free (av=0xf47fc400 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:4057
#2 0xf48c6e68 in operator delete(void*) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#3 0xf52d173d in std::_Deque_base<boost::log::v2_mt_posix::record_view, std::allocator<boost::log::v2_mt_posix::record_view> >::~_Deque_base() () from /usr/local/lib/liblog.so.0
#4 0xf52d18b3 in std::deque<boost::log::v2_mt_posix::record_view, std::allocator<boost::log::v2_mt_posix::record_view> >::~deque() () from /usr/local/lib/liblog.so.0
#5 0xf52d1940 in boost::log::v2_mt_posix::sinks::bounded_fifo_queue<4000u, boost::log::v2_mt_posix::sinks::drop_on_overflow>::~bounded_fifo_queue() () from /usr/local/lib/liblog.so.0
#6 0xf52d462e in boost::log::v2_mt_posix::sinks::asynchronous_sink<cout_sink, boost::log::v2_mt_posix::sinks::bounded_fifo_queue<4000u, boost::log::v2_mt_posix::sinks::drop_on_overflow>
>::~asynchronous_sink() () from /usr/local/lib/liblog.so.0
#7 0xf52d47f4 in asynchronous_sink<cout_sink>::~asynchronous_sink() () from /usr/local/lib/liblog.so.0
#8 0xf52c199a in boost::detail::sp_counted_impl_pd<asynchronous_sink<cout_sink>*, boost::detail::sp_ms_deleter<asynchronous_sink<cout_sink> >
>::dispose() () from /usr/local/lib/liblog.so.0
#9 0xf51f3e7b in boost::log::v2_mt_posix::core::~core() () from /usr/lib/libboost_log.so.1.58.0
#10 0xf51f6529 in boost::detail::sp_counted_impl_p<boost::log::v2_mt_posix::core>::dispose() () from /usr/lib/libboost_log.so.1.58.0
#11 0xf51f6160 in boost::shared_ptr<boost::log::v2_mt_posix::core>::~shared_ptr() () from /usr/lib/libboost_log.so.1.58.0
#12 0xf46bcfb3 in __cxa_finalize (d=0xf526fa88) at cxa_finalize.c:56
#13 0xf51eaab3 in ?? () from /usr/lib/libboost_log.so.1.58.0
#14 0xf7769e2c in _dl_fini () at dl-fini.c:252
#15 0xf46bcc21 in __run_exit_handlers (status=status#entry=0, listp=0xf47fc3a4 <__exit_funcs>, run_list_atexit=run_list_atexit#entry=true) at exit.c:82
#16 0xf46bcc7d in __GI_exit (status=0) at exit.c:104
#17 0xf46a572b in __libc_start_main (main=0x8060dc0, argc=5, argv=0xffdd1514, init=0x8088090, fini=0x8088100, rtld_fini=0xf7769c50 <_dl_fini>, stack_end=0xffdd150c) at libc-start.c:321
#18 0x080630cc in ?? ()
I have no ideas how to progress from here. What is happening? Why do we get the segfault in boost::log::core destruction in this environment?
Does anyone have some advice how can I find it, probably, based on experience?
We are facing a crash in the quickfix library, when we are receivng a very large message from the counterparty. I put a debug build on the machine and the crash dump is below the message.
I am using quickfix version 1.13.3, gcc 4.4.7 on CentOS 6.x and when I looked at FieldMap:174, it is a delete statement.
A similar issue with allocators is probably reported here:
http://sourceforge.net/p/quickfix/mailman/message/10833533/
which seems to be have been fixed in 1.12.4. However, since I’m using a later version, this should not be the case here.
The configure script on my machine gives the following output related to the allocators:
checking for boost::pool_allocator... yes
checking for boost::fast_pool_allocator... yes
checking __gnu_cxx::__pool_alloc... yes
checking __gnu_cxx::__mt_alloc... yes
checking __gnu_cxx::bitmap_allocator... yes
Any pointers on how can I go about fixing this issue?
Stack Trace:
(gdb) bt
#0 0x00007ffff6858084 in FIX::FieldMap::clear (this=0x7ffff5271630) at FieldMap.cpp:174
#1 0x00007ffff6858a49 in FIX::FieldMap::~FieldMap (this=0x7ffff5271630, __in_chrg=<value optimized out>) at FieldMap.cpp:35
#2 0x000000000061bf66 in FIX::Message::~Message (this=0x7ffff5271630, __in_chrg=<value optimized out>)
at /usr/local/include/quickfix/Message.h:58
#3 0x00007ffff6806d14 in FIX::Session::next (this=0x9c6480, msg= "8=FIX.4.4\001\071=166387\001\063\065=W\001\063\064=3\001\064\071=BCSGATEWAY\001\065\062=20151109-21:10:23.243\001\065\066=MDFOREX\001\065\065=LAN\001\061\066\067=CS\001\062\060\067=XSGO\001\062\066\062=1\001\062\066\070=935\001\062\066\071=5\001\062\067\060=3998.8\001\062\067\062=20151109\001\062\070\066=6\001\062\071\060=1\001\062\066\071=7\001\062\067\060=4150\001\062\071\060=1\001\062\066\071=8\001\062\067\060=3950.1\001\062\071\060="..., timeStamp=..., queued=<value optimized out>) at Session.cpp:1309
#4 0x00007ffff682fecc in FIX::SocketConnection::readMessages (this=0x7fffe8000f90, s=...) at SocketConnection.cpp:234
#5 0x00007ffff682fff5 in FIX::SocketConnection::read (this=0x7fffe8000f90, s=...) at SocketConnection.cpp:124
#6 0x00007ffff6821e51 in FIX::ConnectorWrapper::onEvent (this=0x7ffff5271d60, socket=23) at SocketConnector.cpp:67
#7 0x00007ffff682e03d in FIX::SocketMonitor::processReadSet (this=0x9cb0a0, strategy=..., readSet=...) at SocketMonitor.cpp:287
#8 0x00007ffff682edcd in FIX::SocketMonitor::block (this=0x9cb0a0, strategy=..., poll=false, timeout=<value optimized out>)
at SocketMonitor.cpp:243
#9 0x00007ffff6821cc8 in FIX::SocketConnector::block (this=<value optimized out>, strategy=<value optimized out>,
poll=<value optimized out>, timeout=<value optimized out>) at SocketConnector.cpp:144
#10 0x00007ffff682b021 in FIX::SocketInitiator::onStart (this=0x9cadf0) at SocketInitiator.cpp:96
#11 0x00007ffff68247fa in FIX::Initiator::startThread (p=<value optimized out>) at Initiator.cpp:336
#12 0x0000003284c07a51 in start_thread () from /lib64/libpthread.so.0
#13 0x00000032848e893d in clone () from /lib64/libc.so.6
I am seeing a core dump in solaris at the exit procedure of my program.. How to debug and fix this kind of core dump?
(gdb) where
#0 0xff2cc0c0 in kill () from /usr/lib/libc.so.1
#1 0x0004dac0 in run_before_killed_handler (sig=11) at NdmpServer.cpp:1186
#2 signal handler called
#3 0xfee0ad50 in ?? ()
#4 0x00060a6c in proc_cleanup ()
#5 0xff2421ac in _exithandle () from /usr/lib/libc.so.1
#6 0xff2305d8 in exit () from /usr/lib/libc.so.1
#7 0x0003431c in _start ()
Your program apparently uses atexit(3C) to register an exit handler. The problem is occuring in that handler.
Without knowing the finer details of Solaris memory layouts, 0xfee0ad50 seems to be on the OS side. What OS call are you trying (and failing) to make in proc_cleanup?