I have a testprogram which tries to parse an example xml on SLES11, but the result is a segmentation fault.
However if I link without libdb2 than it works fine.
g++-8.3 -o testXmlParser main.cpp -m31 -lxml2
Added the -ldb2 and I get the mentioned segmentation fault and before that a "1: parser error : Document is empty"
g++-8.3 -o testXmlParser main.cpp -m31 -lxml2 -ldb2
My code:
#include <libxml/parser.h>
#include <libxml/tree.h>
#include <iostream>
int main ()
{
xmlDoc *doc = NULL;
xmlNode *root_element = NULL;
std::cout << "log1" << std::endl;
doc = xmlParseEntity("/tmp/testXML.xml");
std::cout << "log2" << std::endl;
root_element = xmlDocGetRootElement(doc);
std::cout << "root element: "<<root_element->name << std::endl ;
return 0;
}
And the callstack:
#0 0x7b30399e in free () from /lib/libc.so.6
#1 0x7bb3bb92 in destroy () from /data/db2inst1/sqllib/lib32/libdb2.so.1
#2 0x7bb3cdf4 in gzclose () from /data/db2inst1/sqllib/lib32/libdb2.so.1
#3 0x7d1896f0 in ?? () from /usr/lib/libxml2.so.2
#4 0x7d187e80 in xmlFreeParserInputBuffer () from /usr/lib/libxml2.so.2
#5 0x7d1602f4 in xmlFreeInputStream () from /usr/lib/libxml2.so.2
#6 0x7d160336 in xmlFreeParserCtxt () from /usr/lib/libxml2.so.2
#7 0x7d17427c in xmlSAXParseEntity () from /usr/lib/libxml2.so.2
#8 0x00400c02 in main ()
Could you help me solve this problem?
This is a test program, the db2 is not used here, but used in our software where this problem comes from.
The problem is that libxml requires libz and you're not linking with it.
Since Db2 includes zlib in their libraries (see stack frames #1, #2) the symbols are getting resolved by the linker.
There must be some incompatibility between the zlib that libxml expects, and the version that is embedded into Db2.
Try adding '-lz' to your compile line, before '-ldb2', so that the linker will try to use that library first.
Db2 uses zlib internally and those symbols are (incorrectly) exported. This will be addressed via APAR
IT29520: ZLIB SYMBOLS INSIDE LIBDB2.SO ARE GLOBALLY VISIBLE WHICH MEANS THEY COLLIDE WITH ZLIB SYMBOLS INSIDE LIBZ.SO
With LD_DEBUG=all you'll see how symbols mapped/resolved. You can try #memmertoIBM's suggestion or put libdb2 behind zlib in LD_LIBRARY_PTH
Related
UPDATE: I've found a partial workaround. See bottom of this post.
After a number of hours of debugging a program, I've found that there is some kind of conflict between the netCDF and HDF5 libraries (the program reads/writes files of both formats).
I've boiled down the code to a tiny program that shows the issue.
This program segfaults:
#include <iostream>
#include <string>
#include "H5Cpp.h"
#include <netcdf>
using namespace std;
void stupidfunction() // Note that this is never called.
{
H5::Group grp1; // The mere potential existence of this makes netcdf segfault!
}
int main(int argn, char ** args)
{
std::string outputFilename = "/tmp/test.nc";
try
{
std::cout << "Now opening " << outputFilename << std::endl;
netCDF::NcFile sfc;
sfc.open(outputFilename, netCDF::NcFile::replace);
std::cout << "closing file" << std::endl;
sfc.close();
return true;
}
catch(netCDF::exceptions::NcException& e)
{
std::cout << "EX: " << e.what() << std::endl;
return false;
}
return 0;
}
(Compile command: h5c++ test.cpp -std=gnu++11 -O0 -g3 -lnetcdf_c++4 -lnetcdf -o test)
My installed (relevant) packages:
libnetcdf-c++4-1 4.3.1-2build1 amd64 C++ interface for scientific data access to large binary data
libnetcdf-c++4-dev 4.3.1-2build1 amd64 creation, access, and sharing of scientific data in C++
libnetcdf-dev 1:4.7.3-1 amd64 creation, access, and sharing of scientific data
libnetcdf15:amd64 1:4.7.3-1 amd64 Interface for scientific data access to large binary data
netcdf-bin 1:4.7.3-1 amd64 Programs for reading and writing NetCDF files
netcdf-doc 1:4.7.3-1 all Documentation for NetCDF
hdf5-helpers 1.10.4+repack-11ubuntu1 amd64 Hierarchical Data Format 5 (HDF5) - Helper tools
hdf5-tools 1.10.4+repack-11ubuntu1 amd64 Hierarchical Data Format 5 (HDF5) - Runtime tools
libhdf4-0 4.2.14-1ubuntu1 amd64 Hierarchical Data Format library (embedded NetCDF)
libhdf5-103:amd64 1.10.4+repack-11ubuntu1 amd64 Hierarchical Data Format 5 (HDF5) - runtime files - serial version
libhdf5-cpp-103:amd64 1.10.4+repack-11ubuntu1 amd64 Hierarchical Data Format 5 (HDF5) - C++ libraries
libhdf5-dev 1.10.4+repack-11ubuntu1 amd64 Hierarchical Data Format 5 (HDF5) - development files - serial versio
What is going on here that makes it segfault?
Is there anything I can do to avoid or fix this problem?
Any help is appreciated!
When run using gdb:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a47a01 in __vfprintf_internal (s=s#entry=0x7fffff7ff480,
format=format#entry=0x7ffff77e13a8 "can't locate ID",
ap=ap#entry=0x7fffff7ff5e0, mode_flags=mode_flags#entry=2)
at vfprintf-internal.c:1289
1289 vfprintf-internal.c: No such file or directory.
gdb backtrace:
(gdb) bt
#0 0x00007ffff7a47a01 in __vfprintf_internal (s=s#entry=0x7fffff7ff480, format=format#entry=0x7ffff77e13a8 "can't locate ID",
ap=ap#entry=0x7fffff7ff5e0, mode_flags=mode_flags#entry=2) at vfprintf-internal.c:1289
#1 0x00007ffff7a5cd4a in __vasprintf_internal (result_ptr=0x7fffff7ff5d8, format=0x7ffff77e13a8 "can't locate ID", args=0x7fffff7ff5e0, mode_flags=2)
at vasprintf.c:57
#2 0x00007ffff75c7e56 in H5E_printf_stack () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.103
#3 0x00007ffff76553b9 in H5I_inc_ref () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.103
...
many many lines repeating H5E_printf_stack, H5E__push_stack and H5I_inc_ref
...
#56139 0x00007ffff76553b9 in H5I_inc_ref () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.103
#56140 0x00007ffff75c7c2f in H5E__push_stack () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.103
#56141 0x00007ffff75c7e7e in H5E_printf_stack () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.103
#56142 0x00007ffff761fc85 in H5G_loc () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.103
#56143 0x00007ffff7546903 in H5Acreate1 () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.103
#56144 0x00007ffff790b11b in NC4_write_provenance () from /usr/lib/x86_64-linux-gnu/libnetcdf.so.15
#56145 0x00007ffff790b5a8 in ?? () from /usr/lib/x86_64-linux-gnu/libnetcdf.so.15
#56146 0x00007ffff790b7b0 in nc4_close_hdf5_file () from /usr/lib/x86_64-linux-gnu/libnetcdf.so.15
#56147 0x00007ffff790b9ea in NC4_close () from /usr/lib/x86_64-linux-gnu/libnetcdf.so.15
#56148 0x00007ffff78ca579 in nc_close () from /usr/lib/x86_64-linux-gnu/libnetcdf.so.15
#56149 0x00007ffff7f82270 in netCDF::NcFile::close() () from /usr/lib/x86_64-linux-gnu/libnetcdf_c++4.so.1
#56150 0x00005555555a7959 in main (argn=1, args=0x7fffffffe5b8) at test.cpp:29
(gdb)
Partial workaround:
If I specify the netCDF file format as classic or classic64, the error does not occur. i.e:
sfc.open(outputFilename, netCDF::NcFile::replace, netCDF::NcFile::classic);
or
sfc.open(outputFilename, netCDF::NcFile::replace, netCDF::NcFile::classic64);
I have a C program which was showing similar behaviour, I found that adding
#include <H5public.h>
:
if (H5dont_atexit() < 0)
{
fprintf(stderr, "failed HDF5 don't-atexit\n");
return 1;
}
at the start of main() fixes the issue. That does mean that files that you H5Fopen() don't get automatically H5Fclose()-ed, but possibly a lower-impact workaround.
I am using a global std::shared_ptr to handle automatic deletion of my Vulkan VkInstance. The pointer has a custom deleter that calls vkDestroyInstance when it goes out of scope. Everything works as expected until I enable the VK_LAYER_LUNARG_standard_validation layer at which point the vkDestroyInstance function causes a segfault.
I have added a minimal example below that produces the issue.
minimal.cpp
#include <vulkan/vulkan.h>
#include <iostream>
#include <memory>
#include <vector>
#include <cstdlib>
// The global self deleting instance
std::shared_ptr<VkInstance> instance;
int main()
{
std::vector<const char *> extensions = {VK_EXT_DEBUG_REPORT_EXTENSION_NAME};
std::vector<const char *> layers = {};
// Uncomment to cause segfault:
// layers.emplace_back("VK_LAYER_LUNARG_standard_validation");
VkApplicationInfo app_info = {};
app_info.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO;
app_info.pApplicationName = "Wat";
app_info.applicationVersion = VK_MAKE_VERSION(1, 0, 0);
app_info.pEngineName = "No Engine";
app_info.engineVersion = VK_MAKE_VERSION(1, 0, 0);
app_info.apiVersion = VK_API_VERSION_1_0;
VkInstanceCreateInfo instance_info = {};
instance_info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
instance_info.pApplicationInfo = &app_info;
instance_info.enabledExtensionCount =
static_cast<uint32_t>(extensions.size());
instance_info.ppEnabledExtensionNames = extensions.data();
instance_info.enabledLayerCount = static_cast<uint32_t>(layers.size());
instance_info.ppEnabledLayerNames = layers.data();
// Handles auto deletion of the instance when it goes out of scope
auto deleter = [](VkInstance *pInstance)
{
if (*pInstance)
{
vkDestroyInstance(*pInstance, nullptr);
std::cout << "Deleted instance" << std::endl;
}
delete pInstance;
};
instance = std::shared_ptr<VkInstance>(new VkInstance(nullptr), deleter);
if (vkCreateInstance(&instance_info, nullptr, instance.get()) != VK_SUCCESS)
{
std::cerr << "Failed to create a Vulkan instance" << std::endl;
return EXIT_FAILURE;
}
std::cout << "Created instance" << std::endl;
// When the program exits, everything should clean up nicely?
return EXIT_SUCCESS;
}
When running the above program as is, the output is what I would expect:
$ g++-7 -std=c++14 minimal.cpp -isystem $VULKAN_SDK/include -L$VULKAN_SDK/lib -lvulkan -o minimal
$ ./minimal
Created instance
Deleted instance
$
However as soon as I add back the VK_LAYER_LUNARG_standard_validation line:
// Uncomment to cause segfault:
layers.emplace_back("VK_LAYER_LUNARG_standard_validation");
I get
$ g++-7 -std=c++14 minimal.cpp -isystem $VULKAN_SDK/include -L$VULKAN_SDK/lib -lvulkan -o minimal
$ ./minimal
Created instance
Segmentation fault (core dumped)
$
When run with gdb the backtrace shows the segfault occurring in the VkDeleteInstance function:
$ g++-7 -std=c++14 -g minimal.cpp -isystem $VULKAN_SDK/include -L$VULKAN_SDK/lib -lvulkan -o minimal
$ gdb -ex run ./minimal
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
...
Starting program: /my/path/stackoverflow/vulkan/minimal
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Created instance
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff24c4334 in threading::DestroyInstance(VkInstance_T*, VkAllocationCallbacks const*) () from /my/path/Vulkan/1.1.77.0/x86_64/lib/libVkLayer_threading.so
(gdb) bt
#0 0x00007ffff24c4334 in threading::DestroyInstance(VkInstance_T*, VkAllocationCallbacks const*) () from /my/path/Vulkan/1.1.77.0/x86_64/lib/libVkLayer_threading.so
#1 0x00007ffff7bad243 in vkDestroyInstance () from /my/path/Vulkan/1.1.77.0/x86_64/lib/libvulkan.so.1
#2 0x000000000040105c in <lambda(VkInstance_T**)>::operator()(VkInstance *) const (__closure=0x617c90, pInstance=0x617c60) at minimal.cpp:38
#3 0x000000000040199a in std::_Sp_counted_deleter<VkInstance_T**, main()::<lambda(VkInstance_T**)>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose(void) (this=0x617c80) at /usr/include/c++/7/bits/shared_ptr_base.h:470
#4 0x0000000000401ef0 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x617c80) at /usr/include/c++/7/bits/shared_ptr_base.h:154
#5 0x0000000000401bc7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x6052d8 <instance+8>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:684
#6 0x0000000000401b6a in std::__shared_ptr<VkInstance_T*, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x6052d0 <instance>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#7 0x0000000000401b9c in std::shared_ptr<VkInstance_T*>::~shared_ptr (this=0x6052d0 <instance>, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr.h:93
#8 0x00007ffff724bff8 in __run_exit_handlers (status=0, listp=0x7ffff75d65f8 <__exit_funcs>, run_list_atexit=run_list_atexit#entry=true) at exit.c:82
#9 0x00007ffff724c045 in __GI_exit (status=<optimized out>) at exit.c:104
#10 0x00007ffff7232837 in __libc_start_main (main=0x40108c <main()>, argc=1, argv=0x7fffffffdcf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdce8) at ../csu/libc-start.c:325
#11 0x0000000000400ed9 in _start ()
(gdb)
The problem can be fixed by using a local instance (inside the main function) instead of a global one so I'm thinking I might not fully understand some nuances of the Vulkan linker when using layers.
In my actual application I want to use a lazily instantiated static class to keep track of all my Vulkan objects and so I run into the same problem when the program exits.
Setup
g++: 7.3.0
OS: Ubuntu 16.04
Nvidia Driver: 390.67 (also tried 396)
Vulkan SDK: 1.1.77.0 (also tried 1.1.73)
GPU: GeForce GTX TITAN (Dual SLI if that matters?)
Global variables are a bad idea. Their destruction is unordered relative to each other in most cases.
Clean up your state in main, not at static destruction time. Simple objects that depend only on memory (a small step up from POD) and don't cross depend tend not to cause problems, but go any further and you enter a hornet's nest.
Your global shared ptr is being cleared and the destruction code run after some arbitrary global state within Vulkan is being cleared. This is causing a segfault. The interesting thing here isn't "why this segfault" but rather "how can I avoid this kind of segfault". The answer to that is "stop using global state"; nothing else really works.
I am using a C++ logging library on a FreeBSD 10 machine and I am running into trouble closing threads when receiving a sigint.
A created a GitHub project for testing purposes (link). If you build it on FreeBSD 10, execute it and press [ctrl+c] it will terminate. You can find the build commands I use below.
$ git clone git#github.com:tijme/free-bsd-thread-bug.git
$ cd free-bsd-thread-bug && mkdir -p cmake-build-debug && cd cmake-build-debug
$ cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER="/usr/local/bin/gcc6" -DCMAKE_CXX_COMPILER="/usr/local/bin/g++6"
$ make -dA
$ ./FreeBSDThreadBug
Code I used (can also be found in the GitHub repository)
/* main.cpp */
#include "Example.h"
#include <iostream>
#include <csignal>
#include <thread>
#include <chrono>
Example* example = new Example();
void onSignal(int signum)
{
delete example;
exit(0);
}
int main() {
signal(SIGINT, onSignal);
std::this_thread::sleep_for(std::chrono::milliseconds(5000));
return 0;
}
/* Example.h */
#ifndef FREEBSDTHREADBUG_EXAMPLE_H
#define FREEBSDTHREADBUG_EXAMPLE_H
#include <thread>
#include <iostream>
#include <chrono>
class Example {
public:
Example();
~Example();
std::thread threadHandle;
void threadFunction();
};
#endif //FREEBSDTHREADBUG_EXAMPLE_H
/* Example.cpp */
#include "Example.h"
#include <thread>
#include <chrono>
Example::Example()
{
std::cout << "Main: starting thread" << std::endl;
threadHandle = std::thread(&Example::threadFunction, this);
std::cout << "Main: thread started" << std::endl;
}
Example::~Example()
{
std::cout << "THIS ID: " << std::this_thread::get_id() << std::endl;
std::cout << "THREAD ID: " << threadHandle.get_id() << std::endl;
std::cout << "Main: joining thread" << std::endl;
threadHandle.join();
std::cout << "Main: thread joined" << std::endl;
}
void Example::threadFunction() {
std::cout << "Thread: starting to sleep" << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(2500));
std::cout << "Thread: sleep finished" << std::endl;
}
Correct output (on e.g. MacOS Sierra):
As you can see the ID's of the threads are different, as expected.
$ ./FreeBSDThreadBug
Main: starting thread
Main: thread started
Thread: starting to sleep
^C
THIS ID: 0x7fffa428a3c0
THREAD ID: 0x70000c044000
Main: joining thread
Thread: sleep finished
Main: thread joined
Wrong output (termination, on FreeBSD 10.3):
The thread ID's are the same here, which is pretty weird.
$ ./FreeBSDThreadBug
Main: starting thread
Main: thread started
Thread: starting to sleep
^C
THIS ID: 0x801c06800
THREAD ID: 0x801c06800
Main: joining thread
terminate called after throwing an instance of 'std::system_error'
what(): Resource deadlock avoided
Abort (core dumped)
Core dump
Core was generated by `FreeBSDThreadBug'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00000008012d335a in thr_kill () from /lib/libc.so.7
[Current thread is 1 (LWP 100146)]
(gdb) bt full
#0 0x00000008012d335a in thr_kill () from /lib/libc.so.7
No symbol table info available.
#1 0x00000008012d3346 in raise () from /lib/libc.so.7
No symbol table info available.
#2 0x00000008012d32c9 in abort () from /lib/libc.so.7
No symbol table info available.
#3 0x0000000800ad8afd in __gnu_cxx::__verbose_terminate_handler () at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.3.0/libstdc++-v3/libsupc++/vterminate.cc:95
terminating = true
t = <optimized out>
#4 0x0000000800ad5b48 in __cxxabiv1::__terminate (handler=<optimized out>) at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:47
No locals.
#5 0x0000000800ad5bb1 in std::terminate () at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:57
No locals.
#6 0x0000000800ad5dc8 in __cxxabiv1::__cxa_throw (obj=obj#entry=0x80200e0a0, tinfo=0x800dd0bc0 <typeinfo for std::system_error>, dest=0x800b073b0 <std::system_error::~system_error()>)
at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.3.0/libstdc++-v3/libsupc++/eh_throw.cc:87
globals = <optimized out>
#7 0x0000000800b04cd1 in std::__throw_system_error (__i=11) at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.3.0/libstdc++-v3/src/c++11/functexcept.cc:130
No locals.
#8 0x0000000800b0792c in std::thread::join (this=0x801c5c058) at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.3.0/libstdc++-v3/src/c++11/thread.cc:139
__e = <optimized out>
#9 0x00000000004016fc in Example::~Example (this=0x801c5c058, __in_chrg=<optimized out>) at /root/FreeBSDThreadBug/Example.cpp:18
No locals.
#10 0x00000000004010b7 in onSignal (signum=2) at /root/FreeBSDThreadBug/main.cpp:11
No locals.
#11 0x000000080082fb4a in ?? () from /lib/libthr.so.3
No symbol table info available.
#12 0x000000080082f22c in ?? () from /lib/libthr.so.3
No symbol table info available.
#13 <signal handler called>
No symbol table info available.
#14 0x00000008012efb5a in _nanosleep () from /lib/libc.so.7
No symbol table info available.
#15 0x000000080082cc4c in ?? () from /lib/libthr.so.3
No symbol table info available.
#16 0x000000000040155d in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/local/lib/gcc6/include/c++/thread:322
__s = {__r = 2}
__ns = {__r = 500000000}
__ts = {tv_sec = 1, tv_nsec = 126917539}
#17 0x000000000040177a in Example::threadFunction (this=0x801c5c058) at /root/FreeBSDThreadBug/Example.cpp:24
No locals.
#18 0x0000000000402432 in std::__invoke_impl<void, void (Example::* const&)(), Example*>(std::__invoke_memfun_deref, void (Example::* const&)(), Example*&&) (
__f=#0x801c5e050: (void (Example::*)(Example * const)) 0x40172c <Example::threadFunction()>, __t=<unknown type in /root/FreeBSDThreadBug/cmake-build-debug/FreeBSDThreadBug, CU 0x552f, DIE 0xb2d7>)
at /usr/local/lib/gcc6/include/c++/functional:227
No locals.
#19 0x00000000004023bf in std::__invoke<void (Example::* const&)(), Example*>(void (Example::* const&)(), Example*&&) (__fn=#0x801c5e050: (void (Example::*)(Example * const)) 0x40172c <Example::threadFunction()>,
__args#0=<unknown type in /root/FreeBSDThreadBug/cmake-build-debug/FreeBSDThreadBug, CU 0x552f, DIE 0xb2d7>) at /usr/local/lib/gcc6/include/c++/functional:251
No locals.
#20 0x0000000000402370 in std::_Mem_fn_base<void (Example::*)(), true>::operator()<Example*>(Example*&&) const (this=0x801c5e050,
__args#0=<unknown type in /root/FreeBSDThreadBug/cmake-build-debug/FreeBSDThreadBug, CU 0x552f, DIE 0xb2d7>) at /usr/local/lib/gcc6/include/c++/functional:604
No locals.
#21 0x000000000040233b in std::_Bind_simple<std::_Mem_fn<void (Example::*)()> (Example*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=0x801c5e048) at /usr/local/lib/gcc6/include/c++/functional:1391
No locals.
#22 0x0000000000402289 in std::_Bind_simple<std::_Mem_fn<void (Example::*)()> (Example*)>::operator()() (this=0x801c5e048) at /usr/local/lib/gcc6/include/c++/functional:1380
No locals.
#23 0x0000000000402268 in std::thread::_State_impl<std::_Bind_simple<std::_Mem_fn<void (Example::*)()> (Example*)> >::_M_run() (this=0x801c5e040) at /usr/local/lib/gcc6/include/c++/thread:196
No locals.
#24 0x0000000800b0769f in std::execute_native_thread_routine (__p=0x801c5e040) at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.3.0/libstdc++-v3/src/c++11/thread.cc:83
__t = std::unique_ptr<std::thread::_State> containing 0x801c5e040
#25 0x000000080082a855 in ?? () from /lib/libthr.so.3
No symbol table info available.
#26 0x0000000000000000 in ?? ()
No symbol table info available.
Backtrace stopped: Cannot access memory at address 0x7fffdfffe000
System information
$ freebsd-version
10.3-RELEASE
$ /usr/local/bin/gcc6 --version
gcc6 (FreeBSD Ports Collection) 6.3.0
$ /usr/local/bin/g++6 --version
g++6 (FreeBSD Ports Collection) 6.3.0
$ cmake --version
cmake version 3.7.2
The original issue I created can be found on GitHub (link), however there is no fix yet.
I hope someone will be able to help me fix this issue. Thanks in advance.
That's not a bug, that's a feature.
You don't get a guarantee about where your signal is going to be delivered, and the set of things you're allowed to do in a signal handler is restricted.
See sigaction(3) for details about what you can do (and you can't do anything else). Your program is doing many things that are not allowed in a signal handler.
The correct thing to do is to signal something else in your program and return from the signal handler. An example technique for doing that is the "self pipe trick". Create a pipe and keep a handle to both ends. Read from one end in your normal I/O processing. If you get a signal, in the signal handler, write a byte to the other end of the pipe and return. When the byte is read from the pipe you know the signal has arrived and you can do extended processing safely.
Update:
As Michael Burr has pointed out, you can block particular threads from receiving particular signals using pthread_sigmask(3). However, to fix the underlying problem here you still need to not do the work in the signal handler.
I used msys64 - mingw32 (on Windows 7, 64 bit) to build a 3rd party library (libosmscout) using a supplied makefile. I was primarily interested in one particular example from the library (Demos/src/DrawMapCairo). With makefile complete library was built successfuly, including demos. Example console application of interest works fine.
My intention however is to make my own application using Code::Blocks IDE, which would use functionality of the example app. Therefore I tried to build the example in Code::Blocks (New project -> console application, GCC 5.1 MinGW). After a while I managed to get a successful build with 0 errors/warnings. But the application doesn't work, it crashes with sigsegv fault. "cout debugging" and stepping into in debugger suggest that the issue seems to be (or start) at line
osmscout::DatabaseRef database(new osmscout::Database(databaseParameter));
Source code with main() :
#includes...
static const double DPI=96.0;
int main(int argc, char* argv[])
{
std::string map;
std::string style;
std::string output;
size_t width,height;
double lon,lat,zoom;
if (argc!=9) {
std::cerr << "DrawMap <map directory> <style-file> <width> <height> <lon> <lat> <zoom> <output>" << std::endl;
return 1;
}
map=argv[1];
style=argv[2];
//next 6 lines not exactly as in source, but for shorter code:
osmscout::StringToNumber(argv[3],width);
osmscout::StringToNumber(argv[4],height);
sscanf(argv[5],"%lf",&lon);
sscanf(argv[6],"%lf",&lat);
sscanf(argv[7],"%lf",&zoom);
output=argv[8];
osmscout::DatabaseParameter databaseParameter;
osmscout::DatabaseRef database(new osmscout::Database(databaseParameter));
osmscout::MapServiceRef mapService(new osmscout::MapService(database));
if (!database->Open(map.c_str())) {
std::cerr << "Cannot open database" << std::endl;
return 1;
}
osmscout::StyleConfigRef styleConfig(new osmscout::StyleConfig (database->GetTypeConfig()));
if (!styleConfig->Load(style)) {
std::cerr << "Cannot open style" << std::endl;
}
cairo_surface_t *surface;
cairo_t *cairo;
surface=cairo_image_surface_create(CAIRO_FORMAT_RGB24,width,height);
if (surface!=NULL) {
cairo=cairo_create(surface);
if (cairo!=NULL) {
osmscout::MercatorProjection projection;
osmscout::MapParameter drawParameter;
osmscout::AreaSearchParameter searchParameter;
osmscout::MapData data;
osmscout::MapPainterCairo painter(styleConfig);
drawParameter.SetFontSize(3.0);
projection.Set(lon,
lat,
osmscout::Magnification(zoom),
DPI,
width,
height);
std::list<osmscout::TileRef> tiles;
mapService->LookupTiles(projection,tiles);
mapService->LoadMissingTileData(searchParameter,*styleConfig,tiles);
mapService->ConvertTilesToMapData(tiles,data);
if (painter.DrawMap(projection,
drawParameter,
data,
cairo)) {
if (cairo_surface_write_to_png(surface,output.c_str())!=CAIRO_STATUS_SUCCESS) {
std::cerr << "Cannot write PNG" << std::endl;
}
}
cairo_destroy(cairo);
}
else {
std::cerr << "Cannot create cairo cairo" << std::endl;
}
cairo_surface_destroy(surface);
}
else {
std::cerr << "Cannot create cairo surface" << std::endl;
}
return 0;
}
How can I find exactly what the problem is and solve it? What's really puzzling me is that the same code built with makefile works just fine.
EDIT:
After running GDB (Gnu Debugger, gdb32.exe) and then bt (backtrace) I get the following output:
[New Thread 3900.0x538]
Program received signal SIGSEGV, Segmentation fault.
0x777ec159 in ntdll!RtlDecodeSystemPointer ()
from C:\Windows\SysWOW64\ntdll.dll
(gdb)
(gdb) bt
#0 0x777e3c28 in ntdll!RtlQueryPerformanceCounter ()
from C:\Windows\SysWOW64\ntdll.dll
#1 0x00000028 in ?? ()
#2 0x00870000 in ?? ()
#3 0x777ec1ed in ntdll!RtlDecodeSystemPointer ()
from C:\Windows\SysWOW64\ntdll.dll
#4 0x777ec13e in ntdll!RtlDecodeSystemPointer ()
from C:\Windows\SysWOW64\ntdll.dll
#5 0x777e3541 in ntdll!RtlQueryPerformanceCounter ()
from C:\Windows\SysWOW64\ntdll.dll
#6 0x00000010 in ?? ()
#7 0x00000028 in ?? ()
#8 0x008700c4 in ?? ()
#9 0x77881dd3 in ntdll!RtlpNtEnumerateSubKey ()
from C:\Windows\SysWOW64\ntdll.dll
#10 0x7783b586 in ntdll!RtlUlonglongByteSwap ()
from C:\Windows\SysWOW64\ntdll.dll
#11 0x00870000 in ?? ()
#12 0x777e3541 in ntdll!RtlQueryPerformanceCounter ()
from C:\Windows\SysWOW64\ntdll.dll
#13 0x00000010 in ?? ()
#14 0x00000000 in ?? ()
(gdb)
What does this error mean and how to find what caused it to correct the fault?
I'm having trouble using boost_threads with clang. The clang version is 3.6.0 and boost version is 1.55.0 from the new Ubuntu 15.04. Program that used to work with previous versions of clang now segfaults at startup. There is no problems when I use g++ instead.
Here is an example program to illustrate the point.
#include <iostream>
#include <boost/thread.hpp>
using namespace std;
void output() {
try {
int x = 0;
for (;;) {
boost::this_thread::sleep(boost::posix_time::milliseconds(100));
cerr << x++ << endl;
}
} catch (boost::thread_interrupted&) {}
}
int main(int argc, char* argv[]) {
try {
boost::thread output_worker(output);
boost::this_thread::sleep(boost::posix_time::milliseconds(1000));
output_worker.interrupt();
output_worker.join();
} catch (...) {
cerr << "Unexpected error!" << endl;
exit(1);
}
}
If I compile it with g++ it works, i.e.
g++ thread.cpp -lboost_thread -lboost_system
If I compile it with clang
clang++ thread.cpp -lboost_thread -lboost_system
I get a segfault with the gdb trace below
Starting program: /home/dejan/test/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bd0580 in boost::exception_ptr boost::exception_detail::get_static_exception_object<boost::exception_detail::bad_alloc_>() ()
from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.55.0
(gdb) bt
#0 0x00007ffff7bd0580 in boost::exception_ptr boost::exception_detail::get_static_exception_object<boost::exception_detail::bad_alloc_>() ()
from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.55.0
#1 0x00007ffff7bcb16a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.55.0
#2 0x00007ffff7de95ba in call_init (l=<optimized out>, argc=argc#entry=1, argv=argv#entry=0x7fffffffdf98, env=env#entry=0x7fffffffdfa8)
at dl-init.c:72
#3 0x00007ffff7de96cb in call_init (env=<optimized out>, argv=<optimized out>, argc=<optimized out>, l=<optimized out>) at dl-init.c:30
#4 _dl_init (main_map=0x7ffff7ffe188, argc=1, argv=0x7fffffffdf98, env=0x7fffffffdfa8) at dl-init.c:120
#5 0x00007ffff7dd9d0a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#6 0x0000000000000001 in ?? ()
#7 0x00007fffffffe2fe in ?? ()
#8 0x0000000000000000 in ?? ()
Am I doing something wrong?
Compiling using clang -std=c++11 makes boost change its internal implementation and actually solves the segmentation fault.
It is not an ideal solution, but it is the way I will be going with our code.