Getting signal 11 (SIGSEGV) when using libcurl in multithreaded Program - c++

I did this
From google I found out we should build libcurl by providing --enable-threaded-resolver but it didn't help, I am still seeing the segfault
I also tried setting the NoSignal option, but that also didn't help.
From logs of program it seems one thread hungs up when calling perform function (of curlpp library ie we have curlpp::Easy request; and it hungs when we do request.perform in one thread)
Also, the issue does not appear at every run and is specific to linux , works fine on MAC OS and Windows.
curl/libcurl version
I tried 7.21.2 and 7.66 version (7.66 with --enable-threaded-resolver )
Error
With 7.21.2 version , I was getting
==8866==
==8866== Process terminating with default action of signal 11 (SIGSEGV)
==8866== Access not within mapped region at address 0x0
==8866== at 0x551FCDD: _IO_file_underflow##GLIBC_2.2.5 (in /usr/lib64/libc-2.17.so)
==8866== by 0x5520E81: _IO_default_uflow (in /usr/lib64/libc-2.17.so)
==8866== by 0x5513C33: _IO_getline_info (in /usr/lib64/libc-2.17.so)
==8866== by 0x551D6FC: fgets_unlocked (in /usr/lib64/libc-2.17.so)
==8866== by 0xF57D5AE: internal_getent (in /usr/lib64/libnss_files-2.17.so)
==8866== by 0xF57E7E2: _nss_files_gethostbyname4_r (in /usr/lib64/libnss_files-2.17.so)
==8866== by 0x558A1B3: gaih_inet.constprop.8 (in /usr/lib64/libc-2.17.so)
==8866== by 0x558B553: getaddrinfo (in /usr/lib64/libc-2.17.so)
==8866== by 0xCB6E013: Curl_getaddrinfo_ex (in /usr/local/qubole/libquboleodbc.so)
==8866== by 0xCB6AE65: Curl_getaddrinfo (in /usr/local/qubole/libquboleodbc.so)
==8866== by 0xCB51CF4: Curl_resolv (in /usr/local/qubole/libquboleodbc.so)
==8866== by 0xCB51FBA: Curl_resolv_timeout (in /usr/local/qubole/libquboleodbc.so)
With 7.66 version (with --enable-threaded-resolver) , I started getting
==11063==
==11063== Process terminating with default action of signal 11 (SIGSEGV)
==11063== Access not within mapped region at address 0x0
==11063== at 0x551FCDD: _IO_file_underflow##GLIBC_2.2.5 (in /usr/lib64/libc-2.17.so)
==11063== by 0x5520E81: _IO_default_uflow (in /usr/lib64/libc-2.17.so)
==11063== by 0x5513C33: _IO_getline_info (in /usr/lib64/libc-2.17.so)
==11063== by 0x551D6FC: fgets_unlocked (in /usr/lib64/libc-2.17.so)
==11063== by 0xFDAA5AE: internal_getent (in /usr/lib64/libnss_files-2.17.so)
==11063== by 0xFDAB7E2: _nss_files_gethostbyname4_r (in /usr/lib64/libnss_files-2.17.so)
==11063== by 0x558A1B3: gaih_inet.constprop.8 (in /usr/lib64/libc-2.17.so)
==11063== by 0x558B553: getaddrinfo (in /usr/lib64/libc-2.17.so)
==11063== by 0xCB7D2E3: Curl_getaddrinfo_ex (in /usr/local/qubole/libquboleodbc.so)
==11063== by 0xCB585C0: getaddrinfo_thread (in /usr/local/qubole/libquboleodbc.so)
==11063== by 0xCB8474A: curl_thread_create_thunk (in /usr/local/qubole/libquboleodbc.so)
==11063== by 0xF391EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
The above is seen when I use Valgrind.
operating system
RHEL 7.7
Any help is appreciated.

Related

Valgrind is not showing line numbers

My fortran code gets failed if I run it without debugger. But if I use valgrind while running it runs. Surely, a case of memory issue. I thought to debug the code using valgrind and it does contain errors like...
==988381== at 0x15EA7C: __nautilus_main_MOD_init_reaction_rates (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
==988381== by 0x16AD99: __nautilus_main_MOD_initialisation (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
==988381== by 0x10B44C: MAIN__ (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
==988381== by 0x10B30E: main (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
==988381== Address 0x4fb7c98 is 8 bytes before a block of size 20,864 alloc'd
==988381== at 0x483C855: malloc (vg_replace_malloc.c:381)
==988381== by 0x150140: __global_variables_MOD_initialize_global_arrays (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
==988381== by 0x16AD2B: __nautilus_main_MOD_initialisation (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
==988381== by 0x10B44C: MAIN__ (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
==988381== by 0x10B30E: main (in /home/seps/Desktop/softwares/DNautilus_spin/bin/dnautilus_spin)
But valgrind is not showing any line numbers. I compiled the code using flags like -g3, -g, -fbacktrace, -o0. Even then, valgrind is not showing line numbers.
I tried changing the flags except -g3 while compiling the code. I tried different valgrind versions like valgrind-3.19.0, valgrind-3.20.0. These I did based on the suggestions from web search.
All I want is to get line numbers so that I can proceed further. Can anyone help me?
Thanks in advance.

Valgrind reporting memory leak in RocksDB

I am trying to profile the performance of RocksDB using Callgrind / KCacheGrind on a mac. I am running the command valgrind --tool=callgrind ./simple_example on one of the example programs that comes with RocksDB in the examples folder. I seem to be getting a memory leak, though, which prevents me from being able to do the performance profiling that I ultimately want to do.
==54628== Callgrind, a call-graph generating cache profiler
==54628== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al.
==54628== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==54628== Command: ./simple_example
==54628==
==54628== For interactive control, run 'callgrind_control -h'.
==54628==
==54628== Process terminating with default action of signal 11 (SIGSEGV)
==54628== Access not within mapped region at address 0x18
==54628== at 0x1016D25BA: _pthread_body (in /usr/lib/system/libsystem_pthread.dylib)
==54628== by 0x1016D250C: _pthread_start (in /usr/lib/system/libsystem_pthread.dylib)
==54628== by 0x1016D1BF8: thread_start (in /usr/lib/system/libsystem_pthread.dylib)
==54628== If you believe this happened as a result of a stack
==54628== overflow in your program's main thread (unlikely but
==54628== possible), you can try to increase the size of the
==54628== main thread stack using the --main-stacksize= flag.
==54628== The main thread stack size used in this run was 8388608.
--54628:0:schedule VG_(sema_down): read returned -4
==54628==
==54628== Events : Ir
==54628== Collected : 14961779
==54628==
==54628== I refs: 14,961,779
Segmentation fault: 11

Performance testing in C++ project

I wrote a project in c++ with 10 threads. One thread loads the data into memory(write the buffer) and other 9 threads are simultaneously read the buffer and store data in SQLite database, All threads are handled with the mutex to avoid conflicts.
Now I need to evaluate the performance of this project such as time to success per threads, memory usages etc. How can I go it in c++ environment? I used Valgrind to check these. But I think it not working.
This is the code I run with Valgrind,
valgrind --tool=memcheck --leak-check=yes ./executable
It gives a message like this,
callers=20 --track-fds=yes ./monerosci
==24262== Memcheck, a memory error detector
==24262== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==24262== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==24262== Command: ./monerosci
==24262==
valgrind: m_syswrap/syswrap-linux.c:5361
(vgSysWrap_linux_sys_fcntl_before): Assertion 'Unimplemented
functionality' failed.
valgrind: valgrind
host stacktrace:
==24262== at 0x38083F48: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24262== by 0x38084064: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24262== by 0x380841F1: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24262== by 0x380FB399: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24262== by 0x380D6234: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24262== by 0x380D2D2A: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24262== by 0x380D43DE: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24262== by 0x380E3946: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
How can I test the performance of the project in C++?
Well it seems there's two separate problems here:
1) memcheck is failing to run due to a bug or some limitation. Apparently one variation of a fcntl call is not supported by your version of valgrind. Maybe you should reduce the code size, remove libraries, until you can pinpoint which call is triggering this problem. Or just run it under a different version of valgrind. However, I think memcheck will not give you the data you want...
2) memcheck is not a tool for profiling. Valgrind is composed of several different tools that can be switched by using the --tool parameter. Here's an overview of them. The one that most likely will give you the info you want is callgrind.

Is this error from Qt or my program?

I'm new to valgrind. I have written a program in C++ using Qt 5.5.1 libraries on Ubuntu 15.10. I'm using Qt Creator with Debug build set. I checked for memory leaks using Valgrind with the following command:
valgrind --leak-check=yes --track-origins=yes ./texteditor
Valgrind then gives me the following message:
==2977== Conditional jump or move depends on uninitialised value(s)
==2977== at 0x97ED1EC: ??? (in /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0.2400.28)
==2977== by 0x97EE58A: ??? (in /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0.2400.28)
==2977== by 0x5B3380B: g_cclosure_marshal_VOID__VOID (in /usr/lib/i386-linux-gnu/libgobject-2.0.so.0.4600.2)
==2977== by 0x5B31B8A: g_closure_invoke (in /usr/lib/i386-linux-gnu/libgobject-2.0.so.0.4600.2)
==2977== by 0x5B43FFB: ??? (in /usr/lib/i386-linux-gnu/libgobject-2.0.so.0.4600.2)
==2977== by 0x5B4CC95: g_signal_emit_valist (in /usr/lib/i386-linux-gnu/libgobject-2.0.so.0.4600.2)
==2977== by 0x5B4CFC4: g_signal_emit (in /usr/lib/i386-linux-gnu/libgobject-2.0.so.0.4600.2)
==2977== by 0x96ECD00: gtk_adjustment_changed (in /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0.2400.28)
==2977== by 0x5B35465: ??? (in /usr/lib/i386-linux-gnu/libgobject-2.0.so.0.4600.2)
==2977== by 0x5B384FC: g_object_thaw_notify (in /usr/lib/i386-linux-gnu/libgobject-2.0.so.0.4600.2)
==2977== by 0x96ED182: gtk_adjustment_configure (in /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0.2400.28)
==2977== by 0x4563C7F: ??? (in /home/tembo/Qt/5.5/gcc/lib/libQt5Widgets.so.5.5.1)
==2977== Uninitialised value was created by a stack allocation
==2977== at 0x456215F: ??? (in /home/tembo/Qt/5.5/gcc/lib/libQt5Widgets.so.5.5.1)
From the above message nothing points to the location of myProgram at all. Is this from Qt and other libraries or do I miss something pointing to myProgram?
By default Valgrind only shows the top 12 entries of the call stack, but this can be changed with the --num-callers=xx parameter. The functions from your own program code are likely further down on the stack.

How to solve segmentation fault problems happening in support libraries?

I have a very odd problem going on. I can replicate the problem by the following small sample code:
#include <openssl/ssl.h>
#include <openssl/err.h>
#include <iostream>
void printSSLErrors()
{
int l_err = ERR_get_error();
while(l_err!=0)
{
std::cout << "SSL ERROR: " << ERR_error_string(l_err, NULL) << std::endl;
l_err = ERR_get_error();
}
}
int main(int argc, char* argv[]) {
SSL_library_init();
SSL_load_error_strings();
// context
SSL_CTX* mp_ctx;
if(!(mp_ctx = SSL_CTX_new(SSLv23_server_method())))
{
printSSLErrors();
return 0;
}
std::cout << "CTX created OK" << std::endl;
// set certificate and private key
if(SSL_CTX_use_certificate_file(mp_ctx, argv[1], SSL_FILETYPE_PEM)!=1)
{
printSSLErrors();
return 0;
}
std::cout << "Certificate intialised OK" << std::endl;
if(SSL_CTX_use_PrivateKey_file(mp_ctx, argv[2], SSL_FILETYPE_PEM)!=1)
{
printSSLErrors();
return 0;
}
std::cout << "Key intialised OK" << std::endl;
SSL_CTX_free(mp_ctx);
ERR_free_strings();
}
This program works as expected when I compile it and link it using -lssl. The problem however is that the openssl routines are part of an application that also links in the mysqlclient libraries. I now recompile the above code with -lssl -lmysqlclient (note that I don't include or use anything from that library here). If I execute the program again I get a segmentation fault in the open ssl library. The most I can pull out of gdb is:
[Thread debugging using libthread_db enabled]
[New Thread -1208158528 (LWP 32359)]
CTX created OK
Certificate intialised OK
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208158528 (LWP 32359)]
0x001b1213 in X509_get_pubkey () from /lib/libcrypto.so.4
(gdb) backtrace
#0 0x001b1213 in X509_get_pubkey () from /lib/libcrypto.so.4
#1 0x00de8a6c in SSL_rstate_string () from /lib/libssl.so.4
#2 0x086f2258 in ?? ()
#3 0xbffceb64 in ?? ()
#4 0x086f1130 in ?? ()
#5 0xbffceaa8 in ?? ()
#6 0x086f2258 in ?? ()
#7 0x086f0d90 in ?? ()
#8 0x00df4858 in ?? () from /lib/libssl.so.4
#9 0x086f2258 in ?? ()
#10 0x086f1130 in ?? ()
#11 0xbffceaa8 in ?? ()
#12 0x00de9d50 in SSL_CTX_use_PrivateKey_file () from /lib/libssl.so.4
Previous frame inner to this frame (corrupt stack?)
(gdb) frame 0
#0 0x001b1213 in X509_get_pubkey () from /lib/libcrypto.so.4
For some reason this only happens when I use mysqlclient v 15 and not with mysqlclient v 16. This is probably too obscure for anyone to solve, but some comments on how linking against a dynamic library that the code itself doesn't even use can cause these errors would be very helpful.
The system is:
RHEL ES4, gcc 3.4.6, openssl-0.9.7a, MySQL-5.11
Any thoughts?
Edit: Here is the output to possibly clarify things a little more:
[Lieuwe ~]$ c++ openssl_test.cpp -lssl -o ssltest
[Lieuwe ~]$ ./ssltest /etc/httpd/conf/certs/test.crt /etc/httpd/conf/certs/test.key
CTX created OK
Certificate intialised OK
Key intialised OK
[Lieuwe ~]$ c++ openssl_test.cpp -lmysqlclient -lssl -o ssltest
[Lieuwe ~]$ ./ssltest /etc/httpd/conf/certs/test.crt /etc/httpd/conf/certs/test.key
CTX created OK
Certificate intialised OK
Segmentation fault (core dumped)
[Lieuwe ~]$
Note that for this purpose I use the crt and key file that the apache server also uses (and work)
Edit 2: Here is the (relevant?) output of valgrind for the program
CTX created OK
--5429-- REDIR: 0x5F6C80 (memchr) redirected to 0x4006184 (memchr)
Certificate intialised OK
==5429== Invalid read of size 4
==5429== at 0xCF4205: X509_get_pubkey (in /lib/libcrypto.so.0.9.7a)
==5429== by 0xDE8A6B: (within /lib/libssl.so.0.9.7a)
==5429== by 0xDE9D4F: SSL_CTX_use_PrivateKey_file (in /lib/libssl.so.0.9.7a)
==5429== by 0x8048C77: main (in /home/liwu/ssltest)
==5429== Address 0x4219940 is 0 bytes inside a block of size 84 free'd
==5429== at 0x4004EFA: free (vg_replace_malloc.c:235)
==5429== by 0xC7FD00: CRYPTO_free (in /lib/libcrypto.so.0.9.7a)
==5429== by 0xCE53A7: (within /lib/libcrypto.so.0.9.7a)
==5429== by 0xCE5562: ASN1_item_free (in /lib/libcrypto.so.0.9.7a)
==5429== by 0xCE0560: X509_free (in /lib/libcrypto.so.0.9.7a)
==5429== by 0xDE979E: SSL_CTX_use_certificate_file (in /lib/libssl.so.0.9.7a)
==5429== by 0x8048C23: main (in /home/liwu/ssltest)
==5429==
==5429== Invalid read of size 4
==5429== at 0xCD4A5F: EVP_PKEY_copy_parameters (in /lib/libcrypto.so.0.9.7a)
==5429== by 0xDE8A7C: (within /lib/libssl.so.0.9.7a)
==5429== by 0xDE9D4F: SSL_CTX_use_PrivateKey_file (in /lib/libssl.so.0.9.7a)
==5429== by 0x8048C77: main (in /home/liwu/ssltest)
==5429== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==5429==
==5429== Process terminating with default action of signal 11 (SIGSEGV)
==5429== Access not within mapped region at address 0x0
==5429== at 0xCD4A5F: EVP_PKEY_copy_parameters (in /lib/libcrypto.so.0.9.7a)
==5429== by 0xDE8A7C: (within /lib/libssl.so.0.9.7a)
==5429== by 0xDE9D4F: SSL_CTX_use_PrivateKey_file (in /lib/libssl.so.0.9.7a)
==5429== by 0x8048C77: main (in /home/liwu/ssltest)
==5429==
I would suggest running your program under Valgrind. Valgrind is intended to provide help with exactly this kind of problem and it is generally much easier to use than a debugger.
If I were to hazard a guess, I would first suspect a memory error in your application (or, less likely, in one of the shared libraries) that is sensitive to the memory layout of the resulting executable. Adding a new shared library or, say, enabling debugging options could very well make the problem appear or disappear for no apparent reason.
The only logical explanation may be that the public key, which is needed for X509_get_pubkey(), can not be located.
Can you please verify that the public key requested by the function is available?
I'd think that the mysql client library is linked against another version of libssl. If you are on linux: are both libraries installed via your distro's official repositories? Are you linking against the static (.a) or dynamic (.so) versions of those libraries?
You can play around with the nm command to find out more (read the manpage).
You can try to rebuild the mysql client library yourself to make sure the same libssl version is used and see whether the problem disappears.