C++ How to interpret gdb backtrace log in Ubuntu - c++

I have written a fairly large C++ program. As a preface, the program runs well every time save for an exiting error, free() invalid pointer usually leading to a core dump; besides this, the program does what I need it to do every time. The program is free of any pointer objects or initialization; despite the program being very long(some 3000 lines), it is rather simply written(mostly int variables, and some 2D vector arrays). My first thought was that some FOR loop was writing beyond vector bounds so I ran Valgrind mems check to see if there is any memory leaks but Valgrind returned a clean report. My next inclination was to use gdb and see if it returned any errors. It did return a SEGABRT so I backtraced it, and gdb returned the following,
> #0 0x00007ffff7018c37 in __GI_raise (sig=sig#entry=6)
> at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1 0x00007ffff701c028 in __GI_abort () at abort.c:89
> #2 0x00007ffff70552a4 in __libc_message (do_abort=do_abort#entry=1,
> fmt=fmt#entry=0x7ffff7167310 "*** Error in `%s': %s: 0x%s ***\n")
> at ../sysdeps/posix/libc_fatal.c:175
> #3 0x00007ffff706182e in malloc_printerr (ptr=<optimized out>,
> str=0x7ffff716345e "free(): invalid pointer", action=1) at malloc.c:4998
> #4 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0)
> at malloc.c:3842
> #5 0x000000000040f152 in __gnu_cxx::new_allocator<int>::deallocate (
> this=0x7fffff85cf60, __p=0x6c1f40)
> at /usr/include/c++/4.8/ext/new_allocator.h:110
> #6 0x000000000040ef54 in std::_Vector_base<int, std::allocator<int> >::_M_deallocate (this=0x7fffff85cf60, __p=0x6c1f40, __n=32)
> at /usr/include/c++/4.8/bits/stl_vector.h:174
> #7 0x000000000040ebc2 in std::_Vector_base<int, std::allocator<int> >::~_Vector_base (this=0x7fffff85cf60, __in_chrg=<optimized out>)
> at /usr/include/c++/4.8/bits/stl_vector.h:160
> #8 0x000000000040e928 in std::vector<int, std::allocator<int> >::~vector (
> this=0x7fffff85cf60, __in_chrg=<optimized out>)
> at /usr/include/c++/4.8/bits/stl_vector.h:416
> #9 0x000000000040e3b5 in main ()
I suppose my question is this. How do I interpret this backtrace? I only see the last stack #9 calling main function but it has no other errors associated with it. The stack errors #0-8 I assume are failures in the called libraries? Is there a way or some options in gdb that I can call to help pinpoint the error and where it is originating from in this backtrace? I am fairly new to using gdb in general so any suggestions would be most appreciated.
Edit Sample Code:
for(int k = 4; k < 7; k++)
{
if(TFNCWS[k] == 1)
{
TFNCWS[k] = 0;
TIM[k] = 1;
}
}

How do I interpret this backtrace?
Very simply: your program crashes while destructing a vector, declared somewhere in your main function.
The most likely cause is stack corruption.
I ran Valgrind mems check to see if there is any memory leaks
Memory leaks have nothing to do with your problem, and Valgrind is exceedingly weak at detecting stack overflows.
You should try Address Sanitizer (available in recent GCC and Clang). Chances are it will point you straight at the problem.

Related

std::string::append crashes program with "std::bad_alloc"

I have a text file which contains a list of data relating to name, position, and height. My program parses this data into a vector map, then uses this data to construct an xml file using boost::property_tree. The text file is about 3500 lines, and the program consistently crashes at line 1773 with:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)
At first I thought maybe the size limit was being reached, but reading up on std::string shows that the target computer should be able to allocate the size required. Regardless, I decided to test with std::string::size , std::string::length, std::string::capacity, std::string::max_size which showed (respectively):
...
...
6572094845 6572094845 6626476032 9223372036854775807
6579537815 6579537815 6626476032 9223372036854775807
6586984998 6586984998 6626476032 9223372036854775807
6594436394 6594436394 6626476032 9223372036854775807
6601892003 6601892003 6626476032 9223372036854775807
6609351825 6609351825 6626476032 9223372036854775807
6616815856 6616815856 6626476032 9223372036854775807
6624284100 6624284100 6626476032 9223372036854775807
std::string::capacity was seen to increase once std::string::length == std::string::capacity.
gdb bt after compiling for debug:
(gdb) bt
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007fd67037e921 in __GI_abort () at abort.c:79
#2 0x00007fd6709d3957 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fd6709d9ae6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fd6709d9b21 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fd6709d9d54 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fd6709da2dc in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007fd670a6bb8b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007fd670a6d133 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x000056104c176f3a in main (argc=1, argv=0x7ffc0af8b9a8) at /home/code/hello_world/createWorld.cpp:224
Example line in text file being read:
713.258 235.418 ABCD1234567 2898
Code:
int main(int argc, char **argv)
{
CreateWorld *newWorld = new CreateWorld();
lastModelsParser *lastModels = new lastModelsParser();
/*
Code here reads creates ifs for xml data,
then reads xml successfully into a ptree
*/
vector<lastModelsParser::lastModel> _lastModels;
_lastModels = lastModels->getlastModels();
uint16_t lastModelsEntry = 0;
std::string newModelString;
for(auto i:_lastModels){
ptNewModel = newWorld->modelModifier(ptModel,
_lastModels.at(lastModelsEntry).pX,
_lastModels.at(lastModelsEntry).pY,
_lastModels.at(lastModelsEntry).name,
_lastModels.at(lastModelsEntry).height);
boost::property_tree::xml_parser::write_xml_element(modelOSS, ptNewModel.front().first, ptNewModel.back().second, 1, xml_settings);
newModelString.append(modelOSS.str()); // CRASHES HERE
lastModelsEntry++;
}
// append to world.xml
boost::property_tree::write_xml(worldOSS, ptWorld, xml_settings); // write xml data into OSStreams
boost::property_tree::write_xml(modelOSS, ptModel, xml_settings); // write xml data into OSStreams
size_t worldPos = worldOSS.str().find("</world>");
std::string newWorldString = worldOSS.str().insert(worldPos,newModelString+"\n\t");
newWorldFile << newWorldString ;
delete(lastModels);
delete(newWorld);
return EXIT_SUCCESS;
}
Edit. Valgrind output
valgrind --tool=massif --massif-out-file=memleak.txt ./createNewWorld
heap_tree=detailed
n2: 6636657886 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n2: 6626476282 0x5160B89: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
n2: 6626476282 0x5162131: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
n0: 6626476033 0x149F38: main (in /home/code/hello_world/createNewWorld)
n0: 249 in 2 places, all below massif's threshold (1.00%)
n0: 0 in 2 places, all below massif's threshold (1.00%)
n0: 10181604 in 18 places, all below massif's threshold (1.00%)
valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes --verbose --log-file=valgrind-out_1.txt ./createNewWorld
...
--4758-- memcheck GC: 1000 nodes, 0 survivors (0.0%)
--4758-- memcheck GC: 1000 nodes, 0 survivors (0.0%)
--4758-- memcheck GC: 1000 nodes, 0 survivors (0.0%)
--4758-- memcheck GC: 1000 nodes, 0 survivors (0.0%)
==4758== Warning: set address range perms: large range [0xee015040, 0x1b37d5041) (undefined)
std::string::max_size is not the biggest string your system can allocate. It's only the biggest string the std::string class itself could represent at all, given unlimited continuous memory space.
You are exceeding already more than 6GB, and there's a good chance there is simply not enough memory left to do the copy necessary for the next re-alloc. DO you have the 13GB RAM required for that step, and do your configured kernel limits permit allocation (not even commitment yet!) of that much for a single process?
Just avoid storing everything in a single, long string, but partition it / write out much earlier. Then re-allocations only result in memory usage spikes up to 2x the biggest allocation, and not 2x your total current memory consumption.
The cause for crash in this code was due to the following lines fibbonaci'ing itself on each iteration:
boost::property_tree::xml_parser::write_xml_element(modelOSS, ptNewModel.front().first, ptNewModel.back().second, 1, xml_settings);
newModelString.append(modelOSS.str());
So to explain:
On each iteration, the write_xml_element member function was writing into modelOSS then appending it to newModelString.
modelOSS is not "overwritten" and is instead appended to by default.
This appended stream is then appended onto newModelString.
Based off this, you can completely remove the line newModelString.append(modelOSS.str()); and it will function as normal.
For example this is what was happening:
New entry = "A"
it = 0, modelOSS = "A", newModelString = "A"
New entry = "B"
it = 1, modelOSS = "A, B", newModelString = "A, A, B"
new entry = "C"
it = 2, modelOSS = "A, B, C", newModelString = "A, A, B, A, A, B, C"
#1 0x00007f64df15bba8 in abort () from /lib64/libc.so.6
#2 0x00007f64ddc36e20 in ?? () from /lib64/libstdc++.so.6
#3 0x00007f64ddc44f26 in ?? () from /lib64/libstdc++.so.6
#4 0x00007f64ddc44fe1 in std::terminate() () from /lib64/libstdc++.so.6
#5 0x00007f64ddc45378 in __cxa_throw () from /lib64/libstdc++.so.6
#6 0x00007f64ddc36a7d in ?? () from /lib64/libstdc++.so.6
#7 0x00007f64ddcfbe1d in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) ()
from /lib64/libstdc++.so.6
#8 0x00007f64ddcfe51b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) () from /lib64/libstdc++.so.6
#9 0x00007f63f4627d2a in encoder::xxx::xxx(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, encoder::TileInfo, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, encoder::TileInfo> > > const&, encoder::TileInfo const&, encoder::CrossTileChapters&, encoder::AddDataInfo const&) ()
bool encoder::xxx::xxx(const std::map<std::string, TileInfo> &mapTileInfos,
const TileInfo &tileInfo, CrossTileChapters &languangeTileChapters, const AddDataInfo &addInfo)
{
...
std::string language = tileInfo.tileType;
std::string addPoiName = "";
auto langCode2name = addInfo.langCode2name;
if (langCode2name.find(language) != langCode2name.end()) {
addPoiName = langCode2name[language];
}
std::string labels = languangeTileChapters.labelChapter->GetLablesString();
...
unsigned int languangeTileLabelCount = languangeTileChapters.labelChapter->GetLabelCount();
unsigned int nativeLabelCount = poiTileChapters.labelChapter->GetNativeLabelCount();
int offset = -1;
bool isFounded = false;
offset = GetPOIOffset(poiTileChapters.pointChapter, addInfo.poiSiteID, isFounded, true);
if (isFounded) {
labels.insert(labels.end(), nativeLabelCount - languangeTileLabelCount - 1, '\0');
} else {
labels.insert(labels.end(), nativeLabelCount - languangeTileLabelCount, '\0');
}
labels += addPoiName;
labels += '\0';
languangeTileChapters.labelChapter->SetLablesString(labels);
return true;
}

Race condition in OpenMP outside parallel block (ThreadSanitizer); false positive?

The following minimal example computes the sum of all the numbers from 1 to 1000 and is parallelized with OpenMP:
#include <iostream>
double sum;
void do_it() {
const size_t n = 1000;
#pragma omp parallel
{
#pragma omp for
for (size_t i = 1; i <= n; ++i) {
#pragma omp atomic
sum += static_cast<double>(i);
}
}
}
int main() {
sum = 0.;
do_it();
std::cout << sum << std::endl;
return 0;
}
I tried compiling this with either clang++-6.0.0 and g++-5.4.0 and ThreadSanitizer. Both compilers produce a few warnings about race conditions in libomp.so/libgomp.so, which I am assuming are false positives, and the following warning about my code:
==================
WARNING: ThreadSanitizer: data race (pid=22081)
Read of size 8 at 0x000001555f48 by main thread:
#0 main /home/arekfu/src/foo/openmp.cc:20 (openmp+0x4be0ce)
Previous atomic write of size 8 at 0x000001555f48 by thread T11:
#0 __tsan_atomic64_compare_exchange_val ??:? (openmp+0x476470)
#1 .omp_outlined._debug__ /home/arekfu/src/foo/openmp.cc:12 (openmp+0x4be011)
#2 .omp_outlined. /home/arekfu/src/foo/openmp.cc:8 (openmp+0x4be011)
#3 __kmp_invoke_microtask ??:? (libomp.so.5+0x994b2)
Location is global '<null>' at 0x000000000000 (openmp+0x000001555f48)
Thread T11 (tid=22093, running) created by main thread at:
#0 pthread_create ??:? (openmp+0x4284db)
#1 __kmpc_threadprivate_register_vec ??:? (libomp.so.5+0x5bc1f)
#2 __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291 (libc.so.6+0x2082f)
SUMMARY: ThreadSanitizer: data race /home/arekfu/src/foo/openmp.cc:20 in main
==================
I cannot see any data race in my code though!
I have also tried replacing the atomic updates with a critical section, like this:
#pragma omp critical
{
sum += static_cast<double>(i);
}
This changes the warning, but the new one does not make much more sense:
==================
WARNING: ThreadSanitizer: data race (pid=27477)
Write of size 8 at 0x000001555f48 by thread T4:
#0 .omp_outlined._debug__ /home/arekfu/src/foo/openmp.cc:13 (openmp+0x4be0a2)
#1 .omp_outlined. /home/arekfu/src/foo/openmp.cc:8 (openmp+0x4be0a2)
#2 __kmp_invoke_microtask ??:? (libomp.so.5+0x994b2)
Previous write of size 8 at 0x000001555f48 by thread T3:
#0 .omp_outlined._debug__ /home/arekfu/src/foo/openmp.cc:13 (openmp+0x4be0a2)
#1 .omp_outlined. /home/arekfu/src/foo/openmp.cc:8 (openmp+0x4be0a2)
#2 __kmp_invoke_microtask ??:? (libomp.so.5+0x994b2)
Location is global '<null>' at 0x000000000000 (openmp+0x000001555f48)
Thread T4 (tid=27482, running) created by main thread at:
#0 pthread_create ??:? (openmp+0x42857b)
#1 __kmpc_threadprivate_register_vec ??:? (libomp.so.5+0x5bc1f)
#2 __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291 (libc.so.6+0x2082f)
Thread T3 (tid=27481, running) created by main thread at:
#0 pthread_create ??:? (openmp+0x42857b)
#1 __kmpc_threadprivate_register_vec ??:? (libomp.so.5+0x5bc1f)
#2 __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291 (libc.so.6+0x2082f)
SUMMARY: ThreadSanitizer: data race /home/arekfu/src/foo/openmp.cc:13 in .omp_outlined._debug__
==================
Are these warnings an indication of real data races, or are they false positives?
The "problem" is the read operation on sum in line 20:
std::cout << sum << std::endl; // here you are reading the value of sum
TSAN cannot infer an inter-thread happens-before relation between this read and the (atomic) updates in the loop. But of course such a relation exists, since all threads are sychronized at the end of the omp block. So yes, this is a false positive.
This post provides more information how to avoid such false positives with OpenMP: Can I use Thread Sanitizer for OpenMP programs?

STL vector.clear() cause memory double free or corruption is pthreads program

Here is code snippet:
pthread_mutex_lock(&hostsmap_mtx);
for (int i = 0; i < service_hosts[service].size(); ++i)
poolmanager->delPool(service, service_hosts[service][i].first);
service_hosts[service].clear();
for (int i = 0; i < servers->count; ++i) {
string temp(servers->data[i]);
int pos = temp.find(':');
string server = temp.substr(0, pos);
string port = temp.substr(pos + 1, temp.length() - pos - 1);
service_hosts[service].push_back(make_pair(server, atoi(port.c_str())));
config.server = server;
config.port = atoi(port.c_str());
poolmanager->addPool(service, config);
}
pthread_mutex_unlock(&hostsmap_mtx);
The type of service_hosts is map<string, vector<pair<string, int> > >
Crash reason:
* Error in './HttpProxy': double free or corruption (fasttop): 0x00007f6fe000a6b0 *
And GDB bt:
5 ~basic_string (this=0x7f6fe0000960, __in_chrg=<optimized out>)
at /usr/include/c++/4.8.3/bits/basic_string.h:539
6 ~pair (this=0x7f6fe0000960, __in_chrg=<optimized out>)
at /usr/include/c++/4.8.3/bits/stl_pair.h:96
7 _Destroy<std::pair<std::basic_string<char>, int> > (__pointer=0x7f6fe0000960)
at /usr/include/c++/4.8.3/bits/stl_construct.h:93
8 __destroy<std::pair<std::basic_string<char>, int>*> (__last=<optimized out>,
__first=0x7f6fe0000960) at /usr/include/c++/4.8.3/bits/stl_construct.h:103
9 _Destroy<std::pair<std::basic_string<char>, int>*> (__last=<optimized out>,
__first=<optimized out>) at /usr/include/c++/4.8.3/bits/stl_construct.h:126
10 _Destroy<std::pair<std::basic_string<char>, int>*, std::pair<std::basic_string<char>, int> > (
__last=0x7f6fe0000970, __first=0x7f6fe0000960)
at /usr/include/c++/4.8.3/bits/stl_construct.h:151
11 _M_erase_at_end (this=<optimized out>, __pos=0x7f6fe0000960)
at /usr/include/c++/4.8.3/bits/stl_vector.h:1352
12 clear (this=0x7f6fe000a0f8) at /usr/include/c++/4.8.3/bits/stl_vector.h:1126
Any advice would be grateful.
Double frees are very easy to diagnose using valgrind (or any other similar tool you have access to). It will tell you who freed the memory you are accessing which will lead you to the root of the problem. If you have problems reading the valgrind output post it here and we can help you out.
This problem may not be easy to find out the root cause.But finally I find out the reason.Cause stl string is not thread safe and stl string assignment is reference-count and COW. Eg:
string str = "stl::string";
string temp = str; //this is ref-count,COW
string temp2 = str.c_str(); //this is memory copy, cause string don't know how to handle const char *, just copy it
My fix is to pass const char * to stl vecotrs, and use const char * as function parameter in multithreading condition and if there is a contention. Eg:
map<string, vector<pair<string, int> > > Hostmap;
string service = "service";
string host = "host";
//Note: not service and host, but service.c_str() and host.c_str()
Hostmap[service.c_str()].push_back(make_pair(host.c_str(), 9966));
Hope this question will provide some hint to the problem you encountered.

function crashes depending on function argument definition - double free or corruption

I have defined a class for the evaluation of bspline basis functions. I have nowhere used pointers, or new delete etc. The class is the following:
class bspline_basis{
//private:
public:
int k; /*! order of Bspline basis */
int nbreak; /*! Dimension of breakpoints vector. */
int nknots; /*! Dimension of knots vector. */
int nbasis; /*! Number of basis functions */
vector<double> breakpts; /*! Represents strictly increasing values of knots, excluding all multiplicities */
vector<double> knots; /*! Raw knot vector of BSpline basis, it may contain multiple entries due to multiplicity */
vector<double> Bix_nonzero; /*! size: (k). Stores nonzero components of bspline basis */
vector<double> Bix; /*! size: nbasis Stores all components of bspline basis. Not necessary - remove? */
int find_knot_span_of_x(const double &x); /*! Returns integer i: t_i <= x < t_{i+k}. Upon call it stores i in i_saved */
pair<int,int> find_nonzero_basis_at_x(const double &x); /*! Returns first, last index of nonzero basis B_i(x) at particular x. */
pair<int,int> find_base_nonzero_interval(const double &x); /*! Returns first (i) , last (i+k) index of knots t_i at particular x. */
int i_saved; // Temporary saves for speed up
double x_saved; // Temporary saves for speed up
/* !ESSENTIAL ROUTINES FOR EVALUATION! Add as optional argument another knot vector for use in evaluation of integrals */
void eval_nonzero_basis(const int &i, const double &x); /*! Evaluates non zero basis functions at x */
void eval_Bix(const int &i, const double &x); /*! Evaluates all basis functions at x */
/*! Default clamped knot vector constructor */
bspline_basis(const vector<double> &_breakpts, const int &_k);
/* Evaluation functions */
double get_Bix(const int &i, const double &x); /*! Value B_i(x) */
};
The function that is transparent to the user and evaluates the functions B_i(x) is
get_Bix(const int &i, const double &x);
When I use it inside a for loop, with a variable integer i, everything works great, that is, in this example:
// some constructor of class mybasis
for(double x=0; x<=10. x+=0.01)
{
cout<< x << " ";
for (int i=0; i<nbasis; ++i)
cout<< mybasis.get_Bix(i,x)<<" ";
cout<<endl;
}
correct values are printed. However if I define a constant integer for the first argument of the function, like in this example:
int idx=3;
for(double x=0; x<=10. x+=0.01)
{
cout<< x << " ";
//for (int i=0; i<nbasis; ++i)
cout<< mybasis.get_Bix(idx,x)<<" ";
cout<<endl;
}
I get the following error:
*** Error in `./test_class.xxx': double free or corruption (out): 0x0000000000740280 ***
when I run the code in gdb, and backtrace it, I get the following message:
(gdb) bt
#0 0x00007ffff7530bb9 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linu/raise.c:56
#1 0x00007ffff7533fc8 in __GI_abort () at abort.c:89
#2 0x00007ffff756de14 in __libc_message (do_abort=do_abort#entry=1, fmt=fmt#entry=0x7ffff767c668 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007ffff757a0ee in malloc_printerr (ptr=<optimised out>, str=0x7ffff767c798 "double free or corruption (out)", action=1) at malloc.c:4996
#4 _int_free (av=<optimised out>, p=<optimised out>, have_lock=0) at malloc.c:3840
#5 0x00000000004039ac in __gnu_cxx::new_allocator<double>::deallocate (this=0x7fffffffdc70, __p=0x608280) at /usr/include/c++/4.8/ext/new_allocator.h:110
#6 0x00000000004031be in std::_Vector_base<double, std::allocator<double> >::_M_deallocate (this=0x7fffffffdc70, __p=0x608280, __n=15) at /usr/include/c++/4.8/bits/stl_vector.h:174
#7 0x00000000004030b3 in std::_Vector_base<double, std::allocator<double> >::~_Vector_base (this=0x7fffffffdc70, __in_chrg=<optimised out>) at /usr/include/c++/4.8/bits/stl_vector.h:160
#8 0x00000000004028ed in std::vector<double, std::allocator<double> >::~vector (this=0x7fffffffdc70, __in_chrg=<optimised out>) at /usr/include/c++/4.8/bits/stl_vector.h:416
#9 0x00000000004017aa in bspline_basis::eval_Bix (this=0x7fffffffdd40, ii=4, x=#0x7fffffffdd10: 0.01) at bsplines_stackoverflow.hpp:247
#10 0x0000000000401f59 in bspline_basis::get_Bix (this=0x7fffffffdd40, i=4, x=#0x7fffffffdd10: 0.01) at bsplines_stackoverflow.hpp:331
#11 0x000000000040214e in main () at test_class.cpp:31
The problem must lie in the functions
double bspline_basis::get_Bix(const int &i, const double &x)
{
if (i<0 || i> nbasis){
DEBUG(i);
std::cerr<< "Index of Bix out of range, aborting ..." << endl;
throw 0;
}
if (x==x_saved && i==i_saved) {
return
Bix[i];
}else if ( x != x_saved && i == i_saved){
eval_Bix(i_saved,x); // Evaluate all nonzero and store to Bix.
x_saved=x; // Store x for subsequent evaluations.
return
Bix[i];
}else {
// a. Find knot span of x:
find_knot_span_of_x(x);
// b. Evaluate all nonzero Bix and pass them to Bix:
eval_Bix(i_saved,x);
x_saved=x; // Store x for subsequent evaluations.
i_saved=i; // Store knot span i for possible subsequent evaluations.
return
Bix[i];
}
}
and
/*!
Wrapper function for eval_nonzero_basis. Passes nonzero basis values to vector<double> Bix.
*/
void bspline_basis::eval_Bix (const int &ii, const double &x){
//pair<int,int> i_start_end = find_nonzero_basis_at_x(x);
int istart= ii-k+1;
pair<int,int> i_start_end = make_pair(istart,ii);
// Evaluate all nonzero entries. for this index.
eval_nonzero_basis(i_start_end.second, x);
// Initialize (to zeros) temporary vector of dimension nbasis
vector<double> Bix_temp(nbasis,0.0);
// Pass nonzero entries to temporary vector
for(int j= i_start_end.first; j <= i_start_end.second; ++j)
Bix_temp[j] = Bix_nonzero[j-i_start_end.first];
// move temporary vector to Bix
Bix=Bix_temp;
}
I cannot understand how it is possible to get an error when I defined the first argument outside the loop. Any help much appreciated.
update: Please let me clarify that the problem is not because the index (idx=3) is outside the allowed range interval. If this was the case the first for loop would crash. Variable nbasis much greater than 3.
update 2: Following #Adrian suggestion, I've made Bix_temp an element of the class, and re-run the code. Again the code crashes, but know it produces some output (values x and infs). This is the new output of the debugger:
9.97 inf
9.98 inf
9.99 inf
10 inf
0.002867sec
4.78333e-05min
*** Error in `/home/foivos/Documents/Grav_Ast/BSplines/Genetic_Algorithms/tests/test_new_Jeans/bsplines_class/test_class.xxx': double free or corruption (out): 0x0000000000608180 ***
Program received signal SIGABRT, Aborted.
0x00007ffff7530bb9 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff7530bb9 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ffff7533fc8 in __GI_abort () at abort.c:89
#2 0x00007ffff756de14 in __libc_message (do_abort=do_abort#entry=1, fmt=fmt#entry=0x7ffff767c668 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007ffff757a0ee in malloc_printerr (ptr=<optimised out>, str=0x7ffff767c798 "double free or corruption (out)", action=1) at malloc.c:4996
#4 _int_free (av=<optimised out>, p=<optimised out>, have_lock=0) at malloc.c:3840
#5 0x00000000004039e8 in __gnu_cxx::new_allocator<double>::deallocate (this=0x7fffffffdda8, __p=0x608180) at /usr/include/c++/4.8/ext/new_allocator.h:110
#6 0x00000000004031fa in std::_Vector_base<double, std::allocator<double> >::_M_deallocate (this=0x7fffffffdda8, __p=0x608180, __n=15) at /usr/include/c++/4.8/bits/stl_vector.h:174
#7 0x00000000004030ef in std::_Vector_base<double, std::allocator<double> >::~_Vector_base (this=0x7fffffffdda8, __in_chrg=<optimised out>) at /usr/include/c++/4.8/bits/stl_vector.h:160
#8 0x0000000000402929 in std::vector<double, std::allocator<double> >::~vector (this=0x7fffffffdda8, __in_chrg=<optimised out>) at /usr/include/c++/4.8/bits/stl_vector.h:416
#9 0x0000000000402646 in bspline_basis::~bspline_basis (this=0x7fffffffdd30, __in_chrg=<optimised out>) at bsplines_stackoverflow.hpp:57
#10 0x00000000004022b6 in main () at test_class.cpp:22
(gdb) frame 9
#9 0x0000000000402646 in bspline_basis::~bspline_basis (this=0x7fffffffdd30, __in_chrg=<optimised out>) at bsplines_stackoverflow.hpp:57
57 class bspline_basis{
(gdb) info args
this = 0x7fffffffdd30
__in_chrg = <optimised out>
(gdb)
Again let me emphasize that when I use the for loop index i I don't get any errors.
update 3: Found the bug, it was conceptual and inside the function
get_Bix(const int &i, const double &x)
Specifically, in the following lines of code:
// Was using the same index i, for different x, which was a mistake and was causing troubles.
}else if ( x != x_saved && i == i_saved){
eval_Bix(i_saved,x); // Evaluate all nonzero and store to Bix.
x_saved=x; // Store x for subsequent evaluations.
return
Bix[i];
}else {
Thank you all for your help.
Try:
if (i<0 || i>= nbasis){
instead of
if (i<0 || i> nbasis){
at the top of get_Bix.
I'm guessing you're in 3D, in which case you'll have components 0, 1 and 2, but not 3.
There's a destructor in the stack trace, and it must be that of Bix_temp at the end of eval_Bix.
I really don't know why it's happening, but making Bix_temp a member of the class would almost certainly fix it. You'd then have the hassle of zeroing it every time, but that's faster than the construction and destruction anyway.

Linux C++ program crashes with St9bad_alloc after map gets very huge

I am running a C++ program that involving building inverted index on red hat linux 64 bits. My invert index is defined as map<unsigned long long int, map<int,int> > invertID; and I got this error where it crashes randomly, with what(): St9bad_alloc.Each time of the crash is different. Sometimes, I got 100,000,000 keys and it's still running a while more. Sometimes, about 80,000,000 keys and it already yell out the error.
Googling around, I found that this error may come from new, but taking a look at my code, I am not using any new keyword, yet, I have such memory allocation with map. I keep on inserting in the key/value pair in each iteration. So I decided some experiment with try catch statement.
In fact, here is the critical part of the code and output:
map<unsigned long long int, map<int,int> >::iterator mainMapIt = invertID.find(ID);
if (mainMapIt != invertID.end()){
//if this ImageID key exists in InvID sub-map
map<int,int> M = mainMapIt->second; // THIS IS LINE 174.
map<int,int>::iterator subMapIt = M.find(imageID);
if (subMapIt != M.end()){
//increment the number of this ImageID key
++invertID[ID][imageID];
}
else{
//add ImageID key with value 1 into the InvertID
try{
invertID[ID][imageID] = 1;
++totalPushBack;
}catch (bad_alloc ba){
cout << "CAUGHT 1: invertID[" << ID << "][" << imageID << endl;
}
}
}
else{
//create the first empty map with the key as image ID with value 1 and put it in implicitly to the invertID
try{
invertID[ID][imageID] = 1;
}catch (bad_alloc ba){
cout << "CAUGHT 2: invertID[" << ID << "][" << imageID << endl;
}
}
Output:
...
CAUGHT 2: invertID[21959247897][3856
CAUGHT 2: invertID[38022506156][3856
CAUGHT 2: invertID[29062506144][3856
terminate called after throwing an instance of 'std::bad_alloc'
what(): St9bad_alloc
I see that when I tried to insert new key, the error is thrown. However, I got a bit more surprise that St9bad_alloc is still being thrown after I cover the key insertion part with try catch block. I did a little backtrace and here is the result:
(gdb) backtrace
#0 0x000000344ac30265 in raise () from /lib64/libc.so.6
#1 0x000000344ac31d10 in abort () from /lib64/libc.so.6
#2 0x00000034510becb4 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6
#3 0x00000034510bcdb6 in ?? () from /usr/lib64/libstdc++.so.6
#4 0x00000034510bcde3 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5 0x00000034510bceca in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6 0x00000034510bd1d9 in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6
#7 0x0000000000406544 in __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<int const, int> > >::allocate (
this=0x7fffffffdfc0, __n=1)
at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:88
#8 0x0000000000406568 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_get_node (this=0x7fffffffdfc0)
at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_tree.h:358
#9 0x0000000000406584 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_create_node (this=0x7fffffffdfc0, __x=...)
at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_tree.h:367
#10 0x00000000004065e3 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_clone_node (this=0x7fffffffdfc0, __x=0x21c082bd0)
at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_tree.h:381
#11 0x0000000000406634 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_copy (this=0x7fffffffdfc0, __x=0x21c082bd0, __p=0x7fffffffdfc8)
at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_tree.h:1226
#12 0x00000000004067e9 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_Rb_tree (this=0x7fffffffdfc0, __x=...)
at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_tree.h:570
#13 0x0000000000406885 in std::map<int, int, std::less<int>, std::allocator<std::pair<int const, int> > >::map (
this=0x7fffffffdfc0, __x=...) at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_map.h:175
#14 0x0000000000403039 in generateInvertID (pathToPF=0x6859a8 "/home/karl/c/000605.pf",
pathToC=0x38c139ed8 "/home/karl/c/000605.c", imageID=3856)
at InvertIndexGen.cpp:174
#15 0x0000000000403b46 in generateInvertIDForAllPFAndC () at InvertIndexGen.cpp:254
#16 0x0000000000403d0b in main (argc=1, argv=0x7fffffffe448) at InvertIndexGen.cpp:47
(gdb)
At #14, InvertIndexGen.cpp:174, in my code above, this is where it crashed:
map<int,int> M = mainMapIt->second; // THIS IS LINE 174.
It seems that when I call ->second, a copy of the respective map has to be created. This should be the reason of St9bad_alloc as well.
But in this case, is there anything I can do here? After all, invertID.max_size() return 18446744073709551615, and I am using about 100 million keys only. I also see it from top, that my program uses only 10% of memory. (we got 128GB RAM)
What are some of the measures I should use against this error? I see some of my senior colleagues are doing this as well, and they report that when their invert index starts to grow more than 70-80% of memory in top, the program starts to go haywire. But my program uses only 10%, so what's going on here? What are some of the things we can do to prevent this error?
EDIT: some comments suggest me to check with ulimit, so here it is:
-bash-3.2$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1056768
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1056768
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
map<int,int> M = mainMapIt->second; // THIS IS LINE 174.
does a copy of your second.
map<int,int>& M = mainMapIt->second; // THIS IS LINE 174.
would at least help to avoid this copy.
map<int,int> M = mainMapIt->second; // THIS IS LINE 174.
This line will cause a unnessary map copy, and memory allocations.
Change to reference would help.
map<int,int> & M = mainMapIt->second; // THIS IS LINE 174.