I often wrap c++ classes using cython. All the calculations are done in c++ and cython is only used to pass variables to the constructor and get the results from c++.
For a recent project, I am having the following problem: The code (which initializes a class and then calls a method) always runs fine initially, but after calling the same method repeatedly (I can't reproduce when exactly), it suddenly runs a factor of at least 1000 slower than normal when calling the same method.
My question is: What could cause such a seemingly random behavior and how would you go about debugging it?
I know this is impossible to solve without seeing the code, but it's large and I don't know where the problematic behavior comes from. I'm just asking for hints and strategies of how to solve it.
Things I've tried:
Checked the c++ code for leaks.
tried different compiler directives (#cython: wraparound=False, boundscheck=False, ...)
A hint could be that if I run the python code without the --pylab option of ipython it complains about a symbol not being found, but that's the only problem I have been able to identify.
It was an uninitialized member that occasionally caused a problem, not a memory leak. Thanks to those who tried to help!
The issue with ipython remains, but I posted that in another question.
Related
I was writing some simple C++ code using DevC++ when this error came up:
I have no clue as to why I am getting this upon initialising a vector array (a graph adjacency list).
I couldn't co much to solve this problem since I am not an expert in c++ compilers. I tried reinstalling the program but that didn't help at all.
My compiler is TDM-GCC and in the compiler options I added "-std=c++11", which is executed when calling the compiler.
This line
std::vector<int> adj[NK];
defines an array of 100 million std::vector objects, along with a static initializer to create all of them.
Did you mean to create a single vector of size 100M?
std::vector<int> adj(NK);
Despite the fact that I made a mistake in the code, it still compiled successfully after a very simple fix. The compiler used to be a 64-bit Release version. I changed the field to the 32-bit Release and the problem disappeared, despite the ridiculous amount of memory my program had needed.
Please note that your mileage may vary and this solution might have some side effects I am not aware of. However, this worked just fine for me and it seems that all my other c++ files compile without any errors.
I am working on the largest project I've ever worked on, and I've never debugged something like this, so I don't know where to get started.
Some info on the crash:
I am using Visual Studio, and the debugger is completely useless. The only information it gives me is that it appears to be happening during a call to "memcpy". The call stack is completely empty except for the memcpy function, and the local variables are listed but it does not have values for any of them.
It happens occasionally on any computer.
It does not ALWAYS happen under any (known) condition, but it only ever happens under a few conditions. In particular it only happens when a particular type of object is destroyed, although that's not necessarily the direct cause, and investigating the destruction process has not been helpful.
A little more about the project:
It is a game using SFML 2.0, linked statically.
I am not calling memcpy anywhere in my own code.
Some questions:
Where could the call to memcpy be coming from? Is it in SFML or elsewhere?
How do I (using visual studio) get more information on a crash when the debugger isn't working?
This is an answer to "Where could the call to memcpy be coming from?"
In most cases this is the result of a call to the copy constructor of std::string with a this pointer of NULL, or a string operation on an already destructed string. This string can be a member of a class of you, of course.
This in itself won't help you to find the problem when the project is really large. However, you can almost certainly assume that you are using a reference or pointer (or iterator) to a custom object that is already destructed. A most straightforward way to find this access would be by running your program, compiled without optimization and with debug info, in valgrind. Unfortunately that isn't available for windows (see Is there a good Valgrind substitute for Windows?).
The main problem here seems to be that you aren't even getting a backtrace, because that would give a strong hint to where to look into, at least. I'm not familiar with windows though, so I can only guess what is the cause of that. Are you sure you have everything compiled with debug info?
Someone has reported a bug in my program, which brings up the following error message:
The error occours in a C++ DLL of the program, which was compiled by VS2008. I can reproduce the error, but am not able to find out what´s the problem. I´ve already done a bunch of tests for memory leaks or wrong alloctions, but with no success.
Now the weird thing: when I add a main function to the code, compile it as an EXE and then run the exactly same thing, all is ok. The error occours only as a DLL.
The next strange thing is, that when I press "Ignore", the program continues and does its job as expected.
So, I´m looking for 2 types of answers:
- Answers that help me find the bug
- Answers that help me to "auto-ignore" or hide this errormessage, so that it does not occour. That would be ok, since there is no difference in the result.
I´m thankful for any help or advise.
Thanks!
Update
Like Joachim Pileborg said, I´ve created a simple test c++ project, that calls my DLL, and it works perfectly! The program that normally calls the DLL is written in DELPHI, so I think it could be a bug of DELPHI... Weird: The call of the DLL works for 99.9999..%, but in one specific case, there occours an error IN the Dll. It´s not the call that fails... Really, really strange story :S
Ok... that is a negative number. The simpler thing... perhaps your code is doing a malloc with an invalid\negative size?
The strange thing is that that number in binary is 11111111111111111111111111111100, probably your size calculation is wrong. It can be however a more complicated error, for example due a buffer overflow.
You should not ignore this error at all, can be a symptom of a more complicated and dangerous error.
Try to debug your allocation, try to get exactly the piece of code that is doing this invalid allocation.
You can overload the new operator or redefine the malloc replacing them your debug functions where you can check the passed arguments (size).
4294967292 - that is (unsigned)-4. You are either doing some computation of the size to be allocated wrong, some integer has overflown or somesuch.
I'd try to put a breakpoint on malloc (or whatever allocation function that is) and check where does the bad value come from.
Through the debugger, I was able to see the call stack, and what function in the code displayed the error. After searching for this routine, I could find out how to disable displaying it.
When I'm using my debugger (in my particular case, it was QT Creator together with GDB that inspired this) on my C++ code, sometimes even after calling make clean followed by make the debugger seems to freak out.
Sometimes it will seem to be lined up with another piece of code's line numbers, and will jump around. Sometimes this is is off by one line, sometimes this is totally off and it'll jump around erratically.
Other times, it'll freak out by stepping into things I didn't ask it to step into, like while stepping over a function call, it might step into the string initialization routine that is part of it.
When I get seg faults, sometimes it's able to tell me where it happened perfectly, and other times it's not even able to display question marks for which functions called the code and from where, and all I see is assembly, even while running the exact same code repeatedly.
I can't seem to figure out a pattern to what causes these failures, and sometimes my debugger is perfectly well behaved.
What are the theoretical reasons behind these debugger freak outs, and what are the concrete steps I can take to prevent them?
There's 3 very common reasons
You're debugging optimized code. This rarely works - optimized code can be reordered/inlined/precomputed/etc. to the point there's no chance whatsoever to map it back to the source code.
You're not debugging, for whatever reason, the binary matching the current source code.
You've invoked undefined behavior somewhere - if whatever stuff your code did, it has messed around with the scaffolding the debugger needs to keep its sanity. This is what usually happens when you get a segfault and you can't get a sane stack trace, you've overwritten/messed with the information(e.g. stack pointers) the debugger needs to do its job.
And probably hundreds more - of the stuff I personally encounter is: debugging multithreaded code; depending on gcc/gdb versions and various other things - there's been quite a handful debugger bugs.
One possible reason is that debuggers are as buggy as any other program!
But the most common reason for a debugger not showing the right source location is that the compiler optimized the code in some way, so there is no simple correspondence between the source code and the executable code. A common optimization that confuses debuggers is inlining, and C++ is very prone to it.
For example, your string initialization routine was probably inlined into the function call, so as far as the debugger was concerned, there was just one function that happened to start with some string initialization code.
If you're tracking down an algorithm bug (as opposed to a coding bug that produces undefined behavior, or a concurrency bug), turning the optimization level down will help you track the bug, because the debugger will have a simpler view of the code.
I have the same question like yours, and I cannot solve it yet. But I have came out one problem solution which is to install a virtual machine and install Unix system in it. And debug it in Linux system. Perhaps it will work.
I have found out the reason, you should rebuild the project every time you changed your code, or the Qt will just run the old version of the code.
This is utterly mystifying me. I have, in my class declaration, two lines:
std::multimap<int, int> commands;
std::multimap<std::string, std::string> config;
The code compiles without issue, but when I run it, I get the following error:
*** glibc detected *** ./antares: free(): invalid pointer: 0xb5ac1b64 ***
Seems simple enough, except that it has nothing to do with how the two variables are later handled. I removed all references in the rest of the code to the variables - still crashed. I commented out one of the lines - either one, and the program ran without issue. How can the error not be with either particular variable? I'm working under the assumption that there isn't a bug in STL, but I've run out of ideas on how my code could possibly be doing this.
This one has me stumped, so I'd appreciate any help you can provide.
Wyatt
EDIT: I'm not suggesting there's a problem with STL, that was just me being a bit glib. I know the bug is in my code, what I want to know is - what could possibly be wrong that declaring an unreferenced variable would cause it to crash? Why would that affect my code at all?
My code is a few thousand lines long, so it's not really worth anyone's time reading through it, I'm just looking for someone to point me in the right direction.
You're correct to assume the problem isn't in GCC or the STL. However, if the maps are causing free errors, your other code is likely stack smashing (or heap smashing). A truly terrible bug to chase down. The worse part about stack smashing is the object that breaks is not the object with the bug.
Here are some debugging tips.
Run the app under valgrind.
define _GLIBCXX_DEBUG to enable stl debugging
add MALLOC_CHECK_=1 as an environment variable. This will give you better malloc error messages. More info here.
On rare occasions I have been able to add a memory watch to the location that will be smashed. But it is rare when you can predict where the smashing will occur.
You are right: the crash is not from these two lines - they just make it visible.
Here's how to diagnose this problem:
first, leave your variables defined (make your program crash)
second, remove or disable other parts of your code until the crash stops happening. Then you will know an approximate area that corrupts your memory.
third (once you have an area that when disabled stops the crash) start enabling parts of it until the crash happens again.
Edit: I'd say your problem is with code that contains your two multimaps (a copy constructor or assignment operator is missing or something like that). It's just a wild guess so don't put much stock on it.