I need to build some old codes I got on my office computer, which has gcc 4.4.5 installed. I edited the code (deleting .h or adding things like <cstring>) in order to bring them up to date so they can be compiled by gcc 4.4.5. However, after a seemingly successful compile the binary file gives out buffer overflow every time I run it. But the code runs with no error on my computer at home (gcc 4.1.2). So is it possible the change I made caused this error? I am not sure since I am not really a programmer.
Far more likely is that the original code was buggy in some way (undefined behaviour, buffer overflows and so on) but the old compiler created (or the old library contained) code that was more tolerant of these issues (a).
I'm afraid you will probably have to go and fix (or get someone to fix) the root cause of the problem. My question to you would be: "if you don't consider yourself a programmer, why are you editing the code and rebuilding it?".
My mother's not a coder either but she doesn't go around tinkering in the Linux kernel :-)
(a) Sometimes undefined behaviour actually works! That's actually its most annoying aspect. Far better that it would fail all the time so that we'd fix more problems before unleashing them on our poor customers. But, even when it works, that doesn't make it a good idea.
Related
Every so often I (re)compile some C (or C++) file I am working on -- which by the way succeeds without any warnings -- and then I execute my program only to realize that nothing has changed since my previous compilation. To keep things simple, let's assume that I added an instruction to my source to print out some debugging information onto the screen, so that I have a visual evidence of trouble: indeed, I compile, execute, and unexpectedly nothing is printed onto the screen.
This happened me once when I had a buggy code (I ran out of the bounds of a static array). Of course, if your code has some kind of hidden bug (What are all the common undefined behaviours that a C++ programmer should know about?) the compiled code can be pretty much anything.
This happened me twice when I used some ridiculously slow network hard drive which -- I guess -- simply did not update my executable file after compilation, and I kept running-and-running the old version, despite the updated source. I just speculate here, and feel free to correct me, if such a phenomenon is impossible, but I suspect it has had to do something with certain processes waiting for IO.
Well, such things could of course happen (and they indeed do), when you execute an old version in the wrong directory (that is: you execute something similar, but actually completely unrelated to your source).
It is happening again, and it annoys me enough to ask: how do you make sure that your executable is matching the source you are working on? Should I compare the date strings of the source and the executable in the main function? Should I delete the executable prior compilation? I guess people might do something similar by means of version control.
Note: I was warned that this might be a subjective topic likely doomed to be closed.
Just use ol' good version control possibilities
In easy case you can just add (any) visible version-id in the code and check it (hash, revision-id, timestamp)
If your project have a lot of dependent files and you suspect older version, than "latest", in produced code, you can (except, obvioulsly, good makefile-rules) monitor also version of every file, used for building code (VCS-dependent, but not so heavy trick)
Check the timestamp of your executable. That should give you a hint regarding whether or not it is recent/up-to-date.
Alternatively, calculate a checksum for your executable and display it on startup, then you have a clue that if the csum is the same the executable was not updated.
Unfortunately I am not working with open code right now, so please consider this a question of pure theoretical nature.
The C++ project I am working with seems to be definitely crippled by the following options and at least GCC 4.3 - 4.8 are causing the same problems, didn't notice any trouble with 3.x series (these options might have not been existed or worked differently there), affected are the platforms Linux x86 and Linux ARM. The options itself are automatically set with O1 or O2 level, so I had to find out first what options are causing it:
tree-dominator-opts
tree-dse
tree-fre
tree-pre
gcse
cse-follow-jumps
Its not my own code, but I have to maintain it, so how could I possibly find the sources of the trouble these options are making. Once I disabled the optimizations above with "-fno" the code works.
On a side note, the project does work flawlessly with Visual Studio 2008,2010 and 2013 without any noticeable problems or specific compiler options. Granted, the code is not 100% cross platform, so some parts are Windows/Linux specific but even then I'd like to know what's happening here.
It's no vital question, since I can make the code run flawlessly, but I am still interested how to track down such problems.
So to make it short: How to identify and find the affected code?
I doubt it's a giant GCC bug and maybe there is not even a real fix for the code I am working with, but it's of real interest for me.
I take it that most of these options are eliminations of some kind and I also read the explanations for these, still I have no idea how I would start here.
First of all: try using debugger. If the program crashes, check the backtrace for places to look for the faulty function. If the program misbehaves (wrong outputs), you should be able to tell where it occurs by carefully placing breakpoints.
If it didn't help and the project is small, you could try compiling a subset of your project with the "-fno" options that stop your program from misbehaving. You could brute-force your way to finding the smallest subset of faulty .cpp files and work your way from there. Note: finding a search algorithm with good complexity could save you a lot of time.
If, by any chance, there is a single faulty .cpp file, then you could further factor its contents into several .cpp files to see which functions are the cause of misbehavior.
as the title mentions, I have a problem where one executable of a big project that gives a segmentation fault when it runs but is compiled normally and not with debug.
We are working on linux SUSE servers and code is mostly C++. Through bt in gdb, I have been able to see where exactly the problem occurs, which brings me to the question. The file is an auto-generated one which has not been changed for years. The difference now is that we have updated a third party component, gSOAP. Before updating the third party version it worked normally on both debug and not.
With debug flags, the problem disappears magically (for newbies like me).
I am sorry but its not possible to include a lot of code, only the line that is:
/*------------------------------------------------------------.
| yynewstate -- Push a new state, which is found in yystate. |
`------------------------------------------------------------*/
yynewstate:
/* In all cases, when you get here, the value and location stacks
have just been pushed. So pushing a state here evens the stacks. */
yyssp++;
yysetstate:
*yyssp = yystate; <------------------ THIS LINE
So, any help would appreciated. I actually dont understand why this problem rises and what steps I should take to solve it.
EDIT, I dont expect you to solve this particular case for me, as in more to help me understand why in programming this could occur, my case in this code is just an example.
First, please realize that you're using C++, not Java or any other language where the running of your program is always predictable, even runtime issues are predictable.
In C++, things are not predictable as in those languages. Just because your original program hasn't changed for years does not mean the program was error-free. That's how C++ works -- you think you have an error-free program, and it is not really error-free.
From your code, the exception is because yyssp is pointing to something it shouldn't be pointing to, and dereferencing this pointer causes the exception. That is the only thing that could be concluded from the code you posted. Why the pointer is pointing to where it is? We don't know, that is what you need to discover by debugging.
As to why things run differently in debug and release -- again, a bug like this allows a program to run in an unpredictable way. Add or remove code, run it on another machine, run it with differing compiler options, maybe even run it next week, and it might behave differently.
One thing you should not do -- if you make a totally irrelevant code change and magically your program works, do not claim the problem is fixed or resolved. No -- the problem is not fixed -- you've either masked it, or the bug is moved to another part of your code, hidden from you. Every fix that entails things like this must be reasoned as to why the fix addresses the problem.
Too many times, a naive programmer thinks that moving things around, adding or removing lines, and bingo, things work, that becomes the fix. Don't fall into that trap.
someone in my team found a temporary solution for this,
it was the optimization flags that this library is build with.
The default for our build was -O2 while on debug this changes.
Building the library with -O0 (changing the makefile) provides a temporary solution.
Alright, I need a sanity check.
I was working on my project by starting to add a new class yesterday and decided to compile my progress so far. Upon executing a release build of the application I immediately noticed a bug. (Which in this case is an asteroid field not showing up for my game) This immediately confused me since I hadn't touched any code that could of created this bug. And everything else is rendered fine as usual.
Fast forward a day later and I think I've narrowed down the cause however it doesn't make any sense to me at all.
It only happens intermittently with release builds. (About 1 in 5 executions) But I can prevent it happening altogether by simply not including my new class in the project. Even though the class isn't being used or included by anything else in the project yet. I even went a step further and found just commenting out the class definitions and leaving the header is fine but the second I uncomment them it comes back. And for further testing I included it in an older build with the same results.
Would unused code affect the compilation of a release build? Does this sound like a Visual Studio bug? Or does it not make any sense at all? (In other words, did I just find a really convincing red herring?) Is there anything I can do to help figure out the source of this bug?
Would unused code affect the compilation of a release build?
Yes. As this SO answer explains, Visual C++ keeps unused functions that aren't marked as inline.
Does this sound like a Visual Studio bug?
It's unlikely, as I'll explain in answer to your next question. Eric Lippert explains that, because the compiler is used so often that the easy and commonly seen bugs have already been found and fixed, it is far more likely that the bug is in your code than that the bug is in the C++ compiler.
Did I just find a really convincing red herring?
Yes, I believe you did. Without seeing the code for drawing the asteroid field, I can't be sure, but it's likely that you are dereferencing a bad pointer somewhere in there. Now it starts to matter which data the pointer points to.
When the unused class is included in the compiled assembly, the memory devoted to your code is larger, which means that your heap has to start higher in memory, and a bunch of things move to higher addresses. This means that a bad pointer can point at different data depending on whether the unused class is compiled into the executable or not. When you dereference that pointer, then, you get different results. That's why removing the code appears to "fix" the bug - whenever you take out the unused class, the bad pointer points at the data you want, rather than at other data. As #awoodland rightly points out in the comments, you're really unlucky without the unused class and lucky with it, because you fail to find a bug that could manifest itself in all kinds of weird ways once you start distributing the code to your friends (or customers, if this is a commercial product). These kinds of "bad pointer that happens to work" bugs can easily cause your code to appear to work correctly on some machines and fail dramatically on other machines, which is very difficult to debug. Better that you found the bug now than later.
When I'm using my debugger (in my particular case, it was QT Creator together with GDB that inspired this) on my C++ code, sometimes even after calling make clean followed by make the debugger seems to freak out.
Sometimes it will seem to be lined up with another piece of code's line numbers, and will jump around. Sometimes this is is off by one line, sometimes this is totally off and it'll jump around erratically.
Other times, it'll freak out by stepping into things I didn't ask it to step into, like while stepping over a function call, it might step into the string initialization routine that is part of it.
When I get seg faults, sometimes it's able to tell me where it happened perfectly, and other times it's not even able to display question marks for which functions called the code and from where, and all I see is assembly, even while running the exact same code repeatedly.
I can't seem to figure out a pattern to what causes these failures, and sometimes my debugger is perfectly well behaved.
What are the theoretical reasons behind these debugger freak outs, and what are the concrete steps I can take to prevent them?
There's 3 very common reasons
You're debugging optimized code. This rarely works - optimized code can be reordered/inlined/precomputed/etc. to the point there's no chance whatsoever to map it back to the source code.
You're not debugging, for whatever reason, the binary matching the current source code.
You've invoked undefined behavior somewhere - if whatever stuff your code did, it has messed around with the scaffolding the debugger needs to keep its sanity. This is what usually happens when you get a segfault and you can't get a sane stack trace, you've overwritten/messed with the information(e.g. stack pointers) the debugger needs to do its job.
And probably hundreds more - of the stuff I personally encounter is: debugging multithreaded code; depending on gcc/gdb versions and various other things - there's been quite a handful debugger bugs.
One possible reason is that debuggers are as buggy as any other program!
But the most common reason for a debugger not showing the right source location is that the compiler optimized the code in some way, so there is no simple correspondence between the source code and the executable code. A common optimization that confuses debuggers is inlining, and C++ is very prone to it.
For example, your string initialization routine was probably inlined into the function call, so as far as the debugger was concerned, there was just one function that happened to start with some string initialization code.
If you're tracking down an algorithm bug (as opposed to a coding bug that produces undefined behavior, or a concurrency bug), turning the optimization level down will help you track the bug, because the debugger will have a simpler view of the code.
I have the same question like yours, and I cannot solve it yet. But I have came out one problem solution which is to install a virtual machine and install Unix system in it. And debug it in Linux system. Perhaps it will work.
I have found out the reason, you should rebuild the project every time you changed your code, or the Qt will just run the old version of the code.