Visual Studio 2005 C compiler problem when optimizing a switch statement - c++

General Question which may be of interest to others:
I ran into a, what I believe, C++-compiler optimization (Visual Studio 2005) problem with a switch statement. What I'd want to know is if there is any way to satisfy my curiosity and find out what the compiler is trying to but failing to do. Is there any log I can spend some time (probably too much time) deciphering?
My specific problem for those curious enough to continue reading - I'd like to hear your thoughts on why I get problems in this specific case.
I've got a tiny program with about 500 lines of code containing a switch statement. Some of its cases contain some assignment of pointers.
double *ptx, *pty, *ptz;
double **ppt = new double*[3];
//some code initializing etc ptx, pty and ptz
ppt[0]=ptx;
ppt[1]=pty; //<----- this statement causes problems
ppt[2]=ptz;
The middle statement seems to hang the compiler. The compilation never ends. OK, I didn't wait for longer than it took to walk down the hall, talk to some people, get a cup of coffee and return to my desk, but this is a tiny program which usually compiles in less than a second. Remove a single line (the one indicated in the code above) and the problem goes away, as it also does when removing the optimization (on the whole program or using #pragma on the function).
Why does this middle line cause a problem? The compilers optimizer doesn't like pty.
There is no difference in the vectors ptx, pty, and ptz in the program. Everything I do to pty I do to ptx and ptz. I tried swapping their positions in ppt, but pty was still the line causing a problem.
I'm asking about this because I'm curious about what is happening. The code is rewritten and is working fine.
Edit:
Almost two weeks later, I check out the closest version to the code I described above and I can't edit it back to make it crash. This is really annoying, embarrassing and irritating. I'll give it another try, but if I don't get it breaking anytime soon I guess this part of the question is obsolete and I'll remove it. Really sorry for taking your time.

If you need to make this code compilable without changing it too much consider using memcpy where you assign a value to ppt[1]. This should at least compile fine.
However, you problem seems more like another part of the source code causes this behaviour.
What you can also try is to put this stuff:
ppt[0]=ptx;
ppt[1]=pty; //<----- this statement causes problems
ppt[2]=ptz;
in another function.
This should also help compiler a bit to avoid the path it is taking to compile your code.

Did you try renaming pty to something else (i.e. pt_y)? I encountered a couple of times (i.e. with a variable "rect2") the problem that some names seem to be "reserved".

It sounds like a compiler bug. Have you tried re-ordering the lines? e.g.,
ppt[1]=pty;
ppt[0]=ptx;
ppt[2]=ptz;
Also what happens if you juggle about the values that are assigned (which will introduce bugs in your code, but may indicator whether its the pointer or the array that's the issue), e.g.:
ppt[0] = pty;
ppt[1] = ptz;
ppt[2] = ptx;
(or similar).

It's probably due to your declaration of ptx, pty and ptz with them being optimised out to use the same address. Then this action is causing your compiler problems later in your code.
Try
static double *ptx;
static double *pty;
static double *ptz;

Related

C++ What could cause a line of code in one place to change behavior in an unrelated function?

Consider this mock-up of my situation.
in an external header:
class ThirdPartyObject
{
...
}
my code: (spread among a few headers and source files)
class ThirdPartyObjectWrapper
{
private:
ThirdPartyObject myObject;
}
class Owner
{
public:
Owner() {}
void initialize();
private:
ThirdPartyObjectWrapper myWrappedObject;
};
void Owner::initialize()
{
//not weird:
//ThirdPartyObjectWrapper testWrappedObject;
//weird:
//ThirdPartyObject testObject;
}
ThirdPartyObject is, naturally, an object defined by a third party (static precompiled) library I'm using. ThirdPartyObjectWrapper is a convenience class that eliminates a lot of boiler-plating for working with ThirdPartyObject. Owner::initialize() is called shortly after an instance of Owner is created.
Notice the two lines I have labeled as "weird" and "not weird" in Owner::initialize(). All I'm doing here is creating a couple of objects on the stack with their default constructors. I don't do anything with those objects and they get destroyed when they leave scope. There are no build or linker errors involved, I can uncomment either or both lines and the code will build.
However, if I uncomment "weird" then I get a segmentation fault, and (here's why I say it's weird) it's in a completely unrelated location. Not in the constructor of testObject, like you might expect, but in the constructor of Owner::myObjectWrapper::myObject. The weird line never even gets called, but somehow its presence or absence consistently changes the behavior of an unrelated function in a static library.
And consider that if I only uncomment "not weird" then it runs fine, executing the ThirdPartyObject constructor twice with no problems.
I've been working with C++ for a year so it's not really a surprise to me that something like this would be able happen, but I've about reached the limit of my ability to figure out how this gotcha is happening. I need the input of people with significantly more C++ experience than me.
What are some possibilities that could cause this to happen? What might be going on here?
Also, note, I'm not asking for advice on how to get rid of the segfault. Segfaults I understand, I suspect it's a simple race condition. What I don't understand is the behavior gotcha so that's the only thing I'm trying to get answers for.
My best lead is that it has to do with headers and macros. The third party library actually already has a couple of gotchas having to do with its headers and macros, for example the code won't build if you put your #include's in the wrong order. I'm not changing any #include's so strictly this still wouldn't make sense, but perhaps the compiler is optimizing includes based on the presence of a symbol here? (it would be the only mention of ThirdPartyObject in the file)
It also occurs to me that because I am using Qt, it could be that the Meta-Object Compiler (which generates supplementary code between compilations) might be involved in this. Very unlikely, as Qt has no knowledge of the third party library where the segfault is happening and this is not actually relevant to the functionality of the MOC (since at no point ThirdPartyObject is being passed as an argument), but it's worth investigating at least.
Related questions have suggested that it could be a relatively small buffer overflow or race condition that gets tripped up by compiler optimizations. Continuing to investigate but all leads are welcome.
Typical culprits:
Some build products are stale and not binary-compatible.
You have a memory bug that has corrupted the state of your process, and are seeing a manifestation of that in a completely unrelated location.
Fixing #1 is trivial: delete the build folder and build again. If you're not building in a shadow build folder, you've set yourself up for failure, hopefully you now know enough to stop :)
Fixing #2 is not trivial. View manual memory management and possible buffer overflows with suspicion. Use modern C++ programming techniques to leverage the compiler to help you out: store things by value, use containers, use smart pointers, and use iterators and range-for instead of pointers. Don't use C-style arrays. Abhor C-style APIs of the (Type * array, int count) kind - they don't belong in C++.
What fun. I've boiled this down to the bottom.
//#include <otherthirdpartyheader.h>
#include <thirdpartyobject.h>
int main(...)
{
ThirdPartyObject test;
return 0;
}
This code runs. If I uncomment the first include, delete all build artifacts, and build again, then it breaks. There's obviously a header/macro component, and probably some kind of compiler-optimization component. But, get this, according to the library documentation it should give me a segfault every time because I haven't been doing a required initialization step. So the fact that it runs at all indicates unexpected behavior.
I'm chalking this up to library-specific issues rather than broad spectrum C++ issues. I'll be contacting the vendor going forward from here, but thanks everyone for the help.

Writing an MCVE (minimal source code that reproduces an error) automatically?

When I want to ask a question on e.g. stackoverflow I usually have to post the source code.
The problem is, I am using quite a big custom framework, classes structure etc. and the problem-related parts may be localized in many places (sometimes it's very hard to detect which parts of code are important for question). I cannot post the full source code (it would be too big to read efficiently).
For that reason I usually make an effort to write an minimal code (usually in one main.cpp instead of tons of classes) that reproduces the problem.
I wonder - is it possible to automate that process?
The typical things to do here is to replace methods/functions' calls with their bodies, merge files into one .cpp, remove all the "not called" methods & classes, unused variables etc.
The real difficulty here is telling the difference between "it doesn't do what I want", "the bug went away because I removed the essential code" and the case of "it now crashes because I removed something important". And really, hitting the delete key after marking some code "don't need it" is the easy part.
Finding out what is essential to show a problem is the hard part, and it's very difficult to automate this - because it is necessary to understand the difference between what the code should do and what it actually does. Just randomly removing code will not work, because the "new" code may be broken because you removed some essential step, not because you remove unused crud - only humans [that understand the problem] can do that.
Consider this:
Object* something;
void Initialize()
{
something = new Object(1, 2, 3);
}
int main()
{
Initialize();
// Some more code, some of which SOMETIMES sets something = NULL.
something->doStuff(); // Will crash if object is NULL.
}
If we remove Initialize, the code will fail every time, not just every third time. But it's because the Object has not been initialized, rather than the bug in the code [which may be that we should add if (something) before something->doStuff(), or because we shouldn't set it to NULL in the "some more code", so "don't do that"].
When I work on a tricky problem, especially at work where we have test systems that automatically produce code for testing different functionality under different conditions, my first step is to make [or take some existing] code a "small standalone test", which is small and simple, and only does "what is necessary", rather than trying to reduce many thousands of lines of complex code that does lots of extra stuff.
For some projects, there are tools around that helps with identifying "which bit of the code is the problem", for example [bugpoint][1] that finds which "pass" in LLVM is guily of causing a crash.
If you have a version control system that supports this you can "bisect" code to come up with the version that introduced a particular fault [at least sometimes]. However, I had a case at work where some of my code "apparently broke things", but it turns out that some other code was "broken all the time since a long time back" because the other code was not clearing a pointer field that the API manual says should be set to NULL, and my code was inspecting the pointer to find out if it was pointing at the right type thing - which goes horribly wrong when the value is "whatever happens to be in that part of the stack", so it is not NULL and not a valid pointer. I just added the code that made this bug apparent rather than hiding itself.
[1] http://llvm.org/docs/CommandGuide/bugpoint.html

Fortran script only runs when print statement added

I am running an atmospheric model, and need to compile an executable to convert some files. If I compile the code as supplied, it runs but it gets stuck and doesn't ever complete. It doesn't give an error or anything like that.
After doing some testing by adding print statements to see where it was getting stuck, I've found that the executable only runs if I compile the code with a print statement in one of the subroutines.
The piece of code in question is the one here. Specifically, the code fails to run unless I put a print statement somewhere in the get_bottom_top_dim subroutine.
Does anyone know why this might be? It doesn't matter what the print statement is (currently I'm using print*, '!'). but as soon as I remove it or comment it out, the code no longer works.
I'm assuming it must have something to do with my machine or compiler (ifort 12.1.0), but I'm stumped as to what the problem is!
This is an extended comment rather than an answer:
The situation you describe, inserting a print statement which apparently fixes a program, often arises when the underlying problem is due to either
a) an attempt to access an element outside the declared bounds of an array; or
b) a mismatch between dummy and actual arguments to some procedure.
Recompile your program with the compiler options to check interfaces at compile-time and to check array bounds at run-time.
Fortran has evolved a LOT since I last used it but here's how to go about solving your problem.
Think of some hypotheses that could explain the symptoms, e.g. the compiler is optimizing the subroutine down to a no-op when it has no print side effect. Or a compiler bug is translating this code into something empty or an infinite loop or crashing code. (What exactly do you mean by "fails to run"?) Or the Linker is failing to link in some needed code unless the subroutine explicitly calls print.
Or there's a bug in this subroutine and the print statement alters its symptoms e.g. by changing which data gets overwritten by an index-out-of-bounds bug.
Think of ways to test these hypotheses. You might already have observations adequate to rule out of some of them. You could decompile the object code to see if this subroutine is empty. Or step through it in a debugger. Or replace the print statement with a different side effect like logging to a file or to an in-memory text buffer.
Or turn on all optional runtime memory checks and compile time warnings. Or simplify the code until the problem goes away, then binary search on bringing back code until the problem recurs.
Do the most likely or easiest tests first. Rule out some hypotheses, and iterate.
I had a similar bug and I found that the problem was in the dependencies on the makefile.
This was what I had:
I set a variable with a value and the program stops.
I write a print command and it works.
I delete the print statement and continues to work.
I alter the variable value and stops.
The thing is, the variable value is set in a parameters.f90
The print statement is in a file H3.f90 that depends on parameters.f90 but it was not declared on the makefile.
After correcting:
h3.o: h3.f90 variables.f90 parameters.f90
$(FC) -c h3.f90
It all worked properly.

Assertion Failed in C++ questions

Running into an issue with some code I'm working on. This code is being run on a linux-based system and the error I receive is the following:
/root/cvswork/pci_sync_card/Code/SSBSupport/src/CRCWbHfChannel/CRCWbHfMSBSimulator.cpp:447:
virtual void CCRCWbHfMSBSimulator::Process(): Assertion 'pcBasebandOutput' failed.
I've tried stepping through this code to figure out why this is failing and I can't seem to figure it out. Unfortunately I have too many files to really share the code on here (stepping through the pcBasebandOutput assignment takes quite some time). I understand this is a more complex issue than can really be asked about. My primary questions are these:
Is my assert(pcBasebandOutput); line of code necessary? I only ask because when running this code on Visual Studio, the results from my program were desirable.
When it is evaluating my pcBasebandOutput variable, why would it evaluate it as false? Is this saying that no value is actually assigned to pcBasebandOutput? Or that a value may be assigned to it, but it is not of the right type (pointer to a struct of two variables, both of which are doubles)?
Thanks!
assert checks a logical condition. Assertation fails if the condition is false. So writing assert(cond) is logically the same as writing:
if (!cond)
{
assert(false);
}
I don't suggest you to remove assert from the code, because it is a guard telling you that something went not the way it's intended to go. And it's not a god idea just to ignore that, because it may shoot you in a leg later
Only you can know that
What is the type of pcBasebandOutput ? Maybe it is not properly initialized?
assert primary purpose is to allow your IDE to enter debuging session in the place where assert has hit. From there you can read all variables and see callstack/threads. Other solution (than using debugger) is to add lots of logging, which in threaded environments can cause problems on its own (logging is quite slow).

Segmentation fault when adding a vector. (C++)

So I've got a pretty basic class that has a few methods and some class variables. Everythings working great up until I add a vector to the member variables in the header file:
std::vector <std::string> vectorofstuff;
If all I do is add this line then my program run perfectly but at the end, after all the output is there, I get a message about a seg fault.
My first guess is that I need to call the destructor on the vector, but that didn't seem to work. Plus my understanding is I don't need to call the destructor unless I use the word 'new'.
Any pushes in the right direction? Thanks bunches!
I guess the following either happened to you, or it was something similar involving unrealised dependencies/headers. Either way, I hope this answer might show up on Google and help some later, extremely confused programmer figure out why they're suddenly observing arbitrary crashes.
So, from experience, this can happen if you compile a new version of SomeObject.o but accidentally have another object file #include an old version of SomeObject.hpp. This leads to corruption, which'll be caused by the compiler referring to outdated member offsets, etc. Sometimes this mostly works and only produces segfaults when destructing objects - either related or seemingly distant ones - and other times the program segfaults right away or somewhere in-between; I've seen several permutations (regrettably!).
For anyone wondering why this can happen, maybe this is just a reflection of how little sleep I get while programming, but I've encountered this pattern in the context of Git submodules, e.g.:
MyRepo
/ GuiSubmodule
/ HelperSubmodule
/ / GuiSubmodule
If (A) you have a new commit in GuiSubmodule, which has not yet been pulled into HelperSubmodule's copy, (B) your makefile compiles MyRepo/uiSubmodule/SomeObject.o, and (C) another translation unit - either in a submodule or in the main repo via the perils of #include - links against an older version of SomeObject.hpp that has a different class layout... You're in for a fun time, and a lot of chasing red herrings until you finally realise the simple mistake.
Since I had cobbled together my build process from scratch, I might've just not been using Git/make properly - or strictly enough (forgetting to push/pull all submodules). Probably the latter! I see fewer odd bugs nowadays at least :)
You are probably corrupting the memory of the vectorofstuff member somewhere within your class. When the class destructor is called the destructor of the vector is called as well, which would try to point and/or delete to invalid memory.
I was fooling around with it and decided to, just to be sure, do an rm on everything and recompile. And guess what? That fixed it. I have no idea why, in the makefile I do this anyway, but whatever, I'm just glad I can move on and continue working on it. Thanks so much for all the help!