Visual Studio - correcting different behaviour in an optimized build - c++

I have a project I'm working on. I recently switched it to release mode with full optimization just to get an idea how some things will perform out of debug mode. On doing so however, I noticed that there were a few irregularities. In my particular case, I have a sprite who's alpha value is different (more transparent) in release mode then debug mode.
To check my findings, I made a copy of the release mode build options, but turned off optimizations (being sure that DEBUG and other related preprocessor options were removed) and it performed correctly. Something in the optimization process modifies the behaviour of my system. It's probably because there are variables I'm not initializing in my classes somewhere.
My question is, is there an alternative, aside from manually combing over my code, to making sure things are initialized properly? I've checked the warnings that pop up, but all are related to int to float/float to int conversion and possible loss of data and enum qualifiers, and none of them are related to the alpha on my sprite.
I'm using Visual Studio 2010 if it makes a difference.

This type of thing can be very difficult to debug. I suggest that you replace the optimization one-by-one, until you find the one that causes the anomly. You can then narrow the issue further by applying that optimization to each translation unit (file) one-by-one.
Another way to deal with this issue is essentially a data trace. Analyze code to determine what item of data is controling your alpha. Locate each statement that writes to the data. Put breakpoint or traces on those statements. Then determine the first point in the executable where the release data differs from the debug data.

Turn on as many warnings as you can to catch uninitialized variables. Run the code through a static LINT tool, like cppcheck. It's been a while since I've used VS, but I'm pretty sure you can set up your debugger to break on accessing uninitialized variable accesses and trace it from there.
If the code is multithreaded, then you likely have a race condition, or the compiler may be reordering things that wouldn't ordinarily make a difference if threads didn't interact, so if that's the case, double-check your proper use of locks.

Use an always-initialized template for all your member variables.
template<typename T> class always_initialized {
T t;
public:
operator T&() { return t; }
operator const T&() const { return t; }
always_initialized() : t(T()) {}
template<typename K> always_initialized(K&& ref) : t(std::forward<K>(ref)) {}
};

Related

C++ What could cause a line of code in one place to change behavior in an unrelated function?

Consider this mock-up of my situation.
in an external header:
class ThirdPartyObject
{
...
}
my code: (spread among a few headers and source files)
class ThirdPartyObjectWrapper
{
private:
ThirdPartyObject myObject;
}
class Owner
{
public:
Owner() {}
void initialize();
private:
ThirdPartyObjectWrapper myWrappedObject;
};
void Owner::initialize()
{
//not weird:
//ThirdPartyObjectWrapper testWrappedObject;
//weird:
//ThirdPartyObject testObject;
}
ThirdPartyObject is, naturally, an object defined by a third party (static precompiled) library I'm using. ThirdPartyObjectWrapper is a convenience class that eliminates a lot of boiler-plating for working with ThirdPartyObject. Owner::initialize() is called shortly after an instance of Owner is created.
Notice the two lines I have labeled as "weird" and "not weird" in Owner::initialize(). All I'm doing here is creating a couple of objects on the stack with their default constructors. I don't do anything with those objects and they get destroyed when they leave scope. There are no build or linker errors involved, I can uncomment either or both lines and the code will build.
However, if I uncomment "weird" then I get a segmentation fault, and (here's why I say it's weird) it's in a completely unrelated location. Not in the constructor of testObject, like you might expect, but in the constructor of Owner::myObjectWrapper::myObject. The weird line never even gets called, but somehow its presence or absence consistently changes the behavior of an unrelated function in a static library.
And consider that if I only uncomment "not weird" then it runs fine, executing the ThirdPartyObject constructor twice with no problems.
I've been working with C++ for a year so it's not really a surprise to me that something like this would be able happen, but I've about reached the limit of my ability to figure out how this gotcha is happening. I need the input of people with significantly more C++ experience than me.
What are some possibilities that could cause this to happen? What might be going on here?
Also, note, I'm not asking for advice on how to get rid of the segfault. Segfaults I understand, I suspect it's a simple race condition. What I don't understand is the behavior gotcha so that's the only thing I'm trying to get answers for.
My best lead is that it has to do with headers and macros. The third party library actually already has a couple of gotchas having to do with its headers and macros, for example the code won't build if you put your #include's in the wrong order. I'm not changing any #include's so strictly this still wouldn't make sense, but perhaps the compiler is optimizing includes based on the presence of a symbol here? (it would be the only mention of ThirdPartyObject in the file)
It also occurs to me that because I am using Qt, it could be that the Meta-Object Compiler (which generates supplementary code between compilations) might be involved in this. Very unlikely, as Qt has no knowledge of the third party library where the segfault is happening and this is not actually relevant to the functionality of the MOC (since at no point ThirdPartyObject is being passed as an argument), but it's worth investigating at least.
Related questions have suggested that it could be a relatively small buffer overflow or race condition that gets tripped up by compiler optimizations. Continuing to investigate but all leads are welcome.
Typical culprits:
Some build products are stale and not binary-compatible.
You have a memory bug that has corrupted the state of your process, and are seeing a manifestation of that in a completely unrelated location.
Fixing #1 is trivial: delete the build folder and build again. If you're not building in a shadow build folder, you've set yourself up for failure, hopefully you now know enough to stop :)
Fixing #2 is not trivial. View manual memory management and possible buffer overflows with suspicion. Use modern C++ programming techniques to leverage the compiler to help you out: store things by value, use containers, use smart pointers, and use iterators and range-for instead of pointers. Don't use C-style arrays. Abhor C-style APIs of the (Type * array, int count) kind - they don't belong in C++.
What fun. I've boiled this down to the bottom.
//#include <otherthirdpartyheader.h>
#include <thirdpartyobject.h>
int main(...)
{
ThirdPartyObject test;
return 0;
}
This code runs. If I uncomment the first include, delete all build artifacts, and build again, then it breaks. There's obviously a header/macro component, and probably some kind of compiler-optimization component. But, get this, according to the library documentation it should give me a segfault every time because I haven't been doing a required initialization step. So the fact that it runs at all indicates unexpected behavior.
I'm chalking this up to library-specific issues rather than broad spectrum C++ issues. I'll be contacting the vendor going forward from here, but thanks everyone for the help.

gdb - re-setting a const

I have
const int MAX_CONNECTIONS = 500;
//...
if(clients.size() < MAX_CONNECTIONS) {
//...
}
I'm trying to find the "right" choice for MAX_CONNECTIONS. So I fire up gdb and set MAX_CONNECTIONS = 750. But it seems my code isn't responding to this change. I wonder if it's because the const int was resolved at compile time even though it wound up getting bumped at runtime. Does this sound right, and, using GDB is there any way I can bypass this effect without having to edit the code in my program? It takes a while just to warm up to 500.
I suspect that the compiler, seeing that the variable is const, is inlining the constant into the assembly and not having the generated code actually read the value of the MAX_CONNECTIONS variable. The C++ spec is worded in a way where if a variable of primitive type is explicitly marked const, the compiler can make certain assumptions about it for the purposes of optimization, since any attempt to change that constant is either (1) illegal or (2) results in undefined behavior.
If you want to use GDB to do things like this, consider marking the variable volatile rather than const to indicate to the compiler that it shouldn't optimize it. Alternatively, have this information controlled by some other data source (say, a configuration option inside a file) so that you aren't blasting the program's memory out from underneath it in order to change the value.
Hope this helps!
By telling it it's const, you're telling the compiler it has freedom to not load the value, but to build it directly into the code when possible. An allocated copy may still exist for those times when the particular instructions chosen need to load a value rather than having an immediate value, or it could be omitted by the compiler as well. That's a bit of a loose answer short on standardese, but that's the basic idea.
As this post is quite old, my answer is more like a reference to my future self. Assuming you compiled in debug mode, running the following expression in the debugger (lldb in my case) works:
const_cast<int&>(MAX_CONNECTIONS) = 750
In case you have to change the constant often, e.g. in a loop, set a breakpoint and evaluate the expression each time the breakpoint is hit
breakpoint set <location>
breakpoint command add <breakpoint_id>
const_cast<int&>(MAX_CONNECTIONS) = 750
DONE

Error in release build only, copying structure from std::vector

I'm having a weird problem trying to get my release build working in Xcode 4.2 using llvm. I've turned off all optimisation settings for the release scheme, and as far as I can tell the release build matches all the settings of the debug build. Regardless of this, the following problem occurs when working with some structures from Box2D, a physics library - but I am unsure if the problem has anything todo specifically with that.
b2CircleShape* circleShape = new b2CircleShape();
circleShape->m_p.Set(0,0);
circleShape->m_radius = m_radius;
b2FixtureDef fixture;
fixture.shape = circleShape;
fixture.density = m_density;
m_fixtureDefs.push_back(fixture); // std::vector
b2FixtureDef fix2 = fixture;
b2FixtureDef fix3 = m_fixtureDefs[0] // EXC_BAD_ACCESS
When I remove all instances of access to m_fixtures, no problems occur. When I run in the development scheme no errors occur. I am really, really confused, if someone could point me in the right direction for errors to look for it would be much appreciated
EDIT:
More interesting stuff
for (vector<b2FixtureDef>::iterator i = m_fixtureDefs.begin() ; i != m_fixtureDefs.end(); ++i)
{
}
This appears to loop forever, making me very confused. It seems like the structure m_fixturesDef has some kind of problem, but I have no idea why whatever weird corruption is going on is only occurring in this particular variable.
By default POD objects are not initializes in C++ so their initial value (unless explicitly initialized) are inherently random.
When you build in debug mode the compiler usually inserts extra initialization code to zero out values. Thus you can easily see different behaviors between debug and release builds.
A quick way to find this kind of problem is to check your compiler warnings; see if you are using a variable before it has been initialization (you may need to turn on the warnings) or something similar.
Note: You can fix a lot of serious problems by making sure your code compiles with zero warnings with the warning level as high as reasonable (usually a step above default). (a warning is really a logical error in your code).

Why is passing a char* to this method failing?

I have a C++ method such as:
bool MyClass::Foo(char* charPointer)
{
return CallExternalAPIFunction(charPointer);
}
Now I have some static method somewhere else such as:
bool MyOtherClass::DoFoo(char* charPointer)
{
return _myClassObject.Foo(charPointer);
}
My issue is that my code breaks at that point. It doesn't exit the application or anything, it just never returns any value. To try and pinpoint the issue, I stepped through the code using the Visual Studio 2010 debugger and noticed something weird.
When I step into the DoFoo function and hover over charPointer, I actually see the value it was called with (an IP address string in this case). However, when I step into Foo and hover over charPointer, nothing shows up and the external API function call never returns (it's like it's just stepped over) and my program resumes it's execution after the call to DoFoo.
I also tried using the Exception... feature of the VS debugger (to pick up first chance exceptions) but it never picked up anything.
Has this ever happened to anyone? Am I doing something wrong?
Thank you.
You need to build the project with Debug settings. Release settings mean that optimizations are enabled and optimizations make debugging a beating.
Without optimizations, there is a very close correspondence between statements in your C++ code and blocks of machine code in the program. The program is slower (often far slower) but it's easier to debug because you can observe what each statement does.
The optimizer reorders your code, eliminates variables, inlines functions, unrolls loops, and does all sorts of other things to make the program fast. The program is faster (often much faster) but it's far more difficult to debug because the correspondence between the statements in your C++ code and the instructions in the machine code is no longer there.

What is causing boost::lower to fail an is_singular assertion?

I am occasionally getting odd behavior from boost::lower, when called on a std::wstring. In particular, I have seen the following assertion fail in a release build (but not in a debug build):
Assertion failed: !is_singular(), file C:\boost_1_40_0\boost/range/iterator_range.hpp, line 281
I have also seen what appear to be memory errors after calling boost::to_lower in contexts such as:
void test(const wchar_t* word) {
std::wstring buf(word);
boost::to_lower(buf);
...
}
Replacing the calls boost::tolower(wstr) with std::transform(wstr.begin(), wstr.end(), wstr.begin(), towlower) appears to fix the problem; but I'd like to know what's going wrong.
My best guess is that perhaps the problem has something to do with changing the case of unicode characters -- perhaps the encoding size of the downcased character is different from the encoding size of the source character?
Does anyone have any ideas what might be going on here? It might help if I knew what "is_singular()" means in the context of boost, but after doing a few google searches I wasn't able to find any documentation for it.
Relevant software versions: Boost 1.40.0; MS Visual Studio 2008.
After further debugging, I figured out what was going on.
The cause of my trouble was that one project in the solution was not defining NDEBUG (despite being in release mode), while all the other modules were. Boost allocates some extra fields in its data structures, which it uses to store debug information (such as whether a data structure has been initialized). If module A has debugging turned off, then it will create data structures that don't contain those fields. Then when module B, which has debugging turned on, gets its hands on that data structure, it will try to check those fields (which were not allocated), resulting in random memory errors.
Defining NDEBUG in all projects in the solution fixed the problem.
An iterator range should only be singular if it's been constructed with the default constructor (stores singular iterators, i.e doesn't represent a range). As it's rather hard to believe that the boost's to_lower function manages to create a singular range, it suggests that the problem might also be elsewhere (a result of some undefined behavior, such as using uninitialized variables which might be initialized to some known value in debug builds).
Read more on Heisenbugs.