I've sort of gone around the houses here, and I'd thought I'd found a solution. It certainly seems to correctly identify the problems I know about, but also leads to unexplained crashes in about half of all system test cases.
The problem is that our code needs to call client code as a dll. We have control over our code, but not the clients', and experience has shown that their code isn't always flawless. I have protected against segmentation faults by exiting the program with a clear message of what might have been going wrong, but I've also had a few divide-by-zero exceptions coming from the clients' code, which I would like to identify and then exit.
What I've been wanting to do is:
Just before running the clients' dll, switching on floating point
introspection.
Run client code.
Check for any problems.
Switch off introspection for speed.
There is theoretically a number of ways of doing this, but many don't seem to work for VS2010.
I have been trying to use the floating_point pragma:
#pragma float_control(except, on, push)
// run client code
#pragma float_control(pop)
__asm fwait; // This forces the floating point unit to synchronise
if (_statusfp() & _SW_ZERODIVIDE)
{
// abort the program
}
This should be OK in theory, and in practice it works well 50% of the time.
I'm thinking that the problem might be the floating_point control stays on, and causes problems elsewhere in the code.
According to microsoft.com:
"The /fp:precise, /fp:fast, /fp:strict and /fp:except switches control
floating-point semantics on a file-by-file basis. The float_control
pragma provides such control on a function-by-function basis."
However, during compilation I get the warning:
warning C4177: #pragma 'float_control' should only be used at global
scope or namespace scope
Which on the face of it is a direct contradiction.
So my question is:
Is the documentation correct, or is the warning (I'm betting on the warning)?
Is there a reliable and safe way of doing this?
Should I be doing this at all, or is it just too dangerous?
You tried
#pragma float_control(except, on, push)
// run client code
#pragma float_control(pop)
That's not how it works. It's a compiler directive, and it means
#pragma float_control(except, on, push)
// This entire function is compiled with float_control exceptions on.
// Therefore, the pragma has to appear outside the function, at global scope.
#pragma float_control(pop)
And of course, this setting affects only the function(s) being compiled, no any functions that they may call - such as your clients. There is no way that a #pragma can change already compiled code.
So, the answers:
Both are correct
Yes, _controlfp_s
You're missing the SSE2 status, so it's at least incomplete
Related
Using Visual Studio 2019 Professional on Windows 10 x64. I have several C++ DLL projects, some of which are multi-threaded. I'm using CRITICAL_SECTION objects for thread safety.
In DLL1:
CRITICAL_SECTION critDLL1;
InitializeCriticalSection(&critDLL1);
In DLL2:
CRITICAL_SECTION critDLL2;
InitializeCriticalSection(&critDLL2);
When I use critDLL1 with EnterCriticalSection or LeaveCriticalSection everything is fine in both _DEBUG or NDEBUG mode. But when I use critDLL2, I get an access violation in 'ntdll.dll' in NDEBUG (though not in _DEBUG).
After popping up message boxes in NDEBUG mode, I was eventually able to track the problem down to the first use of EnterCriticalSection.
What might be causing the CRITICAL_SECTION to fail in one project but work in others? The MSDN page was not helpful.
UPDATE 1
After comparing project settings of DLL1 (working) and DLL2 (not working), I've accidentally got DLL2 working. I've confirmed this by reverting to an earlier version (which crashes) and then making the project changes (no crash!).
This is the setting:
Project Properties > C/C++ > Optimization > Whole Program Optimization
Set this to Yes (/GL) and my program crashes. Change that to No and it works fine. What does the /GL switch do and why might it cause this crash?
UPDATE 2
The excellent answer from #Acorn and comment from #RaymondChen, provided the clues to track down and then resolve the issue. There were two problems (both programmer errors).
PROBLEM 1
The assumption of Whole Program Optimzation (wPO) is the MSVC compiler is compiling "the whole program". This is an incorrect assumption for my DLL project which internally consumes a 3rd party library and is in turn consumed by an external application written in Delphi. This setting is set to Yes (/GL) by default but should be No. This feels like a bug in Visual Studio, but in any case, the programmer needs to be aware of this. I don't know all the details of what WPO is meant to do, but at least for DLLs meant to be consumed by other applications, the default should be changed.
PROBLEM 2
Serious programmer error. It was a call into a 3rd party library, which returned a 128-byte ASCII code which was the error:
// Before
// m_config::acSerial defined as "char acSerial[21]"
(void) m_pLib->GetPara(XPARA_PRODUCT_INFO, &m_config.acSerial[0]);
EnterCriticalSection(&crit); // Crash!
// After
#define SERIAL_LEN 20
// m_config::acSerial defined as "char acSerial[SERIAL_LEN+1]"
//...
char acSerial[128];
(void) m_pLib->GetPara(XPARA_PRODUCT_INFO, &acSerial[0]);
strncpy(m_config.acSerial, acSerial, max(SERIAL_LEN, strlen(acSerial)));
EnterCriticalSection(&crit); // Works!
The error, now obvious, is that the 3rd party library did not copy the serial number of the device into the char* I provided...it copied 128 bytes into my char* stomping over everything contiguous in memory after acSerial. This wasn't noticed before because m_pLib->GetPara(XPARA_PRODUCT_INFO, ...) was one of the first calls into the 3rd party library and the rest of the contiguous data was mostly NULL at that point.
The problem was never to do with the CRITICAL_SECTION. My thanks for Acorn and RaymondChen ... sanity has been restored to this corner of the universe.
If your program crashes under WPO (an optimization that assumes that whatever you are compiling is the entire program), it means that either the assumption is incorrect or that the optimizer ends up exploiting some undefined behavior that previously didn't (without the optimization applied), even if the assumption is correct.
In general, avoid enabling optimizations unless you are really sure you know you meet their requirements.
For further analysis, please provide a MRE.
Consider this mock-up of my situation.
in an external header:
class ThirdPartyObject
{
...
}
my code: (spread among a few headers and source files)
class ThirdPartyObjectWrapper
{
private:
ThirdPartyObject myObject;
}
class Owner
{
public:
Owner() {}
void initialize();
private:
ThirdPartyObjectWrapper myWrappedObject;
};
void Owner::initialize()
{
//not weird:
//ThirdPartyObjectWrapper testWrappedObject;
//weird:
//ThirdPartyObject testObject;
}
ThirdPartyObject is, naturally, an object defined by a third party (static precompiled) library I'm using. ThirdPartyObjectWrapper is a convenience class that eliminates a lot of boiler-plating for working with ThirdPartyObject. Owner::initialize() is called shortly after an instance of Owner is created.
Notice the two lines I have labeled as "weird" and "not weird" in Owner::initialize(). All I'm doing here is creating a couple of objects on the stack with their default constructors. I don't do anything with those objects and they get destroyed when they leave scope. There are no build or linker errors involved, I can uncomment either or both lines and the code will build.
However, if I uncomment "weird" then I get a segmentation fault, and (here's why I say it's weird) it's in a completely unrelated location. Not in the constructor of testObject, like you might expect, but in the constructor of Owner::myObjectWrapper::myObject. The weird line never even gets called, but somehow its presence or absence consistently changes the behavior of an unrelated function in a static library.
And consider that if I only uncomment "not weird" then it runs fine, executing the ThirdPartyObject constructor twice with no problems.
I've been working with C++ for a year so it's not really a surprise to me that something like this would be able happen, but I've about reached the limit of my ability to figure out how this gotcha is happening. I need the input of people with significantly more C++ experience than me.
What are some possibilities that could cause this to happen? What might be going on here?
Also, note, I'm not asking for advice on how to get rid of the segfault. Segfaults I understand, I suspect it's a simple race condition. What I don't understand is the behavior gotcha so that's the only thing I'm trying to get answers for.
My best lead is that it has to do with headers and macros. The third party library actually already has a couple of gotchas having to do with its headers and macros, for example the code won't build if you put your #include's in the wrong order. I'm not changing any #include's so strictly this still wouldn't make sense, but perhaps the compiler is optimizing includes based on the presence of a symbol here? (it would be the only mention of ThirdPartyObject in the file)
It also occurs to me that because I am using Qt, it could be that the Meta-Object Compiler (which generates supplementary code between compilations) might be involved in this. Very unlikely, as Qt has no knowledge of the third party library where the segfault is happening and this is not actually relevant to the functionality of the MOC (since at no point ThirdPartyObject is being passed as an argument), but it's worth investigating at least.
Related questions have suggested that it could be a relatively small buffer overflow or race condition that gets tripped up by compiler optimizations. Continuing to investigate but all leads are welcome.
Typical culprits:
Some build products are stale and not binary-compatible.
You have a memory bug that has corrupted the state of your process, and are seeing a manifestation of that in a completely unrelated location.
Fixing #1 is trivial: delete the build folder and build again. If you're not building in a shadow build folder, you've set yourself up for failure, hopefully you now know enough to stop :)
Fixing #2 is not trivial. View manual memory management and possible buffer overflows with suspicion. Use modern C++ programming techniques to leverage the compiler to help you out: store things by value, use containers, use smart pointers, and use iterators and range-for instead of pointers. Don't use C-style arrays. Abhor C-style APIs of the (Type * array, int count) kind - they don't belong in C++.
What fun. I've boiled this down to the bottom.
//#include <otherthirdpartyheader.h>
#include <thirdpartyobject.h>
int main(...)
{
ThirdPartyObject test;
return 0;
}
This code runs. If I uncomment the first include, delete all build artifacts, and build again, then it breaks. There's obviously a header/macro component, and probably some kind of compiler-optimization component. But, get this, according to the library documentation it should give me a segfault every time because I haven't been doing a required initialization step. So the fact that it runs at all indicates unexpected behavior.
I'm chalking this up to library-specific issues rather than broad spectrum C++ issues. I'll be contacting the vendor going forward from here, but thanks everyone for the help.
I have a C++ method such as:
bool MyClass::Foo(char* charPointer)
{
return CallExternalAPIFunction(charPointer);
}
Now I have some static method somewhere else such as:
bool MyOtherClass::DoFoo(char* charPointer)
{
return _myClassObject.Foo(charPointer);
}
My issue is that my code breaks at that point. It doesn't exit the application or anything, it just never returns any value. To try and pinpoint the issue, I stepped through the code using the Visual Studio 2010 debugger and noticed something weird.
When I step into the DoFoo function and hover over charPointer, I actually see the value it was called with (an IP address string in this case). However, when I step into Foo and hover over charPointer, nothing shows up and the external API function call never returns (it's like it's just stepped over) and my program resumes it's execution after the call to DoFoo.
I also tried using the Exception... feature of the VS debugger (to pick up first chance exceptions) but it never picked up anything.
Has this ever happened to anyone? Am I doing something wrong?
Thank you.
You need to build the project with Debug settings. Release settings mean that optimizations are enabled and optimizations make debugging a beating.
Without optimizations, there is a very close correspondence between statements in your C++ code and blocks of machine code in the program. The program is slower (often far slower) but it's easier to debug because you can observe what each statement does.
The optimizer reorders your code, eliminates variables, inlines functions, unrolls loops, and does all sorts of other things to make the program fast. The program is faster (often much faster) but it's far more difficult to debug because the correspondence between the statements in your C++ code and the instructions in the machine code is no longer there.
What approaches can you use when:
you work with several (e.g. 1-3) other programmers over a small C++ project, you use a single repository
you create a class, declare its methods
you don't have a time do implement all methods yet
you don't want other programmers to use your code yet (because it's not implemented yet); or don't want to use not-yet-implemented parts of the code
you don't have a time/possibility to tell about all such not-yet-implemented stuff to you co-workers
when your co-workers use your not-yet-implemented code you want them to immediately realize that they shouldn't use it yet - if they get an error you don't want them to wonder what's wrong, search for potential bugs etc.
The simplest answer is to tell them. Communication is key whenever you're working with a group of people.
A more robust (and probably the best) option is to create your own branch to develop the new feature and only merge it back in when it's complete.
However, if you really want your methods implemented in the main source tree but don't want people using them, stub them out with an exception or assertion.
I actually like the concept from .Net of a NotImplementedException. You can easily define your own, deriving from std::exception, overriding what as "not implemented".
It has the advantages of:
easily searchable.
allows current & dependent code to compile
can execute up to the point the code is needed, at which point, you fail (and you immediately have an execution path that demonstrates the need).
when it fails, it fails to a know state, so long as you're not blanketly swallowing exceptions, rather than relying upon indeterminable state.
You should either, just not commit the code, or better yet, commit it to a development branch so that it is at least off your machine in case of catastrophic failure of your box.
This is what I do at work with my git repo. I push my work at the end of the day to a remote repo (not the master branch). My coworker is aware that these branches are super duper unstable and not to be touched with a ten foot pole unless he really likes to have broken branches.
Git is super handy for this situation as is, I imagine, other dvcs with cheap branching. Doing this in SVN or worse yet CVS would mean pain and suffering.
I would not check it into the repository.
Declare it. Dont implemented it.
When the programmer use to call the unimplemented part of code linker complains, which is the clear hit to the programmer.
class myClass
{
int i;
public:
void print(); //NOt yet implemented
void display()
{
cout<<"I am implemented"<<endl;
}
};
int main()
{
myClass var;
var.display();
var.print(); // **This line gives the linking error and hints user at early stage.**
return 0;
}
Assert is the best way. Assert that doesn't terminate the program is even better, so that a coworker can continue to test his code without being blocked by your function stubs, and he stays perfectly informed about what's not implemented yet.
In case that your IDE doesn't support smart asserts or persistent breakpoints here is simple implementation (c++):
#ifdef _DEBUG
// 0xCC - int 3 - breakpoint
// 0x90 - nop?
#define DebugInt3 __emit__(0x90CC)
#define DEBUG_ASSERT(expr) ((expr)? ((void)0): (DebugInt3) )
#else
#define DebugInt3
#define DEBUG_ASSERT(expr) assert(expr)
#endif
//usage
void doStuff()
{
//here the debugger will stop if the function is called
//and your coworker will read your message
DEBUG_ASSERT(0); //TODO: will be implemented on the next week;
//postcondition number 2 of the doStuff is not satisfied;
//proceed with care /Johny J.
}
Advantages:
code compiles and runs
a developer get a message about what's not implemented if and only if he runs into your code during his testing, so he'll not get overwhelmed with unnecessary information
the message points to the related code (not to exception catch block or whatever). Call stack is available, so one can trace down the place where he invokes unfinished piece of code.
a developer after receiving the message can continue his test run without restarting the program
Disadvantages:
To disable a message one have to comment out a line of code. Such change can possibly sneak in the commit.
P.S. Credits for initial DEBUG_ASSERT implementation go to my co-worker E. G.
You can use pure virtual functions (= 0;) for inherited classes, or more commonly, declare them but not define them. You can't call a function with no definition.
General Question which may be of interest to others:
I ran into a, what I believe, C++-compiler optimization (Visual Studio 2005) problem with a switch statement. What I'd want to know is if there is any way to satisfy my curiosity and find out what the compiler is trying to but failing to do. Is there any log I can spend some time (probably too much time) deciphering?
My specific problem for those curious enough to continue reading - I'd like to hear your thoughts on why I get problems in this specific case.
I've got a tiny program with about 500 lines of code containing a switch statement. Some of its cases contain some assignment of pointers.
double *ptx, *pty, *ptz;
double **ppt = new double*[3];
//some code initializing etc ptx, pty and ptz
ppt[0]=ptx;
ppt[1]=pty; //<----- this statement causes problems
ppt[2]=ptz;
The middle statement seems to hang the compiler. The compilation never ends. OK, I didn't wait for longer than it took to walk down the hall, talk to some people, get a cup of coffee and return to my desk, but this is a tiny program which usually compiles in less than a second. Remove a single line (the one indicated in the code above) and the problem goes away, as it also does when removing the optimization (on the whole program or using #pragma on the function).
Why does this middle line cause a problem? The compilers optimizer doesn't like pty.
There is no difference in the vectors ptx, pty, and ptz in the program. Everything I do to pty I do to ptx and ptz. I tried swapping their positions in ppt, but pty was still the line causing a problem.
I'm asking about this because I'm curious about what is happening. The code is rewritten and is working fine.
Edit:
Almost two weeks later, I check out the closest version to the code I described above and I can't edit it back to make it crash. This is really annoying, embarrassing and irritating. I'll give it another try, but if I don't get it breaking anytime soon I guess this part of the question is obsolete and I'll remove it. Really sorry for taking your time.
If you need to make this code compilable without changing it too much consider using memcpy where you assign a value to ppt[1]. This should at least compile fine.
However, you problem seems more like another part of the source code causes this behaviour.
What you can also try is to put this stuff:
ppt[0]=ptx;
ppt[1]=pty; //<----- this statement causes problems
ppt[2]=ptz;
in another function.
This should also help compiler a bit to avoid the path it is taking to compile your code.
Did you try renaming pty to something else (i.e. pt_y)? I encountered a couple of times (i.e. with a variable "rect2") the problem that some names seem to be "reserved".
It sounds like a compiler bug. Have you tried re-ordering the lines? e.g.,
ppt[1]=pty;
ppt[0]=ptx;
ppt[2]=ptz;
Also what happens if you juggle about the values that are assigned (which will introduce bugs in your code, but may indicator whether its the pointer or the array that's the issue), e.g.:
ppt[0] = pty;
ppt[1] = ptz;
ppt[2] = ptx;
(or similar).
It's probably due to your declaration of ptx, pty and ptz with them being optimised out to use the same address. Then this action is causing your compiler problems later in your code.
Try
static double *ptx;
static double *pty;
static double *ptz;