Visual Studio C++ Runtime Issue with Multithreading on the Release Configuration - c++

When I compile with the configuration set to release (for both x86 and x64), my program fails to complete. To clarify, there are no build errors or execution errors.
After looking for a cause and solution for the issue, I found Program only crashes as release build -- how to debug? which proposes that it is an array issue. Though this solve my problem, it gave me some insight on the matter (which I leave here for the next person).
To further muddle matters, it's only when a subroutine on the main thread has an execution time greater than about 0ms.
Here are the relevant sections of code:
// Startup Progress Bar Thread
nPC_Current = 0; // global int
nPC_Max = nPC; // global int (max value nPC_Current will reach)
DWORD myThreadID;
HANDLE progressBarHandle = CreateThread(0, 0, printProgress, &nPC_Current, 0, &myThreadID);
/* Do stuff and time how long it takes (this is what increments nPC_Current) */
// Wait for Progress Bar Thread to Terminate
WaitForSingleObject(progressBarHandle, INFINITE);
Where the offending line that my program gets stuck on is that last statement, where the program waits for the created thread to terminate:
WaitForSingleObject(progressBarHandle, INFINITE);
And here is the code for the progress bar function:
DWORD WINAPI printProgress(LPVOID lpParameter)
{
int lastProgressPercent = -1; // Only reprint bar when there is a change to display.
// Core Progress Bar Loop
while (nPC_Current <= nPC_Max)
{
// Do stuff to print a text progress bar
}
return 0;
}
Where the 'Core' while loop generally won't get a single iteration if the execution time of the measured subroutine is about 0ms. To clarify this, if the execution time of the timed subroutine is about 0ms, the nPC_Current will be greater than nPC_Max before the printProgressBar executes once. This means that thread will terminate before the main thread begins to wait for it.
If anyone would help with this, or provide some further insight on the matter, that would be fantastic as I'm having quite some trouble figuring this out.
Thanks!
edits:
wording
deleted distracting contents and added clarifications

My guess would be that you forgot to declare your shared global variables volatile (nPC_Current specifically). Since the thread function itself never modifies nPC_Current, in the release version of the code the compiler optimized you progress bar loop into an infinite loop with never changing value of nPC_Current.
This is why your progress bar never updates from 0% value in release version of the code and this is why your progress bar thread never terminates.
P.S. Also, it appears that you originally intended to pass your nPC_Current counter to the thread function as a thread parameter (judging by your CreateThread call). However, in the thread function you ignore the parameter and access nPC_Current directly as a global variable. It might be a better idea to stick to the original idea of passing and accessing it as a thread parameter.

The number one rule in writing software is:
Leave nothing to chance; check for every single possible error, everywhere.
Note: this is the number one rule not when troubleshooting software; when there is trouble, it is already too late; this is the number one rule when writing software, that is, before there is even a need to troubleshoot.
There is a number of problems with your code; I cannot tell for sure that any one of those is what is causing you the problem that you are experiencing, but I would be willing to bet that if you fixed those, and if you developed the mentality of fixing problems like those, then you would not have the problem you are experiencing.
The documentation for WaitForSingleObject says: "If this handle is closed while the wait is still pending, the function's behavior is undefined." However, you do not appear to be asserting that CreateThread() returned a valid handle. You are not even showing us where and how you are closing that handle. (And when you do close the handle, do you assert that CloseHandle() did not fail?)
Not only you are using global variables, (which are something that I would strongly advice against,) but also, you happily make a multitude of assumptions about their values, without ever asserting any one of those assumptions.
What guarantees do you have that nPC_Current is in fact less than nPC_Max at the beginning of your function?
What guarantees do you have that nPC_Current keeps incrementing over time?
What guarantees do you have that the calculation of lastProgressPercent does not in fact keep yielding -1 during your loop?
What guarantees do you have that nPC_Max is not zero? (Division by zero on a separate thread is kind of hard to catch.)
What guarantees do you have that nPC_Max does not get also modified while your thread is running?
What guarantees do you have that nPC_Current gets incremented atomically? (I hope you understand that if it does not get incremented atomically, then at the moment that you read it from another thread, you may read garbage.)
You have tagged this question with [C++], and I do see a few C++ features being used, but I do not really see any object-oriented programming. The thread function accepts an LPVOID parameter precisely so that you can pass an object to it and thus continue being object-oriented in your second thread, with all the benefits that this entails, like for example encapsulation. I would suggest that you use it.

You can use (with some limitations) breakpoints in release...
Does this part of the code:
/* Do stuff and time how long it takes (this is what increments nPC_Current) */
depend on what printProgress thread does? (If so, you have to assure time dependence, and order conveniently) Are you sure this is always incrementing nPC_Current? Is it a time dependent algorithm?
Have you tested the effect that a Sleep() has here?

Related

Why in my code cpp compare_exchange_strong updates and return false

The problem:
So I'm pretty new to CPP and i was trying to implement a simple comparison code using some atomicity concepts.
The problem is that I'm not getting a desired result, that is: even after the compare_exchange_strong function updates the value of the atomic variable (std::atomic), it returns false.
Below is the program code:
CPP:
Action::Action(Type type, Transfer *transfer)
: transfer(transfer),
type(type) {
Internal = 0;
InternalHigh = -1;
Offset = OffsetHigh = 0;
hEvent = NULL;
status = Action::Status::PENDING;
}
BOOL CancelTimeout(OnTimeoutCallback* rt)
{
auto expected = App::Action::Status::PENDING;
if (rt->action->status.compare_exchange_strong(expected, App::Action::Status::CANCEL))
{
CancelWaitableTimer(rt->hTimer);
return true;
}
return false;
}
HEADER:
struct Action : OVERLAPPED {
enum class Type : long {
SEND,
RECEIVE
};
enum class Status : long {
PENDING,
CANCEL,
TIMEOUT
};
atomic<Status> status;
Transfer *transfer = NULL;
Type type;
WSABUF *data = NULL;
OnTimeoutCallback *timeoutCallback;
Action(Type type, Transfer *transfer);
~Action();
}
Reviewing, the value of the variable rt->action->status is updated to the Action::Status::CANCEL enum, but the return of the compare_exchange_strong function is false.
See the problem in debug:
That said, the desired result is that the first breakpoint, referring to return true, would be triggered instead of return false, taking into account that it changed the value of the variable.
UPDATE: In the print I removed the first Breakpoint by accident, but I think it was understandable
Attempts already made
Modify the structure to: enum class Status : long
Modify the structure to: enum class Status : size_t
Modify the positions of all structure items
Similar topics already searched
[but without success]
Link
Search term
Why does compare_exchange_strong fail with std::atomic<double>, std::atomic<float> in C++?
compare_exchange_strong fail
cpp compare_exchange_strong fails spuriously?
compare exchange fails
Don't really get the logic of std::atomic::compare_exchange_weak and compare_exchange_strong
std::atomic::compare_exchange_weak and compare_exchange_strong
Does C++14 define the behavior of bitwise operators on the padding bits of unsigned int?
Padding problem compare exchange
Among several other topics with different search words
Importante Notes
The code is multi-threaded
There is nowhere else in the code where the value of the atomic
variable is being updated to the enum
Action::Status::CANCEL
I suspect it's something to do with padding (due to some Google
searches), but as I'm new to CPP, I don't know how to modify my
framework to solve the problem
A new instance of the Action structure is generated at each request,
and I also made sure that there is no concurrency occurring on the
same pointer (Action*), because with each change, a new instance of
the Action structure is generated
WAIT!
It is worth mentioning that I am using Google Translate to post this question, in case something is not right, if my question is incomplete, or is formatted in an inappropriate way, please comment so I can adjust it, thank you in advance,
Lucas P.
Updates:
I was not able to replicate the problem using a minified version of the code, that being said, I have to post the entire solution (which in turn is already quite small, as it is a project for studies):
https://drive.google.com/file/d/13fP7OUCC6GeMgUtrPHSOnSGUEBwDGqBC/view?usp=sharing
TL:DR: race condition between another thread modifying it vs. the debugger getting control and reading memory of the process being debugged.
Or the value had been Action::Status::CANCEL for a long time, not expected = App::Action::Status::PENDING;, in which case a single thread running alone could have this behaviour. I assume your program expects this CAS to fail only when two threads are trying to do this around the same time, like only calling this function in the first place if something was pending.
I assume there's another thread that could call CancelTimeout at the same time, otherwise you wouldn't need an atomic RMW. (If this was the only thread that modified it, you'd just check the value, and do a pure store of the new value after a manual compare, like .store(CANCEL), perhaps with std::memory_order_release or relaxed.)
This would explain your observations:
Another thread won the race to modify rt->action->status, so its CAS returned true.
CAS_strong in this thread didn't modify the variable, and returned false.
The if body in this thread didn't run, so this thread hit your breakpoint.
After the debugger eventually got control and all threads of the process were paused, the debugger asked the kernel to read memory of the process being debugged. Since our CAS failed, the other thread's update of rt->action->status must have already happened, so the debugger will see it.
(Especially after all the time it takes for the debugger to get control, the dust will have time to settle. But assuming you're using an x86 or ARMv8, stores in one thread being visible to any other thread mean they're globally visible, to all threads; those ISAs are multi-copy atomic, no IRIW reordering.)
So CAS failed precisely because some other thread already changed the value. It wasn't changed by the thread where CAS failed. Your breakpoint will trigger whenever CAS fails, regardless of the value before or after the CAS.
For CAS_strong to actually return false and update the value, your compiler or CPU would have to be buggy. Those are possible (especially a compiler bug), but are extraordinary claims that require very carefully ruling out software causes of the same observations. That should never be your first guess when you haven't yet sorted out all the details and aren't sure you understand everything that's going on.
If you think a primitive operation didn't do what the docs said it does, it's almost always actually a bug somewhere else, or missing some possible explanation for what you're seeing that doesn't require a compiler bug to explain.
It's fine to ask a Stack Overflow question about what's going on, but keep in mind when writing your title that it's extremely unlikely that your C++ compiler is actually broken.

Forcibly terminate method after a certain amount of time

Say I have a function whose prototype looks like this, belonging to class container_class:
std::vector<int> container_class::func(int param);
The function may or may not cause an infinite loop on certain inputs; it is impossible to tell which inputs will cause a success and which will cause an infinite loop. The function is in a library of which I do not have the source of and cannot modify (this is a bug and will be fixed in the next release in a few months, but for now I need a way to work around it), so solutions which modify the function or class will not work.
I've tried isolating the function using std::async and std::future, and using a while loop to constantly check the state of the thread:
container_class c();
long start = get_current_time(); //get the current time in ms
auto future = std::async(&container_class::func, &c, 2);
while(future.wait_for(0ms) != std::future_status::ready) {
if(get_current_time() - start > 1000) {
//forcibly terminate future
}
sleep(2);
}
This code has many problems. One is that I can't forcibly terminate the std::future object (and the thread that it represents).
At the far extreme, if I can't find any other solution, I can isolate the function in its own executable, run it, and then check its state and terminate it appropriately. However, I would rather not do this.
How can I accomplish this? Is there a better way than what I'm doing right now?
You are out of luck, sorry.
First off, C++ doesn't even guarantee you there will be a thread for future execution. Although it would be extremely hard (probably impossible) to implement all std::async guarantees in a single thread, there is no direct prohibition of that, and also, there is certainly no guarantee that there will be a thread per async call. Because of that, there is no way to cancel the async execution.
Second, there is no such way even in the lowest level of thread implementation. While pthread_cancel exists, it won't protect you from infinite loops not visiting cancellation points, for example.
You can not arbitrarily kill a thread in Posix, and C++ thread model is based on it. A process really can't be a scheduler of it's own threads, and while sometimes it is a pain, it is what it is.

Executing function for some amount of time

I am sorry if this was asked before, but I didn't find anything related to this. And this is for my understanding. It's not an home work.
I want to execute a function only for some amount of time. How do I do that? For example,
main()
{
....
....
func();
.....
.....
}
function func()
{
......
......
}
Here, my main function calls another function. I want that function to execute only for a minute. In that function, I will be getting some data from the user. So, if user doesn't enter the data, I don't want to be stuck in that function forever. So, Irrespective of whether function is completed by that time or it is not completed, I want to come back to the main function and execute the next operation.
Is there any way to do it ? I am on windows 7 and I am using VS-2013.
Under windows, the options are limited.
The simplest option would be for func() to explicitly and periodically check how long it has been executing (e.g. store its start time, periodically check the amount of time elapses since that start time) and return if it has gone longer than you wish.
It is possible (C++11 or later) to execute the function within another thread, and for main() to signal that thread when the required time period has elapsed. That is best done cooperatively. For example, main() sets a flag, the thread function checks that flag and exits when required to. Such a flag is usually best protected by a critical section or mutex.
An extremely unsafe way under windows is for main() to forceably terminate the thread. That is unsafe, as it can leave the program (and, in worst cases, the operating system itself) in an unreliable state (e.g. if the terminated thread is in the process of allocating memory, if it is executing certain kernel functions, manipulating global state of a shared DLL).
If you want better/safer options, you will need a real-time operating system with strict memory and timing partitioning. To date, I have yet to encounter any substantiated documentation about any variant of Windows and unix (not even real time variants) with those characteristics. There are a couple of unix-like systems (e.g. LynxOS) with variants that have such properties.
I think a part of your requirement can be met using multithreading and a loop with a stopwatch.
Create a new thread.
Start a stopwatch.
Start a loop with one minute as the condition for the loop.
During each iteration check if the user has entered the input and process.
when one minute is over, the loop quits.
I 'am not sure about the feasibility about this idea, just shared my idea. I don't know much about c++, but in Node.js your requirement can be achieved using 'events'. May be such things exists in C++ too.

How do I use the stack but avoid a stack overflow in C++

I'm presently moving back to C++ from Java. There are some areas of C++ where higher performance can be achieved by doing more computation on the stack.And some recursive algorithms operate more efficiently on the stack than on the heap.
Obviously the stack is a resource, and if I am going to use it, I should ensure that I do not consume too much (to the point of crashing my program).
I'm running Xcode, and wrote the following simple program:
#include <csignal>
static bool interrupted = false;
long stack_test(long limit){
if((limit>0)&&(interrupted==false))
return stack_test(limit-1)+1; // program crashes here with EXC_BAD_ACCESS...
else
return 0;
}
void signal_handler(int sig){
interrupted = true;
}
int main(char* args[]){
signal(SIGSEGV,&signal_handler);
stack_test(1000000);
signal(SIGSEGV,SIG_DFL);
}
The documentation states that running on BSD, stack limits can be checked by using getrlimit() and that when the stack limit is being reached, a SIGSEGV event is issued. I tried installing the above event handler for this event, but instead, my program stops at the next iteration with EXT_BAD_ACCESS (code=2, ...).
Am I taking the wrong approach here, or is there a better way?
This has the same problem in Java as it does in c++. You are way over-committing to the stack.
And some recursive algorithms operate more efficiently on the stack than on the heap.
Indeed, and they are commonly of the divide and conquer type.
The usefulness of recursion is to reduce the computation to a more manageable computation with each call. limit - 1 is not such a candidate.
If your question is only about the signal, I unfortunately can't offer you any advice on your system.
Your signal handler can't do much to fix the stack overflow. Setting your interrupted flag doesn't help. When your signal handler returns, the instruction that tried to write to an address beyond the end of the stack resumes and it's still going to attempt to write beyond the end of the stack. Your code won't get back to the part which checks your interrupted flag.
With great care and a lot of architecture-specific code, your signal handler could potentially change the context of the thread which encountered the signal such that, when it resumes, it will be at a different point in the code.
You could also use setjmp() and longjmp() to accomplish this at a coarser granularity.
A different approach would be to set up a thread to use a stack that your code allocated, using pthread_attr_setstackaddr() and pthread_attr_setstacksize() prior to pthread_create(). You would run your code in that secondary thread and not the main one. You could set the last page or two of the stack you allocated to be non-writable using mprotect(). Then, your signal handler could set the interrupted flag and also set those pages to be writable. That should give you enough headroom that the resumed code can execute without re-raising the signal, get far enough to check the flag, and return gracefully. Note that this is a one-time last resort, unless you can find a good point to set those guard pages non-writable again.

C++ Thread question - setting a value to indicate the thread has finished

Is the following safe?
I am new to threading and I want to delegate a time consuming process to a separate thread in my C++ program.
Using the boost libraries I have written code something like this:
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
Where finished_flag is a boolean member of my class. When the thread is finished it sets the value and the main loop of my program checks for a change in that value.
I assume that this is okay because I only ever start one thread, and that thread is the only thing that changes the value (except for when it is initialised before I start the thread)
So is this okay, or am I missing something, and need to use locks and mutexes, etc
You never mentioned the type of finished_flag...
If it's a straight bool, then it might work, but it's certainly bad practice, for several reasons. First, some compilers will cache the reads of the finished_flag variable, since the compiler doesn't always pick up the fact that it's being written to by another thread. You can get around this by declaring the bool volatile, but that's taking us in the wrong direction. Even if reads and writes are happening as you'd expect, there's nothing to stop the OS scheduler from interleaving the two threads half way through a read / write. That might not be such a problem here where you have one read and one write op in separate threads, but it's a good idea to start as you mean to carry on.
If, on the other hand it's a thread-safe type, like a CEvent in MFC (or equivilent in boost) then you should be fine. This is the best approach: use thread-safe synchronization objects for inter-thread communication, even for simple flags.
Instead of using a member variable to signal that the thread is done, why not use a condition? You are already are using the boost libraries, and condition is part of the thread library.
Check it out. It allows the worker thread to 'signal' that is has finished, and the main thread can check during execution if the condition has been signaled and then do whatever it needs to do with the completed work. There are examples in the link.
As a general case I would neve make the assumption that a resource will only be modified by the thread. You might know what it is for, however someone else might not - causing no ends of grief as the main thread thinks that the work is done and tries to access data that is not correct! It might even delete it while the worker thread is still using it, and causing the app to crash. Using a condition will help this.
Looking at the thread documentation, you could also call thread.timed_join in the main thread. timed_join will wait for a specified amount for the thread to 'join' (join means that the thread has finsihed)
I don't mean to be presumptive, but it seems like the purpose of your finished_flag variable is to pause the main thread (at some point) until the thread thrd has completed.
The easiest way to do this is to use boost::thread::join
// launch the thread...
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
// ... do other things maybe ...
// wait for the thread to complete
thrd.join();
If you really want to get into the details of communication between threads via shared memory, even declaring a variable volatile won't be enough, even if the compiler does use appropriate access semantics to ensure that it won't get a stale version of data after checking the flag. The CPU can issue reads and writes out of order as long (x86 usually doesn't, but PPC definitely does) and there is nothing in C++9x that allows the compiler to generate code to order memory accesses appropriately.
Herb Sutter's Effective Concurrency series has an extremely in depth look at how the C++ world intersects the multicore/multiprocessor world.
Having the thread set a flag (or signal an event) before it exits is a race condition. The thread has not necessarily returned to the OS yet, and may still be executing.
For example, consider a program that loads a dynamic library (pseudocode):
lib = loadLibrary("someLibrary");
fun = getFunction("someFunction");
fun();
unloadLibrary(lib);
And let's suppose that this library uses your thread:
void someFunction() {
volatile bool finished_flag = false;
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
while(!finished_flag) { // ignore the polling loop, it's besides the point
sleep();
}
delete thrd;
}
void myclass::mymethod() {
// do stuff
finished_flag = true;
}
When myclass::mymethod() sets finished_flag to true, myclass::mymethod() hasn't returned yet. At the very least, it still has to execute a "return" instruction of some sort (if not much more: destructors, exception handler management, etc.). If the thread executing myclass::mymethod() gets pre-empted before that point, someFunction() will return to the calling program, and the calling program will unload the library. When the thread executing myclass::mymethod() gets scheduled to run again, the address containing the "return" instruction is no longer valid, and the program crashes.
The solution would be for someFunction() to call thrd->join() before returning. This would ensure that the thread has returned to the OS and is no longer executing.