Why are some threads deferred?

Why are some threads deferred? - c++

In a tutorial I am following, the author wrote a program that showed that the destructors of std::futures don't always execute the task. In the following program, 10 threads created with std::async() are moved into the vector, and then we wait for their destructors to run.
#include <iostream>
#include <future>
#include <thread>
#include <chrono>
int main()
{
std::cout << "Main thread id: " << std::this_thread::get_id() << std::endl;
std::vector<std::future<void>> futures;
for (int i = 0; i < 10; ++i)
{
auto fut = std::async([i]
{
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << std::this_thread::get_id() << " ";
});
futures.push_back(std::move(fut));
}
}
The result is machine-dependent, but what we found was that only 6 threads were launched when the destructors ran (we only got 6 ids printed after the main thread id output). This meant that the other four were deferred, and deferred threads don't run during std::future's destructors.
My question is why were some threads forced to execute while others were deferred. What is the point of deferring them if the life of the std::future is ending?

the author wrote a program that showed that the destructors of std::futures don't always execute the task.
The destructors never execute the task. If the task is already executing in another thread the destructor waits for it to finish, but it does not execute it.
what we found was that only 6 threads were launched when the destructors ran
This is incorrect, the threads are not launched when the destructors run, they are launched when you call std::async (or some time after that) and they are still running when the destructors start, so the destructors must wait for them.
What is the point of deferring them if the life of the std::future is ending?
Again, they are not deferred when the destructor runs, they are deferred when std::async is called, they are still deferred when the destructor runs, and so they just get thrown away without being run, and the destructor doesn't have to wait for anything.
I don't know if you're quoting the tutorial and the author of that is confused, or if you're confused, but your description of what happens is misleading.
Each time you call std::async without a launch policy argument the C++ runtime decides whether to create a new thread or whether to defer the function (so it can be run later). If the system is busy the runtime might decide to defer the function because launching another thread would make the system even slower.

Your async() calls use the default launch policy, which is launch::async|launch::deferred meaning that "The function chooses the policy automatically (at some point). This depends on the system and library implementation, which generally optimizes for the current availability of concurrency in the system."
thread::hardware_concurrency may give you hints about maximum hardware concurrency on your system. This can contribute to explain why some threads are necessarily deferred (especially if your loop is grater than the hardware concurrency) or not. However, beware that other running processes might use the hardware concurrency as well.
Please note as well that your asynchronous threads make use of cout which could delay some of them due to synchronization (more here)

Related

Force program termination if threads block in C++

When a class is responsible for managing a thread, it is a common pattern (see for example here) to join this thread in the destructor after you have made sure that the thread will finish in time. However, this is not always trivial as outlined in the linked thread leading to a program that never terminates if done incorrectly. Given below is an example to reproduce such a situation:
#include <iostream>
#include <thread>
#include <chrono>
using namespace std::chrono_literals;
class Foo {
public:
Foo() {
mythread = std::thread([&](){
int i = 0;
while(running) {
std::cout << "hi" << std::endl;
if (i++ >= 2) {
// placeholder for e.g. a blocking condition variable
std::this_thread::sleep_for(1000h);
}
std::this_thread::sleep_for(500ms);
}
});
}
~Foo() {
running = false;
mythread.join();
}
private:
std::thread mythread;
bool running{true};
};
int main() {
Foo bar;
std::this_thread::sleep_for(1s);
// enabling this line will block the termination
//std::this_thread::sleep_for(2s);
std::cout << "ending" << std::endl;
}
What I am searching for is a solution that forcefully terminates the program if this situation occurs. Of course, one should always strive towards finishing the thread properly, but having such feature would be good as last resort to have a peace of mind, especially for unobserved embedded systems where crashing programs can be easier restored and debugged than blocking programs.
A rough solution draft would be to start a thread at the end of the main that sleeps for a few seconds and if the program has not ended after that time, std::terminate is called (and ideally a corresponding error is reported). However, we have a chicken-or-egg problem because this new thread will of course keep the program from ending in time. I would highly appreciate any ideas.
EDIT: The solution should not require modification of the Foo class itself so that it also covers respective bugs in unmodified code of e.g. external libraries. Ideally, it would even cover threads no class feels responsible for ending them before the main ends (classes with static storage duration or even no longer referenced objects with dynamic storage duration), but that might not be possible at all without in-depth OS hacking or an external process monitor.

There are several solutions:
Investigate and fix the root problem (this is the best and correct solution)
Workarounds:
You can notify from thread about exiting via condition variable. And only after it do join. If CV's wait_for returns with timeout - kill thread (bad solution, there are another problems).
You can create watch-thread, which will verify time-counter. Counter should be reset from time to time by the application. If watch-thread detects too high value in time-counter, it restarts whole the application.
Move suspicious code out of your application to separate process and communicate with it via IPC. In case of problems - restart that application (best among the workarounds)

In Visual Studio, `thread_local` variables' destructor not called when used with std::async, is this a bug?

The following code
#include <iostream>
#include <future>
#include <thread>
#include <mutex>
std::mutex m;
struct Foo {
Foo() {
std::unique_lock<std::mutex> lock{m};
std::cout <<"Foo Created in thread " <<std::this_thread::get_id() <<"\n";
}
~Foo() {
std::unique_lock<std::mutex> lock{m};
std::cout <<"Foo Deleted in thread " <<std::this_thread::get_id() <<"\n";
}
void proveMyExistance() {
std::unique_lock<std::mutex> lock{m};
std::cout <<"Foo this = " << this <<"\n";
}
};
int threadFunc() {
static thread_local Foo some_thread_var;
// Prove the variable initialized
some_thread_var.proveMyExistance();
// The thread runs for some time
std::this_thread::sleep_for(std::chrono::milliseconds{100});
return 1;
}
int main() {
auto a1 = std::async(std::launch::async, threadFunc);
auto a2 = std::async(std::launch::async, threadFunc);
auto a3 = std::async(std::launch::async, threadFunc);
a1.wait();
a2.wait();
a3.wait();
std::this_thread::sleep_for(std::chrono::milliseconds{1000});
return 0;
}
Compiled and run width clang in macOS:
clang++ test.cpp -std=c++14 -pthread
./a.out
Got result
Foo Created in thread 0x70000d9f2000
Foo Created in thread 0x70000daf8000
Foo Created in thread 0x70000da75000
Foo this = 0x7fd871d00000
Foo this = 0x7fd871c02af0
Foo this = 0x7fd871e00000
Foo Deleted in thread 0x70000daf8000
Foo Deleted in thread 0x70000da75000
Foo Deleted in thread 0x70000d9f2000
Compiled and run in Visual Studio 2015 Update 3:
Foo Created in thread 7180
Foo this = 00000223B3344120
Foo Created in thread 8712
Foo this = 00000223B3346750
Foo Created in thread 11220
Foo this = 00000223B3347E60
Destructor are not called.
Is this a bug or some undefined grey zone?
P.S.
If the sleep std::this_thread::sleep_for(std::chrono::milliseconds{1000}); at the end is not long enough, you may not see all 3 "Delete" messages sometimes.
When using std::thread instead of std::async, the destructors get called on both platform, and all 3 "Delete" messages will always be printed.

Introductory Note: I have now learned a lot more about this and have therefore re-written my answer. Thanks to #super, #M.M and (latterly) #DavidHaim and #NoSenseEtAl for putting me on the right track.
tl;dr Microsoft's implementation of std::async is non-conformant, but they have their reasons and what they have done can actually be useful, once you understand it properly.
For those who don't want that, it is not too difficult to code up a drop-in replacement replacement for std::async which works the same way on all platforms. I have posted one here.
Edit: Wow, how open MS are being these days, I like it, see: https://github.com/MicrosoftDocs/cpp-docs/issues/308
Let's being at the beginning. cppreference has this to say (emphasis and strikethrough mine):
The template function async runs the function f asynchronously (potentially optionally in a separate thread which may be part of a thread pool).
However, the C++ standard says this:
If launch::async is set in policy, [std::async] calls [the function f] as if in a new thread of execution ...
So which is correct? The two statements have very different semantics as the OP has discovered. Well of course the standard is correct, as both clang and gcc show, so why does the Windows implementation differ? And like so many things, it comes down to history.
The (oldish) link that M.M dredged up has this to say, amongst other things:
... Microsoft has its implementation of [std::async] in the form of PPL (Parallel Pattern Library) ... [and] I can understand the eagerness of those companies to bend the rules and make these libraries accessible through std::async, especially if they can dramatically improve performance...
... Microsoft wanted to change the semantics of std::async when called with launch_policy::async. I think this was pretty much ruled out in the ensuing discussion ... (rationale follows, if you want to know more then read the link, it's well worth it).
And PPL is based on Windows' built-in support for ThreadPools, so #super was right.
So what does the Windows thread pool do and what is it good for? Well, it's intended to manage frequently-sheduled, short-running tasks in an efficient way so point 1 is don't abuse it, but my simple tests show that if this is your use-case then it can offer significant efficiencies. It does, essentially, two things
It recycles threads, rather than having to always start a new one for each asynchronous task you launch.
It limits the total number of background threads it uses, after which a call to std::async will block until a thread becomes free. On my machine, this number is 768.
So knowing all that, we can now explain the OP's observations:
A new thread is created for each of the three tasks started by main() (because none of them terminates immediately).
Each of these three threads creates a new thread-local variable Foo some_thread_var.
These three tasks all run to completion but the threads they are running on remain in existence (sleeping).
The program then sleeps for a short while and then exits, leaving the 3 thread-local variables un-destructed.
I ran a number of tests and in addition to this I found a few key things:
When a thread is recycled, the thread-local variables are re-used. Specifically, they are not destroyed and then re-created (you have been warned!).
If all the asynchonous tasks complete and you wait long enough, the thread pool terminates all the associated threads and the thread-local variables are then destroyed. (No doubt the actual rules are more complex than that but that's what I observed).
As new asynchonous tasks are submitted, the thread pool limits the rate at which new threads are created, in the hope that one will become free before it needs to perform all that work (creating new threads is expensive). A call to std::async might therefore take a while to return (up to 300ms in my tests). In the meantime, it's just hanging around, hoping that its ship will come in. This behaviour is documented but I call it out here in case it takes you by surprise.
Conclusions:
Microsoft's implementation of std::async is non-conformant but it is clearly designed with a specific purpose, and that purpose is to make good use of the Win32 ThreadPool API. You can beat them up for blantantly flouting the standard but it's been this way for a long time and they probably have (important!) customers who rely on it. I will ask them to call this out in their documentation. Not doing that is criminal.
It is not safe to use thread_local variables in std::async tasks on Windows. Just don't do it, it will end in tears.

Looks like just another of many bugs in VC++.
Consider this quote from n4750
All variables declared with the thread_local keyword have thread
storage duration . The storage for these entities shall last for the
duration of the thread in which they are created. There is a distinct
object or reference per thread, and use of the declared name refers to
the entity associated with the current thread. 2 A variable with
thread storage duration shall be initialized before its first odr-use
(6.2) and, if constructed, shall be destroyed on thread exit.
+this
If the implementation chooses the launch::async policy, — (5.3) a call
to a waiting function on an asynchronous return object that shares the
shared state created by this async call shall block until the
associated thread has completed, as if joined, or else time out
(33.3.2.5);
I could be wrong("thread exit" vs "thread completed", but I feel this means that thread_local variables need to be destroyed before .wait() call unblocks.

Qestion about understanding "detach()" on threads in C++

I always saw in the internet the rule:
If you don't detach\join a thread, then abort will be called.
I need a reason for why that abort happens.
I can understand with join — because when not doing join to some thread, the the main can be closed before the thread and it can make problems.
But detach doesn't do anything! It has no purpose (at least from what I've seen when running a thread with or without being detached).
What exactly make the abort to jump, any what exactly is the purpose of detach?
Here is a simple example for what causing "aborting":
#include <iostream> // std::cout
#include <thread> // std::thread, std::this_thread::sleep_for
#include <chrono> // std::chrono::seconds
void pause_thread(int n)
{
std::this_thread::sleep_for (std::chrono::seconds(n));
std::cout << "pause of " << n << " seconds ended\n";
}
int main()
{
std::cout << "Spawning and detaching 3 threads...\n";
std::thread (pause_thread,1);
std::cout << "Done spawning threads.\n";
// give the detached threads time to finish (but not guaranteed!):
pause_thread(5);
return 0;
}

A thread is a different beast from a process. Threads are not in a parent/child relation at all.
C++11 thread implementation has to be compatible with all major OS threads, so it had to make design decisions, which cannot be understood at first sight.
I'll describe pthread, which is used in the linux world.
When you create a thread, you can specify its detachstate. It can be two values:
DETACHED: when the thread exits, its allocated resources automatically released. Thread cannot be joined.
JOINABLE: when the thread exits, some of its resources are not automatically freed. For example, its return code (in pthread, there is a return value of a thread). Thread can be joined. Resources will be freed at join().
C++ threads are created as JOINABLE, but you can detach it later.
Now, if you don't detach/join a thread, at ~thread(), what could a thread implementation do? It is an issue, because if it doesn't do anything, then some resource will be silently leaked (as a JOINABLE thread when exits, some of its resources are not automatically freed)
call join() automatically: not a good idea, as the program can stall on it (if the thread still runs). It is potentially a programming bug.
call detach() automatically: not a good idea either, as it could be a programming bug (thread continue to run, but its thread object is destroyed - a programmer should explicitly call detach() in this case)
call abort(): this is the best that an implementation can do, to avoid programming errors
So the designers of std::thread chose to call abort() to avoid programming errors.
(On windows, the thread system is similar. You have to call CloseHandle for a thread so its resources can be released)

Run threads in parallel in C++ [duplicate]

This question already has answers here:
std::thread - "terminate called without an active exception", don't want to 'join' it
(3 answers)
Closed 9 years ago.
I am trying to do a dekker algorithm implementation for homework, I understand the concept but I'm not being able to execute two threads in parallel using C++0x.
#include <thread>
#include <iostream>
using namespace std;
class Homework2 {
public:
void run() {
try {
thread c1(&Homework2::output_one, this);
thread c2(&Homework2::output_two, this);
} catch(int e) {
cout << e << endl;
}
}
void output_one() {
//cout << "output one" << endl;
}
void output_two() {
//cout << "output two" << endl;
}
};
int main() {
try {
Homework2 p2;
p2.run();
} catch(int e) {
cout << e << endl;
}
return 0;
}
My problem is that the threads will return this error:
terminate called without an active exception
Aborted
The only way to success until now for me has been adding c1.join(); c2.join(); or .detach();
the problem is that join(); will wait for the threads to finish, and detach(); ... well Im not sure what detach does because there is no error but also no output, I guess it leaves the threads on their own...
So all this to say:
Does anybody knows how can I do this both threads to run parallel and not sequencial??
The help is must appreciated!
Thanks.-
P.S:
here is what I do for build:
g++ -o output/Practica2.out main.cpp -pthread -std=c++11

The only way to success until now for me has been adding c1.join(); c2.join(); or .detach();...
After you have spawned the 2 threads, your main thread continues on and, based on your code, ends 'pretty' quick (p2.run() then return 0; are relatively close in CPU instruction 'time'). Depending on how quickly the threads started, they might not have had enough CPU time to fully 'spawn' before the program terminated or if they did fully spawn, there might not have been enough time to do the proper cleanup by the kernel. This is also known as a race condition.
Calling join on the spawned threads from the thread you spawned them from allows the threads to finish and clean up properly (under the hood) before your program exits (a good thing). Calling detach works in this scenario too as it releases all resources (under the hood) from your thread object, but keeps the thread active. In the case of calling detach there were no errors reported because the thread objects were detached from the executing threads, so when your program exited, the kernel (nicely) cleaned up the threads for you (or at least that's what might happen, depends on OS/compiler implementation, etc.) so you didn't see your threads ending 'uncleanly'.
So all this to say: Does anybody knows how can I do this both threads to run parallel and not sequencial??
I think you might have some confusion on how threads work. Your threads already run in 'parallel' (so to speak), that is the nature of a thread. Your code posted does not have anything that would be 'parallel' in nature (i.e. parallel computing of data) but your threads are running concurrently (at the same time, or 'parallel' to each).
If you want your main thread to continue without putting the join in the run function, that would require a little more code than what you currently have and I don't want to assume how your code's future should look, but you could take a look at these two questions regarding the std::thread as a member of a class (and executing within such).
I hope that can help.

Ok this is bit more complex but I will try to explain some things in your code.
When you create the threads in the method called run, you want to print two things (imagine you uncomment the lines), but the thread object is destroyed in the stack unwiding of the method which created them (run).
You actually need to do two things, first create the threads and keep them running(for example do it as pointers) and second call the method join to release all the memory and stuff they needed when they are finished.
You can store you threads in a vector something like std::vector<std::thread*>

How do I terminate a thread in C++11?

I don't need to terminate the thread correctly, or make it respond to a "terminate" command. I am interested in terminating the thread forcefully using pure C++11.

You could call std::terminate() from any thread and the thread you're referring to will forcefully end.
You could arrange for ~thread() to be executed on the object of the target thread, without a intervening join() nor detach() on that object. This will have the same effect as option 1.
You could design an exception which has a destructor which throws an exception. And then arrange for the target thread to throw this exception when it is to be forcefully terminated. The tricky part on this one is getting the target thread to throw this exception.
Options 1 and 2 don't leak intra-process resources, but they terminate every thread.
Option 3 will probably leak resources, but is partially cooperative in that the target thread has to agree to throw the exception.
There is no portable way in C++11 (that I'm aware of) to non-cooperatively kill a single thread in a multi-thread program (i.e. without killing all threads). There was no motivation to design such a feature.
A std::thread may have this member function:
native_handle_type native_handle();
You might be able to use this to call an OS-dependent function to do what you want. For example on Apple's OS's, this function exists and native_handle_type is a pthread_t. If you are successful, you are likely to leak resources.

#Howard Hinnant's answer is both correct and comprehensive. But it might be misunderstood if it's read too quickly, because std::terminate() (whole process) happens to have the same name as the "terminating" that #Alexander V had in mind (1 thread).
Summary: "terminate 1 thread + forcefully (target thread doesn't cooperate) + pure C++11 = No way."

I guess the thread that needs to be killed is either in any kind of waiting mode, or doing some heavy job.
I would suggest using a "naive" way.
Define some global boolean:
std::atomic_bool stop_thread_1 = false;
Put the following code (or similar) in several key points, in a way that it will cause all functions in the call stack to return until the thread naturally ends:
if (stop_thread_1)
return;
Then to stop the thread from another (main) thread:
stop_thread_1 = true;
thread1.join ();
stop_thread_1 = false; //(for next time. this can be when starting the thread instead)

Tips of using OS-dependent function to terminate C++ thread:
std::thread::native_handle() only can get the thread’s valid native handle type before calling join() or detach(). After that, native_handle() returns 0 - pthread_cancel() will coredump.
To effectively call native thread termination function(e.g. pthread_cancel()), you need to save the native handle before calling std::thread::join() or std::thread::detach(). So that your native terminator always has a valid native handle to use.
More explanations please refer to: http://bo-yang.github.io/2017/11/19/cpp-kill-detached-thread .

This question actually have more deep nature and good understanding of the multithreading concepts in general will provide you insight about this topic. In fact there is no any language or any operating system which provide you facilities for asynchronous abruptly thread termination without warning to not use them. And all these execution environments strongly advise developer or even require build multithreading applications on the base of cooperative or synchronous thread termination. The reason for this common decisions and advices is that all they are built on the base of the same general multithreading model.
Let's compare multiprocessing and multithreading concepts to better understand advantages and limitations of the second one.
Multiprocessing assumes splitting of the entire execution environment into set of completely isolated processes controlled by the operating system. Process incorporates and isolates execution environment state including local memory of the process and data inside it and all system resources like files, sockets, synchronization objects. Isolation is a critically important characteristic of the process, because it limits the faults propagation by the process borders. In other words, no one process can affects the consistency of any another process in the system. The same is true for the process behaviour but in the less restricted and more blur way. In such environment any process can be killed in any "arbitrary" moment, because firstly each process is isolated, secondly, operating system have full knowledges about all resources used by process and can release all of them without leaking, and finally process will be killed by OS not really in arbitrary moment, but in the number of well defined points where the state of the process is well known.
In contrast, multithreading assumes running multiple threads in the same process. But all this threads are share the same isolation box and there is no any operating system control of the internal state of the process. As a result any thread is able to change global process state as well as corrupt it. At the same moment the points in which the state of the thread is well known to be safe to kill a thread completely depends on the application logic and are not known neither for operating system nor for programming language runtime. As a result thread termination at the arbitrary moment means killing it at arbitrary point of its execution path and can easily lead to the process-wide data corruption, memory and handles leakage, threads leakage and spinlocks and other intra-process synchronization primitives leaved in the closed state preventing other threads in doing progress.
Due to this the common approach is to force developers to implement synchronous or cooperative thread termination, where the one thread can request other thread termination and other thread in well-defined point can check this request and start the shutdown procedure from the well-defined state with releasing of all global system-wide resources and local process-wide resources in the safe and consistent way.

Maybe TerminateThread? In windows only.
WINBASEAPI WINBOOL WINAPI TerminateThread (HANDLE hThread, DWORD dwExitCode);
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-terminatethread

You can't use a C++ std::thread destructor to terminate a single thread in a multi-threads program. Here's the relevant code snippet of std::thread destructor, located in the thread header file (Visual C++):
~thread()
{
if (joinable())
std::terminate();
}
If you call the destructor of a joinable thread, the destructor calls std::terminate() that acts on the process; not on the thread, otherwise, it does nothing.
It is possible to "terminating the thread forcefully" (C++11 std::thread) by using OS function. On Windows, you can use TerminateThread. "TerminateThread is a dangerous function that should only be used in the most extreme cases." - Microsoft | Learn.
TerminateThread(tr.native_handle(), 1);
In order to TerminateThread to effect, you should not call join() / detach() before, since such a call will nullify native_handle().
You should call detach() (or join()) after TerminateThread. Otherwise, as written on the 1st paragraph, on thread destructor std::terminate() will be called and the whole process will be terminated.
Example:
#include <iostream>
#include <thread>
#include <Windows.h>
void Work10Seconds()
{
std::cout << "Work10Seconds - entered\n";
for (uint8_t i = 0; i < 20; ++i) {
std::this_thread::sleep_for(std::chrono::milliseconds(500));
std::cout << "Work10Seconds - working\n";
}
std::cout << "Work10Seconds - exited\n";
}
int main() {
std::cout << "main - started\n";
std::thread tr{};
std::cout << "main - Run 10 seconds work thread\n";
tr = std::thread(Work10Seconds);
std::cout << "main - Sleep 2 seconds\n";
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "main - TerminateThread\n";
TerminateThread(tr.native_handle(), 1);
tr.detach(); // After TerminateThread
std::cout << "main - Sleep 2 seconds\n";
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "main - exited\n";
}
Output:
main - started
main - Run 10 seconds work thread
main - Sleep 2 seconds
Work10Seconds - entered
Work10Seconds - working
Work10Seconds - working
Work10Seconds - working
main - TerminateThread
main - Sleep 2 seconds
main - exited

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js