std::threads are managed in user or kernel space? - c++

As I wrote in the title, I would like to know if c++ stantard threads are managed in user or kernel space.
Thank you.

As happens almost always, the standard doesn't mandate any particular implementation, it just requires that the exhibited behavior conforms to its rules.
Thus, the particular implementation is free to choose; on the other hand, probably many implementations will be based on boost.thread (on which the std::thread proposal is based), so we can look at it to have an idea.
This library uses pthreads on POSIX and Windows threads on Win32. Win32 threads are definitely kernel threads, but pthreads on their own are just yet another interface, which could be implemented both in user space and in kernel space (although almost any recent UNIX kernel provides facilities to implement them in kernel space).
So: std::thread can be anything, although, on "mainstream" PC operating systems/implementations, it's very likely that you'll get kernel threads. If for some reason you need to know more, check your compiler's documentation.

The interface is designed around pthreads, but it is up to the implementer of the libc++ to decide what to use.

Related

How to create a single process mutex within C++?

So I'm reading about monitors vs mutexes and finding mentions that suggest that monitors are faster mutexes because they don't lock system wide but rather only across the threads of a given process.
Is there some way in C++ to accomplish or simulate this?
Edit: I'm curious now what the difference is between system wide mutex and one restricted to a specific process.
C++ Standard does not define system-wide vs per-process primitives. So C++ does not specify whether std::mutex is system-wide.
Reasonable implementations have efficient per-process std::mutex; to have system-wide mutex you'll need to use libraries or operating system objects for your platform
The difference is that per-process mutex may use any memory operations to avoid system calls, as the process memory is shared among process's threads. Atomic operation on that memory are more efficient, and system call is often avoided via them. System-wide mutex will either start with system calls (not efficient), or will have to use shared memory (might be unsafe, also still may have some overhead).
The answer by #Alex Guteniev is as accurate as one can get (and should be considered the accepted answer). It states that the c++ standard doesn't define a system wide concept, and that mutexes for all practical purposes are per process i.e for synchronization between threads (execution agents) in a single process (and therefore according to your needs). The C++ makes it clear what a thread (std::thread) is (33.3 - ... intended to map one-to-one with OS threads (in my draft, at least...N4687)).
Microsoft post VC2015 has improved their implementation to use windows primitives as stated here. This is also indicated here in the most upvoted answer. I've also looked at the boost library implementations (which often precedes/influences the c++ standard) for microsoft and (AFAICT) it doesn't use any inter-process calls.
So to answer your question. In C++ threads and monitors are practically the same thing if this definition is to be considered accurate.
Update, stumbled across the answer to this while researching something related.
On Windows, Critical Sections can be used for single processes instead of system wide mutexes and are often faster:
Edit:
While the above statement is correct, c++ doesn't have the concept system wide mutex. This concept only exists when using OS specific primitives such as win32 CreateMutex and is not relevant to std c++.
Source:
std::mutex performance compared to win32 CRITICAL_SECTION
On Linux, pthreads are for processes.

Deduce if a program is going to use threads

Thread-safe or thread-compatible code is good.
However there are cases in which one could implement things differently (more simply or more efficiently) if one knows that the program will not be using threads.
For example, I once heard that things like std::shared_ptr could use different implementations to optimize the non-threaded case (but I can't find a reference).
I think historically std::string in some implementation could use Copy-on-write in non-threaded code.
I am not in favor or against these techniques but I would like to know if that there is a way, (at least a nominal way) to determine at compile time if the code is being compiled with the intention of using threads.
The closest I could get is to realize that threaded code is usually (?) compiled with the -pthreads (not -lpthreads) compiler option.
(Not sure if it is a hard requirement or just recommended.)
In turn -pthreads defines some macros, like _REENTRANT or _THREAD_SAFE, at least in gcc and clang.
In some some answers in SO, I also read that they are obsolete.
Are these macros the right way to determine if the program is intended to be used with threads? (e.g. threads launched from that same program). Are there other mechanism to detect this at compile time? How confident would the detection method be?
EDIT: since the question can be applied to many contexts apparently, let me give a concrete case:
I am writing a header only library that uses another 3rd party library inside. I would like to know if I should initialize that library to be thread-safe (or at least give a certain level of thread support). If I assume the maximum level of thread support but the user of the library will not be using threads then there will be cost paid for nothing. Since the 3rd library is an implementation detail I though I could make a decision about the level of thread safety requested based on a guess.
EDIT2 (2021): By chance I found this historical (but influential) library Blitz++ which in the documentation says (emphasis mine)
8.1 Blitz++ and thread safety
To enable thread-safety in Blitz++, you need to do one of these
things:
Compile with gcc -pthread, or CC -mt under Solaris. (These options define_REENTRANT,which tells Blitz++ to generate thread-safe code).
Compile with -DBZ_THREADSAFE, or #define BZ_THREADSAFE before including any Blitz++ headers.
In threadsafe mode, Blitz++ array reference counts are safeguarded by
a mutex. By default, pthread mutexes are used. If you would prefer a
different mutex implementation, add the appropriate BZ_MUTEX macros to
<blitz/blitz.h> and send them toblitz-dev#oonumerics.org for
incorporation. Blitz++ does not do locking for every array element
access; this would result in terrible performance. It is the job of
the library user to ensure that appropriate synchronization is used.
So it seems that at some point _REENTRANT was used as a clue for the need of multi-threading code.
Maybe it is a very old reference to take seriously.
I support the other answer in that thread-safety decision ideally should not be done on whole program basis, rather they should be for specific areas.
Note that boost::shared_ptr has thread-unsafe version called boost::local_shared_ptr. boost::intrusive_ptr has safe and unsafe counter implementation.
Some libraries use "null mutex" pattern, that is a mutex, which does nothing on lock / unlock. See boost or Intel TBB null_mutex, or ATL CComFakeCriticalSection. This is specifically to substitute real mutex for threqad-safe code, and a fake one for thread-unsafe.
Even more, sometimes it may make sense to use the same objects in thread-safe and thread-unsafe way, depending on current phase of execution. There's also atomic_ref which serves the purpose of providing thread-safe access to underlying type, but still letting work with it in thread unsafe.
I know a good example of runtime switches between thread-safe and thread-unsafe. See HeapCreate with HEAP_NO_SERIALIZE, and HeapAlloc with HEAP_NO_SERIALIZE.
I know also a questionable example of the same. Delphi recommends calling its BeginThread wrapper instead of CreateThread API function. The wrapper sets a global variable telling that from now on Delphi Memory Manager should be thread-safe. Not sure if this behavior is still in place, but it was there for Delphi 7.
Fun fact: in Windows 10, there are virtually no single-threaded programs. Before the first statement in main is executed, static DLL dependencies are loaded. Current Windows version makes this DLL loading paralleled where possible by using thread pool. Once program is loaded, thread pool threads are waiting for other tasks that could be issued by using of Windows API calls or std::async. Sure if program by itself will not use threads and TLS, it will not notice, but technically it is multi-threaded from the OS perspective.
How confident would the detection method be?
Not really. Even if you can unambiguously detect if code is compiled to be used with multiple threads, not everything must be thread safe.
Making everything thread-safe by default, even though it is only ever used only by a single thread would defeat the purpose of your approach. You need more fine grainded control to turn on/off thread safety if you do not want to pay for what you do not use.
If you have class that has a thread-safe and a non-thread-safe version then you could use a template parameter
class <bool isThreadSafe> Foo;
and let the user decide on a case for case basis.

Why doesn't the modern C++ library support threads with priority?

Many third party C/C++ libraries providing multithreading support threads' priority, corresponding scheduler, etc. Why doesn't the modern C++ standard support this useful feature?
The short answer, I think, is that if the standard included a way to specify priorities, it would also have to specify what would happen as a result. This, unfortunately would lead to one of two possibilities: either you'd force people to completely re-implement threads from the ground up on systems that had different semantics, or else you'd limit the platforms to which code that used std::thread could be ported.
Just for example, on some systems, threads of sufficiently high priority (e.g., "real time priority") use round-robin scheduling. Other systems don't--when a thread of sufficiently high priority starts, it will continue to be scheduled until it runs to completion or is interrupted by a thread of even higher priority. Specifying either behavior would lead to problems porting to systems that use the other.
Many (most?) systems also include some mechanism to prevent starving lower priority threads, so they can continue to receive some CPU time even while higher priority threads are ready to run. Again, the details vary and a few (especially smaller/simpler) systems simply don't include any such mechanism at all. As above, attempting to specify any one behavior would lead to difficulty in porting to systems that implement different behavior.
It would be easy to include a set_priority(int) (or something similar), but specifying what it meant/did portably would be next to impossible.
Such feature is not specified in the standard, which means that as of today, a "thread" as described by the C++ standard has no priority.
For POSIX systems you can use pthread_setschedparam
For Windows you can use SetThreadPriority
It is easy enough to write a simple wrapper class around those calls (and possibly others if you use other platforms) for your program.
(you would do so by retrieving the native thread handle with std::thread::native_handle)
Boost.Thread offer this note about it :
Thread launched in this way are created with implementation defined
thread attributes as stack size, scheduling, priority, ... or any
platform specific attributes. It is not evident how to provide a
portable interface that allows the user to set the platform specific
attributes. Boost.Thread stay in the middle road through the class
thread::attributes which allows to set at least in a portable way the
stack size as follows [...]

GNU pth vs. pthread

I want to build a portable and efficient server in C++; it will have lots of clients trying to connect at the same time, so it must be able of handling each request parallel.
I have been trying to find documentation, guides... etc. for multithreading. I have found a lot about POSIX Pthread, but almost nothing for GNU Pth (apart from the official manual in gnu.org).
So, can anyone explain me the difference between POSIX Pthread and GNU Pth? Please, I want the response not to be a copy of Wikipedia's contents (keep in mind that I'm an absolute newbie to multithreading). I want my server to be portable and efficient between all *nix-based systems, keeping away of using heavy fork()s.
Thanks for your help.
PS: I think it's better to ask this here: what about Windows? Are Pthreads or Pth an option there? If not, what is the API for that operating system?
Use Pthreads, it's much more widely used, so there is far more information and support available for it. I've never met anyone who actually uses GNU Pth. Or better yet if you are using C++11 use std::thread and if not then use boost::thread.
So, can anyone explain me the difference between POSIX Pthread and GNU Pth?
Pthreads is a cross-platform standard for pre-emptible multithreading, meaning (usually) the OS kernel manages the threads and the OS scheduler decides when each thread gets to run (if you have a single core only one thread can run at a time, if you have multiple cores multiple threads can run at a time). The OS scheduler could pause any thread at (almost) any time and let another thread run, so each thread gets a limited "time slice" and then other threads get to run.
GNU Pth is a non-preemptible user-space threading library, meaning the threads and which ones run at which time are decided in user-space not by the kernel. Some people say programs using non-preemptible threading libraries are easier to understand, because your thread won't get paused at arbitrary times for another thread to run.
I want my server to be portable and efficient between all *nix-based systems, keeping away of using heavy fork()s.
fork is not heavy on UNIX.
what about W*ndows? Are Pthreads or Pth an option there? If not, what is the API for that operating system?
There are pthreads APIs for Windows, but they're not native to the Windows OS. I don't know if GNU Pth works on Windows - I doubt it, unless you use Cygwin. Windows has its own Win32 thread model.
Using std::thread or boost::thread is portable to POSIX platforms and Windows, and makes certain parts of the API easier to use (specifically, locking and unlocking mutexes can be easily done in an exception safe way and condition variables are easier to use.)
Gnu PTH is for a very limited use case: you want to use a multi-threaded implementation paradigm but you don't want to use multiple CPUs or cores and you don't want to rely on any OS or kernel-level support. Since almost all general-purpose CPUs now have multiple cores, this use case is increasingly irrelevant.
Windows has a separate threading model from POSIX; if you want your application to be cross-platform it is best to use a cross-platform threading library such as boost::thread.
I think GNUs PTH is meaned for C in the first place. You can use it on C++ too but C++ have its own anyway.
There are quite some applications using pth like low-level burning tools (and so GUI-Tools like K3B and Brasero depend on pth), also GnuPG uses PTH, the package management of Archlinux and some multimedia stuff.
On Windows its always a bit complicated. Microsoft did never get over the fact that C is the Programming Language from/for UNIX-Systems and so is suffering the NIH Symptome (Not Invented Here)
So they do a lot of stuff without any advantage just to be different.
If you use an Application which should run everywhere and its not low-level, use Qt with its QThreads and QThreadPool
Its 100% the same on all operating systems
You need much less code
If you write an "low-level" application i recommend to split your applications into backends and frontends and write a own backend for each OS and use the library which will do the least problems.

How make a threading mechanism in C++?

I know there are some threading libraries for C++, like Pthread, Boost etc out there, but how are they working? There must be an implementation of the logic somewhere.
Let's say that I would like to write my own threading mechanism in C++, not using any library, how would I start? What should I have in mind when writing it?
You'd directly call the underlying API calls in the operating system. For example, CreateThread. Naturally, this is cumbersome and platform-specific, which is why we like to use portable C++ threading libraries...
In C++98/03, there is no notion of a "thread", so the question cannot be answered within the language. In C++11, the answer is to use <thread>.
On the implementation side, threading is an operating system feature. The operating system already has to schedule multiple processes (i.e. separate programs), and a multi-threading OS adds to that the ability to schedule multiple threads within one process. A the very heart, the OS may or may not take advantage of having physically more than one CPU (though that also applies to simple multi-processing; and conversely you can schedule multiple threads on a single CPU). At the heart of the programming, you will need hardware support for synchronisation primitives like atomic read/writes and atomic compare-and-swap to implement correct memory access. (This is not needed for only multi-processing, because separate processes have distinct memory; although it will be needed by the OS itself if there are multiple physical CPUs in use.)
Well, you need something which is able to run several threads.
If you are working on developing an operating system kernel on the bare metal, I think that current multi-core processors have only one core working after their power-on reset. Even the BIOS on most PCs probably keep only one core working (and the other cores idle). So you'll need to write (assembly, non-portable) code to start other cores.
And (as James reminded you), most of the time you are using some operating system kernel. For instance, on Linux (I don't know about Windows), threads are known by the kernel (because the tasks it is scheduling are threads) and they need to be initiated by the Linux clone(2) system call.
Often, kernel threads are quite heavy, and the system has a library (NPTL for Linux Posix threads) which may use fewer kernel threads than user threads (actually Linux NPTL is a 1:1 mapping between kernel and user threads, but on some other systems, like probably Solaris, things are different).
You can't write your own threading mechanism, unless you mean pseudo-threads like co-routines and not actual concurrently executing threads. This is because the fundamental thread mechanism is defined by the kernel and you can't change it nor implement your own. Any library you write must fall back, eventually, to the operating system.