Threading With Multiple Cores C++ - c++

I don't know how to thread in C++ and I would not just wan't to know that but is there a way i can force a thread onto a different core? Also how would I find out how many cores the user has?

Binding thread to the arbitrary CPU is called setting affinity. It's platform-dependent operation.
For windows: SetProcessAffinityMask
For pthreads: pthread_attr_setaffinity_np(3) and pthread_setaffinity_np(3)
For Boost you can use native_handle() to get platform-specific thread handle to use them with functions above.

Related

How to create a user space thread? [duplicate]

I am just started coding of device driver and new to threading, went through many documents for getting an idea about threads. I still have some doubts.
what is a kernel thread?
how it differs from user thread?
what is the relationship between the two threads?
how can i implement kernel threads?
where can i see the output of the implementation?
Can anyone help me?
Thanks.
A kernel thread is a task_struct with no userspace components.
Besides the lack of userspace, it has different ancestors (kthreadd kernel thread instead of the init process) and is created by a kernel-only API instead of sequences of clone from fork/exec system calls.
Two kernel threads have kthreadd as a parent. Apart from that, kernel threads enjoy the same "independence" one from another as userspace processes.
Use the kthread_run function/macro from the kthread.h header You will most probably have to write a kernel module in order to call this function, so you should take a look a the Linux Device Drivers
If you are referring to the text output of your implementation (via printk calls), you can see this output in the kernel log using the dmesg command.
A kernel thread is a kernel task running only in kernel mode; it usually has not been created by fork() or clone() system calls. An example is kworker or kswapd.
You probably should not implement kernel threads if you don't know what they are.
Google gives many pages about kernel threads, e.g. Frey's page.
user threads & stack:
Each thread has its own stack so that it can use its own local variables, thread’s share global variables which are part of .data or .bss sections of linux executable.
Since threads share global variables i.e we use synchronization mechanisms like mutex when we want to access/modify global variables in multi threaded application. Local variables are part of thread individual stack, so no need of any synchronization.
Kernel threads
Kernel threads have emerged from the need to run kernel code in process context. Kernel threads are the basis of the workqueue mechanism. Essentially, a thread kernel is a thread that only runs in kernel mode and has no user address space or other user attributes.
To create a thread kernel, use kthread_create():
#include <linux/kthread.h>
structure task_struct *kthread_create(int (*threadfn)(void *data),
void *data, const char namefmt[], ...);
kernel threads & stack:
Kernel threads are used to do post processing tasks for kernel like pdf flush threads, workq threads etc.
Kernel threads are basically new process only without address space(can be created using clone() call with required flags), means they can’t switch to user-space. kernel threads are schedulable and preempt-able as normal processes.
kernel threads have their own stacks, which they use to manage local info.
More about kernel stacks:-
https://www.kernel.org/doc/Documentation/x86/kernel-stacks
Since you're comparing kernel threads with user[land] threads, I assume you mean something like the following.
The normal way of implementing threads nowadays is to do it in the kernel, so those can be considered "normal" threads. It's however also possible to do it in userland, using signals such as SIGALRM, whose handler will save the current process state (registers, mostly) and change them to another one previously saved. Several OSes used this as a way to implement threads before they got proper kernel thread support. They can be faster, since you don't have to go into kernel mode, but in practice they've faded away.
There's also cooperative userland threads, where one thread runs until it calls a special function (usually called yield), which then switches to another thread in a similar way as with SIGALRM above. The advantage here is that the program is in total control, which can be useful when you have timing concerns (a game for example). You also don't have to care much about thread safety. The big disadvantage is that only one thread can run at a time, and therefore this method is also uncommon now that processors have multiple cores.
Kernel threads are implemented in the kernel. Perhaps you meant how to use them? The most common way is to call pthread_create.

Can fibers migrate between threads?

Can a fiber created in thread A switch to another fiber created in thread B? To make the question more specific, some operating systems have fibers natively implemented (windows fibers),
other need to implement it themselves (using setjump longjump in linux etc.).
Libcoro for example wraps this all up in a single API (for windows it’s just a wrapper for native fibers, for Linux it implements it itself etc.)
So, if it's possible to migrate fibers between threads, can you give me an example usage in windows (linux) in c/c++?
I found something about fiber migration in the boost library documentation, but it's not specific enough about it's implementation and platform dependence. I still want to understand how to do it myself using only windows fibers for example (or using Libcoro on linux).
If it's not possible in a general way, why so?
I understand that fibers are meant to be used as lightweight threads for cooperative multitasking over a single thread, they have cheap context switching compared to regular threads, and they simplify the programming.
An example usage is a system with several threads, each having several fibers doing some kind of work hierarchy on their parent thread (never leaving the parent thread).
Even though it's not the intended use I still want to learn how to do it if it's possible in a general way, because I think I can optimize the work load on my job system by migrating fibers between threads.
The mentioned boost.fiber uses boost.context (callcc/continuation) to implement context switching.
Till boost-1.64 callcc was implemented in assembler only, boost-1.65 enables you to choose between assembler, Windows Fibers (Windows) or ucontext (POSIX if available; deprecated API by POSIX).
The assembler implementation is faster that the other two (2 orders of magnitude compared to ucontext).
boost.fiber uses callcc to implement lightweight threads/fibers - the library provides fiber schedulers that allow to migrate fibers between threads.
For instance one provided scheduler steals fibers from other threads if its run-queue goes out of work (fibers that are ready/that can be resumed).
(so you can choose Windows Fibers that get migrated between threads).

Logging and multithreading

I noticed that most loggers are advertised as thread safe.
What does it mean?
Are they safe against a specific threading library or can they be safe in any multithreading environment (e.g. PThread, Boost threads, C++11 threads, Win32 threads, OpenMP threads, ...)?
It means you won't get something like this in your log files:
this is the line from the firsThis is line from the second thread
t thread
Usually it means that loggers use required locking when they write to the stream in any supported environment.
If a logger is thread safe that means you can call its functions from any threads (be it pthread or boost or openmp). That is usually done by using mutexes to prevent simultaneous output. Without them your program may output mixed lines or even crash if log is used from different threads.

creating thread on another core? (WinAPI)

I was wondering if there was a way to run a thread on a seperate core instead of just a thread on that core?
Thanks
If you create a thread, you have by default no control on which core it will run. The operation system's scheduling algorithm takes care of that, and is pretty good at its job. However, you can use the SetThreadAffinity WinAPI to specify the logical cores a thread is allowed to run on.
Don't do that unless you have very good reasons. Quoting MSDN:
Setting an affinity mask for a process or thread can result in threads receiving less processor time, as the system is restricted from running the threads on certain processors. In most cases, it is better to let the system select an available processor.

fcntl() for thread or process synchronization?

Is it possible to use fcntl() system call on a file to achieve thread/process synchronization (instead of semaphoress)?
Yes. Unix fcntl locks (and filesystem resources in general) are system-wide, so any two threads of execution (be they separate processes or not) can use them. Whether that's a good idea or not is context-dependent.
That's one way of synchronizing between processes, but if you don't want to use semaphores, you could use process shared mutexes, such as mutexes and condition variables created with the PTHREAD_PROCESS_SHARED attribute on POSIX based platforms (see pthread_mutexattr_setpshared() and pthread_condattr_setpshared()). Another option is to use an event based IPC (sockets, etc) mechanism that blocks until an event you define is demultiplexed (e.g. via select()). There are several other shared memory based options as well.
However, since you're using C++ I'd recommend using a C++ framework that greatly simplifies this sort of interprocess synchronization across multiple platforms like boost.interprocess or ACE.
The fcntl and flock are not for thread, but for process, so they cannot be used for thread synchronization.