creating thread on another core? (WinAPI) - c++

I was wondering if there was a way to run a thread on a seperate core instead of just a thread on that core?
Thanks

If you create a thread, you have by default no control on which core it will run. The operation system's scheduling algorithm takes care of that, and is pretty good at its job. However, you can use the SetThreadAffinity WinAPI to specify the logical cores a thread is allowed to run on.
Don't do that unless you have very good reasons. Quoting MSDN:
Setting an affinity mask for a process or thread can result in threads receiving less processor time, as the system is restricted from running the threads on certain processors. In most cases, it is better to let the system select an available processor.

Related

Allocating specific logical cores to specific processes exclusively, Windows, C++

If possible I do wish to allocate a logical core to a single process exclusively.
I am aware that Winbase.h contains Get/SetProcessAffinityMask and SetThreadAffinityMask.
I can get all processes running when the specific process is started and set their affinities to other logical cores, however, I do not want to check all processes in a periodic manner, for instance in order to deal with processes launched after the initiation of my process.
Furthermore there will be other processes which need to use specific logical cores only exclusively (no other process shall waste resources on that logical core). For instance my process shall run on core 15 but another shall run only on core 14.
Is there a better and more permanent way to allocate specific logical cores to specific processes than above mentioned Get/SetProcessAffinityMask scheme.
Windows is not a real-time operating system. Windows is designed to do preemptive multitasking with isolated processes, like basically any other modern desktop OS. A process is not supposed to just lock out every other process from a particular core, therefore, there is no API to explicitly do so (at least I'm not aware of one). It's up to the OS scheduler to decide which threads get to run when and where. That's the whole idea. You can use thread priorities to tell the scheduler that certain threads should be given a chance to run over others. You can use affinity masks to tell the scheduler which cores a thread can be scheduled to. You can even set a preferred core for your thread. But you don't get to schedule threads yourself.
Note that there's apparently a way to get something a bit like what you're looking for to work on Linux (see this question for more). I don't think similar possibilities exist on Windows. Yes you could try to hack together some solution based on a background task that continuously monitors and adjusts the priorities and affinity masks of all the threads in the system to approximate the desired behavior (like the person in the question linked by Ben Voigt above has apparently tried, and failed to achieve). But why would you want to do that? It goes completely against the very nature of everything an OS like Windows is designed to do. To me, what you are asking sounds a lot like what you're really looking for is a completely different kind of operating system, or maybe even no operating system at all. Boot the CPU straight into your own image and you get to drive all the cores in whatever way you fancy…

Ensure that each thread gets a chance to execute in a given time period using C++11 threads

Suppose I have a multi-threaded program in C++11, in which each thread controls the behavior of something displayed to the user.
I want to ensure that for every time period T during which one of the threads of the given program have run, each thread gets a chance to execute for at least time t, so that the display looks as if all threads are executing simultaneously. The idea is to have a mechanism for round robin scheduling with time sharing based on some information stored in the thread, forcing a thread to wait after its time slice is over, instead of relying on the operating system scheduler.
Preferably, I would also like to ensure that each thread is scheduled in real time.
In case there is no way other than relying on the operating system, is there any solution for Linux?
Is it possible to do this? How?
No that's not cross-platform possible with C++11 threads. How often and how long a thread is called isn't up to the application. It's up to the operating system you're using.
However, there are still functions with which you can flag the os that a special thread/process is really important and so you can influence this time fuzzy for your purposes.
You can acquire the platform dependent thread handle to use OS functions.
native_handle_type std::thread::native_handle //(since C++11)
Returns the implementation defined underlying thread handle.
I just want to claim again, this requires a implementation which is different for each platform!
Microsoft Windows
According to the Microsoft documentation:
SetThreadPriority function
Sets the priority value for the specified thread. This value, together
with the priority class of the thread's process determines the
thread's base priority level.
Linux/Unix
For Linux things are more difficult because there are different systems how threads can be scheduled. Under Microsoft Windows it's using a priority system but on Linux this doesn't seem to be the default scheduling.
For more information, please take a look on this stackoverflow question(Should be the same for std::thread because of this).
I want to ensure that for every time period T during which one of the threads of the given program have run, each thread gets a chance to execute for at least time t, so that the display looks as if all threads are executing simultaneously.
You are using threads to make it seem as though different tasks are executing simultaneously. That is not recommended for the reasons stated in Arthur's answer, to which I really can't add anything.
If instead of having long living threads each doing its own task you can have a single queue of tasks that can be executed without mutual exclusion - you can have a queue of tasks and a thread pool dequeuing and executing tasks.
If you cannot, you might want to look into wait free data structures and algorithms. In a wait free algorithm/data structure, every thread is guaranteed to complete its work in a finite (and even specified) number of steps. I can recommend the book The Art of Multiprocessor Programming where this topic is discussed in length. The gist of it is: every lock free algorithm/data structure can be modified to be wait free by adding communication between threads over which a thread that's about to do work makes sure that no other thread is starved/stalled. Basically, prefer fairness over total throughput of all threads. In my experience this is usually not a good compromise.

Make sure that main thread run on it's own core alone

I have a main thread which do some not-so-heavy-heavy work and also I'm creating worker threads which do very-heavy work. All documentation and examples shows how to create a number of hardware threads equal to std::thread::hardware_concurrency(). But since main thread already existed the number of threads becomes std::thread::hardware_concurrency() + 1. For example:
my machine supports 2 hardware threads.
in main thread I'm creating this 2 threads and the total number of threads becomes 3.
a core with the main thread do it's job plus (probably) the worker job.
Of course I don't want this because UI (which is done in main thread) becomes not responsive due to latency. What will happen if I create std::thread::hardware_concurrency() - 1 thread? Will it guarantee that the main thread and only main thread is running on single core? How can I check it?
P.S.: I'm using some sort of pool - I start threads on the program start and stop on exit. During the execution all worker threads run infinite while loop.
As others have written in the comments, you should carefully consider whether you can do a better job than the OS.
That being said, it is technically possible:
Use the native_handle method to get the OS's handle to your thread.
Consult your OS's documentation for setting the thread affinity. E.g., using pthreads, you'd want pthread_set_affinity.
This gives you full control over where each thread runs. In particular, you can give one of the threads a core of its own.
Note that this isn't part of the standard, as it is a level that is not portable. This might serve as another hint that it's possibly not what you're looking for.
No - std::thread::hardware_concurrency() only gives you a hint about the potential numbers of cores in use for multithreading. You might be interested in CPU Affinity Masks (Putting Threads on different CPUs). This works on the pthread level which you can reached via std::thread::native_handle (http://en.cppreference.com/w/cpp/thread/thread/native_handle)
Depending on your OS, you can get the thread's native handle, and control their priority levels using pthread_setschedparam(), for example giving the worker threads a lower priority than the main thread. This can be one solution to the UI problem. In general, number of threads need not match number of available HW cores.
There are definitely cases where you want to be able to gain full control, and reliably analyze what is going on. You are using Windows, but as an example, it is possible on a multicore machine to exclude e.g. one core from the normal Linux OS scheduler, and use that core for time-critical hard real-time tasks. In essence, you will own that core and handle interrupts for it, thereby enabling something close to hard real-time response times and predictability. Requires careful programming and analysis, and takes a significant effort. But very attractive if done right.

how to run each thread on other core?

I have a udp server that receive data and computing it.
I have two thread for each role.
In my cpu is a 8 multi-core and I send data in varius speed.
but at maximun I use ony %14 percent of my cpu two core 50%. if I send more data valume my buffer will fulled and don't use more cpu.
why each core arise only 50% and not more?
I think to divide this two role to multi-core.
I want to be sure that each one on other core.
how I can Explicitly to choose each thread run on other core?
my program worte on c++ visaul studio 9 and run on windows7 and I use boost::thread.
The scheduler will deal with where your threads etc will run. This is OS specific, therefore if you want to attempt to alter how code is run you would need an OS specific API that lets you set a threads affinity etc.
Also, depends what you application is like, its a client server by the looks of it, so its not totally CPU bound. How many threads do you have in total, you mention 2 per role? A thread can only be run on one CPU. Try make units of work that can truly run in parallel, that way they can be truly run independently, ideally on different cores.
The OS will generally do a good job of running your code since it will have a better overall picture.
You cannot make one thread use more than one core. To achieve better CPU utilization you need to redesign your program to create more threads and let the OS schedule them for you. There's no need to manually restrict the threads to specific cores. OSes are really good at figuring out how to allocate cores to threads.
In your case, if the data computing tasks are CPU heavy, you could spawn a new thread per request or have a worker thread pool that would be picking incoming tasks and processing them. This is just one of ideas. It's difficult to say without knowing more about your application architecture and the problems it's trying to solve.
In each thread you can use SetThreadAffinityMask to choose CPUs that your thread should run on it. But I suggest you create a new worker thread for each incoming request (also if you use a thread pool you see considerable performance boost)
Be care that the compiler and linker settings are enabling multithreading.
Best practice is also not to start many threads but long living thread which do some amount of queued work liked computations or downloads.

Allocate more processor cycles to my program

I've been working on win32, c,c++ for a while. I code on visual studio. Most of the time I see system idle process uses more cpu utilization. Is there a way to allocate more processor cycles to my program to run it faster? I understand there might be limitations from i/o, in those cases this question doesn't make any sense.
OR
did i misunderstood the task manager numbers? I'm in a confusion, please help me out.
And I want to do something in program itself, btw I will be happy if answers are specific to windows.
Thanks in advance
~calvin
If your program it the only program that has something to do (not wait for IO), its thread will always be assigned to a processor core.
However, if you have a multi-core processor, and a single-threaded program, the CPU usage of your process displayed in the task manager will always be limited by 100/Ncores.
For example, if you have a quad-core machine, your process will be at 25% (using one core), and the idle process at around 75%. You can only additional CPU power by dividing your tasks into chunks that can be worked on by separate threads which will then be run on the idle cores.
The idle process only "runs" when no other process needs to. If you want to use more CPU cycles, then use them.
If your program is idling, it doesn't do anything, i.e. there is nothing that could be done any faster. So the CPU is probably not the bottle-neck in your case.
Are you maybe waiting for data coming from the disk or network?
In case your processor has multiple cores and your program uses only one core to its full extent, making your program multi-threaded could work.
In a multitask / multithread OS the processor(s) time is splitted among threads.
If you want a specific thread to get bigger time chunk you can set its priority with the SetThreadPriority function, not wise to do it though.
Only special software (should) mess with those settings.
It's common for window applications to have a low cpu usage percent (which we see in the task manager)
because most of the time they just wait for messages.
Use threads to:
abstract away all the I/O waits.
assign work to all cores.
also, remove all sleep-wait states from main thread.
Defer all I/O to a thread, so that wait states are confined within it. Keep the actual computations in the foreground thread, and use synchronization mechanisms that make the I/O slave thread to wait for your main thread when communicating.
If your CPU is multi-core, and your problem is paralellizable, create as many threads as you have cores, research "set affinity" functions to assign them between the cores and still keep a separate thread for all I/O.
Also pay attention not to wait in your main thread - usleep(1) doesn't send you into background for 1 microsecond, but for "no less than..." and that may mean anything between 1ms and 100ms but hardly ever less than that, and never anything close to a microsecond.