Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 months ago.
Improve this question
I am using Pagmo, a C+ API for my optimization problem. Pagmo launches a new thread when a new optimization is launched, via invocation of island.evolve(). MY point is that I don't have fine-grained control here over the type of thread that's launched "under the hood" of Pagmo. I can query Pagmo threads on their status - as to whether they've completed their run. I have a machine with 28 physical cores and I'm thinking that the optimum number of my running threads would be on the order of 28. Right now, my code just dumps the whole lot of threads onto the machine and that's substantially more than the number of cores - and hence, likely very inefficient.
I'm thinking of using a std::counting_semaphore (C++ 20) and setting the counter to 28. each time I launch a thread in Pagmo, I would decrement the semaphore counter. When it hits 0, the remaining threads would block and wait in the queue until the counter was incremented.
I'm thinking I could run a loop which queried the Pagmo threads as to their statuses and increment the std::counting_semaphore's counter each time a thread went idle (meaning its task was completed). Of course, the Pagmo threads are ultimately joined. Each time the counter goes above 0, a new thread is allowed to start - so I believe.
My questions are:
is the the best practice as to how to limit/throttle the number of running threads using modern C++?
Is there a better way I haven't thought of?
Is there a way to query Linux in real time, as to the number of running threads?
Thanks in advance!
Phil
I had tried a simple loop to launch and throttle theads creation but it didn't prove to work well and threads were launched too quickly.
First of all, your post could use some editing and even perhaps provide a code snippet that would help us understand the problem more. Right now I'm only going through the documentation based on a wild guess of what you are doing there.
I've quickly checked what Pagmo is about and I would at first advise to be careful when limiting any library that is designed for parallel computation, from outside of the library.
I will try to answer your questions:
I do not think that this is the best way to throttle threads created by an external library
Yes, first of all I've checked the Pagmo API documentation and if I understand you correctly you are using an island class - based on what they state in their documentation the default class that inherits island and is constructed by the default ctor is thread_island (at least on non-POSIX systems - which may not be your case). However thread_island can be constructed via the thread_island(bool use_pool) ctor which indicates that you can specify to these island classes a common thread pool which they can use. And if it is done for non-POSIX systems, it is most likely done for POSIX systems as well. So if you wish to limit the number of threads I would do this via a thread pool.
You can limit the maximum number of threads running on linux via /proc/sys/kernel/threads-max, you can also instruct a Linux system to treat some processes with less importance via niceness
Hope it helps!
EDIT: As a foot note I will also mention that the documentation actually encourages the use of thread_island even on POSIX systems. See this link here
EDIT2: In case that you must use fork_island due to their mentioned issues when thread_safety cannot be guaranteed. Then another option would be to limit available resources via setrlimit see this link right here - you are interested in setting RLIMIT_NPROC
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last month.
Improve this question
I have around 50 similar tasks - each one performs network calls to some server and writes the response to the db. Each task has its own set of calls it should make. Each task talks to one server, if that's relevant.
I can have one process, each running its own task with its own set of calls it should make (so no memory sharing is needed between the threads). Another option is a process for each task. Or a combination.
Should I choose multiple threads because switching between processes is more costly?
It runs on Ubuntu.
I don't care about the cost of the creation of the thread/process. I want to start them and let them run for a long time.
Strictly in terms of performance, it comes down to initialization time because as far as execution time goes, in this particular case that there is no interprocess communication required, execution time will be very similar.
Threads are faster to initialize since there is less duplication of resources to initialize.
The so much advertised fact that "threads switch faster than process" like in this StackOverflow question/answer is a preposterous non-sequitur. If a batch is running at 100% cpu and there are enough free cores to supply the demand, the OS will not ever bother to switch threads into new cores. This will only happen when threads block and wait for input (or output chance) from network or disk - in which case the price of 3-20 microseconds will completely overshadow the few nanoseconds of advantage of an additional memory cache miss.
The only reason why the operating system would preempt a thread or process other than I/O block is when the machine is running out of computing resources, ie when it is running at more than 100% on every single available CPU. But this is such a special, degenerated case that should not belong in a generic response.
However if the running time is much larger than the initialization time, say the process runs for several minutes while it takes a few milliseconds to start up, then performance should definitely be the same and your requirement should be other like facility of administration of the process batch. Another point to keep in mind is that if one process dies, the others finish, while if one thread crashes, the entire batch dies as well.
There might be some extra benefits of using threads as well due to the lesser use of system resources but I believe these would be negligible.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
What are the different types of thread in C++ ?
I already know multiprocessing and multi threading . I know how to create threads in normal C++ , VC++ but not sure what is mean by different types of thread.
From the software point of view, there are no "different types of threads". A thread is a thread. Are there different types of pixels on the screen? Nope. It's similar. However, in some contexts, you MAY differentiate threads over their intended purpose. You can have:
os threads (or kernel threads) vs user threads (or application threads)
main thread vs ad-hoc threads vs pooled threads
background threads vs high-priority threads
etc
but a thread is a thread. They are all the same in terms of their basic properties: they run some specified code. The difference is in how they are used and what priority they have (= how often they get some processor time to do their work) or what code they are allowed to run.
...ok, thinking a bit more about the terms used in different contexts, ACTUALLY, there are 2 types of threads, and both are just called 'threads':
software threads
hardware threads
The difference is that the former one is what the operating system's scheduler manages (creates/wakes/puts to sleep/kills/etc). The number of those is virtually limited only by the available memory. You may have 100-1000-10000 software threads, no problem.. The latter refers to the actual electronic structures that execute them. There's always a much lower limit there. Not long ago each CPU could execute just a single thread. You wanted to run 8 threads? have a 8-cpu motherboard. Today, most CPUs have multiple "cores" and can each can execute several (usually 4-16) threads.
However, in my region, when someone says "a thread", they mean a "a software thread", and when someone wants to refer to the latter, they say explicitly "a hardware thread". That's why I didn't think of this at first, probably I'm more of a software guy, why in a hardware team they may "thread"="hardware thread" by default.
In general, there are two types of multitasking: process-based and thread-based.
Process-based multitasking handles the concurrent execution of programs which is something like two people doing same tasks or the first person doing the task & the second person doing the sub-task of the same task.
Thread-based multitasking deals with the concurrent execution of pieces of the same program which is something like you are using different parts of your body for some work ( or say, multitasking ).
I don't know if I my above analogies are correct or not ( in reference to your understanding ).
For further information, you can follow this link.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Linux : How to detect a process that consumes the maximum memory and kill it using std::thread? I am new to C++ coding, therefore an explanation with C++ code to implement the function would be highly appreciated.
The exact text of the assignment is to write a C++ code that monitors the memory usage of the device and it shall be aware of the device reached in its targeted maximum memory usage. When the thread detects this condition, it shall be able to identify the process which is taking more memory and do the following actions, Check the process against application priority list. If the process is in low priority category, stop the process and restart. Otherwise, inform the user about the memory over run is happened because of the identified process, take the restart based on the user confirmation. Restart shall be device restart or process restart, that will be decided based on the nature of process which is caused this condition ยท The details shall be captured in the logging file
You might want to look at the Linux OOM (Out-of-memory) killer.
From this link:
It is the job of the linux 'oom killer' to sacrifice one or more processes in order to free up memory for the system when all else fails.
So, technically, you don't need to do anything about it. ;-)
But, if you still want to write it yourself according to your own criteria for choosing and killing the victim process, you may create a Linux service (which run in the background all the time) to do that. The sample code is there in the linked article.
Regarding your std::thread point, if you already have an executable and you want to spawn a dedicated thread to do this, yes, you can do that also. The logic will simply move into that thread.
Additional reading:
How to Configure the Linux Out-of-Memory Killer
Sorry, my first question here. I'm not sure to be the first to ask this, but I could not find answers anywhere.
Modern CPU are heavily multi-threaded/cored but Linux does not garantee processes/threads to physically run in the same time (time sharing).
I'd like my (C++) programs to take advantage of this hardware: spawn small tasks (to update a Hash, copy a data) while going on with the main thread. The goal is to run the program faster. As it does not make sense to spawn a 500ns task and to wait 1ms for its execution I'd like to be (almost) sure that the task will be really executed in the same time as the main thread.
I could not find any paper or discussion on this subject, but I'm not sure to search properly, I just don't know how this thing would be named.
Could someone tell me:
- what's the name of such parallel (same time) executions ?
- is this possible on Linux (or which kind of OS offer such service) ?
Thanks
I realized that my question was more OS than programming oriented, and that I should ask it on a more appropriated Programmers site, here:
https://softwareengineering.stackexchange.com/questions/325257/possiblity-to-request-several-linux-threads-scheduled-together-in-the-same-time
Thanks for answers, they made me advance and better define what I'm looking for.
What you are looking for is cpu thread affinity and cgroups under Linux. There is a lot of complexity to it and you will need to experiment with your particular requirements.
A common strategy in low latency applications is to assign a CPU resource solely to a particular process or thread. The thread runs 'hot' thereby never releasing the CPU to any other process including the kernel.
The Reactor pattern is useful for this model. A queue is setup on a hot thread with other threads feeding the queue. The queue itself can be a bottleneck but thats a whole other discussion. So in your example, the main (hot?) thread will be writing events into the queue of other hot worker threads. Some sort of rendezvous event will indicate to the main thread that the worker threads are finished.
This strategy is only useful for CPU bound applications. If your application is I/O bound then the kernel will likely do a better job than a custom algorithm.
To directly answer your question, yes this is possible in Linux with C/C++ but it is not trivial.
I have some old articles on my blog that may be of interest.
http://matthewericfisher.tumblr.com/post/6462629082/low-latency-highly-scalable-robust-systems
--Matt
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I am developing a C++ application, using pthreads library. Every thread in the program accesses a common unordered_map. The program runs slower with 4 threads than 1. I commented all the code in thread, and left only the part that tokenizes a string. Still the single-threading execution is faster, so I came to the conclusion that the map wasn't the problem.
After that I printed to the screen the threads' Ids, and they seemed to execute sequentially.
In the function that calls the threads, I have a while loop, which creates threads in an array, which size is the number of threads (let's say 'tn'). And every time tn threads are created, I execute a for loop to join them. (pthread_join). While runs many times(not only 4).
What may be wrong?
If you are running a small trivial program this tends to be the case because the work to start the threads, schedule priority, run, context switch, then sync could actually take more time then running it as a single process.
The point here is that when dealing with trivial problems it can run slower. BUT another factor might be how many cores you actually have in your CPU.
When you run a multitthreaded program, each thread will be processed sequentially according to the given CPU clock.
You will only have true multithreading if you have multiple cores. And in such scenario the only multithreading will be 1 thread /core.
Now, given the fact that you (most likely) have both threads on one core, try to keep in mind the overhead generated to the CPU for :
allocating different clock time for each thread
synchronizing thread accesses to various internal CPU operations
other thread priority operations
So in other words, for a simple application, multithreading is actually a downgrade in terms of performance.
Multithreading comes in handy when you need a asynchronous operation (meaning you don't want to wait for a rezult, such as loading an image from an url or streaming geomtery from HDD which is slower then ram) .
In such scenarios, applying multithreading will lead to better user experience, because your program won't hung up when a slow operation occurrs.
Without seeing the code it's difficult to tell for sure, but there could be a number of issues.
Your threads might not be doing enough work to justify their creation. Creating and running threads is expensive, so if your workload is too small, they won't pay for themselves.
Execution time could be spent mostly doing memory accesses on the map, in which case mutually excluding the threads means that you aren't really doing much parallel work in practice (Amdahl's Law).
If most of your code is running under a mutex that it will run serially and not in parllel