Is there a way to limit CPU usage in C++?
I mean can I write a program which takes input %CPU to be used and it uses maximum that much amount of CPU.
I basically want to add the functionality of cpulimit command internally in the program.
If there is, how to do it ?
Edits:
Environment: Linux (debian) with gcc 6.1. It should support as many arbitrary numbers as possible. i.e a range of 1% - 100% . If the OS cannot do, an error can be logged and the nearest value to it can be used or any other solution that is recommended when the OS restricts that number.
Linux does not provide the means to set a specific percentage. However, the nice(2) system call lowers the priority of the process in relation to other processes on the system, thus achieving some sort of a relative percentage of CPU, in relation to other processes on the system.
You can also use the setrlimit(2) system call to set your process's RLIMIT_CPU, as a fixed amount.
Use getrusage(), see Linux commands to detect the computer resource usage of a program
And when you check, and you've used however many milliseconds of runtime you want, use nanosleep() to sleep a few milliseconds. Adjust percentages to match your requirements.
Related
I'm interested in finding the total activity of the system between two instants in time. Under Linux, I can do this by reading /proc/stat twice. Under MacOS, I can call host_statistics twice. Under BSDs, I can also manage with a sysctl.
However, I have no idea how to do this under Windows. The closest equivalent I have found is \Processor(*)\% Processor Time, but this only gives me some kind of "current activity" (10ms-scale, if I understand correctly), rather than a total activity.
I am coding in regular (unmanaged) C++. My code needs to check the system use every ~10-15 seconds.
Ah, it seems that GetSystemTimes will provide the data I need.
My program is in C++ and I have one server listening to a number of clients. Clients send small packets to server. I'm running my code on Ubuntu.
I want to measure the CPU utilization and possibly total number of CPU cycles on both sides, ideally with a breakdown on cycles/utilization spent on networking (all the way from NIC to the user space and vice versa), kernel space, user space, context switches, etc.
I did some search, but I couldn't figure out whether it should be done inside my C++ code or an external profiler should be used, or perhaps some other way.
Your best friend/helper in this case is the /proc file system in Linux. In /proc you will find CPU usage, memory usage, power usage etc. Have a look at this link
http://www.linuxhowtos.org/System/procstat.htm
Even you can check each process cpu usage by looking at the files /proc/process_id/stat.
Take a look at the RDTSCP instruction and some other ways to measure performacnce metrics. System simulators like SniperSim, Gem5 etc. can also give total cycle count of your running program ( however, they may not be very accurate - there are some conditions that need to be met (core frequencies are same etc.))
As I commented, you probably should consider using oprofile. I am not very familiar with it (and it may be complex to use, and require system-wide configuration)
I'ce written a C++ program that does some benchmarking on a couple of algorithms. Some of these algorithms are using other libraries for their calculations. These external libraries (which I don't have control over) use multi-threading, which makes it difficult to get a proper benchmark (some algorithms are single-threaded, some are multi-threaded).
So when doing the benchmarking, I want to limit the threads to 1. Is there anyway I can start a program in Linux and tell it to use a maximum of 1 thread, instead of the default in the external libraries (which is equal to the number of cores)?
I'm not sure it is possible what you ask from an OS perspective [what could the OS do? return failure when the next thread is requested?].
I'd search in the libraries documentation, they will probably provide configuration values for the number of threads they're using.
Alternatively, you can do processor affinity using the taskset command
It is not exactly what you requested, but it ensures that your program will be running on a given CPU (or a set of CPUs). So regardless of the number of threads spawned, the program will be using at any time at most one CPU (or the number of CPUs specified), so the effective parallelism will be controlled.
Alternatively x2 Triggered by the comment from #goocreations... One way of fooling the programs into believing they have a different number of CPUs available is to run them in a virtual machine (e.g. VirtualBox)
I need to get exact timestamps every couple of ms (20, 30, 40ms) over a long period of time (a couple of hours). The function in which the timestamp is taken is invoked as a callback by a 3rd-party library.
Using GetSystemTime() one can get the correct system timestamp but only with milliseconds accuracy, which is not precise enough for me. Using QueryPerformanceTimer() yields more accurate timestamps but is not synchronous to the system timestamp over a long period of time (see http://msdn.microsoft.com/en-us/magazine/cc163996.aspx).
The solution provided at the site linked above somehow works only on older computers, it hangs while synchronizing when i try to use it with newer computers.
It seems to me like boost is also only working on milliseconds accuracy.
If possible, I'd like to avoid using external libraries, but if there's no other choice I'll go with it.
Any suggestions?
Deleted article from CodeProject, this seems to be the copy: DateTimePrecise C# Class The idea is to use QueryPerformanceCounter API for accurate small increments and periodically adjust it in order to keep long term accuracy. This is about to give microsecond accuracy ("about" because it's still not exactly precise, but still quite usable).
See also: Microsecond resolution timestamps on Windows
Which language are you using?
In Java (1.5 or above) I'd suggest 'System.nanoTime()' which requires no import.
Remember in Windows that time-slice granularity is 1000ms / 64 = 15.625ms.
This will affect inter-process communication, especially on uni-processor machines, or machines that run several heavy CPU usage processes 'concurrently'*.
In fact, I just got DOS 6.22 and Windows for Workgroups 3.11/3.15 via eBay, so I can screenshot the original timeslice configuration for uni-processor Windows machines of the era when I started to get into it. (Although it might not be visible in versions above 3.0).
You'll be hard pressed to find anything better than QueryPerformanceTimer() on Windows.
On modern hardware it uses the HPET as a source which replaces the RTC interrupt controller. I would expect QueryPerformanceTimer() and the System clock to be synchronous.
There is no such QueryPerformanceTimer() on windows. The resource is named QueryPerformanceCounter(). It provides a counter value counting at some higher frequency.
Its incrementing frequency can be retrieved by a call to QueryPerformanceFrequency().
Since this frequency is typically in the MHz range, microsecond resolution can be observed.
There are some implementations around, i.e. this thread or at the Windows Timestamp Project
At my company, we often test the performance of our USB and FireWire devices under CPU strain.
There is a test code we run that loads the CPU, and it is often used in really simple informal tests to see what happens to our device's performance.
I took a look at the code for this, and its a simple loop that increments a counter and does a calculation based on the new value, storing this result in another variable.
Running a single instance will use 1/X of the CPU, where X is the number of cores.
So, for instance, if we're on a 8-core PC and we want to see how our device runs under 50% CPU usage, we can open four instances of this at once, and so forth...
I'm wondering:
What decides how much of the CPU gets used up? does it just run everything as fast as it can on a single thread in a single threaded application?
Is there a way to voluntarily limit the maximum CPU usage your program can use? I can think of some "sloppy" ways (add sleep commands or something), but is there a way to limit to say, some specified percent of available CPU or something?
CPU quotas on Windows 7 and on Linux.
Also on QNX (i.e. Blackberry Tablet OS) and LynuxWorks
In case of broken links, the articles are named:
Windows -- "CPU rate limits in Windows Server 2008 R2 and Windows 7"
Linux -- "CPU Usage Limiter for Linux"
QNX -- "Adaptive Partitioning"
LynuxWorks - "Partitioning Operating Systems" and "ARINC 653"
The OS usually decides how to schedule processes and on which CPUs they should run. It basically keeps a ready queue for processes which are ready to run (not marked for termination and not blocked waiting for some I/O, event etc.). Whenever a process used up its timeslice or blocks it basically frees a processing core and the OS selects another process to run. Now if you have a process which is always ready to run and never blocks then this process essentially runs whenever it can thus pushing the utilization of a processing unit to a 100%. Of course this is a bit simplified description (there are things like process priorities for example).
There is usually no generic way to achieve this. The OS you are using might offer some mechanism to do this (some kind of CPU quota). You could try and measure how much time has passed vs. how much cpu time your process used up and then put your process to sleep for certain periods to achieve an approximation of desired CPU utilization.
You've essentially answered your own questions!
The key trait of code that burns a lot of CPU is that it never does anything that blocks (e.g. waiting for network or file I/O), and never voluntarily yields its time slice (e.g. sleep(), etc.).
The other trick is that the code must do something that the compiler cannot optimize away. So, most likely your CPU burn code outputs something based on the loop calculation at the end, or is simply compiled without optimization so that the optimizer isn't tempted to remove the useless loop. Since you're trying to load the CPU, there's no sense in optimizing anyways.
As you hypothesized, single threaded code that matches this description will saturate a CPU core unless the OS has more of these processes than it has cores to run them--then it will round-robin schedule them and the utilization of each will be some fraction of 100%.
The issue isn't how much time the CPU spends idle, but rather how long it takes for your code to start executing. Who cares if it's idle or doing low-priority busywork, as long as the latency is low?
Your problem is fundamentally a consequence of using a synthetic benchmark, presumably in an attempt to obtain reproducible results. But synthetic benchmarks tend to produce meaningless results, so reproducibility is moot.
Look at your bug database, find actual customer complaints, and use actual software and test hardware to reproduce a situation that actually made someone dissatisfied. Develop the performance test in parallel with hard, meaningful performance requirements.