Why am I not getting a test_time close to 5 seconds?

Why am I not getting a test_time close to 5 seconds? - c++

I'am trying to make a time meter. My OS is Windows.Here is a small piece of code that gives a strange result. If a thread sleeps for 1ms and does this 5000 times, then I would expect it to take roughly 5 seconds. But I get as a result that test_time = 12.8095
Do not understand why?
How can I fix the code so that I get a time meter that can measure durations of about 1 millisecond?
std::atomic_bool work = true;
size_t cnt{};
std::chrono::duration<double> operation_time;
double test_time;
auto start_time_ = std::chrono::high_resolution_clock::now();
std::thread counter_ = std::thread([&work, &cnt]() {
while (work) {
std::this_thread::sleep_for(std::chrono::milliseconds(1));
cnt++;
if (cnt >= 5000)
work = false;
}
});
if (counter_.joinable())
counter_.join();
operation_time = std::chrono::duration<double>(std::chrono::high_resolution_clock::now() - start_time_);
test_time = operation_time.count();
std::cout << "test_time = " << test_time << std::endl;

As previously stated, sleep_for
Blocks the execution of the current thread for at least the specified sleep_duration.
The answer to your question would be to use sleep_until
Blocks the execution of the current thread until specified sleep_time has been reached.
That means, take the current timestamp, add 1 ms and sleep until that time.
See:
https://en.cppreference.com/w/cpp/thread/sleep_for
https://en.cppreference.com/w/cpp/thread/sleep_until

Related

How to call a function in a certain frequency, C++

I am a beginner to C++, trying to improve my skills by working on a project.
I am trying to have my program call a certain function 100 times a second for 30 seconds.
I thought that this would be a common, well documented problem but so far I did not manage to find a solution.
Could anyone provide me with an implementation example or point me towards one?
Notes: my program is intended to be single-threaded and to use only the standard library.

There are two reasons you couldn't find a trivial answer:
This statement "I am trying to have my program call a certain function 100 times a second for 30 seconds" is not well-defined.
Timing and scheduling is a very complication problem.
In a practical sense, if you just want something to run approximately 100 times a second for 30 seconds, assuming the function doesn't take long to run, you can say something like:
for (int i=0;i<3000;i++) {
do_something();
this_thread::sleep_for(std::chrono::milliseconds(10));
}
This is an approximate solution.
Problems with this solution:
If do_something() takes longer than around 0.01 milliseconds your timing will eventually be way off.
Most operating systems do not have very accurate sleep timing. There is no guarantee that asking to sleep for 10 milliseconds will wait for exactly 10 milliseconds. It will usually be approximately accurate.

You can use std::this_thread::sleep_until and calculate the end time of the sleep according to desired frequency:
void f()
{
static int counter = 0;
std::cout << counter << '\n';
++counter;
}
int main() {
using namespace std::chrono_literals;
using Clock = std::chrono::steady_clock;
constexpr auto period = std::chrono::duration_cast<std::chrono::milliseconds>(1s) / 100; // conversion to ms needed to prevent truncation in integral division
constexpr auto repetitions = 30s / period;
auto const start = Clock::now();
for (std::remove_const_t<decltype(repetitions)> i = 1; i <= repetitions; ++i)
{
f();
std::this_thread::sleep_until(start + period * i);
}
}
Note that this code will not work, if f() takes more than 10ms to complete.
Note: The exact duration of the sleep_until calls may be off, but the fact that the sleep duration is calculated based on the current time by sleep_until should result in any errors being kept to a minimum.

You can't time it perfectly, but you can try like this:
using std::chrono::steady_clock;
using namespace std::this_thread;
auto running{ true };
auto frameTime{ std::chrono::duration_cast<steady_clock::duration>(std::chrono::duration<float>{1.0F / 100.0F}) }
auto delta{ steady_clock::duration::zero() };
while (running) {
auto t0{ steady_clock::now() };
while (delta >= frameTime) {
call_your_function(frameTime);
delta -= frameTime;
}
if (const auto dt{ delta + steady_clock::now() - t0 }; dt < frameTime) {
sleep_for(frameTime - dt);
delta += steady_clock::now() - t0;
}
else {
delta += dt;
}
}

How to get progress update every 3 seconds in C++ in long computation

I have a program that runs a computation for 10+ hours. It's an entry-by-entry based task that reads a file line by line and computes on the input. At the moment it stays silent for 10 hours before spitting out a "Time elapsed: xxx minutes" message.
I would like to get updates as I go, but I would also like to over-engineer the problem such that I get updates at regular intervals. Clearly I can do some kind of
if (++tasks_processed % 100000 == 0)
cout << tasks_processed << " entries processed...\n";
But I expect I may improve my algorithms in the near future or simple advances in processor/disk speeds will cause my program to be spamming out a dozen of these per second in 2-3 year's time. So instead I want to be able to future proof my reporting intervals.
Now the alternative is to have some chrono based solution where I say
high_resolution_clock::time_point t_start = high_resolution_clock::now();
while (...) {
processing...
high_resolution_clock::time_point t_now = high_resolution_clock::now();
auto duration = duration_cast<seconds>(t_now - t_start).count();
if (duration >=3)
cout << tasks_processed << " entries processed...\n";
}
But this adds a lot of overhead to a tight loop. Are there any other facilities I could make use of to achieve the desired effect?

Check this self explanatory pseudo code as a solution
void LongComputation(std::atomic<bool>& running, std::atomic<float>& progress)
{
// do long computation
while (running)
{
//update progress
}
}
void ProgressCounter(std::atomic<bool>& running, std::atomic<float>& progress)
{
while (running)
{
std::cout << progress << "\n";
std::this_thread::sleep_for(std::chrono::seconds(3));
}
}
int main() {
std::atomic<bool> running{true};
std::atomic<float> progress{0};
std::thread t1([&running, &progress]() { LongComputation(running, progress); });
std::thread t2([&running, &progress]() { ProgressCounter(running, progress); });
//simulating GUI loop
while (!getch())
{
}
running = false;
t1.join();
t2.join();
return 0;
}

C++ accurately do action in "every" 100 microsecond without pausing

I writing a program that needs to loop regularly 100 microsecond for one loop. I have found the to loop regularly in a fixed time. But I find a problem when the time for looping is set to be too small.
The following demo code (not complete code) is to:
Increment the count every 100 microsecond.
Show the count every 1 second.
The expected result is showing approximately 10000 every second.
But the result shows about four thousand a second.
void f2(int input)
{
auto start = std::chrono::system_clock::now();
auto displayStart = std::chrono::system_clock::now();
while(true){ //keep looping quickly
auto now = std::chrono::system_clock::now();
auto interval = std::chrono::duration_cast<std::chrono::microseconds>(now - start);
if ( interval.count() > input){ //if 100 microsecond do
count++;
start = std::chrono::system_clock::now();
}
auto displayNow = std::chrono::system_clock::now();
auto displayInterval = std::chrono::duration_cast<std::chrono::microseconds>(displayNow - displayStart);
if ( displayInterval.count() > 1000000){ //if 1 second do
std::cout<< "1 second count: "<<count<<std::endl;
count=0;
displayStart = std::chrono::system_clock::now();
}
}
}
After that I think CPU scheduling may be the problem for this. I have checked the program works normally in every loop . Each iteration takes about 100 microseconds which is accurate. But problem may occur when program/thread is paused and wait for CPU rescheduling.
For example, and lets magnify the value for clearer illustration. The thread paused for 1 second. Normally it will increment for 10000 times. But now, for next iteration it check for >100 microsecond, so count++ and counter is reset as 1 second is passed. For this case, the count incremented only for 1.
With the following code I modified, I can finish 10000 count++ in a second. But the problem is those 10000 count is not evenly distributed in one second. Because this is only the demo program for testing. The action I actually want is to accurately do action in every 100 microsecond. But due to the pausing of thread, I still not find the solution to solve this.
void f2(int input)
{
auto start = std::chrono::system_clock::now();
auto displayStart = std::chrono::system_clock::now();
while(true){ //keep looping quickly
auto now = std::chrono::system_clock::now();
auto interval = std::chrono::duration_cast<std::chrono::microseconds>(now - start);
if ( interval.count() > input){ //if 100 microsecond do
for(int i=0;i<interval.count()/input;i++){ //modified part
count++;
}
start = std::chrono::system_clock::now();
}
auto displayNow = std::chrono::system_clock::now();
auto displayInterval = std::chrono::duration_cast<std::chrono::microseconds>(displayNow - displayStart);
if ( displayInterval.count() > 1000000){ //if 1 second do
std::cout<< "1 second count: "<<count<<std::endl;
count=0;
displayStart = std::chrono::system_clock::now();
}
}
}
Is there any way like:
eg. make process non pause. Keep it in CPU (not so possible)
to make the counting action in demo program works every 100 microseconds?
Thank you very much

Have N buckets, where N is large enough that scheduling delay won't be a problem.
Keep track of the last time your decay code ran.
When a new packet "goes out", put it in a bucket based on the last time your decay code ran (if less than 100 ms, bucket 0, if 200 ms, bucket 1, etc).
When your decay code runs, calculate the current value after decaying everything properly and update the timestamp.
Note that contention (the thread updating and the thread decaying) will remain a problem. You can fix this to some extent with double or triple buffering of the counters, atomic flags and pointers, and busy-loops in the non-performance-sensitive code (say, the decay code).
Alternatively, instead of recording counts, record time stamps. Consume the buffer of time stamps, doing decay at that point. Similar issues involving size of buffer and multiple threads remain, with similar solutions.
Alternatively, do the decay math in the code that is doing the counting.

Are there any STL functions that wait that use wallclock time instead of "machine awake" time?

I am trying to find a way to wait for a signal or maximum duration such that the duration is wallclock time instead of time the machine is spent awake. For example, for the following order of events:
A wait() function is called for a maximum of 24 hours
12 hours pass
Machine is put to sleep
12 hours pass
Machine is woken up out of sleep
I would like the wait() call to return as soon as the process gets to run since 24 hours of wallclock time have passed. I've tried using std::condition_variable::wait_until but that uses machine awake time. I've also tried WaitForSingleObject() on windows and pthread_cond_timedwait() on mac to no avail. I would prefer something cross-platform (e.g. in the STL) if possible. As a backup, it looks like SetThreadpoolTimer() for windows and dispatch_after() (using dispatch_walltime()) on mac could work, but I would of course prefer a single implementation. Does anybody know of one?
Thanks!
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
condition_variable cv;
mutex m;
unique_lock<mutex> lock(m);
auto start = chrono::steady_clock::now();
cv_status result = cv.wait_until(lock, start + chrono::minutes(5));
//put computer to sleep here for 5 minutes, should wake up immediately
if (result == cv_status::timeout)
{
auto end = chrono::steady_clock::now();
chrono::duration<double> diff = end - start;
cerr << "wait duration: " << diff.count() << " seconds\n";
}
return 0;
}

Odd results when adding artificial delays to C++ code. Embedded Linux

I have been looking at the performance of our C++ server application running on embedded Linux (ARM). The pseudo code for the main processing loop of the server is this -
for i = 1 to 1000
Process item i
Sleep for 20 ms
The processing for one item takes about 2ms. The "Sleep" here is really a call to the Poco library to do a "tryWait" on an event. If the event is fired (which it never is in my tests) or the time expires, it comes returns. I don't know what system call this equates to. Although we ask for a 2ms block, it turns out to be roughly 20ms. I can live with that - that's not the problem. The sleep is just an artificial delay so that other threads in the process are not starved.
The loop takes about 24 seconds to go through 1000 items.
The problem is, we changed the way the sleep is used so that we had a bit more control. I mean - 20ms delay for 2ms processing doesn't allow us to do much processing. With this new parameter set to a certain value it does something like this -
For i = 1 to 1000
Process item i
if i % 50 == 0 then sleep for 1000ms
That's the rough code, in reality the number of sleeps is slightly different and it happens to work out at a 24s cycle to get through all the items - just as before.
So we are doing exactly the same amount of processing in the same amount of time.
Problem 1 - the CPU usage for the original code is reported at around 1% (it varies a little but that's about average) and the CPU usage reported for the new code is about 5%. I think they should be the same.
Well perhaps this CPU reporting isn't accurate so I thought I'd sort a large text file at the same time and see how much it's slowed up by our server. This is a CPU bound process (98% CPU usage according to top). The results are very odd. With the old code, the time taken to sort the file goes up by 21% when our server is running.
Problem 2 - If the server is only using 1% of the CPU then wouldn't the time taken to do the sort be pretty much the same?
Also, the time taken to go through all the items doesn't change - it's still 24 seconds with or without the sort running.
Then I tried the new code, it only slows the sort down by about 12% but it now takes about 40% longer to get through all the items it has to process.
Problem 3 - Why do the two ways of introducing an artificial delay cause such different results. It seems that the server which sleeps more frequently but for a minimum time is getting more priority.
I have a half baked theory on the last one - whatever the system call that is used to do the "sleep" is switching back to the server process when the time is elapsed. This gives the process another bite at the time slice on a regular basis.
Any help appreciated. I suspect I'm just not understanding it correctly and that things are more complicated than I thought. I can provide more details if required.
Thanks.
Update: replaced tryWait(2) with usleep(2000) - no change. In fact, sched_yield() does the same.

Well I can at least answer problem 1 and problem 2 (as they are the same issue).
After trying out various options in the actual server code, we came to the conclusion that the CPU reporting from the OS is incorrect. It's quite result so to make sure, I wrote a stand alone program that doesn't use Poco or any of our code. Just plain Linux system calls and standard C++ features. It implements the pseudo code above. The processing is replaced with a tight loop just checking the elapsed time to see if 2ms is up. The sleeps are proper sleeps.
The small test program shows exactly the same problem. i.e. doing the same amount of processing but splitting up the way the sleep function is called, produces very different results for CPU usage. In the case of the test program, the reported CPU usage was 0.0078 seconds using 1000 20ms sleeps but 1.96875 when a less frequent 1000ms sleep was used. The amount of processing done is the same.
Running the test on a Linux PC did not show the problem. Both ways of sleeping produced exactly the same CPU usage.
So clearly a problem with our embedded system and the way it measures CPU time when a process is yielding so often (you get the same problem with sched_yeild instead of a sleep).
Update: Here's the code. RunLoop is where the main bit is done -
int sleepCount;
double getCPUTime( )
{
clockid_t id = CLOCK_PROCESS_CPUTIME_ID;
struct timespec ts;
if ( id != (clockid_t)-1 && clock_gettime( id, &ts ) != -1 )
return (double)ts.tv_sec +
(double)ts.tv_nsec / 1000000000.0;
return -1;
}
double GetElapsedMilliseconds(const timeval& startTime)
{
timeval endTime;
gettimeofday(&endTime, NULL);
double elapsedTime = (endTime.tv_sec - startTime.tv_sec) * 1000.0; // sec to ms
elapsedTime += (endTime.tv_usec - startTime.tv_usec) / 1000.0; // us to ms
return elapsedTime;
}
void SleepMilliseconds(int milliseconds)
{
timeval startTime;
gettimeofday(&startTime, NULL);
usleep(milliseconds * 1000);
double elapsedMilliseconds = GetElapsedMilliseconds(startTime);
if (elapsedMilliseconds > milliseconds + 0.3)
std::cout << "Sleep took longer than it should " << elapsedMilliseconds;
sleepCount++;
}
void DoSomeProcessingForAnItem()
{
timeval startTime;
gettimeofday(&startTime, NULL);
double processingTimeMilliseconds = 2.0;
double elapsedMilliseconds;
do
{
elapsedMilliseconds = GetElapsedMilliseconds(startTime);
} while (elapsedMilliseconds <= processingTimeMilliseconds);
if (elapsedMilliseconds > processingTimeMilliseconds + 0.1)
std::cout << "Processing took longer than it should " << elapsedMilliseconds;
}
void RunLoop(bool longSleep)
{
int numberOfItems = 1000;
timeval startTime;
gettimeofday(&startTime, NULL);
timeval startMainLoopTime;
gettimeofday(&startMainLoopTime, NULL);
for (int i = 0; i < numberOfItems; i++)
{
DoSomeProcessingForAnItem();
double elapsedMilliseconds = GetElapsedMilliseconds(startTime);
if (elapsedMilliseconds > 100)
{
std::cout << "Item count = " << i << "\n";
if (longSleep)
{
SleepMilliseconds(1000);
}
gettimeofday(&startTime, NULL);
}
if (longSleep == false)
{
// Does 1000 * 20 ms sleeps.
SleepMilliseconds(20);
}
}
double elapsedMilliseconds = GetElapsedMilliseconds(startMainLoopTime);
std::cout << "Main loop took " << elapsedMilliseconds / 1000 <<" seconds\n";
}
void DoTest(bool longSleep)
{
timeval startTime;
gettimeofday(&startTime, NULL);
double startCPUtime = getCPUTime();
sleepCount = 0;
int runLoopCount = 1;
for (int i = 0; i < runLoopCount; i++)
{
RunLoop(longSleep);
std::cout << "**** Done one loop of processing ****\n";
}
double endCPUtime = getCPUTime();
std::cout << "Elapsed time is " <<GetElapsedMilliseconds(startTime) / 1000 << " seconds\n";
std::cout << "CPU time used is " << endCPUtime - startCPUtime << " seconds\n";
std::cout << "Sleep count " << sleepCount << "\n";
}
void testLong()
{
std::cout << "Running testLong\n";
DoTest(true);
}
void testShort()
{
std::cout << "Running testShort\n";
DoTest(false);
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js