On my computer, running on Windows 7, the following code, compiled in Visual C++ 2010 with Boost 1.53, outputs
no timeout
elapsed time (ms): 1000
The same code compiled with GCC 4.8 (online link) outputs
timeout
elapsed time (ms): 1000
My opinion is that the VC++ output is not correct and it should be timeout. Does anyone have the same output (i.e. no timeout) in VC++? If yes, then is it a bug in the Win32 implementation of boost::condition_variable?
The code is
#include <boost/thread.hpp>
#include <iostream>
int main(void) {
boost::condition_variable cv;
boost::mutex mx;
boost::unique_lock<decltype(mx)> lck(mx);
boost::chrono::system_clock::time_point start = boost::chrono::system_clock::now();
const auto cv_res = cv.wait_for(lck, boost::chrono::milliseconds(1000));
boost::chrono::system_clock::time_point end = boost::chrono::system_clock::now();
const auto count = (boost::chrono::duration_cast<boost::chrono::milliseconds>(end - start)).count();
const std::string str = (cv_res == boost::cv_status::no_timeout) ? "no timeout" : "timeout";
std::cout << str << std::endl;
std::cout << "elapsed time (ms): " << count << std::endl;
return 0;
}
If we read the documentation we see:
Atomically call lock.unlock() and blocks the current thread. The
thread will unblock when notified by a call to this->notify_one() or
this->notify_all(), after the period of time indicated by the rel_time
argument has elapsed, or spuriously... [Emphasis mine]
What you are almost certainly seeing is that the VS implementation is treating it as a spurious wakeup that happens to be at the end of the expected duration while the other implementation is treating it as a timeout.
Related
I'm trying to write a kind of thread pool in C++. The code works fine in OSX, but under Linux I'm experiencing a strange behavior.
After a bit of debugging, I found the problem is due to a call to std::condition_variable::wait_until that I must be doing in a wrong way.
With the code below I expect the loop to be looped once every three seconds:
#include <mutex>
#include <chrono>
#include <iostream>
#include <memory>
#include <condition_variable>
#include <thread>
using namespace std;
typedef std::chrono::steady_clock my_clock;
typedef std::chrono::duration<float, std::ratio<1> > seconds_duration;
typedef std::chrono::time_point<my_clock, seconds_duration> timepoint;
timepoint my_begin = my_clock::now();
float timepointToFloat(timepoint time) {
return time.time_since_epoch().count() - my_begin.time_since_epoch().count();
}
void printNow(std::string mess) {
timepoint now = my_clock::now();
cout << timepointToFloat(now) << " " << mess << endl;;
};
void printNow(std::string mess, timepoint time ) {
timepoint now = my_clock::now();
cout << timepointToFloat(now) << " " << mess << " " << timepointToFloat(time) << endl;;
};
int main() {
mutex _global_mutex;
condition_variable _awake_global_execution;
auto check_predicate = [](){
cout << "predicate called" << endl;
return false;
};
while (true) {
{ // Expected to loop every three seconds
unique_lock<mutex> lock(_global_mutex);
timepoint planned_awake = my_clock::now() + seconds_duration(3);
printNow("wait until", planned_awake);
_awake_global_execution.wait_until(lock, planned_awake, check_predicate);
}
printNow("finish wait, looping");
}
return 0;
}
However, sometimes I get as output:
<X> wait until <X+3>
predicate called
(...hangs here for a long time)
(where X is a number), so it seems the timeout is not scheduled after three seconds. Sometimes instead I get:
<X> wait until <X+3>
predicate called
predicate called
<X> finish wait, looping
<X> wait until <X+3> (another loop)
predicate called
predicate called
<X> finish wait, looping
(...continue looping without waiting)
so it seems the timeout is scheduled after a small fraction of seconds. I think I'm messing up something with the timeout timepoint, but I cannot figure out what I'm doing wrong.
If it may be relevant, this code works fine in OSX, while in Linux (Ubuntu 16.04, gcc 5.4, compiled with "g++ main.cc -std=c++11 -pthread") I'm experiencing the strange behavior.
How can I get it work?
Try to cast your timeout to your clock's duration:
auto planned_awake = my_clock::now() +
std::chrono::duration_cast<my_clock::duration>(seconds_duration(3));
Consider this code:
#include <iostream>
#include <vector>
#include <functional>
#include <map>
#include <atomic>
#include <memory>
#include <chrono>
#include <thread>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/asio/high_resolution_timer.hpp>
static const uint32_t FREQUENCY = 5000; // Hz
static const uint32_t MKSEC_IN_SEC = 1000000;
std::chrono::microseconds timeout(MKSEC_IN_SEC / FREQUENCY);
boost::asio::io_service ioservice;
boost::asio::high_resolution_timer timer(ioservice);
static std::chrono::system_clock::time_point lastCallTime = std::chrono::high_resolution_clock::now();
static uint64_t deviationSum = 0;
static uint64_t deviationMin = 100000000;
static uint64_t deviationMax = 0;
static uint32_t counter = 0;
void timerCallback(const boost::system::error_code &err) {
auto actualTimeout = std::chrono::high_resolution_clock::now() - lastCallTime;
std::chrono::microseconds actualTimeoutMkSec = std::chrono::duration_cast<std::chrono::microseconds>(actualTimeout);
long timeoutDeviation = actualTimeoutMkSec.count() - timeout.count();
deviationSum += abs(timeoutDeviation);
if(abs(timeoutDeviation) > deviationMax) {
deviationMax = abs(timeoutDeviation);
} else if(abs(timeoutDeviation) < deviationMin) {
deviationMin = abs(timeoutDeviation);
}
++counter;
//std::cout << "Actual timeout: " << actualTimeoutMkSec.count() << "\t\tDeviation: " << timeoutDeviation << "\t\tCounter: " << counter << std::endl;
timer.expires_from_now(timeout);
timer.async_wait(timerCallback);
lastCallTime = std::chrono::high_resolution_clock::now();
}
using namespace std::chrono_literals;
int main() {
std::cout << "Frequency: " << FREQUENCY << " Hz" << std::endl;
std::cout << "Callback should be called each: " << timeout.count() << " mkSec" << std::endl;
std::cout << std::endl;
ioservice.reset();
timer.expires_from_now(timeout);
timer.async_wait(timerCallback);
lastCallTime = std::chrono::high_resolution_clock::now();
auto thread = new std::thread([&] { ioservice.run(); });
std::this_thread::sleep_for(1s);
std::cout << std::endl << "Messages posted: " << counter << std::endl;
std::cout << "Frequency deviation: " << FREQUENCY - counter << std::endl;
std::cout << "Min timeout deviation: " << deviationMin << std::endl;
std::cout << "Max timeout deviation: " << deviationMax << std::endl;
std::cout << "Avg timeout deviation: " << deviationSum / counter << std::endl;
return 0;
}
It runs timer to call timerCallback(..) periodically with specified frequency. In this example, callback must be called 5000 times per second. One can play with frequency and see that actual (measured) frequency of calls is different from desired one. In fact the higher is the frequency, the higher is deviation. I did some measurements with different frequencies and here is summary:
https://docs.google.com/spreadsheets/d/1SQtg2slNv-9VPdgS0RD4yKRnyDK1ijKrjVz7BBMSg24/edit?usp=sharing
When desired frequency is 10000Hz, system miss 10% (~ 1000) of calls.
When desired frequency is 100000Hz, system miss 40% (~ 40000) of calls.
Question: Is it possible to achieve better accuracy in Linux \ C ++ environment? How? I need it to work without significant deviation with frequency of 500000Hz
P.S. My first idea was that it is the body of the timerCallback(..) method itself causes delay. I measured it. It takes a stably takes less than 1 microsecond to execute. So it does not affect the process.
I have no experience in this problem myself, but I guess (as the references explains) that the scheduler of the OS interferes with your callback somehow.
So, you could try to use the real-time scheduler and try to change priority of your task to a higher one.
Hope this gives you a direction to find your answer.
Scheduler:
http://gumstix.8.x6.nabble.com/High-resolution-periodic-task-on-overo-td4968642.html
Priority:
https://linux.die.net/man/3/setpriority
If you need to achieve one call each two microsecond interval, you'd better to attach to absolute time positions, and don't consider the time each request is going to require.... You run although into the problem that the processing required at each timeslot could be more cpu demanding than the time required for it to execute.
If you have a multicore cpu, I'd divide the timeslot between each core (in a multithreaded approach) for it to be longer for each core, so suppose that you have your requirements in a four core cpu, then you can allow each thread to execute 1 cal per 8usec, which is probably more affordable. In this case you use absolute timers (one absolute timer is one that waits until the wall clock ticks a specific absolute time, and not a delay from the time you called it) and will offset them by an amount equal to the thread number of 2usec delay, in this case (4 cores) you will start thread #1 at time T, thread #2 at time T + 2usec, thread #3 at time T + 4usec, ... and thread #N at time T + 2*(N-1)usec. Each thread will then start itself again at time oldT + 2usec, instead of doing some kind of nsleep(3) call. This will not accumulate the processing time to the delay call, as this is most probably what you are experiencing. The pthread library timers are all absolute time timers, so you can use them. I think this is the only way you'll be capable of reaching such a hard spec. (and prepare to see how the battery suffers with that, assuming you're in an android environment)
NOTE
in this approach, the external bus can be a bottleneck, so even if you get it working, probably it would be better to synchronize several machines with NTP (this can be done to the usec level, at the speed of actual GBit links) and use different processors running in parallel. As you don't describe anything of the process you have to repeat so densely, I cannot provide more help to the problem.
I've recently tried cpp, in the thing I'm making I'm trying to make it so that a variable with the value of 20 is subtracted by 1 every second, but I also need the machine to be waiting for an input from the user. I tried using for loops but they won't proceed until the input is placed or until the variable runs out. I looked at clock but they don't seem to fit my need, or maybe I just misunderstood their purpose.
Any suggestions?
As has already been suggested in the comments, threading is one way to do this. There is a nice self-contained example here (which I've borrowed from in the code below).
In the code below an asynchronous function is launched. Details on these here. This returns a future object which will contain the result once the job has finished.
In this case the job is listening to cin (typically the terminal input) and will return when some data is entered (i.e. when enter is pressed).
In the meantime the while loop will be running which keeps a track of how much time has passed, decrements the counter, and also returns if the asynchronous job finishes. It wasn't clear from your question if this is exactly the behaviour you want but it gives you the idea. It will print out value of decremented variable, but user can enter text, and it will print that out once user presses enter.
#include <iostream>
#include <thread>
#include <future>
#include <time.h>
int main() {
// Enable standard literals as 2s and ""s.
using namespace std::literals;
// Execute lambda asyncronously (waiting for user input)
auto f = std::async(std::launch::async, [] {
auto s = ""s;
if (std::cin >> s) return s;
});
// Continue execution in main thread, run countdown and timer:
int countdown = 20;
int countdownPrev = 0;
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point end;
double elapsed;
while((f.wait_for(5ms) != std::future_status::ready) && countdown >= 0) {
end = std::chrono::steady_clock::now();
elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count();
countdown = 20 - (int) (elapsed/1000);
if (countdown != countdownPrev) {
std::cout << "Counter now: " << std::fixed << countdown << std::endl;
countdownPrev = countdown;
}
}
if (countdown == -1) {
std::cout << "Countdown elapsed" << std::endl;
return -1;
} else {
std::cout << "Input was: " << f.get() << std::endl;
return 0;
}
}
P.S. to get this to work on my compiler I have to compile it with g++ -pthread -std=c++14 file_name.cpp to correctly link the threading library and allow use of c++14 features.
I'm running a couple of threads in parallel. And I want to measure the time it takes to execute one thread and the time it takes to execute the whole program. I'm using VC++, on Windows 7.
I tried to measure it while debugging but then I saw this question: https://stackoverflow.com/questions/38971267/improving-performance-using-parallelism-in-c?noredirect=1#comment65299718_38971267 and in the answer given by Schnien it says:
Debugging of multiple threads is somehow "special" - when your Debugger halts at a breakpoint, the other threads will not be stopped - they will go on
Is this true ? And if yes how can I otherwise measure the time
Thanks
That statement is indeed true, only the thread that hits a breakpoint will be paused.
However to measure execution times you do not have to use debugging at all. More information on measuring execution time can be found on the below question:
Measure execution time in C (on Windows)
What you would want to do is measure the time inside the threads' functions (by subtracting the time at the beginning and at the end of the functions). You can do the same with the program, you can use thread.join to make sure all the threads executions end before measuring the time one last time.
Use a simple timer class to create a stopwatch capability then capture the time within each thread. Also, creating system threads is slower than using std::async and the latter can both return values and propagate exceptions which, using threads cause program termination unless caught within the thread.
#include <thread>
#include <iostream>
#include <atomic>
#include <chrono>
#include <future>
// stopwatch. Returns time in seconds
class timer {
public:
std::chrono::time_point<std::chrono::high_resolution_clock> lastTime;
timer() : lastTime(std::chrono::high_resolution_clock::now()) {}
inline double elapsed() {
std::chrono::time_point<std::chrono::high_resolution_clock> thisTime=std::chrono::high_resolution_clock::now();
double deltaTime = std::chrono::duration<double>(thisTime-lastTime).count();
lastTime = thisTime;
return deltaTime;
}
};
// for exposition clarity, generally avoid global varaibles.
const int count = 1000000;
double timerResult1;
double timerResult2;
void f1() {
volatile int i = 0; // volatile eliminates optimization removal
timer stopwatch;
while (i++ < count);
timerResult1=stopwatch.elapsed();
}
void f2() {
volatile int i = 0; // volatile eliminates optimization removal
timer stopwatch;
while (i++ < count);
timerResult2=stopwatch.elapsed();
}
int main()
{
std::cout.precision(6); std::cout << std::fixed;
f1(); std::cout << "f1 execution time " << timerResult1 << std::endl;
timer stopwatch;
{
std::thread thread1(f1);
std::thread thread2(f2);
thread1.join();
thread2.join();
}
double elapsed = stopwatch.elapsed();
std::cout << "f1 with f2 execution time " << elapsed << std::endl;
std::cout << "thread f1 execution time " << timerResult1 << std::endl;
std::cout << "thread f1 execution time " << timerResult2 << std::endl;
{
stopwatch.elapsed(); // reset stopwatch
auto future1 = std::async(std::launch::async, f1); // spins a thread and descturctor automatically joins
auto future2 = std::async(std::launch::async, f2);
}
elapsed = stopwatch.elapsed();
std::cout << "async f1 with f2 execution time " << elapsed << std::endl;
std::cout << "async thread f1 execution time " << timerResult1 << std::endl;
std::cout << "async thread f1 execution time " << timerResult2 << std::endl;
}
On my machine creating threads adds about .3 ms per thread whereas async is only about .05 ms per thread as it is implemented with a thread pool.
f1 execution time 0.002076
f1 with f2 execution time 0.002791
thread f1 execution time 0.002018
thread f1 execution time 0.002035
async f1 with f2 execution time 0.002131
async thread f1 execution time 0.002028
async thread f1 execution time 0.002018
[EDIT] Had incorrect f calls in front of statements (cut and past error)
See the following code.
#include <future>
#include <iostream>
#include <ctime>
int main()
{
std::future<int> future = std::async(std::launch::deferred, [](){
std::this_thread::sleep_for(std::chrono::seconds(5));
return 100;
});
std::cout << "waiting...\n";
clock_t start = clock();
std::future_status status = future.wait_for(std::chrono::seconds(20));
std::cout << "result is " << future.get() << std::endl;
clock_t end = clock();
std::cout<<"Time Cost : "<< (double)(end-start)/CLOCKS_PER_SEC <<" seconds."<< std::endl;
}
It's very confusing about the execution result. Yep, the main thread will wait for only 5 seconds around and then print "100". But why "Time Cost" shows 0? The test environment is Cygwin with g++ 4.9.3.
Then I tested it in VS2013. The result is 25 seocnds. Strange!
It doesn't show 0 on my machine but a very small value : 0;000156s. But as it measures processor time and your main thread does not consume any cpu (wait is not an active loop), the result is almost 0.
clock() returns processor time spent. It doesn't have any guarantee of advancement whatsoever. If your CPU sleeps, the value returned by it will not be advanced. To measure intervals properly, use clocks from std::chrono, for example, std::chrono::steady_clock.