Timer working improperly - c++

I have a problem with a timer class based on a SDL timer.
class CTimer
{
public:
CTimer(): startTick(0), endTick(0), curTime(0), running(false) {};
void Start() { startTick += SDL_GetTicks() - endTick; running = true; };
void Stop() { endTick = SDL_GetTicks(); running = false; };
void Reset() { startTick = 0; endTick = 0; curTime = 0; };
inline int operator()() { return running ? curTime = ((int) SDL_GetTicks - startTick) / 1000 + 1 : curTime; };
private:
int startTick;
int endTick;
int curTime;
bool running;
};
The () operator should return time in seconds (stored in curTime). But it always returns 4202 (curTime is always equal to that). What am I doing wrong?
Test code:
int main()
{
SDL_Init (SDL_INIT_TIMER);
CApp::CTimer timer;
timer.Start();
for (int i = 0; i < 15; ++i)
{
SDL_Delay (1000);
std::cout << timer() << '\n';
}
return 0;
}

This is a perfect example of why you don't want to use old-style C casts in C++.
(int) SDL_GetTicks
The missing parentheses on the function call mean you're casting a pointer to the function to an int, not the return value. Surprisingly enough the pointer to the function never changes.

Are you missing parentheses for SDL_GetTicks?
inline int operator()() { return running ? curTime = ((int) SDL_GetTicks - startTick) / 1000 + 1 : curTime; };

For starters,
inline int operator()() { return running ? curTime =
((int) SDL_GetTicks - startTick) / 1000 + 1 : curTime; };
should be
inline int operator()() { return running ? curTime =
((int) SDL_GetTicks() - startTick) / 1000 + 1 : curTime; };
I would think.
Did you get a warning error about this?

In addition to the attempted call to SDL_GetTicks instead of SDL_GetTicks() causing it to take the address of that function (and always returning the constant as you observed), it looks like if you call Start and then Stop before calling operator() you won't get a meaningful result.

Related

How to reduce the (writer side) performance penalty for sharing a small structure between two threads?

I'm adding a profiler to an existing evaluator. The evaluator is implemented in c++ (std17 x86 msvc) and I need to store 3 int32, 1 uint16 and 1 uint8 to represent the context of the execution frame. As this is interpreted code, I cannot write any kernel level driver to take snapshots, so we have to add it to the eval loop. As with all profiling, we want to avoid slowing down the actual computation. So far for the motivation/context of this question.
At a high level, the behavior is as follows: before every instruction gets executed by the evaluator, it calls a small function ("hey, i'm here right now"), the sampling profiler is only interested in that frame position once every X ms (or μs, but that's most likely pushing it). So we need 2 threads (let's ignore the serialization for now) that want to share data. We have a very frequent writer (one single thread), and an infrequent reader (a different single thread). We like to minimize the performance penalty on the write size. Note, sometimes the writer becomes slow, so it might be stuck on a frame for a few seconds, we would like to be able to observe this.
So to help this question, I've written a small benchmark setup.
#include <memory>
#include <chrono>
#include <string>
#include <iostream>
#include <thread>
#include <immintrin.h>
#include <atomic>
#include <cstring>
using namespace std;
typedef struct frame {
int32_t a;
int32_t b;
uint16_t c;
uint8_t d;
} frame;
class ProfilerBase {
public:
virtual void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) = 0;
virtual void Stop() = 0;
virtual ~ProfilerBase() {}
virtual string Name() = 0;
};
class NoOp : public ProfilerBase {
public:
void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) override {}
void Stop() override {}
string Name() override { return "NoOp"; }
};
class JustStore : public ProfilerBase {
private:
frame _current = { 0 };
public:
string Name() override { return "OnlyStoreInMember"; }
void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) override {
_current.a = a;
_current.b = b;
_current.c = c;
_current.d = d;
}
void Stop() override {
if ((_current.a + _current.b + _current.c + _current.d) == _current.a) {
cout << "Make sure optimizer keeps the record around";
}
}
};
class WithSampler : public ProfilerBase {
private:
unique_ptr<thread> _sampling;
atomic<bool> _keepSampling = true;
protected:
const chrono::milliseconds _sampleEvery;
virtual void _snap() = 0;
virtual string _subname() = 0;
public:
WithSampler(chrono::milliseconds sampleEvery): _sampleEvery(sampleEvery) {
_sampling = make_unique<thread>(&WithSampler::_sampler, this);
}
void Stop() override {
_keepSampling = false;
_sampling->join();
}
string Name() override {
return _subname() + to_string(_sampleEvery.count()) + "ms";
}
private:
void _sampler() {
auto nextTick = chrono::steady_clock::now();
while (_keepSampling)
{
const auto sleepTime = nextTick - chrono::steady_clock::now();
if (sleepTime > chrono::milliseconds(0))
{
this_thread::sleep_for(sleepTime);
}
_snap();
nextTick += _sampleEvery;
}
}
};
struct checkedFrame {
frame actual;
int32_t check;
};
// https://rigtorp.se/spinlock/
struct spinlock {
std::atomic<bool> lock_ = { 0 };
void lock() noexcept {
for (;;) {
// Optimistically assume the lock is free on the first try
if (!lock_.exchange(true, std::memory_order_acquire)) {
return;
}
// Wait for lock to be released without generating cache misses
while (lock_.load(std::memory_order_relaxed)) {
// Issue X86 PAUSE or ARM YIELD instruction to reduce contention between
// hyper-threads
_mm_pause();
}
}
}
void unlock() noexcept {
lock_.store(false, std::memory_order_release);
}
};
class Spinlock : public WithSampler {
private:
spinlock _loc;
checkedFrame _current;
public:
using WithSampler::WithSampler;
string _subname() override { return "Spinlock"; }
void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) override {
_loc.lock();
_current.actual.a = a;
_current.actual.b = b;
_current.actual.c = c;
_current.actual.d = d;
_current.check = a + b + c + d;
_loc.unlock();
}
protected:
void _snap() override {
_loc.lock();
auto snap = _current;
_loc.unlock();
if ((snap.actual.a + snap.actual.b + snap.actual.c + snap.actual.d) != snap.check) {
cout << "Corrupted snap!!\n";
}
}
};
static constexpr int32_t LOOP_MAX = 1000 * 1000 * 1000;
int measure(unique_ptr<ProfilerBase> profiler) {
cout << "Running profiler: " << profiler->Name() << "\n ";
cout << "\tProgress: ";
auto start_time = std::chrono::steady_clock::now();
int r = 0;
for (int32_t x = 0; x < LOOP_MAX; x++)
{
profiler->EnterFrame(x, x + x, x & 0xFFFF, x & 0xFF);
r += x;
if (x % (LOOP_MAX / 1000) == 0)
{
this_thread::sleep_for(chrono::nanoseconds(10)); // simulat that sometimes we do other stuff not like storing
}
if (x % (LOOP_MAX / 10) == 0)
{
cout << static_cast<int>((static_cast<double>(x) / LOOP_MAX) * 10);
}
if (x % 1000 == 0) {
_mm_pause(); // give the other threads some time
}
if (x == (LOOP_MAX / 2)) {
// the first half of the loop we take as warmup
// so now we take the actual time
start_time = std::chrono::steady_clock::now();
}
}
cout << "\n";
const auto done_calc = std::chrono::steady_clock::now();
profiler->Stop();
const auto done_writing = std::chrono::steady_clock::now();
cout << "\tcalc: " << chrono::duration_cast<chrono::milliseconds>(done_calc - start_time).count() << "ms\n";
cout << "\tflush: " << chrono::duration_cast<chrono::milliseconds>(done_writing - done_calc).count() << "ms\n";
return r;
}
int main() {
measure(make_unique<NoOp>());
measure(make_unique<JustStore>());
measure(make_unique<Spinlock>(chrono::milliseconds(1)));
measure(make_unique<Spinlock>(chrono::milliseconds(10)));
return 0;
}
Compiling this with /O2 in x86 mode on my machine gives this output:
Running profiler: NoOp
Progress: 0123456789
calc: 1410ms
flush: 0ms
Running profiler: OnlyStoreInMember
Progress: 0123456789
calc: 1368ms
flush: 0ms
Running profiler: Spinlock1ms
Progress: 0123456789
calc: 3952ms
flush: 4ms
Running profiler: Spinlock10ms
Progress: 0123456789
calc: 3985ms
flush: 11ms
(while this was compiled with msvc in VS2022, I think g++ --std=c++17 -O2 -m32 -pthread -o testing small-test-case.cpp should come close enough).
Here we see that the Spinlock based sampler adds a ~2.5x overhead to the one without any. I've profiled it, and as expected, a lot of time is spend on taking a lock (where in most cases, there was no need for the lock).

First output of delta time always wrong using chrono

I am trying to make a libary so that I can reuse all my class in future. Down here I have a class that can calculate delta time for fps and other stuffs if needed in future. I noticed that my first output of running always be a negative power up of -7 or more. Is it because I use float to do the calculation that it create this imprecision?
class Timer {
public:
Timer();
~Timer();
void Update();
float GetDeltaTime() const;
private:
std::chrono::time_point<std::chrono::high_resolution_clock> myCurrentTime;
std::chrono::time_point<std::chrono::high_resolution_clock> myLastTime;
};
Timer::Timer()
{
myCurrentTime = std::chrono::high_resolution_clock::now();
myLastTime = std::chrono::high_resolution_clock::now();
}
void Timer::Update()
{
myLastTime = myCurrentTime;
myCurrentTime = std::chrono::high_resolution_clock::now();
}
float Timer::GetDeltaTime() const
{
std::chrono::duration<float> deltaTime = std::chrono::duration_cast<std::chrono::duration<float> >(myCurrentTime - myLastTime);
return deltaTime.count();
}
int main()
{
Timer myTimer;
while (true) {
myTimer.Update();
std::cout << "Delta time: " << myTimer.GetDeltaTime() << std::endl; //Delta time: 2e-07 more or less it always become like this for first output
Sleep(1000);
}
}
Result
I tried to cast duration of time to float before the function run but it still produce same result

Visual Studio is reporting an error, but still compiling the program and everything works as it should. Have I made a mistake in my code?

I have made a small header-only timer struct for testing the speed of functions. I made it purely out of curiosity because I am learning C++.
I have a function called "TimeTask" which has 2 overloads, The first one takes a std::function(void()) as a parameter and records how long the function takes to execute. The other one is a template function that takes a void function with any number of parameters and records its speed.
struct Timer
{
private:
std::chrono::steady_clock::time_point begin, end;
float Duration = 0.0f;
public:
const float& getDuration() const
{
return Duration;
}
void StartTimer()
{
begin = std::chrono::steady_clock::now();
}
void StopTimer()
{
end = std::chrono::steady_clock::now();
std::chrono::duration<float, std::milli> ChronoDuration = end - begin;
Duration = ChronoDuration.count();
}
template<class...Args>
void TimeTask(std::function<void(Args...)> task, Args&&...args)
{
StartTimer();
task(std::forward<Args>(args)...);
StopTimer();
}
void TimeTask(std::function<void()> task)
{
StartTimer();
task();
StopTimer();
}
};
Quite simple really. I tested it with the following code:
void task_parameters(double j) {
for (double i = 0; i < j * j * j * j * j; i++)
{
};
std::cout << "Done\n";
}
void task_void()
{
return;
}
int main()
{
Timer timer;
timer.TimeTask<double>(task_parameters,15.0);
std::cout << timer.getDuration() << std::endl;
timer.TimeTask(task_void);
std::cout << timer.getDuration() << std::endl;
std::cin.get();
return 0;
}
The program compiles and runs as expected but the first time I run the TimeTask function with "double" as my parameter type, it gives me an error:
E0304 no instance of overloaded function "Timer::TimeTask" matches the argument list
The program will still run, but I was wondering if I could get rid of this error? Thank you
The code appears to be correct and the behaves as expected, but I don't understand why that error is appearing.

Why is my rate limiter enforcing a different rate than specified?

The Problem: I'm writing a game (as a programming exercise) from scratch. I'm trying to limit the number of game logic loops ("ticks") per second. I've set it to an arbitrary 100 ticks/second. But no matter what I do, it seems to run at ~130 ticks/second. Could it possibly be rounding errors adding up? Something else? Thanks in advance for any help you can give.
Note: my codebase is much larger than this, but for the purposes of this question, I've stripped it down as much as possible without breaking the rate limiter.
The Output:
counter 1 sleep_for(5ms)
counter 2 sleep_for(2ms)
[snip]
counter 132 sleep_for(3ms)
counter 133 sleep_for(3ms)
133 TPS last 1003ms
counter 134 sleep_for(3ms)
counter 135 sleep_for(3ms)
[snip]
counter 265 sleep_for(3ms)
counter 266 sleep_for(3ms)
133 TPS last 1004ms
counter 267 sleep_for(3ms)
counter 268 sleep_for(3ms)
[snip]
counter 399 sleep_for(3ms)
counter 400 sleep_for(3ms)
134 TPS last 1006ms
The Code:
(The two main functions to look at are ThreadRateLimiter::Tock() and TickRateCounter::Tock())
#include <chrono>
#include <exception>
#include <fstream>
#include <iostream>
#include <thread>
#include <vector>
using namespace std;
const int TICK_RATE = 100;
const chrono::milliseconds TIME_PER_TICK =
chrono::duration_cast<chrono::milliseconds>(chrono::seconds(1)) / TICK_RATE;
template <class T>
class TickTocker
{
public:
virtual void Tick() = 0;
virtual T Tock() = 0;
virtual T TickTock()
{
Tick();
return Tock();
}
};
Ticker:
class Ticker : public TickTocker<long>
{
public:
friend ostream& operator<<(ostream& stream, const Ticker& counter);
Ticker() :
Ticker(0)
{}
Ticker(long counter) :
mTicks(counter),
mTicksLast(mTicks)
{}
Ticker(const Ticker& counter) :
Ticker(counter.mTicks)
{}
bool operator==(const long i)
{
return mTicks == i;
}
void Tick() override
{
mTicks++;
}
long Tock() override
{
long diff = mTicks - mTicksLast;
mTicksLast = mTicks;
return diff;
}
// private:
long mTicks;
long mTicksLast;
};
ostream& operator<<(ostream& stream, const Ticker& counter)
{
return (stream << "counter " << counter.mTicks);
}
TickTracker:
class TickTracker : public TickTocker<chrono::milliseconds>
{
public:
TickTracker() :
mTime(chrono::steady_clock::now()),
mLastTime(mTime)
{}
void Tick() override
{
mTime = chrono::steady_clock::now();
}
chrono::milliseconds Tock() override
{
chrono::milliseconds diff = chrono::duration_cast<chrono::milliseconds>(mTime - mLastTime);
mLastTime = mTime;
return diff;
}
protected:
chrono::time_point<chrono::steady_clock> mTime;
chrono::time_point<chrono::steady_clock> mLastTime;
};
ThreadRateLimiter:
class ThreadRateLimiter : public TickTracker
{
public:
ThreadRateLimiter() : TickTracker(),
mMsFast(chrono::milliseconds(0))
{}
void Tick() override
{
mCounter.Tick();
TickTracker::Tick();
if (mCounter == 1)
{
TickTracker::Tock();
}
}
chrono::milliseconds Tock()
{
chrono::milliseconds diff = TickTracker::Tock();
chrono::milliseconds remaining = TIME_PER_TICK - diff;
/*
* If we always sleep the full remaining time, we'll alternate between sleeping for "minimum" and "maximum" sleep
* times. Sleeping the full remaining time only when we exceed the average makes for more stable sleep times.
*/
bool fullSleep = (mMsFast.count() > (TIME_PER_TICK.count() / 2));
mMsFast += remaining;
if (mMsFast.count() > 0)
{
chrono::milliseconds sleep = fullSleep ? mMsFast : (chrono::milliseconds(mMsFast.count() / 2));
cout << mCounter << " sleep_for(" << sleep.count() << "ms)" << endl;
this_thread::sleep_for(mMsFast);
mMsFast -= sleep;
}
mCounter.Tock();
return remaining;
}
private:
Ticker mCounter;
chrono::milliseconds mMsFast;
};
TickRateCounter:
class TickRateCounter : public TickTracker
{
public:
TickRateCounter(string rateLabel) : TickTracker(),
mRateLabel(rateLabel)
{}
void Tick() override
{
mCounter.Tick();
TickTracker::Tick();
}
chrono::milliseconds Tock() override
{
if (chrono::duration_cast<chrono::seconds>(mTime - mLastTime).count() >= 1)
{
chrono::milliseconds duration = TickTracker::Tock();
cout << (mCounter.Tock() / chrono::duration_cast<chrono::seconds>(duration).count()) << " " << mRateLabel
<< " last " << duration.count() << "ms" << endl;
return duration;
}
return chrono::milliseconds(0);
}
// private:
Ticker mCounter;
string mRateLabel;
};
Main:
int main()
{
ThreadRateLimiter mRateLimiter;
TickRateCounter mTpsCounter("TPS"); // TPS = Ticks per second. Tick = one game loop
while (mTpsCounter.mCounter.mTicks < 400)
{
mRateLimiter.TickTock();
mTpsCounter.TickTock();
}
return 0;
}
The proper way to have rate limiting is:
//pseudocode follows
const frame_duration = something;
last = now();
while(true)
{
process_your_frame_here()
do
{
t = now();
sleep(0); // or whatever fits your system
}
while(t < last + frame_duration);
last = last + frame_duration; // THIS IS KEY
// last = now; // would not produce the right framerate
}
Basically, you delay your frame, as you seem to do. But when bookkeeping the time spent, you only add the time you wanted (frame_duration). So, over multiple frame, it evens out.
To elaborate, if now() starts at 1000 (whatever unit) and you set frame_duration to 200, frame 1 will only run after 1200, frame 2 after 1400, ... frame 100 after 20100, giving exact frame rates over long periods of time.

How do I return a closure from a function?

I want my getEnd function to return a closure with start saved.
When I call this closure it should return time difference...
How to implement it in the c++?
Something like followed:
using namespace std;
long microtime() {
timeval time;
gettimeofday(&time, NULL);
long microsec = ((unsigned long long)time.tv_sec * 1000000) + time.tv_usec;
return microsec;
}
std::function<void()> getEnd (){
long start = microtime();
long end() {
return microtime() - start;
}
return end;
};
#include <functional>
std::function<long()> getEnd()
{
long const start = microtime();
return [=]{ return microtime() - start; };
}
Please note that the above will allocate memory on the heap, so for most practical applications a better alternative would be
struct timer {
long const start;
timer(): start(microtime()) {}
long operator()() { return microtime - start(); }
};
timer getEnd() { return timer(); }