Why is my rate limiter enforcing a different rate than specified? - c++

The Problem: I'm writing a game (as a programming exercise) from scratch. I'm trying to limit the number of game logic loops ("ticks") per second. I've set it to an arbitrary 100 ticks/second. But no matter what I do, it seems to run at ~130 ticks/second. Could it possibly be rounding errors adding up? Something else? Thanks in advance for any help you can give.
Note: my codebase is much larger than this, but for the purposes of this question, I've stripped it down as much as possible without breaking the rate limiter.
The Output:
counter 1 sleep_for(5ms)
counter 2 sleep_for(2ms)
[snip]
counter 132 sleep_for(3ms)
counter 133 sleep_for(3ms)
133 TPS last 1003ms
counter 134 sleep_for(3ms)
counter 135 sleep_for(3ms)
[snip]
counter 265 sleep_for(3ms)
counter 266 sleep_for(3ms)
133 TPS last 1004ms
counter 267 sleep_for(3ms)
counter 268 sleep_for(3ms)
[snip]
counter 399 sleep_for(3ms)
counter 400 sleep_for(3ms)
134 TPS last 1006ms
The Code:
(The two main functions to look at are ThreadRateLimiter::Tock() and TickRateCounter::Tock())
#include <chrono>
#include <exception>
#include <fstream>
#include <iostream>
#include <thread>
#include <vector>
using namespace std;
const int TICK_RATE = 100;
const chrono::milliseconds TIME_PER_TICK =
chrono::duration_cast<chrono::milliseconds>(chrono::seconds(1)) / TICK_RATE;
template <class T>
class TickTocker
{
public:
virtual void Tick() = 0;
virtual T Tock() = 0;
virtual T TickTock()
{
Tick();
return Tock();
}
};
Ticker:
class Ticker : public TickTocker<long>
{
public:
friend ostream& operator<<(ostream& stream, const Ticker& counter);
Ticker() :
Ticker(0)
{}
Ticker(long counter) :
mTicks(counter),
mTicksLast(mTicks)
{}
Ticker(const Ticker& counter) :
Ticker(counter.mTicks)
{}
bool operator==(const long i)
{
return mTicks == i;
}
void Tick() override
{
mTicks++;
}
long Tock() override
{
long diff = mTicks - mTicksLast;
mTicksLast = mTicks;
return diff;
}
// private:
long mTicks;
long mTicksLast;
};
ostream& operator<<(ostream& stream, const Ticker& counter)
{
return (stream << "counter " << counter.mTicks);
}
TickTracker:
class TickTracker : public TickTocker<chrono::milliseconds>
{
public:
TickTracker() :
mTime(chrono::steady_clock::now()),
mLastTime(mTime)
{}
void Tick() override
{
mTime = chrono::steady_clock::now();
}
chrono::milliseconds Tock() override
{
chrono::milliseconds diff = chrono::duration_cast<chrono::milliseconds>(mTime - mLastTime);
mLastTime = mTime;
return diff;
}
protected:
chrono::time_point<chrono::steady_clock> mTime;
chrono::time_point<chrono::steady_clock> mLastTime;
};
ThreadRateLimiter:
class ThreadRateLimiter : public TickTracker
{
public:
ThreadRateLimiter() : TickTracker(),
mMsFast(chrono::milliseconds(0))
{}
void Tick() override
{
mCounter.Tick();
TickTracker::Tick();
if (mCounter == 1)
{
TickTracker::Tock();
}
}
chrono::milliseconds Tock()
{
chrono::milliseconds diff = TickTracker::Tock();
chrono::milliseconds remaining = TIME_PER_TICK - diff;
/*
* If we always sleep the full remaining time, we'll alternate between sleeping for "minimum" and "maximum" sleep
* times. Sleeping the full remaining time only when we exceed the average makes for more stable sleep times.
*/
bool fullSleep = (mMsFast.count() > (TIME_PER_TICK.count() / 2));
mMsFast += remaining;
if (mMsFast.count() > 0)
{
chrono::milliseconds sleep = fullSleep ? mMsFast : (chrono::milliseconds(mMsFast.count() / 2));
cout << mCounter << " sleep_for(" << sleep.count() << "ms)" << endl;
this_thread::sleep_for(mMsFast);
mMsFast -= sleep;
}
mCounter.Tock();
return remaining;
}
private:
Ticker mCounter;
chrono::milliseconds mMsFast;
};
TickRateCounter:
class TickRateCounter : public TickTracker
{
public:
TickRateCounter(string rateLabel) : TickTracker(),
mRateLabel(rateLabel)
{}
void Tick() override
{
mCounter.Tick();
TickTracker::Tick();
}
chrono::milliseconds Tock() override
{
if (chrono::duration_cast<chrono::seconds>(mTime - mLastTime).count() >= 1)
{
chrono::milliseconds duration = TickTracker::Tock();
cout << (mCounter.Tock() / chrono::duration_cast<chrono::seconds>(duration).count()) << " " << mRateLabel
<< " last " << duration.count() << "ms" << endl;
return duration;
}
return chrono::milliseconds(0);
}
// private:
Ticker mCounter;
string mRateLabel;
};
Main:
int main()
{
ThreadRateLimiter mRateLimiter;
TickRateCounter mTpsCounter("TPS"); // TPS = Ticks per second. Tick = one game loop
while (mTpsCounter.mCounter.mTicks < 400)
{
mRateLimiter.TickTock();
mTpsCounter.TickTock();
}
return 0;
}

The proper way to have rate limiting is:
//pseudocode follows
const frame_duration = something;
last = now();
while(true)
{
process_your_frame_here()
do
{
t = now();
sleep(0); // or whatever fits your system
}
while(t < last + frame_duration);
last = last + frame_duration; // THIS IS KEY
// last = now; // would not produce the right framerate
}
Basically, you delay your frame, as you seem to do. But when bookkeeping the time spent, you only add the time you wanted (frame_duration). So, over multiple frame, it evens out.
To elaborate, if now() starts at 1000 (whatever unit) and you set frame_duration to 200, frame 1 will only run after 1200, frame 2 after 1400, ... frame 100 after 20100, giving exact frame rates over long periods of time.

Related

How to reduce the (writer side) performance penalty for sharing a small structure between two threads?

I'm adding a profiler to an existing evaluator. The evaluator is implemented in c++ (std17 x86 msvc) and I need to store 3 int32, 1 uint16 and 1 uint8 to represent the context of the execution frame. As this is interpreted code, I cannot write any kernel level driver to take snapshots, so we have to add it to the eval loop. As with all profiling, we want to avoid slowing down the actual computation. So far for the motivation/context of this question.
At a high level, the behavior is as follows: before every instruction gets executed by the evaluator, it calls a small function ("hey, i'm here right now"), the sampling profiler is only interested in that frame position once every X ms (or μs, but that's most likely pushing it). So we need 2 threads (let's ignore the serialization for now) that want to share data. We have a very frequent writer (one single thread), and an infrequent reader (a different single thread). We like to minimize the performance penalty on the write size. Note, sometimes the writer becomes slow, so it might be stuck on a frame for a few seconds, we would like to be able to observe this.
So to help this question, I've written a small benchmark setup.
#include <memory>
#include <chrono>
#include <string>
#include <iostream>
#include <thread>
#include <immintrin.h>
#include <atomic>
#include <cstring>
using namespace std;
typedef struct frame {
int32_t a;
int32_t b;
uint16_t c;
uint8_t d;
} frame;
class ProfilerBase {
public:
virtual void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) = 0;
virtual void Stop() = 0;
virtual ~ProfilerBase() {}
virtual string Name() = 0;
};
class NoOp : public ProfilerBase {
public:
void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) override {}
void Stop() override {}
string Name() override { return "NoOp"; }
};
class JustStore : public ProfilerBase {
private:
frame _current = { 0 };
public:
string Name() override { return "OnlyStoreInMember"; }
void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) override {
_current.a = a;
_current.b = b;
_current.c = c;
_current.d = d;
}
void Stop() override {
if ((_current.a + _current.b + _current.c + _current.d) == _current.a) {
cout << "Make sure optimizer keeps the record around";
}
}
};
class WithSampler : public ProfilerBase {
private:
unique_ptr<thread> _sampling;
atomic<bool> _keepSampling = true;
protected:
const chrono::milliseconds _sampleEvery;
virtual void _snap() = 0;
virtual string _subname() = 0;
public:
WithSampler(chrono::milliseconds sampleEvery): _sampleEvery(sampleEvery) {
_sampling = make_unique<thread>(&WithSampler::_sampler, this);
}
void Stop() override {
_keepSampling = false;
_sampling->join();
}
string Name() override {
return _subname() + to_string(_sampleEvery.count()) + "ms";
}
private:
void _sampler() {
auto nextTick = chrono::steady_clock::now();
while (_keepSampling)
{
const auto sleepTime = nextTick - chrono::steady_clock::now();
if (sleepTime > chrono::milliseconds(0))
{
this_thread::sleep_for(sleepTime);
}
_snap();
nextTick += _sampleEvery;
}
}
};
struct checkedFrame {
frame actual;
int32_t check;
};
// https://rigtorp.se/spinlock/
struct spinlock {
std::atomic<bool> lock_ = { 0 };
void lock() noexcept {
for (;;) {
// Optimistically assume the lock is free on the first try
if (!lock_.exchange(true, std::memory_order_acquire)) {
return;
}
// Wait for lock to be released without generating cache misses
while (lock_.load(std::memory_order_relaxed)) {
// Issue X86 PAUSE or ARM YIELD instruction to reduce contention between
// hyper-threads
_mm_pause();
}
}
}
void unlock() noexcept {
lock_.store(false, std::memory_order_release);
}
};
class Spinlock : public WithSampler {
private:
spinlock _loc;
checkedFrame _current;
public:
using WithSampler::WithSampler;
string _subname() override { return "Spinlock"; }
void EnterFrame(int32_t a, int32_t b, uint16_t c, uint8_t d) override {
_loc.lock();
_current.actual.a = a;
_current.actual.b = b;
_current.actual.c = c;
_current.actual.d = d;
_current.check = a + b + c + d;
_loc.unlock();
}
protected:
void _snap() override {
_loc.lock();
auto snap = _current;
_loc.unlock();
if ((snap.actual.a + snap.actual.b + snap.actual.c + snap.actual.d) != snap.check) {
cout << "Corrupted snap!!\n";
}
}
};
static constexpr int32_t LOOP_MAX = 1000 * 1000 * 1000;
int measure(unique_ptr<ProfilerBase> profiler) {
cout << "Running profiler: " << profiler->Name() << "\n ";
cout << "\tProgress: ";
auto start_time = std::chrono::steady_clock::now();
int r = 0;
for (int32_t x = 0; x < LOOP_MAX; x++)
{
profiler->EnterFrame(x, x + x, x & 0xFFFF, x & 0xFF);
r += x;
if (x % (LOOP_MAX / 1000) == 0)
{
this_thread::sleep_for(chrono::nanoseconds(10)); // simulat that sometimes we do other stuff not like storing
}
if (x % (LOOP_MAX / 10) == 0)
{
cout << static_cast<int>((static_cast<double>(x) / LOOP_MAX) * 10);
}
if (x % 1000 == 0) {
_mm_pause(); // give the other threads some time
}
if (x == (LOOP_MAX / 2)) {
// the first half of the loop we take as warmup
// so now we take the actual time
start_time = std::chrono::steady_clock::now();
}
}
cout << "\n";
const auto done_calc = std::chrono::steady_clock::now();
profiler->Stop();
const auto done_writing = std::chrono::steady_clock::now();
cout << "\tcalc: " << chrono::duration_cast<chrono::milliseconds>(done_calc - start_time).count() << "ms\n";
cout << "\tflush: " << chrono::duration_cast<chrono::milliseconds>(done_writing - done_calc).count() << "ms\n";
return r;
}
int main() {
measure(make_unique<NoOp>());
measure(make_unique<JustStore>());
measure(make_unique<Spinlock>(chrono::milliseconds(1)));
measure(make_unique<Spinlock>(chrono::milliseconds(10)));
return 0;
}
Compiling this with /O2 in x86 mode on my machine gives this output:
Running profiler: NoOp
Progress: 0123456789
calc: 1410ms
flush: 0ms
Running profiler: OnlyStoreInMember
Progress: 0123456789
calc: 1368ms
flush: 0ms
Running profiler: Spinlock1ms
Progress: 0123456789
calc: 3952ms
flush: 4ms
Running profiler: Spinlock10ms
Progress: 0123456789
calc: 3985ms
flush: 11ms
(while this was compiled with msvc in VS2022, I think g++ --std=c++17 -O2 -m32 -pthread -o testing small-test-case.cpp should come close enough).
Here we see that the Spinlock based sampler adds a ~2.5x overhead to the one without any. I've profiled it, and as expected, a lot of time is spend on taking a lock (where in most cases, there was no need for the lock).

First output of delta time always wrong using chrono

I am trying to make a libary so that I can reuse all my class in future. Down here I have a class that can calculate delta time for fps and other stuffs if needed in future. I noticed that my first output of running always be a negative power up of -7 or more. Is it because I use float to do the calculation that it create this imprecision?
class Timer {
public:
Timer();
~Timer();
void Update();
float GetDeltaTime() const;
private:
std::chrono::time_point<std::chrono::high_resolution_clock> myCurrentTime;
std::chrono::time_point<std::chrono::high_resolution_clock> myLastTime;
};
Timer::Timer()
{
myCurrentTime = std::chrono::high_resolution_clock::now();
myLastTime = std::chrono::high_resolution_clock::now();
}
void Timer::Update()
{
myLastTime = myCurrentTime;
myCurrentTime = std::chrono::high_resolution_clock::now();
}
float Timer::GetDeltaTime() const
{
std::chrono::duration<float> deltaTime = std::chrono::duration_cast<std::chrono::duration<float> >(myCurrentTime - myLastTime);
return deltaTime.count();
}
int main()
{
Timer myTimer;
while (true) {
myTimer.Update();
std::cout << "Delta time: " << myTimer.GetDeltaTime() << std::endl; //Delta time: 2e-07 more or less it always become like this for first output
Sleep(1000);
}
}
Result
I tried to cast duration of time to float before the function run but it still produce same result

Linker error on Timer based on Singleton pattern

Trying to write Singleton for the first time. This is a timer that works with one function to handle timer's both start and stop and also printing result.
When compiling, I'm getting linker errors like this one:
:-1: ошибка: CMakeFiles/some_algorithms.dir/timer_singleton.cpp.obj:timer_singleton.cpp:(.rdata$.refptr._ZN15timer_singleton7counterE[.refptr._ZN15timer_singleton7counterE]+0x0): undefined reference to `timer_singleton::counter'
What causes this error and how do I fix it?
Here is my source code:
timer_singleton.h
#ifndef TIMER_SINGLETON_H
#define TIMER_SINGLETON_H
#pragma once
#include <iostream>
#include <chrono>
class timer_singleton
{
public:
timer_singleton(timer_singleton & other) = delete;
void operator=(const timer_singleton& other) = delete;
static timer_singleton * getInstance();
static void hit_the_clock();
private:
timer_singleton();
static timer_singleton * instance;
static std::chrono::high_resolution_clock clock;
static std::chrono::high_resolution_clock::time_point start;
static std::chrono::high_resolution_clock::time_point stop;
static size_t counter;
};
#endif // TIMER_SINGLETON_H
timer_singleton.cpp
#include "timer_singleton.h"
timer_singleton::timer_singleton()
{
clock = std::chrono::high_resolution_clock();
start = clock.now();
stop = clock.now();
counter = 0;
}
timer_singleton * timer_singleton::getInstance()
{
if (instance == nullptr)
{
instance = new timer_singleton();
}
return instance;
}
void timer_singleton::hit_the_clock()
{
if (counter % 2 == 1)
{
// Clocks start ticking
start = clock.now();
++counter;
}
else
{
// Clocks stop ticking and print time measured time
stop = clock.now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
std::cout << "Measured time = " << duration.count() << " microseconds" << std::endl;
++counter;
}
}
main.cpp
#include "timer_singleton.h"
// ...
timer_singleton * timer = timer_singleton::getInstance();
timer->hit_the_clock();
// some calculations
timer->hit_the_clock();
The problem: Most non-constant static members need to be defined outside of the class definition in order to get the one-and-only-one instance that will be shared by all class instances. Normally this means that in timer_singleton.cpp you would have to add
timer_singleton::counter = 0; // allocate and initialize
But...
A singleton is effectively already static so the only static member should be the function that gets the instance. This makes the whole problem go away.
New code with comments about other useful changes:
class timer_singleton
{
public:
timer_singleton(timer_singleton &other) = delete;
void operator=(const timer_singleton &other) = delete;
static timer_singleton* getInstance();
void hit_the_clock(); // shouldn't be static
private:
timer_singleton();
// None of these should have been static
std::chrono::high_resolution_clock clock; // This clock could jump around,
// including backward, in time.
// Safer with a steady_clock
std::chrono::high_resolution_clock::time_point start;
std::chrono::high_resolution_clock::time_point stop;
size_t counter;
};
timer_singleton::timer_singleton():
start(clock.now()),
stop(start), // guaranteed to be same as start
counter(0)
{ // assignments replaced with initializations in member initializer list
}
timer_singleton* timer_singleton::getInstance()
{ // now using Meyers singelton
static timer_singleton instance;
return &instance; // consider adjusting to return a reference.
// Often a bit cleaner thanks to the no null guarantee
}
void timer_singleton::hit_the_clock()
{
auto now = clock.now(); // if timing is critical, the first thing
// you do is get the current time.
//if (counter % 2 == 1) // remember counter starts at 0, so first hit
// would stop, not start, the timer.
if (counter % 2 == 0)
{
// Clocks start ticking
start = now;
}
else
{
// Clocks stop ticking and print time measured time
stop = now;
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
std::cout << "Measured time = " << duration.count() << " microseconds" << std::endl;
}
++counter; // don't repeat yourself
}

left of '.expression' must have class/struct/union

In this code I have an object that contains two variables hours and minutes. Im trying to consecutively add a value of 15 minutes and a value of 20 minutes to an object called t1. But I get the error : left of '.plus' must have class/struct/union.
Thanks in advance.
#include <iostream>
#include <iomanip>
using namespace std;
class Time {
public:
Time(int u, int m);
Time(int g);
void plus(Time t);
void plus(int x);
void print();
private:
int min;
int hour;
};
void Time::plus(Time t) {
min += t.min;
if (min > 59) {
hour++;
min = min - 60;
}
}
void Time::plus(int x) {
min += x;
if (min > 59) {
hour++;
min = min - 60;
}
}
void Time::print() {
cout << setw(2) << hour << " hour and " << min << " minutes";
}
Time::Time(int u, int m) : hour(u), min(m) {
}
Time::Time(int m) : hour(0), min(m) {
}
int main() {
Time t1(1, 10);
const Time kw(15);
t1.plus(kw).plus(20);
cout << "t1 = "; t1.print(); cout << endl;
cin.get();
return 0;
}
Right now, your plus returns void, or nothing! So you can't do t1.plus(kw).plus(20);. If you want to, you need to have your .plus() return a Time:
class Time {
public:
Time &plus(Time t);
Time &plus(int x);
...
};
I've gone ahead and made it return a Time by reference so that when you chain the function like you are, the next plus will still modify the original object the first plus was called on! We can do this if you implement the plus functions like:
Time &Time::plus(Time t) {
min += t.min;
if (min > 59) {
hour++;
min = min - 60;
}
return *this; //return ourselves so that the next func will be called on us too!
}
Time &Time::plus(int x) {
min += x;
if (min > 59) {
hour++;
min = min - 60;
}
return *this; //return ourselves so that the next func will be called on us too!
}

Timer working improperly

I have a problem with a timer class based on a SDL timer.
class CTimer
{
public:
CTimer(): startTick(0), endTick(0), curTime(0), running(false) {};
void Start() { startTick += SDL_GetTicks() - endTick; running = true; };
void Stop() { endTick = SDL_GetTicks(); running = false; };
void Reset() { startTick = 0; endTick = 0; curTime = 0; };
inline int operator()() { return running ? curTime = ((int) SDL_GetTicks - startTick) / 1000 + 1 : curTime; };
private:
int startTick;
int endTick;
int curTime;
bool running;
};
The () operator should return time in seconds (stored in curTime). But it always returns 4202 (curTime is always equal to that). What am I doing wrong?
Test code:
int main()
{
SDL_Init (SDL_INIT_TIMER);
CApp::CTimer timer;
timer.Start();
for (int i = 0; i < 15; ++i)
{
SDL_Delay (1000);
std::cout << timer() << '\n';
}
return 0;
}
This is a perfect example of why you don't want to use old-style C casts in C++.
(int) SDL_GetTicks
The missing parentheses on the function call mean you're casting a pointer to the function to an int, not the return value. Surprisingly enough the pointer to the function never changes.
Are you missing parentheses for SDL_GetTicks?
inline int operator()() { return running ? curTime = ((int) SDL_GetTicks - startTick) / 1000 + 1 : curTime; };
For starters,
inline int operator()() { return running ? curTime =
((int) SDL_GetTicks - startTick) / 1000 + 1 : curTime; };
should be
inline int operator()() { return running ? curTime =
((int) SDL_GetTicks() - startTick) / 1000 + 1 : curTime; };
I would think.
Did you get a warning error about this?
In addition to the attempted call to SDL_GetTicks instead of SDL_GetTicks() causing it to take the address of that function (and always returning the constant as you observed), it looks like if you call Start and then Stop before calling operator() you won't get a meaningful result.