Building a Timeline from Lossy Time Stamps - c++

I'm working with a product from Velodyne called the VLP-16 (the manual is available from their website with details) and I'm trying to build a timeline of the data it sends. The data it sends comes through a UDP transmission (UDP packets may appear out of order) and each packet is time-stamped with a 32-bit microsecond value. The microsecond value is synced with UTC time. This means that the timestamp will wrap around back to zero after each hour in UTC time. Since UDP packets may technically appear out of order, it is difficult to know what hour a packet may belong to.
Here's a snippet of code that generally describes the problem at hand:
struct LidarPacket
{
uint32_t microsecond;
/* other data */
};
struct LidarTimelineEntry
{
uint32_t hour;
LidarPacket packet;
};
using LidarTimeline = std::vector<LidarTimelineEntry>;
void InsertAndSort(LidarTimeline& timeline, uint32_t hour, const LidarPacket&);
void OnLidarPacket(LidarTimeline &timeline, LidarPacket& newestPacket)
{
/* Where to insert 'newestPacket'? */
}
The simplest approach would be to assume that the packets come in order.
void OnLidarPacket(LidarTimeline &timeline, LidarPacket& newestPacket)
{
if (timeline.empty()) {
timeline.emplace_back(LidarTimelineEntry{0, newestPacket});
return;
}
auto &lastEntry = timeline.back();
if (newestPacket.microsecond < lastEntry.packet.microsecond) {
InsertAndSort(timeline, lastEntry.hour + 1, newestPacket);
} else {
InsertAndSort(timeline, lastEntry.hour, newestPacket);
}
}
This approach will fail if even one packet is out of order though. A slightly more robust way is to also check to see if the wrap occurs near the end of the hour.
bool NearEndOfHour(const LidarPacket& lidarPacket)
{
const uint32_t packetDuration = 1344; // <- approximate duration of one packet.
const uint32_t oneHour = 3600000000; // <- one hour in microseconds
return (lidarPacket.microsecond < packetDuration) || (lidarPacket.microsecond > (oneHour - packetDuration));
}
void OnLidarPacket(LidarTimeline &timeline, LidarPacket& newestPacket)
{
if (timeline.empty()) {
timeline.emplace_back(LidarTimelineEntry{0, newestPacket});
return;
}
auto &lastEntry = timeline.back();
if ((newestPacket.microsecond < lastEntry.packet.microsecond) && NearEndOfHour(lastEntry.packet)) {
InsertAndSort(timeline, lastEntry.hour + 1, newestPacket);
} else {
InsertAndSort(timeline, lastEntry.hour, newestPacket);
}
}
But it's difficult to tell if this is really going to cut it. What is the best way to build a multi-hour timeline from microsecond-stamped data coming from a UDP stream?
Doesn't have to be answered in C++

The approach I ended up going with is the following:
If the timeline is empty, add the time point to the timeline with hour=0
Otherwise, create three possible time points. One using the hour of the last time point, one with an hour before the hour of the last time point, and the last an hour after the last time point.
Compute the absolute differences of all these time points with the last time point (using a 64-bit integer).
Select the hour which yields the smallest absolute difference with the last time point.
This approach passed the following tests:
uint32_t inputTimes[] = { 0, 750, 250 };
// gets sorted to:
uint32_t outputTimes[] = { 0, 250, 750 };
int32_t outputHours[] = { 0, 0, 0 };
uint32_t inputTimes[] = { 1500, oneHour - 500, 250 };
// gets sorted to:
uint32_t outputTimes[] = { oneHour - 500, 250, 1500 };
int32_t outputHours[] = { -1, 0, 0 };
uint32_t inputTimes[] = { oneHour - 500, 1500, 250 };
// gets sorted to:
uint32_t outputTimes[] = { oneHour - 500, 250, 1500 };
int32_t outputHours[] = { 0, 1, 1 };

Related

Fastest way to process http request

I am currently working on creating a network of multisensors (measuring temp, humidity ect). There will be tens or in some buildings even hundreds of sensors measuring at the same time. All these sensors send their data via a http GET request to a local esp32 server that processes the data and converts it into whatever the building's contol system can work with (KNX, BACnet, MODbus). Now I stress tested this server and found out that it can process around 1400 requests per minute before the sender gets no response anymore. This seems like a high amount but if a sensor sends its data every 2 seconds it means there will be a limit of around 45 sensors. I need to find a way how to process such a request quicker, this is the code I currently use:
server.on("/get-data", HTTP_GET, [](AsyncWebServerRequest *request)
{handle_get_data(request); request->send(200); });
void handle_get_data(AsyncWebServerRequest *request)
{
packetval++;
sensorData.humidity = request->arg("humidity").toFloat();
sensorData.temperature = request->arg("temperature").toFloat();
sensorData.isMovement = request->arg("isMovement");
sensorData.isSound = request->arg("isSound");
sensorData.luxValue = request->arg("luxValue").toDouble();
sensorData.RSSI = request->arg("signalValue").toInt();
sensorData.deviceID = request->arg("deviceID");
sensorData.btList = request->arg("btList");
if (deviceList.indexOf(sensorData.deviceID) == -1)
{
deviceList += sensorData.deviceID;
activeSensors++;
}
if (sensorData.isMovement || sensorData.isSound)
{
sendDataFlag = true;
}
}
I use the AsyncTCP library.
Now I measured the execution time of the function handle_get_data() and it turns out it is only ~175uS which is very quick. However the time between two calls of handle_get_data() is around 6ms which is really slow but it still doesnt explain why I can only process 1400 per minute or 24 per second (6ms = 155Hz why is my limit 24Hz?). Other than that I do not use any other code during the processing of a request, is it perhaps a limitation in the library? Is there another way to process such a request?
A request looks like this: http://192.168.6.51:80/get-data?humidity=32.0&temperature=32.0&isMovement=1&isSound=1&luxValue=123&RSSI=32&deviceID=XX:XX:XX:XX:XX:XX&btList=d1d2d3d4d5d6d7
If there is really nothing I can do I can always switch to a raspberry pi to process everything but I would rather stick to esp32 since I want to easily create an own PCB.
Thanks for all the help!
Creating a websocket instead of using http requests solved the issue for me:
AsyncWebSocket ws("/ws");
void setup()
{
ws.onEvent(onWsEvent);
server.addHandler(&ws);
}
AsyncWebSocketClient *wsClient;
void onWsEvent(AsyncWebSocket *server, AsyncWebSocketClient *client, AwsEventType type, void *arg, uint8_t *data, size_t len)
{
if (type == WS_EVT_DATA)
{
AwsFrameInfo *info = (AwsFrameInfo *)arg;
String msg = "";
packetval++;
if (info->final && info->index == 0 && info->len == len)
{
if (info->opcode == WS_TEXT)
{
for (size_t i = 0; i < info->len; i++)
{
msg += (char)data[i];
}
}
}
sensorData.humidity = msg.substring(msg.indexOf("<hum>") + 5, msg.indexOf("</hum>")).toFloat();
sensorData.temperature = msg.substring(msg.indexOf("<tem>") + 5, msg.indexOf("</tem>")).toFloat();
sensorData.isMovement = (msg.substring(msg.indexOf("<isMov>") + 7, msg.indexOf("</isMov>")) == "1");
sensorData.isSound = (msg.substring(msg.indexOf("<isSnd>") + 7, msg.indexOf("</isSnd>")) == "1");
sensorData.luxValue = msg.substring(msg.indexOf("<lux>") + 5, msg.indexOf("</lux>")).toDouble();
sensorData.RSSI = msg.substring(msg.indexOf("<RSSI>") + 6, msg.indexOf("</RSSI>")).toInt();
sensorData.deviceID = msg.substring(msg.indexOf("<dID>") + 5, msg.indexOf("</dID>"));
sensorData.btList = msg.substring(msg.indexOf("<bt>") + 4, msg.indexOf("</bt>"));
if (deviceList.indexOf(sensorData.deviceID) == -1)
{
deviceList += sensorData.deviceID;
activeSensors++;
}
if (sensorData.isMovement || sensorData.isSound)
{
sendDataFlag = true;
}
}
}
This will process more than 11000 packets per minute (200kb/s). The execution time of void onWsEvent(AsyncWebSocket *server, AsyncWebSocketClient *client, AwsEventType type, void *arg, uint8_t *data, size_t len) takes ~500uS now which means there is definitly optimising to do in this function but the time between two calls is reduced all the way to 1ms.

Why is the xTimerPeriodInTicks affecting the code's runtime?

So for a report I'm trying to time the number of cycles it takes to execute a couple seperate functions in a program running on an ESP32, using freeRTOS. The code looks something like this with init_input_generator() being the first function called, here i'm timing onesecwindow.fifo_iir_filter_datapoint():
float geninputarray[SAMPLING_RATE]; //stores the values of an artificial signal we can use to test the algorithm
fifo_buffer onesecwindow = fifo_buffer(SAMPLING_RATE);
esp_err_t init_input_generator() {
dsps_tone_gen_f32(geninputarray, SAMPLING_RATE, 4, 0.04, 0); //10Hz
TimerHandle_t input_signal_timer =
xTimerCreate("input_timer", pdMS_TO_TICKS(10), pdTRUE, (void *)0, input_generator_callback);
xTimerStart(input_signal_timer, 0);
return ESP_OK;
}
void input_generator_callback(TimerHandle_t xTimer)
{
xTaskCreatePinnedToCore(
input_counter,
"input_generator",
4096,
NULL,
tskIDLE_PRIORITY + 11,
NULL,
PRO_CPU_NUM);
}
void input_counter(TimerHandle_t xTimer)
{
stepcounter = (stepcounter + 1) % SAMPLING_RATE;
insert_new_data(geninputarray[stepcounter]);
vTaskDelete(nullptr); // to gracefully end the task as returning is not allowed
}
esp_err_t insert_new_data(float datapoint)
{
onesecwindow.fifo_write(datapoint); // writes the new datapoint into the onesec window
unsigned int start_time = dsp_get_cpu_cycle_count();
onesecwindow.fifo_iir_filter_datapoint();
unsigned int end_time = dsp_get_cpu_cycle_count();
printf("%i\n", (end_time-start_time));
}
The thing that i'm noticing is that whenever I change the xTimerPeriodInTicks to a larger number, I get significantly longer time results, which just really confuses me and get's in the way of proper timing. Although the function doesn't scale with the SAMPLING_RATE and thus should give quite consistent results for each loop, output in cycles with a 10ms timer period looks something like this (every 5 or 6 "loops" for some reason it's longer):
1567, 630, 607, 624, 591, 619, 649, 1606, 607
with a 40ms timer period I get this as a typical output:
1904, 600, 1894, 616, 1928, 1928, 607, 1897, 628
As such I'm confused by the output, as they use the same freeRTOS taskcreate to run with the same priority on the same core, I don't see why there would be any difference. Perhaps I'm misunderstanding something basic/fundamental here so any help would be greatly appreciated.
Update
So based on the comments by #Tarmo i have restructured to approach to having a recurring task, unfortunately the output still seems to suffer from the same problem. The code now looks like this:
#include <esp_dsp.h> //Official ESP-DSP library
float geninputarray[SAMPLING_RATE]; //stores the values of an artificial signal we can use to test the algorithm
fifo_buffer onesecwindow = fifo_buffer(SAMPLING_RATE);
esp_err_t init_input_generator() {
dsps_tone_gen_f32(geninputarray, SAMPLING_RATE, 4, 0.04, 0); //10Hz
xTaskCreatePinnedToCore(
input_counter, // the actual function to be called
"input_generator",
4096,
NULL,
tskIDLE_PRIORITY + 5,
NULL,
APP_CPU_NUM);
return ESP_OK;
}
void input_counter(TimerHandle_t xTimer)
{
while (true)
{
stepcounter = (stepcounter + 1) % SAMPLING_RATE;
insert_new_data(geninputarray[stepcounter]);
vTaskDelay(4);
}
}
esp_err_t insert_new_data(float datapoint)
{
onesecwindow.fifo_write(datapoint); // writes the new datapoint into the onesec window
unsigned int start_time = dsp_get_cpu_cycle_count();
onesecwindow.fifo_iir_filter_datapoint();
unsigned int end_time = dsp_get_cpu_cycle_count();
printf("%i\n", (end_time-start_time));
}

DPDK 17.11.1 - drops seen when doing destination based rate limiting

Editing the problem statement to highlight more on the core logic
We are seeing performance issues when doing destination based rate limiting.
We maintain state for every {destination-src} pair (max of 100 destinations and 2^16 sources). We have an array of 100 nodes and at each node we have a rte_hash*. This hash table is going to maintain the state of every source ip seen by that destination. We have a mapping for every destination seen (0 to 100) and this is used to index into the array. If a particular source exceeds a threshold defined for this destination in a second, we block the source, else we allow the source. At runtime, when we see only traffic for 2 or 3 destinations, there are no issues, but when we go beyond 5, we are seeing lot of drops. Our function has to do a lookup and identify the flow matching the dest_ip and src_ip. Process the flow and decide whether it needs dropping. If the flow is not found, add it to the hash.
struct flow_state {
struct rte_hash* hash;
};
struct flow_state flow_state_arr[100];
// am going to create these hash tables using rte_hash_create at pipeline_init and free them during pipeline_free.
Am outlining what we do in pseudocode.
run()
{
1) do rx
2) from the pkt, get index into the flow_state_arr and retrieve the rte_hash* handle
3) rte_hash_lookup_data(hash, src_ip,flow_data)
4) if entry found, take decision on the flow (the decision is simply say rate limiting the flow)
5) else rte_hash_add_data(hash,src_ip,new_flow_data) to add the flow to table and forward
}
Please guide if we can have these multiple hash table objects in data path or whats the best way if we need to handle states for every destination separately.
Edit
Thanks for answering. I will be glad to share the code snippets and our gathered results. I don't have comparison results for other DPDK versions, but below are some of the results for our tests using 17.11.1.
Test Setup
Am using IXIA traffic gen (using two 10G links to generate 12Mpps) for 3 destinations 14.143.156.x (in this case - 101,102,103). Each destination's traffic comes from 2^16 different sources. This is the traffic gen setup.
Code Snippet
struct flow_state_t {
struct rte_hash* hash;
uint32_t size;
uint64_t threshold;
};
struct flow_data_t {
uint8_t curr_state; // 0 if blocked, 1 if allowed
uint64_t pps_count;
uint64_t src_first_seen;
};
struct pipeline_ratelimit {
struct pipeline p;
struct pipeline_ratelimit_params params;
rte_table_hash_op_hash f_hash;
uint32_t swap_field0_offset[SWAP_DIM];
uint32_t swap_field1_offset[SWAP_DIM];
uint64_t swap_field_mask[SWAP_DIM];
uint32_t swap_n_fields;
pipeline_msg_req_handler custom_handlers[2]; // handlers for add and del
struct flow_state_t flow_state_arr[100];
struct flow_data_t flows[100][65536];
} __rte_cache_aligned;
/*
add_handler(pipeline,msg) -- msg includes index and threshold
In the add handler
a rule/ threshold is added for a destination
rte_hash_create and store rte_hash* in flow_state_arr[index]
max of 100 destinations or rules are allowed
previous pipelines add the ID (index) to the packet to look in to the
flow_state_arr for the rule
*/
/*
del_handler(pipeline,msg) -- msg includes index
In the del handler
a rule/ threshold #index is deleted
the associated rte_hash* is also freed
the slot is made free
*/
#define ALLOWED 1
#define BLOCKED 0
#define TABLE_MAX_CAPACITY 65536
int do_rate_limit(struct pipeline_ratelimit* ps, uint32_t id, unsigned char* pkt)
{
uint64_t curr_time_stamp = rte_get_timer_cycles();
struct iphdr* iph = (struct iphdr*)pkt;
uint32_t src_ip = rte_be_to_cpu_32(iph->saddr);
struct flow_state_t* node = &ps->flow_state_arr[id];
struct flow_data_t* flow = NULL
rte_hash_lookup_data(node->hash, &src_ip, (void**)&flow);
if (flow != NULL)
{
if (flow->curr_state == ALLOWED)
{
if (flow->pps_count++ > node->threshold)
{
uint64_t seconds_elapsed = (curr_time_stamp - flow->src_first_seen) / CYCLES_IN_1SEC;
if (seconds_elapsed)
{
flow->src_first_seen += seconds_elapsed * CYCLES_IN_1_SEC;
flow->pps_count = 1;
return ALLOWED;
}
else
{
flow->pps_count = 0;
flow->curr_state = BLOCKED;
return BLOCKED;
}
}
return ALLOWED;
}
else
{
uint64_t seconds_elapsed = (curr_time_stamp - flow->src_first_seen) / CYCLES_IN_1SEC;
if (seconds_elapsed > 120)
{
flow->curr_state = ALLOWED;
flow->pps_count = 0;
flow->src_first_seen += seconds_elapsed * CYCLES_IN_1_SEC;
return ALLOWED;
}
return BLOCKED;
}
}
int index = node->size;
// If entry not found and we have reached capacity
// Remove the rear element and mark it as the index for the new node
if (node->size == TABLE_MAX_CAPACITY)
{
rte_hash_reset(node->hash);
index = node->size = 0;
}
// Add new element #packet_flows[mit_id][index]
struct flow_data_t* flow_data = &ps->flows[id][index];
*flow_data = { ALLOWED, 1, curr_time_stamp };
node->size++;
// Add the new key to hash
rte_hash_add_key_data(node->hash, (void*)&src_ip, (void*)flow_data);
return ALLOWED;
}
static int pipeline_ratelimit_run(void* pipeline)
{
struct pipeline_ratelimit* ps = (struct pipeline_ratelimit*)pipeline;
struct rte_port_in* port_in = p->port_in_next;
struct rte_port_out* port_out = &p->ports_out[0];
struct rte_port_out* port_drop = &p->ports_out[2];
uint8_t valid_pkt_cnt = 0, invalid_pkt_cnt = 0;
struct rte_mbuf* valid_pkts[RTE_PORT_IN_BURST_SIZE_MAX];
struct rte_mbuf* invalid_pkts[RTE_PORT_IN_BURST_SIZE_MAX];
memset(valid_pkts, 0, sizeof(valid_pkts));
memset(invalid_pkts, 0, sizeof(invalid_pkts));
uint64_t n_pkts;
if (unlikely(port_in == NULL)) {
return 0;
}
/* Input port RX */
n_pkts = port_in->ops.f_rx(port_in->h_port, p->pkts,
port_in->burst_size);
if (n_pkts == 0)
{
p->port_in_next = port_in->next;
return 0;
}
uint32_t rc = 0;
char* rx_pkt = NULL;
for (j = 0; j < n_pkts; j++) {
struct rte_mbuf* m = p->pkts[j];
rx_pkt = rte_pktmbuf_mtod(m, char*);
uint32_t id = rte_be_to_cpu_32(*(uint32_t*)(rx_pkt - sizeof(uint32_t)));
unsigned short packet_len = rte_be_to_cpu_16(*((unsigned short*)(rx_pkt + 16)));
struct flow_state_t* node = &(ps->flow_state_arr[id]);
if (node->hash && node->threshold != 0)
{
// Decide whether to allow of drop the packet
// returns allow - 1, drop - 0
if (do_rate_limit(ps, id, (unsigned char*)(rx_pkt + 14)))
valid_pkts[valid_pkt_count++] = m;
else
invalid_pkts[invalid_pkt_count++] = m;
}
else
valid_pkts[valid_pkt_count++] = m;
if (invalid_pkt_cnt) {
p->pkts_mask = 0;
rte_memcpy(p->pkts, invalid_pkts, sizeof(invalid_pkts));
p->pkts_mask = RTE_LEN2MASK(invalid_pkt_cnt, uint64_t);
rte_pipeline_action_handler_port_bulk_mod(p, p->pkts_mask, port_drop);
}
p->pkts_mask = 0;
memset(p->pkts, 0, sizeof(p->pkts));
if (valid_pkt_cnt != 0)
{
rte_memcpy(p->pkts, valid_pkts, sizeof(valid_pkts));
p->pkts_mask = RTE_LEN2MASK(valid_pkt_cnt, uint64_t);
}
rte_pipeline_action_handler_port_bulk_mod(p, p->pkts_mask, port_out);
/* Pick candidate for next port IN to serve */
p->port_in_next = port_in->next;
return (int)n_pkts;
}
}
RESULTS
When generated traffic for only one destination from 60000 sources with threshold of 14Mpps, there were no drops. We were able to send 12Mpps from IXIA and recv 12Mpps
Drops were observed after adding 3 or more destinations (each configured to recv traffic from 60000 sources). The throughput was only 8-9 Mpps. When sent for 100 destinations (60000 src each), only 6.4Mpps were handled. 50% drop was seen.
On running it through vtune-profiler, it reported rte_hash_lookup_data as the hotspot and mostly memory bound (DRAM bound). I will attach the vtune report soon.
Based on the update from internal testing, rte_hash library is not causing performance drops. Hence as suggested in comment is more likely due to current pattern and algorithm design which might be leading cache misses and lesser Instruction per Cycle.
To identify whether it is frontend stall or backend pipeline stall or memory stall please either use perf or vtune. Also try to minimize branching and use more likely and prefetch too.

How to write function for multiple analog pins? (arduino)

So I'm writing this little function for some pot pins. The pot sends a value only when its being turned, at rest, it sends nothing. Which is how I want it to function.
It works fine with one pin.
I've gotten it to a point where it half works with multiple pins. So if I call it twice in the loop with two pins, I get back the right values on both those pins. But I loose the functionality of the if statement. Basically I can't figure out the last half of this. Arrays have been suggested I'm just unsure of how to proceed.
Suggestions? Thank you.
byte pots[2] = {A0, A2};
int lastPotVal = 0;
void setup(){
Serial.begin(9600);
}
void loop(){
// get the pin out of the array
rePot(pots[0]);
rePot(pots[1]);
delay(10);
}
void rePot(const int potPin){
// there is probably an issue around here somewhere...
int potThresh = 2;
int potFinal = 0;
int potVal = 0;
// set and map potVal
potVal = (analogRead(potPin));
potVal = map(potVal, 0, 664, 0, 200);
if(abs(potVal - lastPotVal) >= potThresh){
potFinal = (potVal/2);
Serial.println(potFinal);
lastPotVal = potVal;
} // end of if statement
} // end of rePot
This uses a struct to mange a pot and the data associated with it (the pin it's on, the last reading, threshold, etc). Then, the rePot() function is changed to take one of those structs as input, instead of just the pin number.
struct Pot {
byte pin;
int threshold;
int lastReading;
int currentReading;
};
// defining an array of 2 Pots, one with pin A0 and threshold 2, the
// other with pin A2 and threshold 3. Everything else is automatically
// initialized to 0 (i.e. lastReading, currentReading). The order that
// the fields are entered determines which variable they initialize, so
// {A1, 4, 5} would be pin = A1, threshold = 4 and lastReading = 5
struct Pot pots[] = { {A0, 2}, {A2, 3} };
void rePot(struct Pot * pot) {
int reading = map(analogRead(pot->pin), 0, 664, 0, 200);
if(abs(reading - pot->lastReading) >= pot->threshold) {
pot->currentReading = (reading/2);
Serial.println(pot->currentReading);
pot->lastReading = reading;
}
}
void setup(){
Serial.begin(9600);
}
void loop() {
rePot(&pots[0]);
rePot(&pots[1]);
delay(10);
}
A slightly different take on this is to change rePot() into a function that takes the whole array as input, and then just updates the whole thing. Like this:
void readAllThePots(struct Pot * pot, int potCount) {
for(int i = 0; i < potCount; i++) {
int reading = map(analogRead(pot[i].pin), 0, 664, 0, 200);
if(abs(reading - pot[i].lastReading) >= pot[i].threshold) {
pot[i].currentReading = (reading/2);
Serial.println(pot[i].currentReading);
pot[i].lastReading = reading;
}
}
}
void loop() {
readAllThePots(pots, 2);
delay(10);
}

Steptimer.getTotalSeconds within steptimer.h returning 0, c++ visual studio 2013, directx app

I'm trying to use the given code within steptimer.h to set up code that will run every two seconds. However with the code below, timer.GetTotalSeconds() always returns 0.
Unfortunately there isn't much information readily available on StepTimer.h (at least I believe due to a lack of useful search results), so I was hoping someone might be able to shed some light as to why the timer isn't recording the elapsed seconds. Am I using it incorrectly?
Code from Game.h, Game.cpp and StepTimer.h are included below. Any help is greatly appreciated.
From Game.cpp:
double time = timer.GetTotalSeconds();
if (time >= 2) {
laser_power++;
timer.ResetElapsedTime();
}
Initialised in Game.h:
DX::StepTimer timer;
From Common/StepTimer.h:
#pragma once
#include <wrl.h>
namespace DX
{
// Helper class for animation and simulation timing.
class StepTimer
{
public:
StepTimer() :
m_elapsedTicks(0),
m_totalTicks(0),
m_leftOverTicks(0),
m_frameCount(0),
m_framesPerSecond(0),
m_framesThisSecond(0),
m_qpcSecondCounter(0),
m_isFixedTimeStep(false),
m_targetElapsedTicks(TicksPerSecond / 60)
{
if (!QueryPerformanceFrequency(&m_qpcFrequency))
{
throw ref new Platform::FailureException();
}
if (!QueryPerformanceCounter(&m_qpcLastTime))
{
throw ref new Platform::FailureException();
}
// Initialize max delta to 1/10 of a second.
m_qpcMaxDelta = m_qpcFrequency.QuadPart / 10;
}
// Get elapsed time since the previous Update call.
uint64 GetElapsedTicks() const { return m_elapsedTicks; }
double GetElapsedSeconds() const { return TicksToSeconds(m_elapsedTicks); }
// Get total time since the start of the program.
uint64 GetTotalTicks() const { return m_totalTicks; }
double GetTotalSeconds() const { return TicksToSeconds(m_totalTicks); }
// Get total number of updates since start of the program.
uint32 GetFrameCount() const { return m_frameCount; }
// Get the current framerate.
uint32 GetFramesPerSecond() const { return m_framesPerSecond; }
// Set whether to use fixed or variable timestep mode.
void SetFixedTimeStep(bool isFixedTimestep) { m_isFixedTimeStep = isFixedTimestep; }
// Set how often to call Update when in fixed timestep mode.
void SetTargetElapsedTicks(uint64 targetElapsed) { m_targetElapsedTicks = targetElapsed; }
void SetTargetElapsedSeconds(double targetElapsed) { m_targetElapsedTicks = SecondsToTicks(targetElapsed); }
// Integer format represents time using 10,000,000 ticks per second.
static const uint64 TicksPerSecond = 10000000;
static double TicksToSeconds(uint64 ticks) { return static_cast<double>(ticks) / TicksPerSecond; }
static uint64 SecondsToTicks(double seconds) { return static_cast<uint64>(seconds * TicksPerSecond); }
// After an intentional timing discontinuity (for instance a blocking IO operation)
// call this to avoid having the fixed timestep logic attempt a set of catch-up
// Update calls.
void ResetElapsedTime()
{
if (!QueryPerformanceCounter(&m_qpcLastTime))
{
throw ref new Platform::FailureException();
}
m_leftOverTicks = 0;
m_framesPerSecond = 0;
m_framesThisSecond = 0;
m_qpcSecondCounter = 0;
}
// Update timer state, calling the specified Update function the appropriate number of times.
template<typename TUpdate>
void Tick(const TUpdate& update)
{
// Query the current time.
LARGE_INTEGER currentTime;
if (!QueryPerformanceCounter(&currentTime))
{
throw ref new Platform::FailureException();
}
uint64 timeDelta = currentTime.QuadPart - m_qpcLastTime.QuadPart;
m_qpcLastTime = currentTime;
m_qpcSecondCounter += timeDelta;
// Clamp excessively large time deltas (e.g. after paused in the debugger).
if (timeDelta > m_qpcMaxDelta)
{
timeDelta = m_qpcMaxDelta;
}
// Convert QPC units into a canonical tick format. This cannot overflow due to the previous clamp.
timeDelta *= TicksPerSecond;
timeDelta /= m_qpcFrequency.QuadPart;
uint32 lastFrameCount = m_frameCount;
if (m_isFixedTimeStep)
{
// Fixed timestep update logic
// If the app is running very close to the target elapsed time (within 1/4 of a millisecond) just clamp
// the clock to exactly match the target value. This prevents tiny and irrelevant errors
// from accumulating over time. Without this clamping, a game that requested a 60 fps
// fixed update, running with vsync enabled on a 59.94 NTSC display, would eventually
// accumulate enough tiny errors that it would drop a frame. It is better to just round
// small deviations down to zero to leave things running smoothly.
if (abs(static_cast<int64>(timeDelta - m_targetElapsedTicks)) < TicksPerSecond / 4000)
{
timeDelta = m_targetElapsedTicks;
}
m_leftOverTicks += timeDelta;
while (m_leftOverTicks >= m_targetElapsedTicks)
{
m_elapsedTicks = m_targetElapsedTicks;
m_totalTicks += m_targetElapsedTicks;
m_leftOverTicks -= m_targetElapsedTicks;
m_frameCount++;
update();
}
}
else
{
// Variable timestep update logic.
m_elapsedTicks = timeDelta;
m_totalTicks += timeDelta;
m_leftOverTicks = 0;
m_frameCount++;
update();
}
// Track the current framerate.
if (m_frameCount != lastFrameCount)
{
m_framesThisSecond++;
}
if (m_qpcSecondCounter >= static_cast<uint64>(m_qpcFrequency.QuadPart))
{
m_framesPerSecond = m_framesThisSecond;
m_framesThisSecond = 0;
m_qpcSecondCounter %= m_qpcFrequency.QuadPart;
}
}
private:
// Source timing data uses QPC units.
LARGE_INTEGER m_qpcFrequency;
LARGE_INTEGER m_qpcLastTime;
uint64 m_qpcMaxDelta;
// Derived timing data uses a canonical tick format.
uint64 m_elapsedTicks;
uint64 m_totalTicks;
uint64 m_leftOverTicks;
// Members for tracking the framerate.
uint32 m_frameCount;
uint32 m_framesPerSecond;
uint32 m_framesThisSecond;
uint64 m_qpcSecondCounter;
// Members for configuring fixed timestep mode.
bool m_isFixedTimeStep;
uint64 m_targetElapsedTicks;
};
}
Alrighty got what I wanted with the code below. Was missing the .Tick(####) call.
timer.Tick([&]() {
double time = timer.GetTotalSeconds();
if (time >= checkpt) {
laser_power++;
checkpt += 2;
}
});
Just fixed an integer checkpt to increment by 2 each time so that it runs every 2 seconds. There's probably a better way to do it, but it's 3.30am so I'm being lazy for the sake of putting my mind at ease.