Implementation of a "hits in last [second/minute/hour]" data structure - c++

I think this is a fairly common question but I can't seem to find answer by googling around (maybe there's a more precise name for the problem I don't know?)
You need to implement a structure with a "hit()" method used to report a hit and hitsInLastSecond|Minute|Hour methods. You have a timer with say nanosecond accuracy. How do you implement this efficiently?
My thought was something like this (in psuedo-C++)
class HitCounter {
void hit() {
hits_at[now()] = ++last_count;
}
int hitsInLastSecond() {
auto before_count = hits_at.lower_bound(now() - 1 * second)
if (before_count == hits_at.end()) { return last_count; }
return last_count - before_count->second;
}
// etc for Minute, Hour
map<time_point, int> hits_at;
int last_count = 0;
};
Does this work? Is it good? Is something better?
Update: Added pruning and switched to a deque as per comments:
class HitCounter {
void hit() {
hits.push_back(make_pair(now(), ++last_count));
}
int hitsInLastSecond() {
auto before = lower_bound(hits.begin(), hits.end(), make_pair(now() - 1 * second, -1));
if (before == hits.end()) { return last_count; }
return last_count - before_count->second;
}
// etc for Minute, Hour
void prune() {
auto old = upper_bound(hits.begin(). hits.end(), make_pair(now - 1 * hour, -1));
if (old != hits.end()) {
hits.erase(hits.begin(), old)
}
}
deqeue<pair<time_point, int>> hits;
int last_count = 0;
};

What you are describing is called a histogram.
Using a hash, if you intend nanosecond accuracy, will eat up much of your cpu. You probably want a ring buffer for storing the data.
Use std::chrono to achieve the timing precision you require, but frankly hits per second seems like the highest granularity you need and if you are looking at the overall big picture, it doesn't seem like it will matter terribly what the precision is.
This is a partial, introductory sample of how you might go about it:
#include <array>
#include <algorithm>
template<size_t RingSize>
class Histogram
{
std::array<size_t, RingSize> m_ringBuffer;
size_t m_total;
size_t m_position;
public:
Histogram() : m_total(0)
{
std::fill_n(m_ringBuffer.begin(), RingSize, 0);
}
void addHit()
{
++m_ringBuffer[m_position];
++m_total;
}
void incrementPosition()
{
if (++m_position >= RingSize)
m_position = 0;
m_total -= m_ringBuffer[m_position];
m_ringBuffer[m_position] = 0;
}
double runningAverage() const
{
return (double)m_total / (double)RingSize;
}
size_t runningTotal() const { return m_total; }
};
Histogram<60> secondsHisto;
Histogram<60> minutesHisto;
Histogram<24> hoursHisto;
Histogram<7> weeksHisto;
This is a naive implementation which assumes you will call it every second and increment the position, and will transpose runningTotal from one histogram to the next every RingSize (so every 60s, add secondsHisto.runningTotal to minutesHisto).
Hopefully it will be a useful introductory place to start from.
If you want to track a longer histogram of hits per second, you can do that with this model, by increasing the ring size, add a second total to track the last N ring buffer entries, so that m_subTotal = sum(m_ringBuffer[m_position - N .. m_position]), similar to the way m_total works.
size_t m_10sTotal;
...
void addHit()
{
++m_ringBuffer[m_position];
++m_total;
++m_10sTotal;
}
void incrementPosition()
{
// subtract data from >10 sample intervals ago.
m_10sTotal -= m_ringBuffer[(m_position + RingBufferSize - 10) % RingBufferSize];
// for the naive total, do the subtraction after we
// advance position, since it will coincide with the
// location of the value RingBufferSize ago.
if (++m_position >= RingBufferSize)
m_position = 0;
m_total -= m_ringBuffer[m_position];
}
You don't have to make the histo grams these sizes, this is simply a naive scraping model. There are various alternatives, such as incrementing each histogram at the same time:
secondsHisto.addHit();
minutesHisto.addHit();
hoursHisto.addHit();
weeksHisto.addHit();
Each rolls over independently, so all have current values. Size each histo as far as you want data at that granularity to go back.

Related

How to make step in 10 points without save value outside?

Actually what I need is I have a progress bar logic that tracking every second count of files, but case here that I don't need to update every time when I get trigger, like 1%, 3%, 7% ... I would like to define kind of step_progress variable (for example 10) and get this progress bar update steps like 10%, 20%, 30%...
So, for this case I wrote such logic
...
->set_progress_bar_callback([&](int count, int copied_file) {
const int PROGRESS_UPDATE_STEP = 10;
const int MAX_PERCENTAGE = 100;
float progress_step = (float)MAX_PERCENTAGE / (float)count;
float progress_result = progress_step * copied_file;
if ((int)progress_result % PROGRESS_UPDATE_STEP == 0)
{
if (progress_callback != nullptr)
{
printf("HERE!!! execute_copy_process Progress count %f\n", progress_result);
progress_callback(progress_result);
}
}
}
and I need to mention that I don't want to define outside variable(like class member), I would like that all this values should be inside lambda.
So, as you can see count - is a full amount (for example 124), and copied_file - it is a progress(for example 34)
Thus I calculate progress_result and try to get module(%) in order to understand if it is step of 10, but issue here that not every time progress_result value came to round number and eventually I get my progress like 10%, 40%, 60%...
Is there is some kind of trick to do it without to save outside values?
You should stay in integer arithmetic.
[&](int current, int total) {
const int PROGRESS_UPDATE_STEP = 10;
const int MAX_PERCENTAGE = 100;
int step = std::max(total * PROGRESS_UPDATE_STEP / MAX_PERCENTAGE, 1);
if ((current % step) == 0)
{
if (progress_callback != nullptr)
{
float progress_result = static_cast<float>(current) / total;
printf("HERE!!! execute_copy_process Progress count %f\n", progress_result);
progress_callback(progress_result);
}
}
}

Efficient way to retrieve count of number of times a flag set since last n seconds

I need to track how many times a flag is enabled in last n seconds. Below is the example code I can come up with.StateHandler maintains the value of the flag in active array for last n(360 here) seconds. In my case update function is called from outside every second. So when I need to know how many times it set since last 360 seconds I call getEnabledInLast360Seconds. Is it possible to do it more efficiently like not using an array size of n for booleans ?
#include <map>
#include <iostream>
class StateHandler
{
bool active[360];
int index;
public:
StateHandler() :
index(0),
active()
{
}
void update(bool value)
{
if (index >= 360)
{
index = 0;
}
active[index % 360] = value;
index++;
}
int getEnabledInLast360Seconds()
{
int value = 0;
for (int i = 0; i < 360; i++)
{
if (active[i])
{
value++;
}
}
return value;
}
};
int main()
{
StateHandler handler;
handler.update(true);
handler.update(true);
handler.update(true);
std::cout << handler.getEnabledInLast360Seconds();
}
Yes. Use the fact that numberOfOccurrences(0,360) and numberOfOccurrences(1,361) have 359 common terms. So remember the sum, calculate the common term, and calculate the new sum.
void update(bool value)
{
if (index >= 360)
{
index = 0;
}
// invariant: count reflects t-360...t-1
if (active[index]) count--;
// invariant: count reflects t-359...t-1
active[index] = value;
if (value) count++;
// invariant: count reflects t-359...t
index++;
}
(Note that the if block resetting index removes the need for the modulo operator % so I removed that)
Another approach would be to use subset sums:
subsum[0] = count(0...19)
subsum[1] = count(20...39)
subsum[17] = count(340...359)
Now you only have to add 18 numbers each time, and you can entirely replace a subsum every 20 seconds.
Instead of fixing the buffer, you can simply use std::set<timestamp> (Or perhaps std::queue). Every time you check, pop off the elements older than 360s and count the remaining ones.
If you check scarcely but update often, you might want to add the "popping" to the update itself, to prevent the set from growing too big.

Recursion Depth Cut Off Strategy: Parallel QuickSort

I have a parallel quickosort alogirthm implemented. To avoid overhead of excess parallel threads I had a cut off strategy to turn the parallel algorithm into a sequential one when the vector size was smaller than a paticular threshold. However, now I am trying to set the cut off strategy based on recursion depth. i.e I want my algorithm to turn sequential when a certain recursion depth is reached. I employed the following code, but it dosent work. I'm not sure how to proceed. Any ideas?
template <class T>
void ParallelSort::sortHelper(typename vector<T>::iterator start, typename vector<T>::iterator end, int level =0) //THIS IS THE QUICKSoRT INTERFACE
{
static int depth =0;
const int insertThreshold = 20;
const int threshold = 1000;
if(start<end)
{
if(end-start < insertThreshold) //thresholf for insert sort
{
insertSort<T>(start, end);
}
else if((end-start) >= insertThreshold && depth<threshold) //threshhold for non parallel quicksort
{
int part = partition<T>(start,end);
depth++;
sortHelper<T>(start, start + (part - 1), level+1);
depth--;
depth++;
sortHelper<T>(start + (part + 1), end, level+1);
depth--;
}
else
{
int part = partition<T>(start,end);
#pragma omp task
{
depth++;
sortHelper<T>(start, start + (part - 1), level+1);
depth--;
}
depth++;
sortHelper<T>(start + (part + 1), end, level+1);
depth--;
}
}
}
I tried the static variable depth and also the non static variable level but both of them dont work.
NOTE: The above snipped only depends on depth. level is included to show both the methods tried
static depth being written to from two threads makes your code execute unspecified behavior, as what those writes do is not specified.
As it happens, you are passing down level, which is your recursion depth. At each level, you double the number of threads -- so a limit on level equal to 6 (say) corresponds to 2^6 threads at most. Your code is only half parallel, because the partition code occurs in the main thread, so you will probably have fewer than the theoretical maximum number of threads going at once.
template <class T>
void ParallelSort::sortHelper(typename vector<T>::iterator start, typename vector<T>::iterator end, int level =0) //THIS IS THE QUICKSoRT INTERFACE
{
const int insertThreshold = 20;
const int treeDepth = 6; // at most 2^6 = 64 tasks
if(start<end)
{
if(end-start < insertThreshold) //thresholf for insert sort
{
insertSort<T>(start, end);
}
else if(level>=treeDepth) // only 2^treeDepth threads, after which we run in sequence
{
int part = partition<T>(start,end);
sortHelper<T>(start, start + (part - 1), level+1);
sortHelper<T>(start + (part + 1), end, level+1);
}
else // launch two tasks, creating an exponential number of threads:
{
int part = partition<T>(start,end);
#pragma omp task
{
sortHelper<T>(start, start + (part - 1), level+1);
}
sortHelper<T>(start + (part + 1), end, level+1);
}
}
}
Alright, I figured it out. It was a stupid mistake on my part.
The algorithm should fall back onto the sequential code when the stack size is greater than some threshold, not smaller. Doing so solves the problem, and gives me a speedup.

Speedier std::insert, or, How to optimize a call that Instruments says is slow?

I'm attempting to Xcode instruments to find ways to speed up my app enough to run well on legacy devices. Most of the time is spent in an audio callback, specifically:
void Analyzer::mergeWithOld(tones_t& tones) const {
tones.sort();
tones_t::iterator it = tones.begin();
// Iterate over old tones
for (tones_t::const_iterator oldit = m_tones.begin(); oldit != m_tones.end(); ++oldit) {
// Try to find a matching new tone
while (it != tones.end() && *it < *oldit) ++it;
// If match found
if (it != tones.end() && *it == *oldit) {
// Merge the old tone into the new tone
it->age = oldit->age + 1;
it->stabledb = 0.8 * oldit->stabledb + 0.2 * it->db;
it->freq = 0.5 * oldit->freq + 0.5 * it->freq;
} else if (oldit->db > -80.0) {
// Insert a decayed version of the old tone into new tones
Tone& t = *tones.insert(it, *oldit);
t.db -= 5.0;
t.stabledb -= 0.1;
}
}
}
I feel a bit like a dog who finally catches a squirrel and then realizes he has no idea what to do next. Can I speed this up, and if so, how do I go about doing it?
EDIT: Of course— tones_t is
typedef std::list<Tone> tones_t;
And Tone is a struct:
struct Tone {
static const std::size_t MAXHARM = 48; ///< The maximum number of harmonics tracked
static const std::size_t MINAGE = TONE_AGE_REQUIRED; // The minimum age required for a tone to be output
double freq; ///< Frequency (Hz)
double db; ///< Level (dB)
double stabledb; ///< Stable level, useful for graphics rendering
double harmonics[MAXHARM]; ///< Harmonics' levels
std::size_t age; ///< How many times the tone has been detected in row
double highestFreq;
double lowestFreq;
int function;
float timeStamp;
Tone();
void print() const; ///< Prints Tone to std::cout
bool operator==(double f) const; ///< Compare for rough frequency match
/// Less-than compare by levels (instead of frequencies like operator< does)
static bool dbCompare(Tone const& l, Tone const& r) {
return l.db < r.db;
}
};
Optimizations are a complex things. You may need to try several approaches.
1: Merge m_tones and tones into a new list then assign that list back to m_tones. Be sure to set the capacity for the new list beforehand.
This adds two list copies into the mix but eliminates all the inserts. You would have to test to see if it's faster.
2: Dump the list for a different structure. Can you store m_tones as a std::set instead of a list?
When you need to get m_tones in an ordered fashion you will need to call std::sort, but if you don't need an ordered iteration or if you only need it infrequently, then it might be faster.
These are just ideas for how to think about the problem differently, you will have to test test test to see which option has the best performance.

Get top 5 algorithm from a container?

I have a class(object), User. This user has 2 private attributes, "name" and "popularity". I store the objects into a vector (container).
From the container, I need to find the top 5 most popular user, how do I do that? (I have an ugly code, I will post here, if you have a better approach, please let me know. Feel free to use other container, if you think vector is not a good choice, but please use only: map or multimap, list, vector or array, because I only know how to use these.) My current code is:
int top5 = 0, top4 = 0, top3 = 0, top2 = 0, top1 = 0;
vector<User>::iterator it;
for (it = user.begin(); it != user.end(); ++it)
{
if( it->getPopularity() > top5){
if(it->getPopularity() > top4){
if(it->getPopularity() > top3){
if(it->getPopularity() > top2){
if(it->getPopularity() > top1){
top1 = it->getPopularity();
continue;
} else {
top2 = it->getPopularity();
continue;
}
} else {
top3 = it->getPopularity();
continue;
}
}
} else {
top4 = it->getPopularity();
continue;
}
} else {
top5 = it->getPopularity();
continue;
}
}
I know the codes is ugly and might be prone to error, thus if you have better codes, please do share with us (us == cpp newbie). Thanks
You can use the std::partial_sort algorithm to sort your vector so that the first five elements are sorted and the rest remains unsorted. Something like this (untested code):
bool compareByPopularity( User a, User b ) {
return a.GetPopularity() > b.GetPopularity();
}
vector<Users> getMostPopularUsers( const vector<User> &users, int num ) {
if ( users.size() <= num ) {
sort( users.begin(), users.end(), compareByPopularity );
} else {
partial_sort( users.begin(), users.begin() + num, users.end(),
compareByPopularity );
}
return vector<Users>( users.begin(), users.begin() + num );
}
Why don't you sort (std::sort or your own implementation of Quick Sort) the vector based on popularity and take the first 5 values ?
Example:
bool UserCompare(User a, User b) { return a.getPopularity() > b.getPopularity(); }
...
std::sort(user.begin(), user.end(), UserCompare);
// Print first 5 users
If you just want top 5 popular uses, then use std::partial_sort().
class User
{
private:
string name_m;
int popularity_m;
public:
User(const string& name, int popularity) : name_m(name), popularity_m(popularity) { }
friend ostream& operator<<(ostream& os, const User& user)
{
return os << "name:" << user.name_m << "|popularity:" << user.popularity_m << "\n";
return os;
}
int Popularity() const
{
return popularity_m;
}
};
bool Compare(const User& lhs, const User& rhs)
{
return lhs.Popularity() > rhs.Popularity();
}
int main()
{
// c++0x. ignore if you don't want it.
auto compare = [](const User& lhs, const User& rhs) -> bool
{ return lhs.Popularity() > rhs.Popularity(); };
partial_sort(users.begin(), users.begin() + 5, users.end(), Compare);
copy(users.begin(), users.begin() + 5, ostream_iterator<User>(std::cout, "\n"));
}
First off, cache that it->getPopularity() so you don't have to keep repeating it.
Secondly (and this is much more important): Your algorithm is flawed. When you find a new top1 you have to push the old top1 down to the #2 slot before you save the new top1, but before you do that you have to push the old top2 down to the #3 slot, etc. And that is just for a new top1. You are going to have to do something similar for a new top2, a new top3, etc. The only one you can paste in without worrying about pushing things down the list is when you get a new top5. The correct algorithm is hairy. That said, the correct algorithm is much easier to implement when your topN is an array rather than a bunch of separate values.
Thirdly (and this is even more important than the second point): You shouldn't care about performance, at least not initially. The easy way to do this is to sort the entire list and pluck off the first five off the top. If this suboptimal but simple algorithm doesn't affect your performance, done. Don't bother with the ugly but fast first N algorithm unless performance mandates that you toss the simple solution out the window.
Finally (and this is the most important point of all): That fast first N algorithm is only fast when the number of elements in the list is much, much larger than five. The default sort algorithm is pretty dang fast. It has to be wasting a lot of time sorting the dozens / hundreds of items you don't care about before a pushdown first N algorithm becomes advantageous. In other words, that pushdown insertion sort algorithm may well be a case of premature disoptimization.
Sort your objects, maybe with the library if this is allowed, and then simply selecte the first 5 element. If your container gets too big you could probably use a std::list for the job.
Edit : #itsik you beat me to the sec :)
Do this pseudo code.
Declare top5 as an array of int[5] // or use a min-heap
Initialize top5 as 5 -INF
For each element A
if A < top5[4] // or A < root-of-top5
Remove top5[4] from top5 // or pop min element from heap
Insert A to top // or insert A to the heap
Well, I advise you improve your code by using an array or list or vector to store the top five, like this
struct TopRecord
{
int index;
int pop;
} Top5[5];
for(int i = 0; i<5; i++)
{
Top5[i].index = -1;
// Set pop to a value low enough
Top5[i].pop = -1;
}
for(int i = 0; i< users.size(); i++)
{
int currentpop = i->getPopularity()
int currentindex = i;
int j = 0;
int temp;
while(j < 5 && Top5[j].pop < currentpop)
{
temp = Top5[j].pop;
Top[j].pop = currentpop;
currentpop = temp;
temp = Top5[j].index;
Top[j].index = currentindex;
currentindex = temp;
j++;
}
}
You also may consider using Randomized Select if Your aim is performance, since originally Randomized Select is good enough for ordered statistics and runs in linear time, You just need to run it 5 times. Or to use partial_sort solution provided above, either way counts, depends on Your aim.