Strategy to reduce time of gettimeofday? - c++

I write a stat server to count visit data of each day, therefore I have to clear data in db (memcached) every day.
Currently, I'll call gettimeofday to get date and compare it with the cached date to check if there are of the same day frequently.
Sample code as belows:
void report_visits(...) {
std::string date = CommonUtil::GetStringDate(); // through gettimeofday
if (date != static_cached_date_) {
flush_db_date();
static_cached_date_ = date;
}
}
The problem is that I have to call gettimeofday every time the client reports visit information. And gettimeofday is time-consuming.
Any solution for this problem ?

The gettimeofday system call (now obsolete in favor of clock_gettime) is among the shortest system calls to execute. The last time I measured that was on an Intel i486 and lasted around 2us. The kernel internal version is used to timestamp network packets, read, write, and chmod system calls to update the timestamps in the filesystem inodes, and the like. If you want to measure how many time you spent in gettimeofday system call you just have to do several (the more, the better) pairs of calls, one inmediately after the other, annotating the timestamp differences between them and getting finally the minimum value of the samples as the proper value. That will be a good aproximation to the ideal value.
Think that if the kernel uses it to timestamp each read you do to a file, you can freely use it to timestamp each service request without serious penalty.
Another thing, don't use (as suggested by other responses) a routine to convert gettimeofday result to a string, as this indeed consumes a lot more resources. You can compare timestamps (suppose them t1 and t2) and,
gettimeofday(&t2, NULL);
if (t2.tv_sec - t1.tv_sec > 86400) { /* 86400 is one day in seconds */
erase_cache();
t1 = t2;
}
or, if you want it to occur everyday at the same time
gettimeofday(&t2, NULL);
if (t2.tv_sec / 86400 > t1.tv_sec / 86400) {
/* tv_sec / 86400 is the number of whole days since 1/1/1970, so
* if it varies, a change of date has occured */
erase_cache();
}
t1 = t2; /* now, we made it outside, so we tie to the change of date */
Even, you can use the time() system call for this, as it has second resolution (and you don't need to cope with the usecs or with the overhead of the struct timeval structure).

(This is an old question, but there is an important answer missing:)
You need to define the TZ environment variable and export it to your program. If it is not set, you will incur a stat(2) call of /etc/localtime... for every single call to gettimeofday(2), localtime(3), etc.
Of course these will get answered without going to disk, but the frequency of the calls and the overhead of the syscall is enough to make an appreciable difference in some situations.
Supporting documentation:
How to avoid excessive stat(/etc/localtime) calls in strftime() on linux?
https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/

To summarise:
The check, as you say, is done up to a few thousand times per seconds.
You're flushing a cache once every day.
Assuming that the exact time at which you flush is not critical and can be seconds (or even minutes perhaps) late, there is a very simple/practical solution:
void report_visits(...)
{
static unsigned int counter;
if ((counter++ % 1000) == 0)
{
std::string date = CommonUtil::GetStringDate();
if (date != static_cached_date_)
{
flush_db_date();
static_cached_date_ = date;
}
}
}
Just do the check once every N-times that report_visits() is called. In the above example N is 1000. With up to a few thousand checks per seconds, you'll be less than a second (or 0.001% of a day) late.
Don't worry about counter wrap-around, it only happens once in about 20+ days (assuming a few thousand checks/s maximum, with 32-bit int), and does not hurt.

Related

Best option to profile CPU use in my program?

I am profiling CPU usage on a simple program I am writing. I have different algorithms I want to try, and I also want to know what's the impact on the total system performance.
Currently, I am using ualarm() to execute some instructions at 30Hz; every 15 of those interruptions (every 0.5s) I record the CPU time with getrusage() (in useconds), so I have an estimation on the total cpu time of cpu consumption on that point in time. But to get context, I also need to know the total time elapsed in the system in that time period, so I can have the % of which is used by my program.
/* Main Loop */
while(1)
{
alarm = 0;
/* Waiting Loop: */
for(i=0; !alarm; i++){
}
count++;
/* Do my things */
/* Check if it's time to store cpu log: */
if ((count%count_max) == 0)
{
getrusage(RUSAGE_SELF, &ru);
store_cpulog(f,
(int64_t) ru.ru_utime.tv_sec,
(int64_t) ru.ru_utime.tv_usec,
(int64_t) ru.ru_stime.tv_sec,
(int64_t) ru.ru_stime.tv_usec);
}
}
I have different options, but I don't know which one will provide the most exact result:
Use ualarm for the timing. Currently it's programmed to signal every 0.5 seconds, so I can take those 0.5 seconds as the CPU time. Seems quite obvious to use, but it's the best option?
Use clock_gettime(CLOCK_MONOTONIC): it provides readings with a nanosec resolution.
Use gettimeofday(): provides readings with a usec resolution. I've found opinions against using it.
Any recommendation? Thanks.
Possible solution is to use system function time and don't using busy loop (like #Hasturkun say) in your program. Call in console:
time /path/to/my/program
and after execution of it you get something like:
real 0m1.465s
user 0m0.000s
sys 0m1.210s
Not sure about precision, if it is enough for you.
Callgrind is possibly the best application for profiling C/C++ code under linux. Use it with pride:)

How can I periodically execute some function if this function takes along time to run (less than peroid)

I want to run a function for example func() exactly 1 time per second. However the running time of func() is about 500 ms. How Can I do that? I know if the running time of the function is low, I can write a while loop in func() and sleep() for 1 second after each execution. But now, the running time is high. What should I do to ensure the func() run exactly 1 time per second? Thanks.
Yo do:
Take the current time in start_time.
Perform your job
Take the current time in end_time
Wait for (1 second + start_time - end_time)
That way, you can perform your tasks every seconds reliably. If the task takes less time, you will wait longer and vice versa. Note however that this assumes that your task takes always less than 1 sec. to execute. In the real code, you want to check for that before the sleep statement.
Implementation details depend on the platform.
Note that using this method still results in a small drift due to the time it takes to compute step 4. A more accurate alternative would be to synchronize on integer multiple of one second. That way, over 1000s of cycles you would not drift.
It depends on the level of accuracy you need.
If you want a brute, easy to code solution, you can get the time before first run of the function and save it in some variable (start_time). Create repeat index count variable (repeat_number) that stores next repeat number. Then you can do kinda this:
1) next_run_time = ++repeat_number*1sec + start_time;
2) func();
3) wait_time = next_run_time - current_time;
4) sleep(wait_time)
5) goto 1;
This approach disables accumulation of time error on each iteration.
But for the real application you should find some event framework or library.

Limit iterations per time unit

Is there a way to limit iterations per time unit? For example, I have a loop like this:
for (int i = 0; i < 100000; i++)
{
// do stuff
}
I want to limit the loop above so there will be maximum of 30 iterations per second.
I would also like the iterations to be evenly positioned in the timeline so not something like 30 iterations in first 0.4s and then wait 0.6s.
Is that possible? It does not have to be completely precise (though the more precise it will be the better).
#FredOverflow My program is running
very fast. It is sending data over
wifi to another program which is not
fast enough to handle them at the
current rate. – Richard Knop
Then you should probably have the program you're sending data to send an acknowledgment when it's finished receiving the last chunk of data you sent then send the next chunk. Anything else will just cause you frustrations down the line as circumstances change.
Suppose you have a good Now() function (GetTickCount() is bad example, it's OS specific and has bad precision):
for (int i = 0; i < 1000; i++){
DWORD have_to_sleep_until = GetTickCount() + EXPECTED_ITERATION_TIME_MS;
// do stuff
Sleep(max(0, have_to_sleep_until - GetTickCount()));
};
You can check elapsed time inside the loop, but it may be not an usual solution. Because computation time is totally up to the performance of the machine and algorithm, people optimize it during their development time(ex. many game programmer requires at least 25-30 frames per second for properly smooth animation).
easiest way (for windows) is to use QueryPerformanceCounter(). Some pseudo-code below.
QueryPerformanceFrequency(&freq)
timeWanted = 1.0/30.0 //time per iteration if 30 iterations / sec
for i
QueryPerf(count1)
do stuff
queryPerf(count2)
timeElapsed = (double)(c2 - c1) * (double)(1e3) / double(freq) //time in milliseconds
timeDiff = timeWanted - timeElapsed
if (timeDiff > 0)
QueryPerf(c3)
QueryPerf(c4)
while ((double)(c4 - c3) * (double)(1e3) / double(freq) < timeDiff)
queryPerf(c4)
end for
EDIT: You must make sure that the 'do stuff' area takes less time than your framerate or else it doesn't matter. Also instead of 1e3 for milliseconds, you can go all the way to nanoseconds if you do 1e9 (if you want that much accuracy)
WARNING... this will eat your CPU but give you good 'software' timing... Do it in a separate thread (and only if you have more than 1 processor) so that any guis wont lock. You can put a conditional in there to stop the loop if this is a multi-threaded app too.
#FredOverflow My program is running very fast. It is sending data over wifi to another program which is not fast enough to handle them at the current rate. – Richard Knop
What you might need a buffer or queue at the receiver side. The thread that receives the messages from the client (like through a socket) get the message and put it in the queue. The actual consumer of the messages reads/pops from the queue. Of course you need concurrency control for your queue.
Besides the flow control methods mentioned, if you also have the need to maintain an accurate specific data sending rate in your sender part. Usually it can be done like this.
E.x. if you want to send at 10Mbps, create a timer of interval 1ms so it will call a predefined function every 1ms. Then in the timer handler function, by keep tracking of 2 static variables 1)Time elapsed since beginning of sending data 2)How much data in bytes have been sent up to last call, you can easily calculate how much data is needed to be sent in the current call (or just sleep and wait for next call).
By this way, you can do "streaming" of data in a very stable way with very little jitterness, and this is usually adopted in streaming of videos. Of course it also depends on how accurate the timer is.

How can I set tens of thousands of tasks to each trigger at a different defined time?

I'm constructing a data visualisation system that visualises over 100,000 data points (visits to a website) across a time period. The time period (say 1 week) is then converted into simulation time (1 week = 2 minutes in simulation), and a task is performed on each and every piece of data at the specific time it happens in simulation time (the time each visit occurred during the week in real time). With me? =p
In other programming languages (eg. Java) I would simply set a timer for each datapoint. After each timer is complete it triggers a callback that allows me to display that datapoint in my app. I'm new to C++ and unfortunately it seems that timers with callbacks aren't built-in. Another method I would have done in ActionScript, for example, would be using custom events that are triggered after a specific timeframe. But then again I don't think C++ has support for custom events either.
In a nutshell; say I have 1000 pieces of data that span across a 60 second period. Each piece of data has it's own time in relation to that 60 second period. For example, one needs to trigger something at 1 second, another at 5 seconds, etc.
Am I going about this the right way, or is there a much easier way to do this?
Ps. I'm using Mac OS X, not Windows
I would not use timers to do that. Sounds like you have too many events and they may lie too close to each other. Performance and accuracy may be bad with timers.
a simulation is normally done like that:
You are simly doing loops (or iterations). And on every loop you add an either measured (for real time) or constant (non real time) amount to your simulation time.
Then you manually check all your events and execute them if they have to.
In your case it would help to have them sorted for execution time so you would not have to loop through them all every iteration.
Tme measuring can be done with gettimer() c function for low accuracy or there are better functions for higher accuracy e.g. QueryPerformanceTimer() on windows - dont know the equivalent for Mac.
Just make a "timer" mechanism yourself, that's the best, fastest and most flexible way.
-> make an array of events (linked to each object event happens to) (std::vector in c++/STL)
-> sort the array on time (std::sort in c++/STL)
-> then just loop on the array and trigger the object action/method upon time inside a range.
Roughly that gives in C++:
// action upon data + data itself
class Object{
public:
Object(Data d) : data(d) {
void Action(){display(data)};
Data data;
};
// event time + object upon event acts
class Event{
public:
Event(double t, Object o) time (t), object(o) {};
// useful for std::sort
bool operator<(Event e) { return time < e.time; }
double time;
Object object;
}
//init
std::vector<Event> myEvents;
myEvents.push_back(Event(1.0, Object(data0)));
//...
myEvents.push_back(Event(54.0, Object(data10000)));
// could be removed is push_back() is guaranteed to be in the correct order
std::sort(myEvents.begin(), myEvents.end());
// the way you handle time... period is for some fuzziness/animation ?
const double period = 0.5;
const double endTime = 60;
std::vector<Event>::iterator itLastFirstEvent = myEvents.begin();
for (double currtime = 0.0; currtime < endTime; currtime+=0.1)
{
for (std::vector<Event>::iterator itEvent = itLastFirstEvent ; itEvent != myEvents.end();++itEvent)
{
if (currtime - period < itEvent.time)
itLastFirstEvent = itEvent; // so that next loop start is optimised
else if (itEvent.time < currtime + period)
itEvent->actiontick(); // action speaks louder than words
else
break; // as it's sorted, won't be any more tick this loop
}
}
ps: About custom events, you might want to read/search about delegates in c++ and function/method pointers.
If you are using native C++, you should look at the Timers section of the Windows API on the MSDN website. They should tell you exactly what you need to know.

Getting the current time (in milliseconds) from the system clock in Windows?

How can you obtain the system clock's current time of day (in milliseconds) in C++? This is a windows specific app.
The easiest (and most direct) way is to call GetSystemTimeAsFileTime(), which returns a FILETIME, a struct which stores the 64-bit number of 100-nanosecond intervals since midnight Jan 1, 1601.
At least at the time of Windows NT 3.1, 3.51, and 4.01, the GetSystemTimeAsFileTime() API was the fastest user-mode API able to retrieve the current time. It also offers the advantage (compared with GetSystemTime() -> SystemTimeToFileTime()) of being a single API call, that under normal circumstances cannot fail.
To convert a FILETIME ft_now; to a 64-bit integer named ll_now, use the following:
ll_now = (LONGLONG)ft_now.dwLowDateTime + ((LONGLONG)(ft_now.dwHighDateTime) << 32LL);
You can then divide by the number of 100-nanosecond intervals in a millisecond (10,000 of those) and you have milliseconds since the Win32 epoch.
To convert to the Unix epoch, subtract 116444736000000000LL to reach Jan 1, 1970.
You mentioned a desire to find the number of milliseconds into the current day. Because the Win32 epoch begins at a midnight, the number of milliseconds passed so far today can be calculated from the filetime with a modulus operation. Specifically, because there are 24 hours/day * 60 minutes/hour * 60 seconds/minute * 1000 milliseconds/second = 86,400,000 milliseconds/day, you could user the modulus of the system time in milliseconds modulus 86400000LL.
For a different application, one might not want to use the modulus. Especially if one is calculating elapsed times, one might have difficulties due to wrap-around at midnight. These difficulties are solvable, the best example I am aware is Linus Torvald's line in the Linux kernel which handles counter wrap around.
Keep in mind that the system time is returned as a UTC time (both in the case of GetSystemTimeAsFileTime() and simply GetSystemTime()). If you require the local time as configured by the Administrator, then you could use GetLocalTime().
To get the time expressed as UTC, use GetSystemTime in the Win32 API.
SYSTEMTIME st;
GetSystemTime(&st);
SYSTEMTIME is documented as having these relevant members:
WORD wYear;
WORD wMonth;
WORD wDayOfWeek;
WORD wDay;
WORD wHour;
WORD wMinute;
WORD wSecond;
WORD wMilliseconds;
As shf301 helpfully points out below, GetLocalTime (with the same prototype) will yield a time corrected to the user's current timezone.
You have a few good answers here, depending on what you're after. If you're looking for just time of day, my answer is the best approach -- if you need solid dates for arithmetic, consider Alex's. There's a lot of ways to skin the time cat on Windows, and some of them are more accurate than others (and nobody has mentioned QueryPerformanceCounter yet).
A cut-to-the-chase example of Jed's answer above:
const std::string currentDateTime() {
SYSTEMTIME st, lt;
GetSystemTime(&st);
char currentTime[84] = "";
sprintf(currentTime,"%d/%d/%d %d:%d:%d %d",st.wDay,st.wMonth,st.wYear, st.wHour, st.wMinute, st.wSecond , st.wMilliseconds);
return string(currentTime); }
Use GetSystemTime, first; then, if you need that, you can call SystemTimeToFileTime on the SYSTEMTIME structure that the former fills for you. A FILETIME is a 64-bit count of 100-nanosecs intervals since an epoch, and so more suitable for arithmetic; a SYSTEMTIME is a structure with all the expected fields (year, month, day, hour, etc, down to milliseconds). If you want to know "how many milliseconds have elapsed since midnight", for example, subtracting two FILETIME structures (one for the current time, one obtained by converting the same SYSTEMTIME after zeroing out the appropriate fields) and dividing by the appropriate power of ten is probably the simplest available approach.
Depending on the needs of your application there are six common options. This Dr Dobbs Journal article will give you all the information (and more) you need on choosing the best one.
In your specific case, from this article:
GetSystemTime() retrieves the current
system time and instantiates a
SYSTEMTIME structure, which is
composed of a number of separate
fields including year, month, day,
hours, minutes, seconds, and
milliseconds.
Here is some code that works in Windows which I've used in a Open Watcom C project. It should work in C++ It returns seconds (not milliseconds) using _dos_gettime or gettime
double seconds(void)
{
#ifdef __WATCOMC__
struct dostime_t t;
_dos_gettime(&t);
return ((double)t.hour * 3600 + (double)t.minute * 60 + (double)t.second + (double)t.hsecond * 0.01);
#else
struct time t;
gettime(&t);
return ((double)t.ti_hour * 3600 + (double)t.ti_min * 60 + (double)t.ti_sec + (double)t.ti_hund * 0.01);
#endif
}
While it's not what the question asks, it's worth considering why you want this info.
If all you want to do is keep track of how long something takes to calculate or the time past since the last user interaction, consider using the uptime (milliseconds since boot), which is much simpler to get: GetTickCount() or GetTickCount64(). This is all I wanted to do but I went down the epoch rabbit hole first because that's how you do it under unix.