Convert historical time to GMT - c++

I need to convert some string times in the format "2011061411322100" into GMT - my first attempt is below. However, the problem is that the times are coming from another PC and is a historical time. So I am not getting the times in real time so I cannot simply get the GMT from the local time on the box that my code is running.
The problem is that if my code is running during a time change, the time change will have occurred on my box but not on the remote box where I am getting the times. I can however, query the box to get the current time at any time.
So, to give more detail:
I start the job on a remote box
The job completes
I get some times related to the job running
I convert the time to GMT
If a time change (daylight savings) occurs between 1. and 2. I am screwed. My GMT conversion will break. I guess after 2) I need to get the current Remote Box time and see if there is a difference of >58 mins and then apply that to the conversion. But I cannot figure out a reliable method of doing this.
string GMTConverter::strToGMT(const string& timeToConvert)
{
// Set time zone from TZ environment variable.
_tzset();
struct tm tmTime;
//2011 06 14 11 32 21 00
// (strToInt is just a wrapper for atoi)
int year = strToint(timeToConvert.substr(0, 4) );
int month = strToint(timeToConvert.substr(4, 2) );
int day = strToint(timeToConvert.substr(6, 2) );
int hour = strToint(timeToConvert.substr(8, 2) );
int min = strToint(timeToConvert.substr(10, 2) );
int sec = strToint(timeToConvert.substr(12, 2) );
cout<<"Time after parsing: "<<year<<"/"<<month<<"/"<<day<<" "<<hour<<":"<<min<<":"<<sec<<endl;
// add to tm struct and return
tmTime.tm_hour = hour;
tmTime.tm_min = min;
tmTime.tm_sec = sec;
tmTime.tm_mday = day;
tmTime.tm_mon = (month-1);
tmTime.tm_year = (year - 1900);
cout <<"Time in TM: "<<tmTime.tm_year<<"/"<<tmTime.tm_mon<<"/"<<tmTime.tm_mday<<" "<<tmTime.tm_hour<<":"<<tmTime.tm_min<<":"<<tmTime.tm_sec<<endl;
char currDateTime[64];
// For logging
strftime(currDateTime, 63, "%c", &tmTime);
cout <<"Actual time:"<<currDateTime<<endl;
time_t remotePCTime = mktime( &tmTime );
struct tm *gmt = gmtime( &remotePCTime );
cout << "gmt = " << asctime( gmt ) << endl;
char datebuf_2[12];
char timebuf_2[13];
strftime( datebuf_2, 13, "%Y-%m-%d\0", gmt );
strftime( timebuf_2, 13, "%H:%M:%S\0", gmt );
return string(datebuf_2) + "T" + string(timebuf_2) + "." + string("000");
}

The obvious reliable solution would be to use UTC (which has no daylight savings) for the time stamp you're sending over. Using any time system that has inherent ambiguity (there is one hour of overlap each year where you can get the same time stamps on a different time) will make it impossible to have a fool-proof method, since information is lost.
If you have no control over the time format that the remote machine is sending, you can only try to extrapolate from the information that you do have, for instance, if the end time is lower than the start time, add one hour. This again introduces ambiguity if the job took longer than one hour, but at least time won't move backwards.

Time change appears twice a year - why should you bother? Anyway, can't you change the time format to include the time change event? Or check if job is done during time change by comparing to some fixed time and date at which time change appears?

Get the local time in UTC at the start and end of the remote job. Get the remote job time and covert to UTC at the start and end of the job. Convert the collection "historic" times to GMT/UTC as stated in your original post. Keep this data together in a struct or class and give additional start end times a clear name like LocalDLSValidation etc
We must now checkfor following scenarios:
1. Start and end time delta between Local and Remote is within allowed threshold(50mins?)
This the gold case. No modification is required to our collection of historical times
2. Local start/end time and remote time delta is outside threshold.
This is the second simplest case. It means that we can + or - hour to our entire collection of times.
3.Local start time and remote time delta is within threshold. But end is outside.
This is our worst case scenario as it means change has occurred in the middle of our job. If the job lasts less than hour then it will be easy to see which times in our collection need to be + or - one hour.
If it is greater than 1 hour....mmmmm. This is where we run into problems.
4. Start time between local and remote is different but end time is different
According to use case in OP this should not occur.

Related

How can I convert a UTC timestamp to local time, seconds past the hour?

I have a large data set with timestamps that are in UTC time in milliseconds. I'm synchronizing this data set with another's who has timestamps of microseconds past the hour, so I need to convert the first to local time, in seconds past the hour.
Most of the similar questions I've read on this subject get UTC time from the time() function which gets the current time.
I've tried implementing the following which was pulled from C++ Reference.
The timestamp I'm trying to convert is a double, but I'm not sure how to actually use this value.
An example ts from my data set: 1512695257869
int main ()
{
double my_utc_ts; //value acquired from the data set
time_t rawtime;
struct tm * ptm;
time ( &rawtime ); //getting current time
//rawtime = my_utc_ts; //this is what I tried and is wrong, and results in a nullptr
ptm = gmtime ( &rawtime );
puts ("Current time around the World:");
printf ("Phoenix, AZ (U.S.) : %2d:%02d\n", (ptm->tm_hour+MST)%24, ptm->tm_min);
return 0;
}
After I'm able to convert it to a usable gmtime object or whatever, I need to get seconds past the hour... I think I'll be able to figure this part out if I can get the UTC timestamps to successfully convert, but I haven't thought this far ahead.
Guidance would be much appreciated. Thanks in advance.
After I'm able to convert it to a usable gmtime object or whatever, I need to get seconds past the hour...
Here is how you can convert a double representing milliseconds since 1970-01-01 00:00:00 UTC to seconds past the local hour using Howard Hinnant's, free, open-source, C++11/14/17 timezone library which is based on <chrono>:
#include "date/tz.h"
#include <iostream>
int
main()
{
using namespace std::chrono;
using namespace date;
double my_utc_ts = 1512695257869;
using ms = duration<double, std::milli>;
sys_time<milliseconds> utc_ms{round<milliseconds>(ms{my_utc_ts})};
auto loc_s = make_zoned(current_zone(), floor<seconds>(utc_ms)).get_local_time();
auto sec_past_hour = loc_s - floor<hours>(loc_s);
std::cout << utc_ms << " UTC\n";
std::cout << sec_past_hour << " past the local hour\n";
}
This outputs for me:
2017-12-08 01:07:37.869 UTC
457s past the local hour
If your local time zone is not an integral number of hours offset from UTC, the second line of output will be different for you.
Explanation of code:
We start with your input my_utc_ts.
The next line creates a custom std::chrono::duration that has double as the representation and milliseconds as the precision. This type-alias is named ms.
The next line constructs utc_ms which is a std::chrono::time_point<system_clock, milliseconds> holding 1512695257869, and represents the time point 2017-12-08 01:07:37.869 UTC. So far, no actual computation has been performed. Simply the double 1512695257869 has been cast into a type which represents an integral number of milliseconds since 1970-01-01 00:00:00 UTC.
This line starts the computation:
auto loc_s = make_zoned(current_zone(), floor<seconds>(utc_ms)).get_local_time();
This creates a {time_zone, system_time} pair capable of mapping between UTC and a local time, using time_zone as that map. It uses current_zone() to find the computer's current time zone, and truncates the time point utc_ms from a precision of milliseconds to a precision of seconds. Finally the trailing .get_local_time() extracts the local time from this mapping, with a precision of seconds, and mapped into the current time zone. That is, loc_s is a count of seconds since 1970-01-01 00:00:00 UTC, offset by your-local-time-zone's UTC offset that was in effect at 2017-12-08 01:07:37 UTC.
Now if you truncate loc_s to a precision of hours, and subtract that truncated time point from loc_s, you'll get the seconds past the local hour:
auto sec_past_hour = loc_s - floor<hours>(loc_s);
The entire computation is just the two lines of code above. The next two lines simply stream out utc_ms and sec_past_hour.
Assuming that your local time zone was offset from UTC by an integral number of hours at 2017-12-08 01:07:37 UTC, you can double-check that:
457 == 7*60 + 37
Indeed, if you can assume that your local time zone is always offset from UTC by an integral number of hours, the above program can be simplified by not mapping into local time at all:
sys_time<milliseconds> utc_ms{round<milliseconds>(ms{my_utc_ts})};
auto utc_s = floor<seconds>(utc_ms);
auto sec_past_hour = utc_s - floor<hours>(utc_s);
The results will be identical.
(Warning: Not all time zones are offset from UTC by an integral number of hours)
And if your database is known to be generated with a time zone that is not the computer's current local time zone, that can be taken into account by replacing current_zone() with the IANA time zone identifier that your database was generated with, for example:
auto loc_s = make_zoned("America/New_York", floor<seconds>(utc_ms)).get_local_time();
Update
This entire library is based on the std <chrono> library introduced with C++11. The types above utc_ms and loc_s are instantiations of std::chrono::time_point, and sec_past_hour has type std::chrono::seconds (which is itself an instantiation of std::chrono::duration).
durations can be converted to their "representation" type using the .count() member function. For seconds, this representation type will be a signed 64 bit integer.
For a more detailed video tutorial on <chrono>, please see this Cppcon 2016 presentation. This presentation will encourage you to avoid using the .count() member function as much as humanly possible.
For example instead of converting sec_past_hour to a long so that you can compare it to other values of your dataset, convert other values of your dataset to std::chrono::durations so that you can compare them to sec_past_hour.
For example:
long other_data = 123456789; // past the hour in microseconds
if (microseconds{other_data} < sec_past_hour)
// ...
This snippet shows how <chrono> will take care of units conversions for you. This means you won't make mistakes like dividing by 1,000,000 when you should have multiplied, or spelling "million" with the wrong number of zeroes.
I'd start by converting the floating point number to a time_t. A time_t is normally a count of seconds since an epoch (most often the POSIX epoch--midnight, 1 Jan 1970), so it sounds like that's going to take little more than a bit of fairly simple math.
So let's assume for the sake of argument that your input uses a different epoch. Just for the sake of argument let's assume it's using an epoch of midnight, 1 jan 1900 instead (and, as noted, it's in milliseconds instead of seconds).
So, to convert that to a time_t, you'd start by dividing by 1000 to convert from milliseconds to seconds. Then you'd subtract off the number of seconds between midnight 1 jan 1900 and midnight 1 jan 1970. Now you have a value you can treat as a time_t that the standard library can deal with1.
Then use localtime to get that same time as a struct tm.
Then zero out the minutes and seconds from that tm, and use mktime to get a time_t representing that time.
Finally, use difftime to get the difference between the two.
1. For the moment, I'm assuming your standard library is based around a standard POSIX epoch, but that's a pretty safe assumption.

What's the most efficient way to programmatically check if the year is changed

I am trying to capture packets from the NIC and save part of the packet payload as a string.
On part of packet that must be stored is its Log Time known as SysLog. Each packets has a SysLog with the following Format:
Nov 01 03 14:50:25 TCP...[other parts of packet Payload]
As it can be seen, the packet SysLog has no Year Number. My program must be running all over the year, so I need to add Year Number to the packet SysLog and convert SysLog to epoch time. The final string that I have to store is like this:
1478175389-TCP, ….
I use the following peace of code to convert Syslog to EpochTime.
tm* tm_date = new tm();
Std ::string time = Current_Year;
time += " ";
time += packet.substr(0,18);
strptime(time.c_str(), "%Y %b %d %T", tm_date);
EpochTime = timegm(tm_date);
The currentYear Method:
std::string currentYear() {
std::stringstream now;
auto tp = std::chrono::system_clock::now();
auto ms = std::chrono::duration_cast<std::chrono::milliseconds>(tp.time_since_epoch());
size_t modulo = ms.count() % 1000;
time_t seconds = std::chrono::duration_cast<std::chrono::seconds>(ms).count();
#if HAS_STD_PUT_TIME
#else
char buffer[25]; // holds "2013-12-01 21:31:42"
if (strftime(buffer, 25, "%Y", localtime(&seconds))) {
now << buffer;
}
#endif // HAS_STD_PUT_TIME
return now.str();
}
The above operations are what i have to do for every packets. The packet rate is 100000-1000000 pps and the above peace of code is very time consuming specially on currentYear().
One possible optimization is to remove currentYear() Method and save the
Year number as a constant value. As said earlier my program must be run all over the year and as you know 2017 is comming. We can not change our binary at 31/12/2016 23:59:00 and also we don’t want to waste our time for calculating Year Number!!
I need a more efficient way to calculate the current year number without running it for each packet.
Is it possible? What is your suggestion for me?
Once you have obtained the current date and time, based on this it shouldn't be too difficult to calculate what the epoch time will be for midnight of next January 1st.
After calculating the expected epoch time for when the year rolls around, going forward all you have to do is compare it to the current time, when making a log entry. If it hasn't reached the precalculated Jan 1 midnight time, you know that the year hasn't rolled around yet.
So, you don't need to calculate the year for every packet at all. Just need to check the current time against the precalculated January 1st midnight time, which shouldn't change unless the politicians decide to change your timezone, while all of this is running...
The year is changed for log entries beginning with Jan, and only those log entries.
Log entries sometimes come out of order, or carry a timestamp saved during previous processing.
Attaching the year from the PC clock will give bad results, such as
2016 Dec 31 23:59:58 normal
2016 Jan 01 00:01:01 printing time placed in packet by remote device, remote clock is running a bit fast
2017 Dec 31 23:59:59 printing timestamp saved locally two seconds before logging occurred
2017 Jan 01 00:00:03 back to normal
You can't just concatenate the year of local clock with the month...second of the log message. You have to assign the year that avoids large clock jumps.
Since you're trying to produce Unix time (seconds since epoch) anyway, start by turning the log message time into Julian (seconds since start of year) and test whether the Julian is less than or greater than say 10 million (roughly 4 months).
You can "cache" the string you generate and only change it when the year changes. It may be though just a "little" improvement depending on what operations take the most time.
//somewhere
static int currentYear = 0;
static std::string yearStr = "";
//in your function
auto now = std::chrono::system_clock::now();
auto tnow = system_clock::to_time_t(now);
auto lt = localtime(&tnow); //or gmtime depends on your needs.
if(currentYear != lt.tm_year)
{
yearStr = std::to_string(lt.tm_year + 1900);
currentYear = t.tm_year;
}
return yearStr;
I am not sure if static has any negative/positive aspects on the performance of reading the string or a member variable may be better here due to cache locality. You have to test this.
If you use this in multiple threads you have to use a mutex here which probably will reduce performance though (again you have to measure this).
First, you might consider currentYear() returning an int (e.g. 2016), probably with time(2), localtime_r(3), the tm_year field.... You'll then avoid making C++ strings.
Then, you speak of high packet rate, so you probably have some event loop. You don't explain how it is done (hopefully you use some library à la libevent, or at least your own loop around poll(2)....), but you might compute the current year only once every tenth of second in that event loop. Or have some other thread computing the current year once in a while (you'll probably need a mutex, or use std::atomic<int> as the type of current year...)

How to get nanoseconds from boost::chrono::hight_resolution_clock::time_point?

I am new to boost and chrono. I am writing a logger that logs the timestamps of API calls, entry and exit. I tried using boost::xtime first, but it wasn't giving the high resolution value I needed. Hence was thinking about using Chrono. I declared a boost::chrono::hight_resolution_clock::time_stamp x; variable for getting the timestamp and assigned it to boost::chrono::hight_resolution_clock::now ();. Now, I need to get the nanoseconds from this variable and put it in my log file (thats the requirement). So I cast it boost::chrono::duration_cast (x). But it just wouldn't let me do that. It needs 2 parameters apparently, and I only have one. Is there a way to get around this?. Is it possible to create another time_stamp variable and assign zero to it and use that variable?. I tried assigning zero, but its not working. Kindly help me out.
Thanks,
Sam
If tagged c++11, any reason why not to use std::chrono?
// Using std::chrono
auto start = std::chrono::high_resolution_clock::now(); // start timer
/* do some work */
auto diff = std::chrono::high_resolution_clock::now() - start; // get difference
auto nsec = std::chrono::duration_cast<std::chrono::nanoseconds>(diff);
std::cout << "it took: " << nsec.count() << " nanoseconds" << std::endl;
boost::chrono::duration_cast converts a duration into the specified units, but you've given it a boost::chrono::time_point, not a duration.
There's really no such thing as "the current time in nanoseconds". To get a duration, you need to specify the time since which you want to know how many nanoseconds have elapsed (an "epoch"). Different clocks will measure their time based on different epochs.
boost::chrono::system_clock (currently) uses the Unix epoch (midnight Jan 1, 1970) as its epoch, but it's not steady and it may not have the resolution you need (it's in nanoseconds on my Ubuntu box, but in 1/10,000,000ths of a second on my Windows box).
boost::chrono::high_resolution_clock uses boot up as its epoch, is steady, and measures time in nanoseconds on both boxes I tested on.
Boost also provides other clocks like process_cpu_clock that use other epochs and count in other units.
Thus you can get nanos since Jan 1, 1970 using system_clock, but it may not actually be nanosecond-accurate, and it may go backwards if the user changes the system time or the computer syncs with network time, or you can get nanos since some other point in time using high_resolution_clock.

Strategy to reduce time of gettimeofday?

I write a stat server to count visit data of each day, therefore I have to clear data in db (memcached) every day.
Currently, I'll call gettimeofday to get date and compare it with the cached date to check if there are of the same day frequently.
Sample code as belows:
void report_visits(...) {
std::string date = CommonUtil::GetStringDate(); // through gettimeofday
if (date != static_cached_date_) {
flush_db_date();
static_cached_date_ = date;
}
}
The problem is that I have to call gettimeofday every time the client reports visit information. And gettimeofday is time-consuming.
Any solution for this problem ?
The gettimeofday system call (now obsolete in favor of clock_gettime) is among the shortest system calls to execute. The last time I measured that was on an Intel i486 and lasted around 2us. The kernel internal version is used to timestamp network packets, read, write, and chmod system calls to update the timestamps in the filesystem inodes, and the like. If you want to measure how many time you spent in gettimeofday system call you just have to do several (the more, the better) pairs of calls, one inmediately after the other, annotating the timestamp differences between them and getting finally the minimum value of the samples as the proper value. That will be a good aproximation to the ideal value.
Think that if the kernel uses it to timestamp each read you do to a file, you can freely use it to timestamp each service request without serious penalty.
Another thing, don't use (as suggested by other responses) a routine to convert gettimeofday result to a string, as this indeed consumes a lot more resources. You can compare timestamps (suppose them t1 and t2) and,
gettimeofday(&t2, NULL);
if (t2.tv_sec - t1.tv_sec > 86400) { /* 86400 is one day in seconds */
erase_cache();
t1 = t2;
}
or, if you want it to occur everyday at the same time
gettimeofday(&t2, NULL);
if (t2.tv_sec / 86400 > t1.tv_sec / 86400) {
/* tv_sec / 86400 is the number of whole days since 1/1/1970, so
* if it varies, a change of date has occured */
erase_cache();
}
t1 = t2; /* now, we made it outside, so we tie to the change of date */
Even, you can use the time() system call for this, as it has second resolution (and you don't need to cope with the usecs or with the overhead of the struct timeval structure).
(This is an old question, but there is an important answer missing:)
You need to define the TZ environment variable and export it to your program. If it is not set, you will incur a stat(2) call of /etc/localtime... for every single call to gettimeofday(2), localtime(3), etc.
Of course these will get answered without going to disk, but the frequency of the calls and the overhead of the syscall is enough to make an appreciable difference in some situations.
Supporting documentation:
How to avoid excessive stat(/etc/localtime) calls in strftime() on linux?
https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/
To summarise:
The check, as you say, is done up to a few thousand times per seconds.
You're flushing a cache once every day.
Assuming that the exact time at which you flush is not critical and can be seconds (or even minutes perhaps) late, there is a very simple/practical solution:
void report_visits(...)
{
static unsigned int counter;
if ((counter++ % 1000) == 0)
{
std::string date = CommonUtil::GetStringDate();
if (date != static_cached_date_)
{
flush_db_date();
static_cached_date_ = date;
}
}
}
Just do the check once every N-times that report_visits() is called. In the above example N is 1000. With up to a few thousand checks per seconds, you'll be less than a second (or 0.001% of a day) late.
Don't worry about counter wrap-around, it only happens once in about 20+ days (assuming a few thousand checks/s maximum, with 32-bit int), and does not hurt.

Timestamp in milliseconds gives me 10 digit in C++?

I am trying to retrieve Current Time in milliseconds using boost library.. Below is my code which I am using to get the current time in milliseconds.
boost::posix_time::ptime time = boost::posix_time::microsec_clock::local_time();
boost::posix_time::time_duration duration( time.time_of_day() );
std::cout << duration.total_milliseconds() << std::endl;
uint64_t timestampInMilliseconds = duration.total_milliseconds() // will this work or not?
std::cout << timestampInMilliseconds << std::endl;
But this prints out in 10 digit which is like 17227676.. I am running my code on my ubuntu machine.. And I believe it is always 13 digit long value? Isn't so?
After computing the timestamp in milliseconds, I need to use below formula on that -
int end = (timestampInMilliseconds / (60 * 60 * 1000 * 24)) % 14
But somehow I am not sure whether timestampInMilliseconds which I am getting is right or not?
First of all should I be using boost::posix or not? I am assuming there might be some better way.. I am running code on my ubuntu machine..
Update:-
As this piece of bash script prints out timestampInMilliseconds which is of 13 digit..
date +%s%N | cut -b1-13
The problem here is that you use time_of_day() which returns (from this reference)
Get the time offset in the day.
So from the value you provided in the question I can deduce that you ran this program at 4:47 am.
Instead you might want to use e.g. the to_tm() to get a struct tm and construct your time in milliseconds from there.
Also note that the %s format to the date command (and the strftime function) is the number of seconds since the epoch, not the number of milliseconds.
If you look at the tm structure, you will see that it has the number of years (since 1900, so subtract 70 here), days into the year, and then hours,, minutes and seconds into the day. All these can be used to calculate the time in seconds easily.
And that in seconds is the problem here. If you look at e.g. the POSIX time function you see that
shall return the value of time in seconds since the Epoch
If you want an accurate millisecond resolution you simply can't use the ptime (where the p stands for POSIX). If you want millisecond resolution you either have to use e.g. system functions that returns the time in higher resolutions (like gettimeofday), or you can see e.g. this old SO answer.