Library to discover dates from text? - c++

I need to pull a date out of a string. Since not everyone uses the official ISO format when printing their dates, it is impractical to write a date parser for every possible date format that could be used, and I need to handle as many date formats as possible - I don't control the data and can't expect it to come in a specific format.
This seems like a problem that has probably already been solved ages ago, but my Google-fu is too weak to find the solution. :(
Does there already exist a C++ library that, given a string, will return the month, day, year, hour, minute, second, etc that is referenced in that string, if any?
Pseudocode:
string s1 = "There is an expected meteor shower this Thursday,"
"August 15th 2013 at 4:39 AM.";
string s2 = "20130815T04:39:00";
date d1 = magicConverter(s1);
date d2 = magicConverter(s2);
assert(d1 == d2);

You might use the code from here, but you need to configure a mask, that tells the code which time format is used. If you write a class routine, that takes a mask and a string and gets you out the time and is able to print in any format you like, you should be well prepared. You have to look in more detail, if it also supports Daynames and Monthnames. I got it to work in python with a module providing a function that seems pretty much the same.
For more detail:
Please look at the example 2013-08-03 again. Nobody and as follows no computer is able to tell you if this date belongs to August or April, except of having a mask telling JJJJ-MM-DD or JJJJ-DD-MM. Also this library may tell you only standard masked times. So it might lead you to August in this case. But as you said it can be any date declaration, thus it does not need to follow standards, thus it can also mean March. An other possibility is to tell you about the date from the context (e.g. a table with a column of all te same time formats by looking for the increase (which would also fail if you just look at one day per month for just one year).
Another example... if I ask you 2013-05-04... to which month does it belong? You might tell me... April. I would reply "no, to the 4th of May" and vice versa for May and 5th of April. If you tell me how to solve this puzzle with two possible solutions I would understand your downvote... please think before downvoting someone trying to help you.

Related

How to convert imported date variable to the original format in Stata?

My original date variable is like this 19jun2015 16:52:04. After importing, it looks like this: 1.77065e+12
The storage type for the new imported variable is str11 and display format is %11s
I wonder how I can restore it back to date format?
William Lisowski gives excellent advice in his comment. For anyone using date-times in Stata, there is a minimal level of understanding without which confusion and outright error are unavoidable. Only study of the help so that your specific needs are understood can solve your difficulty.
There is a lack of detail in the question which makes precise advice difficult (imported -- from what kind of file? using which commands and/or third party programs?), except to diagnose that your dates are messed up and can only be corrected by going back to the original source.
Date strings such as "19jun2015 16:52:04" can be held in Stata as strings but to be useful they need to be converted to double numeric variables which hold the number of milliseconds since the beginning of 1960. This is a number that people cannot interpret, but Stata provides display formats so that displayed dates are intelligible.
Your example is when converted a number of the order of a trillion but if held as a string with only 6 significant figures you have, at a minimum, lost detail irretrievably.
These individual examples make my points concrete. di is an abbreviation for the display command.
clock() (and also Clock(), not shown or discussed here: see the help) converts string dates to milliseconds since Stata's origin. With a variable, you would use generate double.
. di %23.0f clock("19jun2015 16:52:04", "DMY hms")
1750351924000
If displayed with a specific format, you can check that Stata is interpreting your date-times correctly. There are also many small variations on the default %tc format to control precise display of date-time elements.
. di %tc clock("19jun2015 16:52:04", "DMY hms")
19jun2015 16:52:04
The first example shows that even date-times which are recent dates (~2016) and in integer seconds need 10 significant figures to be accurate; the default display gives 4; somehow you have 6, but that is not enough.
. di clock("19jun2015 16:52:04", "DMY hms")
1.750e+12
You need to import the dates again. If you import them exactly as shown, the rest can be done in Stata.
See https://en.wikipedia.org/wiki/Significant_figures if that phrase is unfamiliar.

Quantlib USDLibor, how does it knows which is the correct fixing date?

When pricing floating rate bonds, one needs to work with instances of the USDLibor class and adding new fixings given a date (which is equivalent to the last reset date minus two business days). However, sometimes it complaints on runtime telling the user that the fixing for a specified date is not available (meaning that one has provided the fixing for a wrong date).
How do instances of USDLibor know which is correct date? I ask this because maybe I can sort this problem by retrieving the correct date directly as USDLibor gets it working around the problem of figuring out the correct date.
The fixing date is two business days before the reset date, as you said (the implementation of the logic is in the FloatingRateCoupon::fixingDate() method, if you want to check it).
However, you might be using the wrong business days. USD LIBOR is fixed in London, so holidays are determined according to the UK calendar, not the US calendar.
In any case, once you have built the bond, you can ask the cashflows themselves for their fixings dates with something like this (which I haven't tested, so it might not even compile, but you should get the idea):
using namespace QuantLib;
Leg cashflows = bond.cashflows();
std::vector<Date> fixingDates;
for (Size i=0; i<cashflows.size(); ++i) {
boost::shared_ptr<FloatingRateCoupon> coupon =
boost::dynamic_pointer_cast<FloatingRateCoupon>(cashflows[i]);
if (coupon)
fixingDates.append(coupon->fixingDate());
}
after which the fixingDates vector will contain (not surprisingly) the fixing dates.

Getting current date and time in C++

I am doing a school project which basically records the in and out time of an employee(of an particular company).The employee while checking in or out should enter a unique key generated specially for him so that no one else can check in and out for him.Then referring to the employees position( A worker or a manager or something like that) his total working time each day , for a week and a month is calculated. The company starts at 8 am and ends at 5 pm for 1st shift and for second shift from 3.30 pm to 2.30 am.Plus Saturday and Sunday off.
Then after a month the program refers to the working time date stored in a file and calculates his pay(For workers its per hour salary and for managers it aint). If an employee is consistently late the details are forwarded to the HR account( so that necessary steps are taken).This will help the company log the details of their employees duty time plus give enough detail to take action if someones always late.
I'm doing this as a school project and it needn't be of enterprise class and all.. But i want the coding to perform as it should.Also i'm forced to use the old Turbo C++.
Now i'm struck in the place where the time of the employees in and out time is logged.
This coding does the work
void main( )
{
clrscr();
char dateStr [9],timeStr [9];
_strdate( dateStr);
cout<<" The current date is "<<dateStr<<'\n';
_strtime( timeStr );
cout<<" The current time is "<<timeStr<<'\n';
getch();
}
I saw it somewhere on the web but can someone help me understand how it works.
I also saw another coding
typedef struct _SYSTEMTIME {
WORD wYear;
WORD wMonth;
WORD wDayOfWeek;
WORD wDay;
WORD wHour;
WORD wMinute;
WORD wSecond;
WORD wMilliseconds;
} SYSTEMTIME;
#include <Windows.h>
#include <stdio.h>
int main()
{
SYSTEMTIME st;
GetSystemTime(&st);
printf("Year:%dnMonth:%dnDate:%dnHour:%dnMin:%dnSecond:% dn"
st.wYear,st.wMonth,st.wDay,st.wHour,st.wMinute,st.wSecond);
}
I think the second one is better as it not only gives me date but also gives me the day so i can check easily for the weekends.
So help me understand how these time functions work. Also if you have any suggestions for my project they are welcome.
You need to decide the format you want to store these clock "events", both for in-memory storage and manipulation and the persistent (on-disk) storage format. When you use different formats for in-memory and on-disk (or in-database) storage, you would use methods to "marshall" or "serialize"/"de-serialize" the data (look up and read about these terms). You also want to decide whether these datetime "events" will be stored or displayed in UTC (Zulu-time, GMT), or local time. You may find that storing these 'timestamps' in UTC is the best, and then you need functions/methods/routines to convert human-readable, displayable values to/from local time to UTC time.
Consider defining a "class" that has the above methods. Your class should have a method to record the current time, convert to human readable, and serialize/de-serialize the data.
Though printf works in C++, you might want to use the stream operators you have used in your first example, as they are more in the spirit of C++. Consider defining a parse method to de-serialize the data, and a to_string method (ruby uses to_s) to serialize (though reading up on stream operator overloading, and overloading the '<<' operator is more the C++ way).
The first uses C library functions (though Microsoft extensions to the standard libc). The second uses the winapi function GetSytemTime.
Both will give the system time.
The first thing I'd look at is what the rest of your code uses. You should distinguish between what is winapi code, C code and C++ code, currently your question uses a mixture of all three.
The C++ method is preferred (if you are intending to write in C++) which would be to use the newer library. The C method is as per your first example, though without mixing libc functions with stream operators (a c++ feature). The winapi method is as per your second example (I'll forgive the use of printf as FormatMessage is a pain).

C++ and Windows: Is SYSTEMTIME always based on the Gregorian calendar?

I have a SYSTEMTIME struct. This struct may either contain a UTC time or a local time that was returned from a Windows API function at some prior point and time.
In C++ I am calculating the day of the year based on the SYSTEMTIME that a function returns. In other words how many days since Jan 1. In order to do that I need to be mindful of the extra day during leap years, Feburary 29. That's all easy enough if I knew that the SYSTEMTIME is always based on the gregorian calendar.
If a user in a foreign country uses some other calendar system wouldn't I have a problem calculating the day of the year? I can't seem to do this on my machine to test the theory, and I don't even know if it's plausible. Any Microsoft experts that can help me out here?
Maybe a better question would be is there already a Windows API function that calculates the day of the year based on a SYSTEMTIME? I can't find one.
The closest thing I could find searching is this javascript question, which is interesting but I think very different from what I'm asking. I won't see any replies to this question until tomorrow (Monday) so if there are any follow up questions I will answer them then.
Thanks!
edit: I found this article but it still doesn't answer the question:
OS level support for non-Gregorian calendars? - Sorting it all Out - Site Home - MSDN Blogs
In looking at SYSTEMTIME on MSDN, it says:
Retrieves the current system date and time. The system time is expressed in Coordinated Universal Time (UTC).
It seems that regardless, SYSTEMTIME works in the Gregorian calendar.
Best of luck, I hope that I was of help.

c++ function -- which date is first / last?

One of my c++ function does some calculations based off of the values of other variables. The program asks for a bunch of information including start date and end date for 2 separate events.
p1.start_date and p2.start_date; p1.end_date and p2.end_date each of which have a day, month and year stored inside.
I need to set combined.start_date to which happens earlier (p1.start_date or p2.start_date) and I need to set combined.end_date to which happens later.
Could I please have some help in getting this started? Here is what I have now: http://pastebin.com/huJprtHj.
At least assuming the dates involved are reasonably current1, stuff the month/day/year into a struct tm and use mktime to convert to a timt_t, then you can compare the two time_ts directly.
If you need/want to support a wider range of dates, you might consider Ray Gardner's Julian Date routines.
At least in a typical case, dates from 1970 to at least 2038 will work.
Generally, calculations based on dates can be done in two ways.
Convert the date into a "number of days since some fixed date (e.g. 1 Jan 1970)".
Use the date components (year, month, day).
If this is all you need to do, just comparing each part (with the "highest first") will work just fine - you just need a compare function that can tell you if date1 is less than date2.
The rest of your question should be really simple programming.
Edit: to clarify: For DATE calculations, days from a set date is fine. The system library functions have functions that use seconds [and in some systems, fractions of a second] for a complete time down to seconds. This is not required for comparing dates where a the time of day is not involved.
Make this function. I'm guessing that your dates are stored in an object named Date, since you don't specify.
bool operator< ( const Date& left, const Date &right )
{
// ...
}
Then you can compare your date objects the same as if they were built-in types like int.