turn a string variable containing dates into a date variable Stata - stata

I have variables day,month and year, month is string and day,year is numeric, what is the fastest way to convert this into the correct date format (dmy)? I know I can write code converting the month into numeric month (might be little long) and then use mdy() function, is there a faster way ? or make use of the original month and year to create a mdy format?
I appreciate any suggestion

Consider concatenating variables (converting numeric ones to string) and then converting to DMY:
Input Data
month day year
January 21 2016
February 13 2016
March 6 2016
Stata script
gen fulldate = date(string(day) + "" + month + "" + string(year), "DMY")
format fulldate %td
Result
month day year fulldate
January 21 2016 21jan2016
February 13 2016 13feb2016
March 6 2016 06mar2016

Related

Extract Month from Date Power BI

I currently have a date field with the format 01 January 2022, 04 February 2022 Etc
I am trying to create a column that pulls out the month in MMM format (JAN, FEB)
Using Month = Month('Current Week'[Created On])
This pulls the numbered date of the month (1,2)
I tried to add in format but this does not work as Month only allows 1 argument.
What else can I do to pull out the month format that I am looking for?
Try this one:
Month String =
UPPER(
FORMAT(
DATEVALUE('Current Week'[Created On]),
"mmm"
)
)

Looking to create a cumulative column in SAS

Example Dataset:
record_id admin_dt_1
1 June 7th 2022
2 August 25th 2022
3 August 23rd 2022
4 July 8th 2022
5 August 5th 2022
I would like my output to show in the first column September 1st...2nd...so on to 30th which I have done but I would like the second column to show the number of people eligible for each day in September. Eligible means anyone after 28 days from their admin_dt_1. I also want the column to be cumulative it should look something like this: Since there are 5 data points it should add up to 5 in the frequency column.
Date Frequency eligible
September 1st 3
September 30th 5
data dose2eligible;
set request;
/*create September 1st to September 30th date*/
do date= '01sep2022'd to '30sep2022'd;
output;
end;
format
date date9.;
run;
proc freq data=dose2eligible; table date; run;
You were very close. Count the number of days between admin_dt_1 and date, then create a 1/0 flag using the shortcut var = (boolean comparision):
eligible = (admin_dt_1 - date > 28);
data dose2eligible;
set request;
/*create September 1st to September 30th date*/
do date= '01sep2022'd to '30sep2022'd;
eligible = (admin_dt_1 - date > 28);
output;
end;
format date date9.;
run;
You can then count the number of eligible people on each date:
proc sql;
select date
, sum(eligible) as total_eligible
from dose2eligible
group by date;
quit;

PowerBI: Weekday or weekend

I have a table AW_Calendar
Date Year Month name Day Name weekend
Friday, January 1, 2016 2016 January Friday
Saturday, January 2, 2016 2016 January Saturday
and so on
I am trying to write a DAX for whether a day is weekend or not, but I am getting an error
Weekend = if(AW_Calendar[Day Name]='Saturday'||'Sunday',1,0)
I am getting an error.
'Cannot find table 'Saturday''
What could be wrong
It expects double quotes " instead of single quotes ' for strings. It uses the latter for table names.
Note that using the WEEKDAY function may be more efficient here:
Weekend = IF ( WEEKDAY ( AW_Calendar[Date], 2 ) > 5, 1, 0 )
When you use any logical operator you have to write the table name for every condition. Also ' character refers to tables. If you want to compare text data you have to use " instead of '.
This should work:
Weekend = if(AW_Calendar[Day Name]="Saturday"|| AW_Calendar[Day Name]="Sunday",1,0)

Stata: How to modify some values in a string variable but keep original values?

I am working with a very large dataset (1 million obs.).
I have a string date that looks like this
key seq startdate (string)
AD07 1 August 2011
AD07 2 June 2011
AD07 3 February 2004
AD07 4 November 2004
AD07 5 2001
AD07 6 January 1998
AD5c23 1 January 2014
AD5c235 2 February 2014
AD5c235 3 2014
These are self-reported employment dates.
Some did not report the month at which they started.
But I would like to replace for AD07 the date “2001” to “January 2001”. Hence I cannot simply replace it because I would like to keep the original years but add the month in the string variable.
I started with:
levelsof start if start<="2016", local(levels)
which gives me all the years without the month from 1900 to 2016.
Now I would like to add "January" for the years without the month and keep original years.
How should I do that without using replace for every year? foreach loop?
You have a serious data quality problem if people are claiming to have started work in 1900 and every year since then! Even considering early employment starts and delayed retirement, that implies people older than the oldest established age.
Also, imputing "January" will impart bias as almost all job durations will be longer than they would have been. Real January starts will be correct, but no others: "June" or "July" or random months would make more obvious statistical sense.
That said, there is no loop needed here. You're asking for one line, say
replace startdate = "January " + startdate if length(trim(date)) == 4
or
replace startdate = "January " + startdate if real(startdate) < .
-- assuming a follow-up in converting to numeric dates. The logic there is that all year-only dates trim down to 4 characters, or (better) that feeding month names to real() will yield missings.
That said in turn, creating a new variable is better practice than over-writing one. Also, consider throwing away the month detail. Is it needed?
EDIT
You may have another problem if there are people with two or more jobs in the same year without month specifications. You don't want to impute all months in question as "January". You can check for such observations by
gen byte incomplete = real(startdate) < .
gen year = substr(trim(startdate), -4, 4)
bysort key year incomplete : gen byte multiplebad = incomplete & _N > 1

In sas application Set date parameter and put it in the contacted column to retrieve data in a certain period

The date in the table is not one set,
Days in the days column and months in the month column and years in the year column
I have concatenated the columns and then put these concatenation in where clause and put the parameter I have made but I got no result
I assume you are querying a date dimension table, and you want to extract the record that matches a certain date.
Solution:
I created a dates table to match with,
data dates;
input key day month year ;
datalines;
1 19 2 2018
2 20 2 2018
3 21 2 2018
4 22 2 2018
;;;
run;
Output:
In the where clause I parse the date '20feb2018'd using day, month & year functions: in SAS you have to quote the dates in [''d]
proc sql;
select * from dates
/*if you want to match todays' date: replace '20feb2018'd with today()*/
where day('20feb2018'd)=day and month('20feb2018'd)=month and year('20feb2018'd)=year;
quit;
Output:
if you compare date from day month and year, then use mdy function in where clause as shown below. it is not totally clear what you are looking for.
proc sql;
select * from dates
where mdy(month,day, year) between '19feb2018'd and '21feb2018'd ;