I am attempting to round times to the nearest 15 minute interval in Stata, so for instance Dec 31, 2017 23:58 would become Jan 01, 2018 00:00. I have time stored (based on my understanding of the documentation) as the number of milliseconds since the start of 1960. So I thought this would do it:
gen round = round(datetime, 60000*15)
However, this doesn't quite work. For instance Nov 03, 2017 19:45:27 becomes Nov 03, 2017 19:46:01, when I think I should become 19:45:00. Does anyone know what I'm missing here?
Let's show a worked example illustrating my comment that you need to store datetime values as double rather than float.
. clear
. set obs 1
number of observations (_N) was 0, now 1
. gen double datetime = clock("Nov 03, 2017 19:45:27","MDYhms")
. gen round_f = round(datetime, 60000*15)
. gen double round_d = round(datetime, 60000*15)
. format datetime round_f round_d %tc
. list, clean noobs
datetime round_f round_d
03nov2017 19:45:27 03nov2017 19:46:01 03nov2017 19:45:00
Related
Simple question, I think.
I have a checkin_date_time variable in a database with thousands of unique records.
Database
ID checkin_date_time
1 January 01, 2019 11:36:50
2 January 01, 2019 11:36:55
....
60000 December 31, 2019 11:36:50
60001 December 31, 2019 11:36:55
I would like to create a 'week' variable based on the checkin_date_time variable. So for example 'January 01, 2019 11:36:55' would equal week 1 and 'December 31, 2019 15:16:57' would equal week 52.
Desired Output
ID datetime Week
1 January 01, 2019 11:36:50 1
2 January 01, 2019 11:36:55 1
....
60000 December 31, 2019 11:36:50 52
60001 December 31, 2019 11:36:55 52
I tried using the following code but its saying my
data testl;
set ed_tat;
week=week(checkin_date_time);
run;
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
Week operates on a date variable, use DATEPART() to get the date first and then determine the week.
week = week(datepart(checkin_date_time));
I am trying to create a variance measure in PowerBI.
This is the data that I have,
Month Year MonthNo Value
Jan 2016 1 700
Feb 2016 2 800
March 2016 3 900
April 2016 4 750
.
.
Jan 2017 13 690
Feb 2017 14 730
And My variance for the Month Number 7 should be like,
`{Avg(values(4,5,6) - Value(7)} / Value(7)`
i.e (Average of last 3 months value - current month value) / Current month value
How to do this in Power BI? Thanks.
If it is okay for you to use a column, I believe you could add one with this code to get what you want:
Variance = (CALCULATE(AVERAGEX(Sheet1,Sheet1[Value]),FILTER(FILTER(Sheet1,Sheet1[MonthNo]<=EARLIER(Sheet1[MonthNo])-1),Sheet1[MonthNo]>=EARLIER(Sheet1[MonthNo])-3))-Sheet1[Value])/Sheet1[Value]
You'll need to replace all instances of Sheet1 with the name of your table.
It'll give you something like this:
I'm trying to improve the processing time used via an already existing for-loop in a *.jsl file my classmates and I are using in our programming course using SAS. My question: is there a PROC or sequence of statements that exist that SAS offers that can replicate a search and match condition? Or a way to go through unsorted files without going line by line looking for matching condition(s)?
Our current scrip file is below:
if( roadNumber_Fuel[n]==roadNumber_TO[m] &
fuelDate[n]>=tripStart[m] & fuelDate[n]<=TripEnd[m],
newtripID[n] = tripID[m];
);
I have 2 sets of data simplified below.
DATA1:
ID1 Date1
1 May 1, 2012
2 Jun 4, 2013
3 Aug 5, 2013
..
.
&
DATA2:
ID2 Date2 Date3 TRIP_ID
1 Jan 1 2012 Feb 1 2012 9876
2 Sep 5 2013 Nov 3 2013 931
1 Dec 1 2012 Dec 3 2012 236
3 Mar 9 2013 May 3 2013 390
2 Jun 1 2013 Jun 9 2013 811
1 Apr 1 2012 May 5 2012 76
...
..
.
I need to check a lot of iterations but my goal is to have the code
check:
Data1.ID1 = Data2.ID2 AND (Date1 >Date2 and Date1 < Date3)
My desired output dataset woudld be
ID1 Date1 TRIP_ID
1 May 1, 2012 76
2 Jun 4, 2013 811
Thanks for any insight!
You can do range matches in two ways. First off, you can match using PROC SQL if you're familiar with SQL:
proc sql;
create tableC as
select * from table A
left join table B
on A.id=B.id and A.date > B.date1 and A.date < B.date2
;
quit;
Second, you can create a format. This is usually the faster option if it's possible to do this. This is tricky when you have IDs, but you can do it.
First, create a new variable, ID+date. Dates are numbers around 18,000-20,000, so multiply your ID by 100,000 and you're safe.
Second, create a dataset from the range dataset where START=lower date plus id*100,000, END=higher date + id*100,000, FMTNAME=some string that will become the format name (must start with A-Z or _ and have A-Z, _, digits only). LABEL is the value you want to retrieve (Trip_ID in the above example).
data b_fmts;
set b;
start=id*100000+date1;
end =id*100000+date2;
label=value_you_want_out;
fmtname='MYDATEF';
run;
Then use PROC FORMAT with CNTLIN=` option to import formats.
proc format cntlin=b_fmts;
quit;
Make sure your date ranges don't overlap - if they do this will fail.
Then you can use it easily:
data a_match;
set a;
trip_id=put(id*100000+date,MYDATEF.);
run;
I'm trying to improve the processing time used via an already existing for-loop in a *.jsl file my classmates and I are using in our programming course using SAS. My question: is there a PROC or sequence of statements that exist that SAS offers that can replicate a search and match condition? Or a way to go through unsorted files without going line by line looking for matching condition(s)?
Our current scrip file is below:
if( roadNumber_Fuel[n]==roadNumber_TO[m] &
fuelDate[n]>=tripStart[m] & fuelDate[n]<=TripEnd[m],
newtripID[n] = tripID[m];
);
I have 2 sets of data simplified below.
DATA1:
ID1 Date1
1 May 1, 2012
2 Jun 4, 2013
3 Aug 5, 2013
..
.
&
DATA2:
ID2 Date2 Date3 TRIP_ID
1 Jan 1 2012 Feb 1 2012 9876
2 Sep 5 2013 Nov 3 2013 931
1 Dec 1 2012 Dec 3 2012 236
3 Mar 9 2013 May 3 2013 390
2 Jun 1 2013 Jun 9 2013 811
1 Apr 1 2012 May 5 2012 76
...
..
.
I need to check a lot of iterations but my goal is to have the code
check:
Data1.ID1 = Data2.ID2 AND (Date1 >Date2 and Date1 < Date3)
My desired output dataset woudld be
ID1 Date1 TRIP_ID
1 May 1, 2012 76
2 Jun 4, 2013 811
Thanks for any insight!
You can do range matches in two ways. First off, you can match using PROC SQL if you're familiar with SQL:
proc sql;
create tableC as
select * from table A
left join table B
on A.id=B.id and A.date > B.date1 and A.date < B.date2
;
quit;
Second, you can create a format. This is usually the faster option if it's possible to do this. This is tricky when you have IDs, but you can do it.
First, create a new variable, ID+date. Dates are numbers around 18,000-20,000, so multiply your ID by 100,000 and you're safe.
Second, create a dataset from the range dataset where START=lower date plus id*100,000, END=higher date + id*100,000, FMTNAME=some string that will become the format name (must start with A-Z or _ and have A-Z, _, digits only). LABEL is the value you want to retrieve (Trip_ID in the above example).
data b_fmts;
set b;
start=id*100000+date1;
end =id*100000+date2;
label=value_you_want_out;
fmtname='MYDATEF';
run;
Then use PROC FORMAT with CNTLIN=` option to import formats.
proc format cntlin=b_fmts;
quit;
Make sure your date ranges don't overlap - if they do this will fail.
Then you can use it easily:
data a_match;
set a;
trip_id=put(id*100000+date,MYDATEF.);
run;
I have an assignees that I've been working on and I'm stuck on the last function.
use the function void Increment(int numDays = 1)
This function should move the date forward by the number of calendar days given in the argument. Default value on the parameter is 1 day. Examples:
Date d1(10, 31, 1998); // Oct 31, 1998
Date d2(6, 29, 1950); // June 29, 1950
d1.Increment(); // d1 is now Nov 1, 1998
d2.Increment(5); // d2 is now July 4, 1950
I don not understand how to do this.
void Date::Increment(int numDays = 1)
I'm stuck, I know how to tell the function to increment, by the ++ operator but i get confuse when I have to get the function to increment the last day of the month to the the fist, or to end at the last date of that month for example. Oct 31 to Nov 1, or June 29 to July 4. I can do July 5 to July 8 but the changing months confuse me
You will need to store a list (or array) of how many days are in each month. If you add numDays to the current date and it becomes bigger than this, you need to increment the month as well.
For example, we have a date object representing 29 March 2010. We call Increment(4) and add 4 to the day variable, ending up with 33 March 2010. We now check how many days March has and find out it's 31 (eg. daysInMonth[3] == 31). Since 33 is greater than 31, we need subtract 31 from 33 and increase the month, ending up with 2 April 2010.
You will need special handling for February in leap years (any year divisible by 4 and not divisible by 100 unless it's also divisible by 400) and for incrementing past the end of December.
30 days has September, April, June, and November. The rest have 31 days, except for February, which has 28 days except on a leap year (every 4 years, and 2008 was the last one) when it has 29 days.
This should be plenty to get you going.
First, construct a function like
int numDaysSinceBeginning( Date );
which counts number of days elapsed from a well known date (e.g. Jan 1 1900) to the specific Date.
Next, construct another function which converts that day-delta to Date
Date createDateWithDelta( int );
From your example,
Date d2(6, 29, 1950); // June 29, 1950
int d2Delta = numDaysSinceBeginnning( d2 );
Date d2Incremented = createDateWithDelta( d2Delta + 5 ); // d2Incremented is July 4, 1950