Converting daily data to weekly data in SAS - sas

I have the DAILY returns of industry portfolios in SAS.
I would like to calculate the WEEKLY returns.
The daily returns are in percentage so I think that should just be the sum of returns during each week.
Obvious problems I am facing is that the weeks can have a different number of days in.
The table I have in SAS is in the following format:
INDUSTRY_NUMBER DATE DAILY_RETURN
Any help would be greatly appreciated.
I have tried this:
proc expand data=Day_result
out=Week_result from=day to=week;
Industry_Number Trading_Date;
convert Value_weighted_return / method=aggregate observed=total;
run;
The daily data is in Day_Result when I remove the forth line i.e.
proc expand data=Day_result
out=Week_result from=day to=week;
convert Value_weighted_return / method=aggregate observed=total;
run;
This works as in it does what I want it to do but it doesn't do it for each category it does it for the whole table.
So if I have 40 categories I want the weekly returns for each category.
The second set of code provides the weekly return for every category.
EXAMPLE DATA:
data have;
format trading_date date9.;
infile datalines dlm=',';
input trading_date:ddmmyy10. industry_number value_weighted_return;
datalines;
19/01/2000,1, -0.008
20/01/2000,1, 0.008
23/01/2000,1, 0.008
24/01/2000,1, -0.007
25/01/2000,1, -0.009
26/01/2000,1, 0.008
27/01/2000,1, -0.008
30/01/2000,1, 0.003
31/01/2000,1, -0.001
01/02/2000,1, 0.004
02/02/2000,2, -0.008
03/02/2000,2, -0.005
06/02/2000,2, -0.004
07/02/2000,2, -0.009
08/02/2000,2, 0.002
09/02/2000,2, 0.006
10/02/2000,2, 0.008
13/02/2000,2, 0.008
14/02/2000,2, 0.002
15/02/2000,2, 0.01
16/02/2000,2, -0.008
;
run;

Sort your data by INDUSTRY_NUMBER Trading_Date, use INDUSTRY_NUMBER as a by-group, identify your time variable.
proc sort data=have;
by industry_number trading_date;
run;
Next, convert your data into a time-series to remove any time gaps. Set any missing days as the previous value since it does not change on those trading days (e.g. weekends, bank holidays, etc.).
proc timeseries data=have
out=have_ts;
by industry_number;
id trading_date interval=day
setmissing=previous
accumulate=average
;
var value_weighted_return;
run;
Finally, take the time-series output and convert it from day to week. Since you are using weights, you may want to use average rather than total.
proc expand data=have_ts
out=have_ts_week
from=day
to=week
;
by industry_number;
id trading_date;
convert Value_weighted_return / method=aggregate observed=average;
run;

Related

How to determine the frequency of a time series?

For an if-query I would like to create a macro varibale giving the respective frequency of the underlying time
series. I tried to get some descriptive statistics from proc time series. However, they unfortunately do not include the figure for the frequency.
The underlying times series does not necessarily conclude all periods of the frequency. That excludes a selected count by proc sql from my point of view.
Does anyone know an efficient procedure to determine the frequency without computing the frequency on my own (in a data step or a proc sql code)?
You can use the outspectra statement to help learn what kind of seasonality it has. Based on the data, give PROC TIMESERIES your best guess of day, month, etc. In the example below, we know we want to forecast by month but we do not know what seasonality it has.
proc timeseries data=sashelp.air outspectra=spectra;
id date interval=month;
var air;
run;
Plot this spectra dataset in proc sgplot and you'll see something that looks like this:
proc sgplot data=spectra;
where NOT missing(period);
series x=period y=p;
run;
This line will naturally increase over time, but we're looking for a bumps in the line. Notice the large bump somewhere between 0 and 24 months and the several smaller bumps before it. Let's zoom in on that by filtering out the longer periods.
proc sgplot data=spectra;
where period < 24 and NOT missing(period);
series x=period y=p;
run;
It's pretty clear that there is a strong seasonality of 12, with potentially smaller cycles at 3 and 6 months. From this plot, we can conclude that our seasonality should be 12 based on our spectra plot.
You can turn this into a macro to help identify the season if you'd like. Simply search for the largest bump within a reasonable timeframe. In our case we'll choose 36 because we do not suspect that we have any seasonality > 36 months.
proc sort data=spectra;
by period;
run;
data identify_period;
set spectra;
by period;
where NOT missing(period) AND period LE 36;
delta = abs(p - lag(p) );
run;
proc sql;
select period, max(delta) as max_delta
from identify_period
having delta = max(delta)
;
quit;
Output:
PERIOD max_delta
12 163712
I don't know how to do this without data step logic, but you could wrap the data step in a macro as follows:
%macro get_frequency(data,date_variable,output_variable);
proc sort data=&data (keep=&date_variable) out=__tempsorted;
by &date_variable;
run;
data _null_;
set __tempsorted end=lastobs;
prevdate=lag(&date_variable);
if _n_ > 1 then do;
interval_number+1;
interval_total + (&date_variable - prevdate);
end;
if lastobs then do;
average_interval = interval_total/interval_number;
frequency = round(365.25/average_interval);
call symput ("&output_variable",left(put(frequency,best32.)));
end;
run;
proc datasets nolist;
delete __tempsorted;
run;
quit;
%mend get_frequency;
Then you can call the macro on your original data set timeseries to examine the variable date and create a new macro variable frequency1 with the required frequency.
data work.timeseries;
input date date. value;
format date date9.;
datalines;
01Oct18 3000
01Nov18 4000
01Dec18 6500
01Jan19 7000
01Feb19 4000
01Mar19 5000
01Apr19 7500
01May19 4800
01Jun19 4500
;
run;
%get_frequency(timeseries,date,freqency1)
%put &=frequency1;
This seems to work on your sample data where each date is the first of the month. If your dates are evenly distributed (e.g. always near month start/end, or always near mid-month etc.) then this macro should work ok. Obviously if you have multiple observations per date then it will give the completely incorrect frequency.

Missing values in VARMAX

I have a dataset with visitors and weather variables. I'm trying to forecast visitors based on the weather variables. Since the dataset only consists of visitors in season there is missing values and gaps for every year. When running proc reg in sas it's all okay but the issue comes when i'm using proc VARMAX. I cannot run the regression due to missing values. How can i tackle this?
proc varmax data=tivoli4 printall plots=forecast(all);
id obs interval=day;
model lvisitors = rain sunshine averagetemp
dfebruary dmarch dmay djune djuly daugust doctober dnovember ddecember
dwednesday dthursday dfriday dsaturday dsunday
d_24Dec2016 d_05Dec2013 d_24Dec2017 d_24Dec2014 d_24Dec2015 d_24Dec2019
d_24Dec2018 d_24Sep2012 d_06Jul2015
d_08feb2019 d_16oct2014 d_15oct2019 d_20oct2016 d_15oct2015 d_22sep2017 d_08jul2015
d_20Sep2019 d_08jul2016 d_16oct2013 d_01aug2012 d_18oct2012 d_23dec2012 d_30nov2013 d_20sep2014 d_17oct2012 d_17jun2014
dFrock2012 dFrock2013 dFrock2014 dFrock2015 dFrock2016 dFrock2017 dFrock2018 dFrock2019
dYear2015 dYear2016 dYear2017
/p=7 q=2 Method=ml dftest;
garch p=1 q=1 form=ccc OUTHT=CONDITIONAL;
restrict
ar(3,1,1)=0, ar(4,1,1)=0, ar(5,1,1)=0,
XL(0,1,13)=0, XL(0,1,14)=0, XL(0,1,13)=0, XL(0,1,27)=0, XL(0,1,38)=0, XL(0,1,42)=0;
output lead=10 out=forecast;
run;
As with any forecast, you will first need to prepare your time-series. You should first run through your data through PROC TIMESERIES to fill-in or impute missing values. The impute choice that is most appropriate is dependent on your variables. The below code will:
Sum lvisitors by day and set missing values to 0
Set missing values of averagetemp to average
Set missing values of rain, sunshine, and your variables starting with d to 0 (assuming these are indicators)
Code:
proc timeseries data=have out=want;
id obs interval = day
setmissing = 0
notsorted
;
var lvisitors / accumulate=total;
crossvar averagetemp / accumulate=none setmissing=average;
crossvar rain sunshine d: / accumulate=none;
run;
Important Time Interval Consideration
Depending on your data, this could bias your error rate and estimates since you always know no one will be around in the off-season. If you have many missing values for off-season data, you will want to remove those rows.
Since PROC VARMAX does not support custom time intervals, you can instead create a simple time identifier. You can alternatively turn this into a format for proc format and converttime_id at the end.
data want;
set have;
time_id+1;
run;
proc varmax data=want;
id time_id interval=day;
...
output lead=10 out=myforecast;
run;
data myforecast;
merge myforecast
want(keep=time_id date)
;
by time_id;
run;
Or, if you made a format:
data myforecast;
set myforecast;
date = put(time_id, timeid.);
drop time_id;
run;

SAS: Change time series freuqncy (Proc Expand)

I have a stock price dataset which has observations in miliseconds (Variables: STOCK DATE TIME(in ms) PRICE. It is sorted by stock, date, and time.
I now need a dataset where the freuqency is 1-second intervals. The price variable should be the prevailing price at the second.
I tried proc expand:
proc expand data=have out=want to=second;
id stock date time; run;
But it does not work that way.
Any help is appreciated!
M
got it: proc timeseries with id time and interval=second works!

SAS: Calculating rolling skew of previous 30 days

I want to calculate the skew of a timeseries (stock returns) of the previous 30 days on a rolling basis (thus, getting daily values).
Dataset looks like:
Stock date month year return
1SF7 1/07/2016 7 2016 0.94
1SF7 5/07/2016 7 2016 0.91
1SF7 6/07/2016 7 2016 0.82
1SF7 7/07/2016 7 2016 0.95
..........
Currently, I tried proc means and just calculate month-end skewness
proc means data=have; by year month;
output out= want (drop= _freq_ _type_ ) skew(return)=Skew_monthly;
run;
Anyone has an idea for rolling skewness? I know there is a question here that asks for rolling skewness, but the answer to that only outputs one value per 30 days, but I want daily values.
Thankful for any input!
Marc
Thanks, I managed it with the array version:
data want; array p{0:29} _temporary_;
set have; by symbol;
if symbol then call missing(of p{*});
p{mod(_n_,30)} = return;
skew = skewness(of p{*});
run;

sas aggregate weekly, monthly

For the data set below(actual one is several thousand row long) I would like SAS to aggregate the income daily (many income lines everyday per machine), weekly, monthly (start of week is Monday, Start of month is 01 in any given year) by the machine. Is there a straight forward code for this? Any help is appreciated.
MachineNo Date income
1 01Jan2012 1500
1 02Jan2012 2000
1 27Aug2012 300
2 02Jan2012 1200
2 15Jun2012 50
3 03Mar2012 1000
4 08Apr2012 500
proc expand and proc timeseries are excellent tools for accumulation and aggregation to different frequencies of series. You can combine both with by-group processing to convert to any time period that you need.
Step 1: Sort by MachineNo and Date
proc sort data=want;
by MachineNo Date;
run;
Step 2: Find the min/max end dates of your series for date alignment
The format=date9. statement is important. For whatever reason, some SAS/ETS and HPF procedures require date literals for certain arguments.
proc sql noprint;
select min(date) format=date9.,
max(date) format=date9.
into :min_date,
:max_date
from have;
quit;
Step 3: Align each MachineNo by start/end date, and accumulate days per MachineNo
The below code will get you aligned daily accumulation, remove duplicate days per machine, and set Income on any missing days to 0. This step will also guarantee that your series has equal time intervals per by-group, allowing you to run hierarchical time-series analyses without violating the equal-spaced interval assumption.
proc timeseries data=have
out=want_day;
by MachineNo;
id date interval=day
align=both
start="&min_date"d
end="&max_date"d;
var income / accumulate=total setmiss=0;
run;
Step 4: Aggregate aligned Daily to Weekly shifted by 1 day, Monthly
SAS time intervals are able to be both multiplied and shifted. Since the standard weekday starts on a Sunday, we want to shift by 1 day to have it start on a Monday.
Standard Week
2 3 4 5 6 7 1
Mon Tue Wed Thu Fri Sat Sun
Shifted
1 2 3 4 5 6 7
Mon Tue Wed Thu Fri Sat Sun
Intervals follow the format:
TimeInterval<Multiplier>.<Shift>
The standard shift interval is 1. For all intents and purposes, consider 1 as 0: 1 means it's unshifted. 2 means it's shifted by 1 period. Thus, for a week to start on a Monday, we want to use the interval Week.2.
proc expand data=want_day
out=want_week
from=day
to=week.2;
id date;
convert income / method=aggregate observed=total;
run;
Step 5: Convert Week to Month
proc expand data=want_week
out=want_month
from=week.2
to=month;
id date;
convert income / method=aggregate observed=total;
run;
In case you don't have a license for SAS/ETS here's another way.
For the monthly data you can format the date in a proc means output.
I think WeekW. starts on Monday but it may not be in a format you want, so you'll need to create a new variable for week first if you wanted to use this method.
proc means data=have nway noprint;
class machineno date;
format date monyy7.;
var income;
output out=want sum(income)=income;
run;