SAS iterpolation : alternative to proc expand - sas

From annual data :
I would like to create the the data per day but I can't use the proc expand because the SAS ETS is not available.
Thank you for your suggestions.

Something like this is a basic approach perhaps:
create a list of dates for interpolation
merge have data (shown above, not included in code below)
Plot to see if linear pattern, (looks somewhat exponential/curved)
run linear regression, saving predicted values
plot interpolated values against actual values
data years;
do date='30Jun2017'd to '30Jun2022'd;
output;
end;
run;
data have;
merge years have;
by date;
format date date9.;
run;
proc sgplot data=have;
series x=date y=px_last;
run;
proc reg data=have plots;
model px_last = date;
output out=pred p=predicted_value;
run;
proc sgplot data=pred;
series x=date y=predicted_Value;
scatter x=date y=px_last;
run;

Related

Overlay the average trend on group by trends using Proc sgplot

I want to create a line graph that includes the overall trend of a disease rate and the specific trends for males and females. I use the following code for to create the group by trends. How to add he average trend to this line graph. Thanks for your help.
proc sgplot data=have ;
vline year/response=disease_rate group=sex stat=mean datalabel=disease_rate ;
yaxis values=(0,1) label="Percentage";
run;
Here's an example of summarizing it and then displaying it on the graph. There are more than one way to do this though, this is just one.
data have;
set sashelp.heart(in=a);
year=round(2021-ageAtStart, 10);
disease_rate= status="Dead";
run;
proc means data=have mean noprint;
class sex year;
types sex sex*year;
var disease_rate;
output out=summary_stats mean=average_value;
run;
proc sort data=summary_stats;
by sex year;
run;
data graph_data;
merge summary_stats(where=(_type_=2) rename=average_value=mean_sex_year)
summary_stats(where=(_type_=3) rename=average_value = mean_sex);
by sex;
format mean_sex: percent12.1;
run;
proc sgplot data=graph_data ;
*where year > 1990;
vline year/response=mean_sex_year group=sex stat=mean datalabel=mean_sex_year ;
vline year/response=mean_sex group=sex stat=mean datalabel=mean_sex ;
run;
Use series instead of vline so that you can overlay a regression on top of it to get an average trend line. For example:
proc sql;
create table have as
select date
, region
, sum(sale) as sale
from sashelp.pricedata
group by region, date
order by region, date
;
quit;
proc sgplot data=have;
series x=date y=sale / group=region;
reg x=date y=sale / group=region;
xaxis fitpolicy=rotatethin;
run;

How do I put conditions around Proc Freq statements in SAS?

I have the following statement
Proc Freq data =test;
tables gender;
run;
I want this to generate an output based on a condition applied to the gender variable. For example - if count of gender greater than 2 then output.
How can I do this in SAS?
Thanks
If you mean an output dataset, you can put a where clause directly in the output dataset options.
Proc Freq data =sashelp.class;
tables sex/out=sex_freq(where=(count>9));
run;
I'm not aware of how you can accomplish this only using proc freq but you can redirect the output to a data set and then print the results.
proc freq data=test;
tables gender / noprint out=tmp;
run;
proc print data=tmp;
where count > 2;
run;
Alternatively you could use proc summary, but this still requires two steps.
proc summary data=test nway;
class gender;
output out=tmp(where=(_freq_ > 2));
run;
proc print data=tmp;
run;

how to print dataset without formats in SAS

I ´have a dataset with formats attached to it and I dont want to remove the formats from the dataset and when I use proc freq or proc print, I want the original values and not the formats attached.
Proc print data=mylib.data;
run;
is there any format=no option?
proc freq data=mylib.data;
tables gender;
format?????
run;
You can remove a format by specifying a null format on the PROC strep:
proc freq data=mylib.data ;
tables gender ;
format _ALL_ ;
run ;
_ALL_ is a list of all variables in the dataset.

SAS: Printing monthly and weekly average

How can I print (and export to file) monthly and weekly average of value? The data is stored in a library and the form is following:
Obs. Date Value
1 08FEB2016:00:00:00 29.00
2 05FEB2016:00:00:00 29.30
3 04FEB2016:00:00:00 29.93
4 03FEB2016:00:00:00 28.65
5 02FEB2016:00:00:00 28.40
(...)
3078 08MAR2004:00:00:00 32.59
3079 05MAR2004:00:00:00 32.75
3080 04MAR2004:00:00:00 32.05
3081 03MAR2004:00:00:00 31.82
EDIT: I somehow managed to get the monthly data but I'm returning average for each month separately. I would to have it done as one result, namely Month-Average+export it to a file or a data set. And still I have no idea how to deal with weeks.
%macro printAvgM(start,end);
proc summary data=sur1.dane(where=(Date>=&start
and Date<=&end)) nway;
var Value;
output out=want (drop=_:) mean=;
proc print;
run;
%mend printAvgM;
%printAvgM('01jan2003'd,'31jan2003'd);
EDIT2: Here is my code, step by step:
libname sur 'C:\myPath';
run;
proc import datafile="C:\myPath\myData.csv"
out=SUR.DANE
dbms=csv replace;
getnames=yes;
run;
proc sort data=sur.dane out=sur.dane;
by Date;
run;
libname sur1 "C:\myPath\myDB.accdb";
run;
proc datasets;
copy in=sur out=sur1;
select dane;
run;
data sur1.dane2;
set sur1.dane;
date2=datepart(Date);
format date2 WEEKV11.;
run;
The last step results in NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. and the format of dane2 variable is DATETIME19..
Ok, it's small enough to handle easily then. I would recommend first converting your datetime variable to a date variable using DATEPART() function and then use a format within PROC MEANS. You can look up the WEEKU and WEEKV formats to see if they meet your needs. The code below should be enough to get you started. You could do the monthly without the date conversion, but I couldn't find a weekly format for the datetime variable.
*Fake data generated;
data fd;
start=datetime();
do i=1 to 3000000 by 120;
datetime=start+(i-1)*30;
var=rand('normal', 25, 5);
output;
end;
keep datetime var;
format datetime datetime21.;
run;
*Get date variable;
data fd_date;
set fd;
date_var = datepart(datetime);
date_month = put(date_var, yymon7,);
Date_week = put(date_var, weekv11.);
run;
*Monthly summary;
proc means data=fd_date noprint nway;
class date_var;
var var;
output out=want_monthly mean(var)=avg_var std(var)=std_var;
format date_var monyy7.;
run;
*Weekly summary;
proc means data=fd_date noprint nway;
class date_var;
var var;
output out=want_weekly mean(var)=avg_var std(var)=std_var;
format date_var weekv11.;
run;
Replace date_var with the new monthly and weekly variables. Because these are character variables they won't sort properly.

How to construct histograms with unequal class widths in SAS?

I am trying to create histograms in sas with the help of proc univariate in sas. But it gives me histograms with equal class widths. Suppose i want to have a histogram with first class interval from 1 to 10 and second class interval from 10 to 100.
I tried using-
proc univariate data=sasdata1.dataone;
var sum;
histogram sum/ midpoints=0 to 10 by 10 10 to 100 by 90 ;run;
But this does not work. What is the correct way of doing this?
You can't do it with UNIVARIATE as far as I know, but any of the SGPLOT/GPLOT/etc. procedures will work; just bin your data into a categorical variable and VBAR that variable.
If you're okay with frequencies (not percents), this would work:
data test;
set sashelp.class;
do _t = 1 to floor(ranuni(7)*20);
age=age+floor(ranuni(7)*10);
output;
end;
run;
proc format;
value agerange
low-12 = "Pre-Teen"
13-14 = "Early Teen"
15-18 = "Teen"
19-21 = "Young Adult"
22-high = "Adult";
quit;
ods graphics on;
ods preferences;
proc sgplot data=test;
format age agerange.;
vbar age;
run;
I believe if you need percents, you'd want to PROC FREQ or TABULATE your data first and then SGPLOT (or GPLOT) the results.
I did find a macro that can be used to create histograms with unequal endpoints.
The code can be found in the NESUG 2008 proceedings